AI, machine learning & audiovisual collections at NFSA - Rebecca Coronel, Dr Keir Winesmith
Rebecca Coronel, Head of Collection Preservation, and Dr Keir Winesmith, Chief Digital Officer, from the National Film and Sound Archive (NFSA) present at NDF23.
Abstract:
Artificial intelligence and machine learning (AI) presents a significant opportunity to increase accessibility and discoverability of audiovisual collections. Until now, the ability to find content has depended on the quality and extent of data held in the collection catalogue. But each audiovisual item contains a wealth of additional information outside of the scope of its catalogue entry. AI and ML allows us to use tools such as facial recognition, optical character recognition and voice to text transcription to search the collection. AI will revolutionise our ability to discover AV collections and allow us to uncover content that previously was invisible. In the past content creators, researchers and the public needed the help of NFSA staff to identify relevant collection items, but an AI search tool opens up enormous opportunities for everyone to be able to find what they are looking for using simple search terms, images or location information.
This presentation will explore the NFSA’s early experiences in developing an AI search tool for the Australian national audiovisual collection. The presentation will look at lessons learned, challenges presented by machine learning and the future potential for this tool in assisting the NFSA to unlock content in the collection.
Transcript:
Session 3A: AI, Machine Learning, and Audiovisual Collections at NFSA
Moderator:
Welcome to the session AI, Machine Learning, and Audiovisual Collections at NFSA—the National Film and Sound Archive of Australia. We have:
Rebecca Coronel – Head of Collection Preservation, NFSA
Kier Winesmith – Chief Digital Officer, NFSA
Over to you.
Presentation by Rebecca Coronel
Rebecca Coronel:
Thanks very much, everyone, and thank you for such a warm welcome. It’s been a long time since I’ve been to Wellington, and it’s been great to be here.
We’re going to do a bit of a double act today. I’m going to be talking a little bit about the NFSA, the National Film and Sound Archive’s collection, our digitisation work, and a machine learning pilot that we undertook. Then, I’m going to hand over to Kia, and he’s going to be talking about the exciting work we’re just launching into right now.
Just for those who maybe haven’t been to the NFSA or haven’t had a chance to go to Canberra—this is our building in the national capital on Ngunnawal land. We are the Australian National Audiovisual Collection, so we hold anything that can be played on a screen and audio—all kinds of collection items.
This paper is about our journey to improve the discoverability of our collection. We’re just at the very beginning of that journey, so this is really about a trial, a proof of concept, and working towards more specific collection discovery tools—and what we’re calling the Conversational Archive, which is our next step.
You might wonder why the non-tech person is speaking, and I am very much the non-tech person! That’s because I am the Head of Collection Preservation, so I look after the analogue collection at the National Film and Sound Archive, which numbers close to probably a million items. And, of course, it’s very hard to apply AI and ML unless that collection is digitised—and that’s the other part of my responsibility.
What We Collect
As I said, we’re a wide-ranging film and sound archive, so this is just an example to give you a sense of the variety—Mad Max: Fury Road. But we collect all components of audiovisual material:
Film
Magnetic tape
All kinds of audio
All kinds of recordings, recorded sound, music, oral histories—all kinds of things
To give you an idea of our classic institutional backlog—for instance, in the case of Mad Max, we have seven pallets of outtakes from the original Mad Max film, none of which are digitised. So, there’s a lot of content that we have that is available, and a lot that is not yet available.
Who Do We Serve?
We are a national collecting institution, so we are an Australian government organisation. We have:
Public visitors coming into our building in Canberra
Researchers, documentary filmmakers, and other clients accessing our collection
Educational activities, with a particular focus on media literacy
One of the key things for us is that for every one visitor who comes into our Canberra building physically, we have something in the order of 4,000 visits online.
So, for us, being online and making sure our collection is discoverable, useful, and accessible is the way we connect with the maximum number of people.
Some interesting stats—while we’re an Australian government organisation, our audience definitely extends internationally. When we look at the statistics for people accessing our collection online, for some areas of the collection, it’s more international than it is Australian.
Expanding the Collection
Our current mission is to tell the national story by collecting, preserving, and sharing audiovisual media—and also the cultural experience platforms of our time.
That last part is important in terms of where we’re heading. We’ve just started collecting:
Video games
AR (augmented reality) experiences
VR (virtual reality) experiences
Social media content
These are the cultural experience platforms we’re all using now. But collecting them is a challenge—so that’s also part of where we’re heading.
The Urgency of Digitisation
Some of you who work in archive land might know us from Deadline 2025, which talks about the risk to magnetic media tape.
We did a lot of work in the last few years to promote and advocate for the preservation of tape, which we understand largely will not be accessible if it’s not digitised in the next couple of years.
2025 is a bit of an arbitrary date, but it’s a good indication of the urgency and the priorities we need to apply to magnetic media preservation.
We’re lucky—we were recently the recipients of $41.9 million specifically to help us in that task. So, we’re currently digitising not just our own collection but also the collections of other Australian institutions.
However, our curators keep collecting. In the last year or so, we digitised around 27,000 magnetic tape items, but our curators collected another 16,000 in that same year—so it’s a never-ending exercise.
Digitisation & Infrastructure Upgrades
Before I hand over to Kia, here’s an example of our magnetic media digitisation setup.
We’ve upscaled to be able to run 29 video streams simultaneously—which is fantastic.
We run one preservation scanner and one small-gauge scanner.
Our new funding has allowed us to purchase another scanner and additional film equipment.
We have around 150,000 film items that are not yet digitised.
Kia’s team has been working very hard on our new offsite data storage centre in Canberra. It’s in a secure Australian government-approved facility.
This is a huge step forward for our:
Digitisation strategy
IT infrastructure
Security and disaster recovery
Previously, we had to juggle LTO tapes between different buildings to ensure we always had a secure backup. Now, we have a dedicated offsite centre, which makes a massive difference.
For those who love the machines, this is our new tape library—LTO-8 is our next step forward. This library will hold about 27 petabytes of data.
Our current collection sits at around 7 petabytes and grows all the time.
That’s the technology and digitisation end. Now, I just want to briefly talk about the machine learning pilot we ran before I hand over to Kier.
Presentation by Kier Winesmith:
Kia ora, everyone.
Trigger warning: I talk really fast, I swear, and I make jokes. So if you feel like you should laugh, you should laugh. If you feel like, is he?—he is.
The Conversational Archive
So, we started thinking about the premise of a conversational archive, and we have to go back a little bit to understand what we mean by that.
The beginning of writing in Australia was lists—way before writing, when we were keeping cultural content and passing it through generations, it was stories and dance, it was movement and song.
After that came lists.
After lists came lists with lines that looked like spreadsheets but on paper—hundreds of years before spreadsheets.
Then we took the spreadsheets and made those into cards and put them in little boxes.
Then we took the cards, made them into dots—ones and zeros—and put them in computers 55 years ago.
So, this metaphor for finding cultural content has not changed—not for your institutions, and not for mine.
It’s still strings of characters in lists in the right order, the way the cataloguer put it in—not the way you, as a punter, think about it.
The computational archive is a new idea—maybe the last decade.
We think that after the computational archive comes the conversational archive.
And because we’re an institution that likes to build things as well as think about things, we thought—we can’t just say that, we just have to do it.
The Idea Behind "Ask AA"
We also want to acknowledge that these ideas of passing cultural content through time are not new.
Foucault was writing about the idea of an archive not being a bunch of books, but a bunch of ideas joined up by humans—and we totally, totally agree.
So, we got a whole bunch of money that was called Audiovisual Australia—or AA for short.
"Is that AA-funded?"
"Is our trip AA-funded?"
"Is the scanner AA-funded?"
"Are the servers AA-funded?"
AA just became a stand-in for all of it.
And so we thought—what happens if you ask AA?
What if you take all those entities that are inside the institution, and all of those partners that Rebecca talked about that are outside the institution, and you create a conversational archive so that people can engage with the collection in a way that feels really natural?
Because once it’s four million things, you can’t know all the things.
You need a way into the things.
Where Does Machine Learning Fit?
This is actually a scrubbed and differentiated version of our ecosystem.
We have a lot of technology. We are a technology hub—hundreds and hundreds of machines.
So we asked ourselves:
If we were going to put machine learning inside the NFSA, where would we put it?
What would make sense?
What’s logical?
And so we stepped back, and we thought:
The three things we exist to do are:
Collect
Preserve (the objects, metadata, and digital copies)
Share (online and in person)
We value both in-person and digital access, but we know 4,000 people online engage for every 1 in person.
So, where does machine learning fit in that?
It doesn’t replace what we do—it’s an enabler, not a replacement.
And so we thought—it goes there, and it generates that, and it goes there too.
And actually, this is the photo to take a picture of—because this is what Ask AA is.
It’s not just an LLM instead of an archive.
It’s a whole bunch of very targeted, intentional decisions to place machines in specific spots.
From Dreaming AI to Primary Source AI
When you finish that work, you get to ask questions.
So, the first question I was interested in was:
"Write me an essay on the history of Australian film."
And this is what came out—this is PowerPoint, but these are the actual words copy-pasted from our tool:
"The Arrival of Sound and Decline (1930–1940)"
What even is that? I would watch it—but it doesn’t exist.
Nothing in there exists.
Muriel’s Wedding exists, but the date’s wrong.
Everything is bollocks.
Even the IDs are made up.
So we were like, "Yeah, let’s not do that."
Instead, we programmed it to say no—"Write your own essay. Maybe I can find you some films to watch or listen to, and you can write your own essay."
Lazy students.
A Smarter Search: Finding Ben Mendelsohn
I’m a massive fan of Ben Mendelsohn—I think he’s a genius actor, one of Australia’s finest.
I was convinced in my head that his career blew up due to Animal Kingdom.
So I asked:
"When did Ben Mendelsohn’s career take off?"
And it said:
"It actually took off when he acted in the film The Year My Voice Broke."
And we know that because he said that—and the link exists—and that ID is real.
You click the link, and you hear Ben Mendelsohn talking about his own career.
That’s the difference between a large language model and what we’re calling a primary source model.
Our model doesn’t invent things.
It gets you to the bit of the collection that is relevant to you.
Building an Australian AI Model
Here’s the problem—everything we use speaks American.
You put Australian faces on TV in America? Does okay.
You put non-white Australian faces, especially First Nations faces? Does not find them.
For example:
"Chook roll" (Australian for chicken roll)? Nope.
"Servo" (Australian for service station)? Nope.
"Queanbeyan" (First Nations place name)? Nope.
The American models get "David", but they don’t get anything else.
So we’re calling our new model "Bowerbird".
Bowerbird is an Australian speech engine.
It understands Australian entities.
It knows Kylie Minogue is a person.
It knows Queanbeyan is a place.
It does better transcription than the American models.
Because we speak Australian, and they don’t.
Final Thoughts
Spotify does this. Shopify does this. Everyone is doing this.
We are only using the national collection to train our AI.
We are not sucking from the internet.
Because if you use Whisper—when it dreams, it dreams in "Like and Subscribe."
Have a guess what it’s trained on.
Wrapping Up
So what did we learn?
Experimentation leads to good infrastructure—but only with proper governance.
We’re hosting a conference on AI in libraries, archives, and museums in Canberra in October 2024. You should come. Thanks.