Ballarat's two flagship cultural institutions are mid-way through a coordinated audit of their digitised collections, targeting thousands of duplicate image files that have accumulated over two decades of scanning programs — a problem that archivists say quietly erodes the usefulness of public heritage databases and wastes server infrastructure budgets running into tens of thousands of dollars annually.
The Art Gallery of Ballarat, on Lydiard Street North, and the Museum of Australian Democracy at Eureka — known as MADE, on Stawell Street — began a joint digital asset review in the first quarter of 2026, aligning with a broader push by Museums Victoria and the Australian Institute for the Conservation of Cultural Material to standardise metadata and eliminate redundant files across regional collections. The timing is not accidental. Federal cultural infrastructure funding attached to the 2025–26 budget cycle required regional recipients to demonstrate digital collection integrity as a condition of ongoing grant eligibility.
Why Duplicates Are a Bigger Problem Than They Sound
A single digitisation campaign for a collection of 19th-century goldfields photographs — the kind Sovereign Hill and the Ballarat Historical Society have been running since the early 2000s — can generate three or four versions of the same image at different resolutions, under slightly different file names, catalogued by different staff members across different years. Multiply that across 30,000 objects in a mid-sized regional gallery and the duplication rate can reach 15 to 20 percent of total stored assets, according to published benchmarks from the Digital Preservation Coalition, a UK-based nonprofit whose 2024 annual report documented the problem across member institutions in eleven countries.
That figure matters in Ballarat's case because the Art Gallery of Ballarat holds one of the most significant colonial-era collections outside Melbourne, including works acquired as far back as 1884. Every duplicate file that sits undetected in the catalogue creates a potential indexing error — a single painting appearing twice in a search result, or worse, metadata from one work accidentally attached to another during bulk exports to the national aggregator Trove.
Cities with comparable heritage profiles have been at this longer. Bath's Holburne Museum completed a full duplicate-image audit of its digitised collection in late 2024, using open-source deduplication software developed by the Rijksmuseum in Amsterdam. Bruges, whose public library system digitised more than 400,000 medieval manuscript pages between 2018 and 2023, built automated hash-matching into its scanning workflow from the outset, meaning duplicates are flagged before they enter the database rather than cleaned out afterwards. Ballarat's institutions are doing the clean-up in retrospect, which is slower and more labour-intensive.
What Ballarat Is Actually Doing About It
The joint audit between the Art Gallery of Ballarat and MADE is using a combination of manual curatorial review and a commercially licensed deduplication tool. The process is expected to run through to December 2026. Staff at both institutions declined interview requests for this article, but publicly available tender documents lodged with the City of Ballarat in March 2026 show the software licensing component of the project was budgeted at $18,500 for the 2025–26 financial year.
The Ballarat Heritage Office, which operates under the City of Ballarat's planning and environment directorate and maintains its own photographic archive of local built heritage, is not part of the joint audit. Its collection sits on a separate server infrastructure and uses a different cataloguing system — a gap that archival specialists who have written about the issue in journals such as Archives and Manuscripts have flagged as a common fault line in regional council digital strategies.
Sovereign Hill, which manages its own photographic and artefact database independently of council, has not publicly detailed whether it is undertaking similar deduplication work, though its collection of goldfields-era imagery is among the most frequently licensed in the central highlands region.
For visitors and researchers using the Art Gallery of Ballarat's online collection portal or accessing Ballarat material through Trove, the practical upside of a completed audit is a cleaner, more reliable search experience. Institutions that have finished this work — Bath being the nearest comparable example — report measurable reductions in user-reported catalogue errors in the 12 months following completion. Ballarat's collections team is aiming for the same outcome by mid-2027, when the audit results are scheduled to feed into an updated digital access strategy for the gallery's 150th anniversary programming.