The City of Ballarat holds one of regional Victoria's largest photographic archives, with tens of thousands of digitised images spanning the 1850s gold rush through to the late twentieth century. A growing share of that collection — archivists have flagged the figure internally as potentially one in five scanned items — consists of duplicates: the same print or glass negative captured multiple times across different digitisation rounds, then catalogued under different file names and metadata tags. The result is inflated storage costs, slower public search results, and hours of staff time spent on manual review.
The problem has sharpened in 2026 because several major funding deadlines are converging. The State Library of Victoria's Digitisation Partnership Program is mid-cycle, and Sovereign Hill Museums Association — which manages its own photographic records related to Ballarat's colonial and mining heritage on Bradshaw Street — has been expanding its digital access offerings ahead of a planned gallery refresh. When institutions apply for grants under programs like the Australian Research Council's Linkage Projects scheme, duplicate-heavy collections weaken the case for fresh capital by suggesting poor data governance.
A Problem With Deep Roots in Gold Country
Ballarat's situation is not unique, but its heritage intensity makes it acute. The city's gold-rush origins mean that institutions including the Art Gallery of Ballarat on Lydiard Street North and the Ballarat Mechanics' Institute on Sturt Street have independently digitised overlapping holdings — sometimes the same donated photograph appearing in three separate institutional repositories. Deduplication software can flag near-identical pixel arrays, but heritage photographs present a specific challenge: a faded original and a restored copy of the same image will often fool automated tools into treating them as distinct items.
Globally, cities with comparable heritage photography burdens have taken different approaches. Salzburg, Austria, working through its Stadtarchiv, completed a two-year AI-assisted deduplication project in 2024 that it says reduced its digital image count by roughly 18 percent while retaining all unique archival content. Dunedin, New Zealand — a city whose Scottish settler heritage gives it a photographic archive density similar to Ballarat's — deployed open-source perceptual hashing tools across the Hocken Collections at the University of Otago starting in late 2023, a lower-cost approach that required significant volunteer indexing hours. Neither model transplants directly to Ballarat, where institutional ownership is fragmented across council, state-linked bodies, and private associations.
What Ballarat's Institutions Are Doing Now
The Art Gallery of Ballarat, which holds more than 6,000 works including a substantial photography collection, has been working with the Collections Council of Australia's frameworks for shared metadata standards. The Mechanics' Institute — which celebrated 170 years of operation in 2021 — manages its own catalogue through the Australian Microforms and Digitised Records Program. Neither institution has publicly committed to a unified deduplication initiative, though the City of Ballarat's Digital Strategy, adopted in 2023, flags interoperability between council-managed datasets as a medium-term goal.
The financial stakes are real. Cloud storage for uncompressed archival-quality TIFF files — the standard format for heritage digitisation — runs to roughly $80 to $120 per terabyte per month on Australian-hosted platforms, according to publicly available pricing from providers including AWS Sydney and Microsoft Azure. A collection carrying 20 percent redundant files is paying a proportionate premium every month while those duplicates sit unchecked.
Heritage Victoria's current granting rounds, open through August 2026, include provisions for collections management infrastructure, which archivists in the sector have noted could be applied to deduplication tooling — though the program prioritises physical conservation over digital workflows in its published criteria.
The most practical near-term step available to Ballarat's institutions is a coordinated audit using existing tools. The Smithsonian Institution in Washington published its deduplication methodology under a Creative Commons licence in 2022, and several Australian regional galleries have adapted it at minimal cost. Ballarat's archivists, spread across Lydiard Street and Bradshaw Street and the Sturt Street corridor, already have the expertise. The missing ingredient is a formal agreement to look at each other's catalogues in the same room at the same time.