Ballarat's public digital image libraries contain thousands of duplicate files. That is the blunt starting point for understanding why the City of Ballarat and several of its cultural partners launched a coordinated duplicate-image replacement program earlier this year — and why archivists say the problem was decades in the making.
The issue matters now because two major digitisation pushes are converging at once. Sovereign Hill's expanded archival project, which received federal tourism and heritage grant support in the 2024–25 federal budget cycle, generated a fresh tranche of high-resolution scans of goldfields-era photographs. Simultaneously, the Ballarat Heritage Office has been migrating legacy records from an older content management system into a contemporary cloud-based catalogue. When those two streams merged, staff found the same images — sometimes three or four versions of them — sitting under different file names and metadata tags.
How the Duplicates Accumulated
The roots go back to at least 2003, when the Ballarat Fine Art Gallery — now the Art Gallery of Ballarat on Lydiard Street — began its first systematic digitisation of the permanent collection. Equipment at the time produced files at resolutions that later proved inadequate for print reproduction, so batches were rescanned in 2009 and again in 2014 without the earlier versions being retired from the system. Each round of scanning was handled by a different contractor working to a different technical brief, and no single taxonomy governed how files were named or tagged.
The Eureka Centre redevelopment in the early 2010s added another layer of complexity. When interpretive materials from the original Eureka Centre on Stawell Street were digitised and handed to the Museum of Australian Democracy at Eureka — known locally as MADE — some files arrived already duplicated from the source collection. Archival staff absorbed them without a deduplication step because, at the time, storage was cheap and the priority was preservation speed over catalogue hygiene.
Regional library networks compounded things further. The Ballarat Library on Doveton Street North participates in the Libraries Victoria shared catalogue, and image assets pulled from that system during local exhibition projects were sometimes saved locally without cross-referencing what already existed on council servers. By the time anyone ran a systematic audit, the problem was structural rather than incidental.
What the Clean-Up Actually Involves
Duplicate-image replacement is not simply deleting extras. Archivists must determine which version of an image is the canonical one — the highest resolution, the best-preserved, the most accurately catalogued — before the others can be marked for removal or demotion. In Ballarat's case, that process has required physical cross-referencing against original glass plates and prints held at the Ballarat Historical Society on Barkly Street, because metadata errors in the digital records sometimes mean the file labelled as a 2014 high-resolution rescan is actually lower quality than the 2009 version.
The City of Ballarat's 2025–26 operational budget allocated funding to the Heritage Services team specifically for catalogue remediation work, though the council has not publicly itemised that line against the broader heritage portfolio. Staff are using open-source deduplication tools alongside manual review, a combination that archivists at similar regional institutions — including the Bendigo Regional Archives Centre — have used in comparable projects.
The practical consequence for the public is gradual. Researchers accessing Ballarat's digital collections through the PictureVictoria portal or the council's own Ballarat Heritage gateway have sometimes encountered the same image under multiple search results, occasionally with contradictory dates or subject tags. As duplicates are retired and canonical versions are properly keyed to consistent metadata, those inconsistencies should resolve.
The program is expected to run through to mid-2027. Institutions involved have been advised to hold off on any new bulk digitisation intake until the existing catalogue is stabilised — a discipline that will test patience given how much goldfields material still sits unscanned in private and institutional collections across the Central Highlands. For now, the work is methodical, unglamorous, and long overdue.