Skip to main content
The Daily Ballarat

Ballarat news, every day

News

How Ballarat's Digital Archives Ended Up Full of the Same Image Twice: The Road to Duplicate Replacement

A years-long accumulation of scanning shortcuts, budget constraints and shifting software platforms left the region's public digital collections riddled with duplicate images — and the clean-up is now well underway.

How we report this

Our reporters are based in Ballarat and cover local government, business and community. We are independently owned and editorially independent. Read our editorial standards →

By Ballarat News Desk · Published 5 July 2026, 5:45 am · 4 min read ·

Updated 5 July 2026, 1:47 pm

How Ballarat's Digital Archives Ended Up Full of the Same Image Twice: The Road to Duplicate Replacement
Photo: Photo by Mitchell Luo on Pexels

Ballarat's public digital image libraries contain thousands of duplicate files. That is the blunt starting point for understanding why the City of Ballarat and several of its cultural partners launched a coordinated duplicate-image replacement program earlier this year — and why archivists say the problem was decades in the making.

The issue matters now because two major digitisation pushes are converging at once. Sovereign Hill's expanded archival project, which received federal tourism and heritage grant support in the 2024–25 federal budget cycle, generated a fresh tranche of high-resolution scans of goldfields-era photographs. Simultaneously, the Ballarat Heritage Office has been migrating legacy records from an older content management system into a contemporary cloud-based catalogue. When those two streams merged, staff found the same images — sometimes three or four versions of them — sitting under different file names and metadata tags.

How the Duplicates Accumulated

The roots go back to at least 2003, when the Ballarat Fine Art Gallery — now the Art Gallery of Ballarat on Lydiard Street — began its first systematic digitisation of the permanent collection. Equipment at the time produced files at resolutions that later proved inadequate for print reproduction, so batches were rescanned in 2009 and again in 2014 without the earlier versions being retired from the system. Each round of scanning was handled by a different contractor working to a different technical brief, and no single taxonomy governed how files were named or tagged.

The Eureka Centre redevelopment in the early 2010s added another layer of complexity. When interpretive materials from the original Eureka Centre on Stawell Street were digitised and handed to the Museum of Australian Democracy at Eureka — known locally as MADE — some files arrived already duplicated from the source collection. Archival staff absorbed them without a deduplication step because, at the time, storage was cheap and the priority was preservation speed over catalogue hygiene.

Regional library networks compounded things further. The Ballarat Library on Doveton Street North participates in the Libraries Victoria shared catalogue, and image assets pulled from that system during local exhibition projects were sometimes saved locally without cross-referencing what already existed on council servers. By the time anyone ran a systematic audit, the problem was structural rather than incidental.

What the Clean-Up Actually Involves

Duplicate-image replacement is not simply deleting extras. Archivists must determine which version of an image is the canonical one — the highest resolution, the best-preserved, the most accurately catalogued — before the others can be marked for removal or demotion. In Ballarat's case, that process has required physical cross-referencing against original glass plates and prints held at the Ballarat Historical Society on Barkly Street, because metadata errors in the digital records sometimes mean the file labelled as a 2014 high-resolution rescan is actually lower quality than the 2009 version.

The City of Ballarat's 2025–26 operational budget allocated funding to the Heritage Services team specifically for catalogue remediation work, though the council has not publicly itemised that line against the broader heritage portfolio. Staff are using open-source deduplication tools alongside manual review, a combination that archivists at similar regional institutions — including the Bendigo Regional Archives Centre — have used in comparable projects.

The practical consequence for the public is gradual. Researchers accessing Ballarat's digital collections through the PictureVictoria portal or the council's own Ballarat Heritage gateway have sometimes encountered the same image under multiple search results, occasionally with contradictory dates or subject tags. As duplicates are retired and canonical versions are properly keyed to consistent metadata, those inconsistencies should resolve.

The program is expected to run through to mid-2027. Institutions involved have been advised to hold off on any new bulk digitisation intake until the existing catalogue is stabilised — a discipline that will test patience given how much goldfields material still sits unscanned in private and institutional collections across the Central Highlands. For now, the work is methodical, unglamorous, and long overdue.

Spread the word

Your reaction

Bookmark this story to your reading list.

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Ballarat

This article was produced by the The Daily Ballarat editorial desk and covers news in Ballarat. See our editorial standards for how we use AI.

The Daily Ballarat brief

The day's Ballarat news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Ballarat and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Ballarat news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Ballarat and accept our Privacy Policy. Unsubscribe anytime.

More from Ballarat

More from Ballarat

Enjoyed this story? Get tomorrow's briefing free.