Skip to main content
The Daily Ballarat

Ballarat news, every day

News

How Ballarat's digital archives ended up full of duplicate images — and what it took to finally fix it

Years of well-meaning digitisation projects, competing platforms and underfunded coordination left the region's cultural institutions sitting on thousands of redundant image files.

How we report this

Our reporters are based in Ballarat and cover local government, business and community. We are independently owned and editorially independent. Read our editorial standards →

By Ballarat News Desk · Published 5 July 2026, 6:02 am · 4 min read ·

Updated 5 July 2026, 1:46 pm

How Ballarat's digital archives ended up full of duplicate images — and what it took to finally fix it
Photo: Jose, Arthur W. (Arthur Wilberforce), 1863-1934 Taylor, Thomas Griffith, 1880-1963 Woolnough, Walter George, b. 1876 David, T. W. Edgeworth (Tannatt William Edgeworth), Sir, 1858-1934 / Public domain (Wikimedia Commons)

Ballarat's cultural institutions are midway through a painstaking effort to strip thousands of duplicate images from their shared digital archives — a problem that built quietly over nearly two decades of piecemeal scanning projects, grant cycles that rewarded volume over accuracy, and software systems that never talked to each other properly.

The duplication issue matters now because regional councils and state-funded bodies are under growing pressure to demonstrate value from digital infrastructure spending. Victoria's Public Record Office has tightened compliance expectations for regional repositories since 2023, and institutions that cannot demonstrate clean, deduplicated collections risk losing access to future digitisation funding rounds. For Ballarat, where the gold-rush heritage identity is tied directly to photographic and archival holdings, the stakes are unusually high.

How the problem accumulated

The origins trace back to the early 2000s, when Ballarat's major cultural institutions — including the Art Gallery of Ballarat on Lydiard Street North and the Ballarat Heritage Services arm of the City of Ballarat — each began independent digitisation programs. Sovereign Hill, which draws roughly 500,000 visitors annually to its Bradshaw Street site, ran its own separate photographic cataloguing effort tied to tourism grant applications through Regional Development Victoria.

Each program used different metadata standards. Scans made for one grant application were often rescanned for another, with no central registry to flag that the image already existed. By the time the Museum of Australian Democracy at Eureka — known as MADE, on Stawell Street — opened its expanded digital gallery in 2015, there were at least three institutions in central Ballarat holding overlapping collections of images from the 1854 Eureka Stockade and surrounding goldfields period, none of them systematically cross-referenced.

The Federal Government's Trove platform, administered by the National Library of Australia, absorbed many of these contributions over successive years. That aggregation, useful as it was for public access, also amplified the duplication: the same glass-plate negative, scanned by two different institutions on two different occasions, could appear in Trove under different identifiers, different titles and different rights statements.

The audit and what it found

A formal audit commissioned through the Ballarat Regional Libraries network and delivered in the second half of 2024 counted more than 14,000 image records across the four main contributing institutions that were either exact duplicates or near-identical scans of the same physical object. That figure represented roughly 22 per cent of the combined digital holdings reviewed. The audit cost approximately $47,000, funded through a State Library Victoria community heritage grant.

The report identified three structural causes: grant conditions that measured outputs in raw file counts rather than unique objects; the absence of a shared identifier system across institutions before 2019; and staff turnover rates at regional galleries and libraries that meant institutional memory about what had already been scanned was regularly lost. Ballarat Central Library on Doveton Street had changed digitisation coordinators four times between 2008 and 2020, according to the audit's methodology section.

Deduplication work began in earnest in early 2025. It involves both automated matching software and manual review, because metadata inconsistencies mean automated tools alone cannot reliably identify near-duplicates without human confirmation. The process is expected to run through to at least mid-2027.

For anyone using these archives — researchers, heritage consultants, tourism operators preparing interpretive material along the Eureka Centre precinct on Stawell Street — the practical implication is that search results in local digital catalogues are currently being revised progressively as duplicates are resolved. Records may temporarily appear, disappear or be merged without notice as the work proceeds.

Institutions have been advised to note the audit period in any materials that cite digital collection sizes, since headline figures will continue to shift downward until deduplication is complete. The City of Ballarat's heritage team has flagged the revised, cleaner collection will form the foundation for a new unified catalogue planned for launch in 2028.

Spread the word

Your reaction

Bookmark this story to your reading list.

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Ballarat

This article was produced by the The Daily Ballarat editorial desk and covers news in Ballarat. See our editorial standards for how we use AI.

The Daily Ballarat brief

The day's Ballarat news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Ballarat and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Ballarat news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Ballarat and accept our Privacy Policy. Unsubscribe anytime.

More from Ballarat

More from Ballarat

Enjoyed this story? Get tomorrow's briefing free.