No. 94, July 2019
< The Australian Web Archive lets you time travel! >
Australians now have the chance to unlock the last twenty years of our digital history, thanks to the new web archive at the National Library of Australia.
The AWA is one of the biggest web archives in the world. It features snapshots of hundreds of thousands of Australian websites from 1996 to 2018. It contains about 600 terabytes of data across nine billion records. If you printed out those records and stacked them in a bookcase, the bookcase would stretch from Canberra to Cairns.
< Australian Web Archive available at the Trove >
The AWA combines records from the PANDORA Archived websites, the Australian Government Web Archive (AGWA) and websites relating to Australia collected annually through large-scale crawl harvests.
Data is collected through three main activities:
- National domain harvesting: A snapshot of the .au domain taken each year in collaboration with the Internet Archive (a not-for-profit organisation based in the USA)
- Selective collection of individual websites and web documents that are included in the PANDORA Archive, and undertaken by the Library in collaboration with PANDORA partner agencies
- Bulk harvesting: Undertaken in-house by the Library, and primarily relating to government websites
The AWA collection only includes website material that was publicly accessible and free of charge. It does not include content behind paywalls or the ‘dark web'.
Users can perform full-text searches within the archive, and the collection is freely accessible to users outside the Library's physical buildings. They just need to go online and head to Trove.
Website address: https://trove.nla.gov.au/website?q=
Copyright (C) 2019 National Library of Australia