Elevate your enterprise data technology and strategy at Transform 2021.
The Internet Archive is making it easier for web users to access archived versions of dead web pages with a new official add-on for Google Chrome browser.
Once you install the Wayback Machine extension, whenever you land on a once-valid web page that now delivers an error code — such as “page not found” or a “404” — the extension will query the Wayback Machine to check whether there is anything in the archives. If there is, you’ll be asked to click to view the most recently archived version.
For the uninitiated, the Internet Archive has been documenting the web’s evolution since 1996, crawling millions of websites and documenting changes and edits at intermittent periods. So, for example, anyone wishing to return to the Twitter homepage of 2006, or the FBI homepage in 1996, can do so.
The broader Internet Archive is an incredibly useful tool for curious geeks interested in the history of the web. But it also serves a more important purpose as it prevents content publishers, from newspapers to government agencies, from whitewashing their online history.
By way of example, an estimated 83 percent of PDF documents on .gov domains disappeared during President Obama’s first term in the White House. Such vanishing acts aren’t always sinister, however, as there may be any number of legitimate reasons for content disappearing — departments may merge, or projects may become obsolete.
But the Internet Archive and its partner organizations have made it their mission to document government website data. As George W. Bush’s time in office was coming to an end in 2008, the End of Term Web Archive was launched with one sole purpose: to serve as a permanent record of government-related communications during presidential transitions. Last month, the Internet Archive revealed plans to preserve 100 terabytes of government website data.
Furthermore, a Harvard study from 2013 found that 49 percent of hyperlinks relating to Supreme Court decisions no longer work. And this is the problem that the Wayback Machine is looking to solve. So-called link rot is a growing concern, and online archives are vital to preserving a vast swathe of important data.
As for the new Wayback Machine extension, the Internet Archive says that it is continuing to try to protect user privacy by not recording the IP addresses, and it says that it’s in discussions with Google about “adding a proxy server as an additional layer of protection,” the Wayback Machine’s director, Mark Graham, noted in a blog post.
Additionally, in response to concerns over what a Donald Trump presidency may mean for privacy and censorship on the internet, the Internet Archive recently announced it was building a replica database in Canada.
VentureBeatVentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more