Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.
The Internet Archive has been documenting the Web’s evolution since the dawdling days of dial-up — or 1996, to be exact. Anyone wishing to return to the Apple homepage of 1998, or the New York Times of the early 2000s, can simply plug their desired URL into the Wayback Machine, and it takes care of the rest.
The Internet Archive crawls the Web taking snapshots at intermittent periods, serving as a public record of how the Internet is changing. At the time of writing, it has snapped almost 440 billion “captures,” covering web pages, video, and images. But the online world has changed greatly the past 19 years and, as such, the Internet Archive is evolving too.
The San Francisco-based not-for-profit has received a grant from the Laura and John Arnold Foundation (LJAF), to help rebuild the Wayback Machine for the modern era — the update is intended to make the Wayback Machine easier to search and more user-friendly.
“The Internet Archive is helping to preserve the world’s digital history in a transformational way,” said Kelli Rhee, vice president of venture development at LJAF. “Taking the Wayback Machine to the next level will make the entire web more reliable, stable, and retrievable for everyone.”
The entire Wayback Machine code will be rewritten to “improve reliability and functionality,” and a new interface will make it easier to dig out archived websites. This includes being able to find websites by keywords — at present, you have to type in the web address manually. But it won’t include every single page — only homepages will be indexed.
The Internet Archive says it’s looking to improve multimedia websites it has crawled, which means supporting new formats as well as supporting older to formats to ensure that so-called “bit-rot” doesn’t set in. It’s also partnering with third-party services such as Wikipedia to help repair broken links — so if a link on a Wikipedia page leads to a now-deleted page, for example, it will link to an archived page on the Wayback Machine instead.
“Today, people’s work, and to some extent their lives, are conducted and shared largely online,” said Wendy Hanamura, director of partnerships at the Internet Archive. “That means a portion of the world’s cultural heritage now resides only on the Web. And we estimate the average life of a Web page is only one hundred days before it is either altered or deleted.”
The all-new Wayback Machine is expected to be finished some time in 2017.
VentureBeatVentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform
- networking features, and more