Diving deep into the web: Glenbrook Networks

You think the Web is big? In truth, it’s far bigger than it appears. The Web is made up of hundreds of billions of Web documents — far more than the 8 billion to 20 billion claimed by Google or Yahoo.

But most of these Web pages are largely unreachable by most search engines because they are stored in databases that cannot be accessed by Web crawlers. Glenbrook Networks has been working on accessing these documents, using technology that crawls databases, and which automatically completes online forms and extracts data. Here’s our Merc story today (free registration). The company can do some interesting things, like this map of jobs in Silicon Valley.

Next Story:
Previous Story:

About the Author,

Matt launched VentureBeat in September of 2006, with the realization that no one else was covering the entrepreneurial and tech innovation scene with the velocity or depth that he was. Prior to founding VentureBeat, he covered venture capital for the San Jose Mercury News from 2001 to 2006. In 2002, Matt was awarded "Journalist of the Year" by the Northern California Society of Professional Journalists. Prior to working at the Merc, he was a correspondent for the Wall Street Journal in Bonn, Germany from 1995 to 1998, and a writer for the Washington Post in 1994. Matt holds a PhD in Government and an MA in German and European Studies from Georgetown University. In addition to VentureBeat, Matt is also the Executive Producer of DEMO, the leading launchpad event for emerging technologies.

blog comments powered by Disqus