You think the Web is big? In truth, it’s far bigger than it appears. The Web is made up of hundreds of billions of Web documents — far more than the 8 billion to 20 billion claimed by Google or Yahoo.
But most of these Web pages are largely unreachable by most search engines because they are stored in databases that cannot be accessed by Web crawlers. Glenbrook Networks has been working on accessing these documents, using technology that crawls databases, and which automatically completes online forms and extracts data. Here’s our Merc story today (free registration). The company can do some interesting things, like this map of jobs in Silicon Valley.