Join Transform 2021 this July 12-16. Register for the AI event of the year.

Back in the late 90s, VCs asked entrepreneurs if they had an Internet vision for their companies. Earlier this decade, they asked about an open source strategy. In two years, VCs will be asking entrepreneurs if they’re powered by Wikipedia.

Wikipedia could be one of the most important resources of our time. Most of us are trained to think about an encyclopedia as a collection of articles. But Wikipedia has some important attributes that can power many interesting future applications:

• Collective Wisdom. This is what most people see in Wikipedia, a place where the world puts its knowledge.
• Realtime Snapshot of History. If I could put only one thing in a time capsule, I would put in Wikipedia. At any moment in time, it is the closest thing to an up-to-date, complete history of the world.
• Extraordinary Breadth, yet not overrun with junk (unlike the Web). 1,673,715 articles in English at the time of this writing, but one still has to be “notable” to get in (note: he’s in now).
• Dependable Quality. Yes, Wikipedia has its problems, but its quality is very strong considering the breadth of the information available on the site.
• Well structured (at least for unstructured data). Despite the unruliness of text (at least for software programming), Wikipedia is actually well structured, with guidelines for everything from cross-referencing to linking to a table of contents.
• Open Content” license. Wikipedia used the GNU Free Documentation License, which gives the “freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially.” Some important caveats apply, including rights on derivative works; so read the license prior to using the content.

This unique resource is readily available to entrepreneurs and technologists out there. I hope to see a new boom of services with the help of Wikipedia. Here are some examples of some people taking advantage of Wikipedia:

General Repurposing, with added value

• Augment a “reference” shelf: The Free Dictionary
• Add to search results:
• Wikepedia translated by machine: Qwika
• Wikipedia entries enhancing other content: See this article for one description.

Teaching computers to be more human

Some researchers from Technion use Wikipedia to “map single words and larger fragments of text to a database of concepts” to give their software “background knowledge” so computers can “filter e-mail spam, perform Web searches and even conduct electronic intelligence gathering at a much more sophisticated level than current programs.”
• Of course, such techniques can also be used by bad guys to send more spam, pose as humans, etc.
• Use Wikipedia to teach computers to understand terms and connections between subjects

Trying to make search better through reliable content, safe links


A better atlas


Wikipedia as a print encyclopedia?

• When my immigrant parents stretched their pennies and brought the Encyclopedia Britannica into our public-housing apartment, it was an amazing day and showed my parent’s commitment to education. Can a print version of Wikipedia do the same for a child today, especially those in countries without good Internet access?

Wikipedia powering a database version of the Web

• Early today, Danny Hillis announced a new venture called FreeBase, an ambitious project that is attempting to structure Web content into a form that can be queried like a database. If he can make it work, it will certainly allow software developers more easily take advantage of the open content on the Internet. Of course, Wikipedia is a large component of this project and without its authoritative entries and logical structure, a project like FreeBase would be much much harder.

I am sure someone somewhere is doing this, but what about these simple ideas?

• Finding Safe Web Neighborhoods: Use links within Wikipedia to map safe and authorative web sites
• Photos and Media in Wikimedia Commons: Produce a free/cheap Stock Photo site
• TrendWatching: Use changes, additions, and requsted articles for track trends: see this Requested Articles page
• A Kid’s Encyclopedia: show a subset of articles for a kid-friendly encyclopedia

Increasingly start-ups of all kinds are relying on Wikipedia. My company, Boxxet, collects content on fan subjects. Take, for example, the TV show, Heroes. We use the text from the Wikipedia article on Heroes to train our software to understand the subject better, looking at its outbound links to help direct our crawlers to look at the helpful, spam-free Web sites.

Can our services or products be made better? We made them more widespread through network effects and cheap distribution (client/server and then Internet). We made them cheaper through productivity gains (outsourcing and open source). Now it is time to make them smarter by using Wikipedia.

[Keep up with You Mon Tsang’s thoughts on his personal blog, his corporate blog and his new startup, Boxxet.]


VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member