Compendia Bioscience, cancer-genomics data miner, leverages the Web for biodata analysis

compendia-logo.jpgA number of startups are starting to bring the power of the Web to bear on complex masses of biological data. One of the latest is Compendia Bioscience, an Ann Arbor, Mich., computational biotech that’s focused on mining cancer-genomics data. The company just received a $2.4 million grant from the National Cancer Institute to further development of Compendia’s lead product, a program that combs through and analyzes publicly available data on gene activity in a variety of tumors.

In this respect, Compendia’s product Oncomine is conceptually similar to other recent biological data search-and-analysis programs launched by companies such as NextBio and GenomeQuest. (See our previous coverage here and here, respectively.) The main difference here is that Compendia is focused more on gene-expression data from microarray experiments — so-called “transcriptomics,” if you like that kind of phrase — than straight genomic data. These experiments involve testing biopsied tumor tissue with a “gene chip” that can identify which genes are especially active and those whose activity has been throttled down or even turned off. This sort of information has all sorts of uses — it can help identify genes responsible for the creation and spread of tumors and may also help in the search for new drugs.

Microarray experiments produce an enormous amount of information — a single microarray can easily produce hundreds of gigabytes of data, so these are some pretty huge databases we’re talking about. Oncomine currently claims to index more than 21,000 microarray datasets containing over 500 million data points.

In any event, it’s fascinating to see the Web 2.0 mentality start to take on the huge and rapidly growing piles of biological data that used to be locked up in individual laboratories — or even in somewhat hard-to-interpret public databases like GenBank. These efforts not only make it more widely accessible, they make it possible for the first time to conduct analyses over widely disparate datasets that simply couldn’t have been done before. Not for free, of course — Compendia, like its fellows, charges academic labs and pharma/biotech companies for the privilege of peering into this data more easily.

Over time, however, I wouldn’t be surprised if these tools continue to get cheaper and easier to use, particularly as individuals obtain greater understanding and control of their own genetic information and begin demanding ways to help them better interpret it. Watch this space. Things should start to get pretty interesting before long.