deCODEme and its questionable disease-risk predictions

(UPDATED: Original final paragraphs on 23andMe broken out as a separate post here.)

A few days ago, I noted that deCODEme, the personal-genomics spinoff of Iceland's deCODE Genetics, looks to be offering disease-risk predictions based on surprisingly thin evidence. I looked into it a little more deeply, and while I'm not a geneticist or even a close approximation thereof, I'm still a little taken aback by how little deCODEme currently seems to be flying on where many of these conditions are concerned

To recap for a second, deCODEme -- like the much better-publicized 23andMe (more on them in a moment) -- offers a service for an "introductory" price of $985 that scans customer genomes in a million or so specific locations to yield a rough sense of their genetic inheritance and its potential influence on their health and physical characteristics. Using gene-chip technology, the company looks specifically for individual DNA "letters," or nucleotides, that are known to vary between individuals. These single-letter variations, technically called single-nucleotide polymorphisms, or SNPs, essentially mark genes or other stretches of DNA whose altered function can contribute to (or protect against) disease or determine physical characteristics such as eye color.

deCODEme provides its customers with an analysis of SNPs that have been linked to 18 diseases, calculating a risk summary that compares an individual's odds of getting sick to those for the population -- well, a population -- at large. The trouble, as we noted earlier, is that in many cases deCODEme bases this risk assessment on just one or two SNPs, when most diseases are thought to be influenced by tens or hundreds of different genes. That means the disease risks deCODEme calculates are very likely to be wildly inaccurate -- potentially a serious state of affairs for the folks paying roughly $1,000 for this very analysis, even if deCODEme is careful to caution its users not to rely on the data as medical information. (Exactly what other use it might be isn't entirely clear to me.)

Since I didn't originally go through every one of the 18 diseases deCODEme analyzes, I decided take a closer look at the scientific foundation for the company's risk assessments. It turns out that for fully half of those conditions, including colon cancer and heart attack, deCODEme is relying on just one or two SNPs to calculate disease risk. Risk for three conditions -- Alzheimer's disease, asthma and obesity -- is based on a single SNP. (I've put together a chart listing the number of SNPs used to assess risk in all these conditions below the fold.)

In several instances, the very scientific publications that deCODEme uses to justify the use of one SNP also provide evidence for others that deCODEme, for some reason, has so far chosen to overlook. In Alzheimer's disease, for instance, deCODEme cites this publication in support of its choice of a SNP called rs4420638, which appears to affect the gene that produces apolipoprotein E, or ApoE, a protein linked to Alzheimer's susceptibility. The same study, however, lists four additional SNPs, all meeting criteria of statistical significance.

In heart attack, deCODEme relies upon this New England Journal of Medicine study to implicate a SNP known as rs599839. The company, however, overlooks thirteen other SNPs linked to heart disease in the same study, including one called rs1333049 that carried "the strongest association with coronary artery disease" in two separate studies involving almost 7,400 patients. Of course, deCODEme doesn't seem to explain why anywhere on its Web site.

Similarly, the study deCODEme cites to support its use of one of two SNPs in colon cancer notes explicitly that "[m]uch of the variation in inherited risk of colorectal cancer (CRC) is probably due to combinations of common low risk variants" -- which, translated into English, essentially means that the genetic risk of colon cancer is most likely spread across a large number of common genetic variants, each of which increases risk of the disease by a small amount. Yet deCODEme uses only two SNPs to assess its customers' risk of colon cancer, and outside of some boilerplate language, mostly leaves it to individuals themselves to interpret what the service is telling them. (The company does make "experts" available to answer questions, although unsurprisingly that feature isn't available to demo users.)

To be fair, the whole field of genetic disease analysis is still an imperfect science, not to mention a work in progress. And there are some conditions -- both types of diabetes and Crohn's disease, in particular -- for which deCODEme bases its calculations on eight or more SNPs, which at least should give a fuller picture of the situation. That said, though, at the moment the site looks very much like it was thrown up in a hurry (it launched just a few days before 23andMe), which may explain the "introductory" pricing and the, well, introductory level of service here.

A chart listing the number of SNPs deCODEme uses for each disease-risk calculation follows after the jump.

NOTE: Links are to the deCODEme "scientific details" page for each condition. Since these pages are maintained inside the company's "demo user" account, you'll first have to activate that account by clicking here for the links to work.

One SNPAlzheimer's disease Asthma Obesity

Two SNPsAge-related macular degeneration (AMD)Atrial fibrillation Celiac disease Colorectal cancer Glaucoma Myocardial infarction (heart attack)

Three SNPsMultiple sclerosis Psoriasis

Four SNPsRestless legs syndrome

Five SNPsProstate cancer

Six SNPsRheumatoid arthritis

Seven SNPsBreast cancer

Eight SNPsType 2 diabetes

Ten SNPsType 1 diabetes

Twelve SNPsCrohn's disease

More