Q&A: Copy number variants
5 November 2009

What are CNVs?
CNVs, or copy number variants, are relatively long sections of DNA gained or lost in one individual relative to another. They've been known about for a long time but, until the last couple of years, we haven't really had the tools to get a sense of how prevalent they are in the genome.
What was the aim of the study?
In 2006, we published a 'first-generation' map of the largest CNVs that could be seen with the technology of the time. Since then we've been working to produce a more comprehensive map that captures the majority of common CNVs, and provides the resources to be able to incorporate those into genetic studies: essentially, a reference map for CNVs to drive research.
To do this, we had to screen the genome at much higher resolution than has been done before. We split the genome into 20 different segments and put each on to its own microarray. We ended up screening the genome with 42 million probes - at least a 20-fold higher density than has been done previously. We looked at CNVs bigger than 443 bp [base pairs] occurring in one in 20 individuals.
What did you find?
We've discovered some interesting things about CNVs, such as how they arise and how frequently - at least one in 17 children every generation will have a new CNV. Something that previous CNV maps haven't shown is that some duplicated sections of DNA have jumped to new locations within the genome, sometimes on to a different chromosome.
CNVs seem to be less common in introns [non-coding DNA sequences within a gene] - they're selected against. It is known that deletions in these regions can cause disease but we don't fully understand the mechanism behind this yet.
Was there anything unexpected?
Studies of SNPs [single nucleotide polymorphisms, sequence variations at a single DNA base] have gone some way to explaining why diseases cluster in families, but the majority of this clustering is not explained. It was thought that CNVs might account for this so-called 'missing heritability', but we found that it's highly unlikely that common CNVs explain a significant proportion of it, although rare CNVs may still play a significant role.
Where is the missing heritability?
We think that it's most likely to be rare variation that hasn't been well captured by the approaches currently available. We do point out in the paper that, because of the negative selection acting on some CNVs, many of these variants will necessarily be rare. Also, the fact that they're being selected against suggests that they may have an effect on disease.
Where is the missing heritability?
We think that it's most likely to be rare variation that hasn't been well captured by the approaches currently available. We do point out in the paper that, because of the negative selection acting on some CNVs, many of these variants will necessarily be rare. Also, the fact that they're being selected against suggests that they may have an effect on disease.
What's next for CNVs?
One of our next steps will be to look for these rarer variants. Ideally, we'll do this in a way that allows us to capture SNPs and CNVs, so we'll be maximising the power to identify any region of the genome that happens to harbour rare variants. It should be possible, but it is a work in progress. While the new sequencing technology can be very good at capturing sequence variation, it's an open question to what degree it can capture structural variation.
What do you do outside of the lab?
I enjoy spending time playing with my young children, and sculling on the river Cam.
Image: Dr Matt Hurles
Reference
Conrad DF et al. Origins and functional impact of copy number variation in the human genome. Nature 2009 [Epub ahead of print].

