Science at the Sanger Institute
The Wellcome Trust Sanger Institute is a leading centre for genetics and genomics studies. Below is a summary of the types of research that are performed at the Sanger Institute, and some of the services it provides to the scientific community. To learn more visit the Sanger Institute’s website.
Scientists and equipment in the sequencing room. Credit: Wellcome Library, London
- From sequence to biology
- Sequencing
- Data mining
- High-throughput analysis
- Genetic variation and disease
- Postgenomic projects
- Model organisms
- Pathogens
- Genomic infrastructure.
From sequence to biology
The Sanger Institute's reputation has been based on its large-scale, high-quality genome sequencing projects. While these remain a foundation for the Sanger Institute's work, it is equally committed to the development of tools to analyse and annotate genome sequence data, and to biological studies of the function of genes in living systems.
This work lays the foundation for new diagnostics and therapeutics based on an understanding of human and animal health at a molecular level.
Sequencing
The Sanger Institute has made major contributions to the sequencing of the genomes of humans, yeast, the nematode worm and more than 30 pathogens. Next generation sequencing techniques are leading to substantially faster sequencing of DNA. This is of importance to projects such as the 1000 Genomes Project, which is sequencing the equivalent of more than 2 human genomes every day.
All external access to the sequencing facilities is based on collaboration and should involve of a member of the Sanger Institute Faculty. Access is granted after the merits of the proposed project have been considered by a Sequencing Committee.
Data mining
Sophisticated software and a large investment in computer resources keep the data organised and underpin efforts to identify genes and other sequence features. The data stored by the Sanger Institute are freely accessible to researchers across the world via databases such as the genome browser Ensembl, the pathogen and sequence database Gene DB, and the Catalogue of Somatic Mutations in Cancer, COSMIC.
High-throughput analysis
The development of high-throughput tools has opened up new opportunities to explore gene function on a grand scale. Rather than study genes one at a time, these tools enable researchers to track the activity of thousands in a single experiment.
Genetic variation and disease
Genetically, humans are 99.9 per cent identical, but we do differ in minute detail - and these differences can be medically important. A key aspect of the Sanger Institute's work is to identify sequence variation in human populations, and how specific variants contribute to health and disease.
It is working to identify genetic variations such as single nucleotide polymorphisms (SNPs) and patterns of SNPs that are inherited together (haplotypes). The results are contributed to the International HapMap Project, which is developing a public resource that will help researchers find genes associated with human disease.
Other variations can occur through the gain or loss of large pieces of DNA of between ten thousand and five million bases in size. This Copy Number Variation (CNV) is being investigated for its contribution to human disease, along with other structural variations.
Postgenomic projects
New projects range from addressing biomedical questions, such as how chromosomes rearrange and evolve, to uncovering the molecular basis of disease from deafness to diabetes.
Model organisms
Model organisms are vital tools for scientists studying biological processes in living organisms. The availability of genome sequences greatly aids researchers working on the function of genes in living systems.
The Sanger Institute has made important contributions to yeast, worm and zebrafish genome sequencing. The latter is particularly useful for understanding the development of vertebrates.
The most powerful model for understanding human biology and disease is the mouse, which will be central to future work at the Sanger Institute.
Pathogens
The Sanger Institute is continuing to sequence a wide range of pathogens and other microbes of biological interest, such as those that produce natural antibiotics.
It is also developing programmes of research aiming to use pathogen genome sequence data to understand microbial biology- such as why some strains are harmful and others are not.
Genomic infrastructure
All the research at the Sanger Institute is underpinned by world-class core facilities that support large-scale sequencing and analysis. It also has one of the most powerful computer facilities in Europe.
Every day the Sanger Institute generates billions of bases of raw sequence data, which is a major challenge for not only the bioinformaticians and statistical geneticists making use of it, but also for the storage and provision of access to the data that is required. In 2009, the Sanger Institute Data Centre storage capacity stood at 4 petabytes of data.
Other specialist support services provide expert assistance in many new experimental techniques.
This infrastructure is central to the Sanger Institute's productivity, and has underpinned its collaborations with numerous groups around the UK, Europe and the rest of the world.






