We use cookies on this website. By continuing to use this site without changing your cookie settings, you agree that you are happy to accept our cookies and for us to access these on your device. Find out more about how we use cookies and how to change your cookie settings.

A genome fest

25 years of pathogen genome sequencing

The world is awash with pathogen genome sequence data. What difference has it made in research and medicine?

Over the last few years or so, it seems that not a month goes by without an organism's genome being sequenced. Particularly popular have been pathogens - they mostly have conveniently small genomes and genome sequence data should speed the development of new therapies or control measures.

Twenty-five years on from the sequencing of the first pathogen - the bacteriophage phiX174, whose genome of 5500 bases was decoded by Fred Sanger and colleagues - more than 1000 virus sequences have been completed, together with the genomes of bacteria, single-celled pathogens and parasitic worms. To celebrate the 25th anniversary, a conference at Hinxton Hall in July 2002 brought together pathogen and genomics researchers from around the globe to discuss the development of the field, uses of pathogen genomes in research, and hopes for the future.

Biological insights

As more and more sequence is produced, researchers are able to compare the genomes of related species or of different strains of the same species. Such comparative sequencing - a recurrent theme at the conference - can help identify genes, and provide valuable information about genome organisation, evolution and why some species or strains are harmless and others deadly.

Enteric (intestinal) bacteria, discussed by Professor Gordon Dougan (Imperial College London), are a case in point. Bacteria such as Escherichia coli, Shigella and Salmonella have about 4500 genes, and share a similar 'core' set of genes. Comparisons between their genomes are uncovering the genetic differences between them that underlie their varied lifestyles and pathogenic effects. For example, Salmonella typhimurium infects many different organisms, residing in the gut and causing gastroenteritis. Conversely, Salmonella typhi, the typhoid bug, infects only humans, spreading through the system and attacking the liver, spleen and bone marrow.

Equally revealing are comparisons between mycobacteria such as M. tuberculosis (which causes TB) and M. leprae (leprosy), discussed by Stewart Cole (Institut Pasteur, Paris). M. leprae lives only in the peripheral nervous system, while M. tuberculosis usually affects the lungs but can also infect the brain, kidneys and bones. In its highly restricted niche, M. leprae appears to have little need of the full complement of M. tuberculosis's 4000 genes - it has just 1600 functional genes, and a genome littered with pseudogenes (inactivated versions of originally functional genes).

Genomic changes drive the emergence of new species, but even within a species there can be considerable variation in DNA. Some of the most extreme examples are the RNA viruses and retroviruses such as HIV. As Simon Wain Hobson (Institut Pasteur, Paris) pointed out, such viruses do not correct errors made during the replication of their genetic material, creating enormous variation. HIV genome sequence can vary significantly in nearby cells, and within a single, multiply infected cell, amino acid variation can reach levels of 30 per cent, as different genomes promiscuously exchange DNA.

While viruses are especially mutable, bacteria are not far behind in the variation stakes. A good example is Helicobacter pylori, a bacterium that colonises the human stomach, often in childhood, and is carried by about half of all humans, occasionally causing ulcers and gastric cancer. H. pylori, described by Professor Mark Achtman (Berlin), is highly diverse as it swaps DNA constantly. "If we look at the same gene in 20 different isolates, all 20 sequences will be different."

One upshot of sequence variation is the ability to study geographical trends in the spread of an infection - information that can be used to infer an evolutionary history of human-pathogen interactions. For example, Professor Achtman and colleagues compared H. pylori samples from all over the world, separating the bacteria into broad populations that correlate with their geographical source. H. pylori evolved over 11 000 years ago and the distribution of the populations may reflect the history of human migrations.

A similar approach has been taken by Professor Karen Day (Oxford) to explore the evolution of the malaria parasite Plasmodium falciparum. Having dated the parasite's evolution to about 7000 years ago - the time of the dawn of agriculture in Africa - Professor Day's group has developed 'family trees' of malaria parasite genome sequences, which suggest the parasite travelled out of Africa into Asia and beyond with fairly recent migrations. So, for example, it is likely to have reached the Far East about 4000 years ago, and South America about 500 years ago, again along the slave routes (see malaria microsite for further details).

Practical applications

While genome sequencing is bringing insights into pathogens' adaptation and evolution, it is also driving the development of practical applications: new diagnostics, drugs and vaccines.

DNA data can improve diagnosis and help identify precisely which pathogen a patient is infected with - crucial for the correct treatment. For example, Salmonella typhi is notoriously difficult to identify, and typhoid is hard to diagnose because its symptoms resemble those of other diseases such as malaria and dengue fever. DNA-based typing can also be used to distinguish different strains of bacteria - allowing more targeted treatments and the tracking of drug-resistant forms.

A relatively new approach is multilocus sequence typing (MLST), used to 'tag' different strains of bacteria. Another innovation is to put genes for drug resistance, carried by many different types of bacteria, onto a microarray ('gene chip'), so that resistant strains can be identified.

Genome sequences provide rich pickings for drug developers. Sometimes, the genome sequence throws up an obvious set of targets. For example, M. tuberculosis loves feeding off lipids, and about 10 per cent of its 4000 genes are for lipid metabolism. Drug developers are thus working on lipid mimics to block the organism's metabolism. For other organisms, the genome data highlight candidate genes that may be potential vaccine or drug targets. For example, S. typhimurium's genome sequence revealed 50 previously unknown candidate genes coding for surface proteins, which will be accessible to the body's defences.

A complete genome sequence offers an inventory of every possible antigen that could be included in a vaccine. The challenge is to identify which will successfully stimulate an effective immune response, across all strains and in all people. Working with colleagues at Chiron in the USA, Professor Richard Moxon (University of Oxford) has been going through this process for meningococci.

Although good vaccines exist for meningococci, none has been produced for group B meningococci. In group B meningococci, the key target antigen is similar to one found on the surface of human cells, especially fetal brains. This poses the theoretical risk that an immune response would be stimulated against a vaccinated person's own cells, and the possibility that antibodies could cross the placenta in immunised women who become pregnant.

A good candidate antigen should have certain qualities - for example, it should be conserved in all pathogenic strains, be accessible to the host's immune system (so usually on the cell surface), and essential to survival. Much of this information can be deduced from gene sequences or by comparisons between species. Biological experiments to test immune responses can then begin on a selected subset of candidates.

By this approach, several highly promising antigens were identified within 18 months; two of these are under intensive study and nearing clinical trials. Given the paucity of novel antigens that have reached this stage in the preceding 30 years, this has been a major step forward.

While pathogen genomes dominated the conference, David Hopwood (John Innes Centre) touched upon Streptomyces coelicolor - an 'antipathogen' as he put it. S. coelicolor is one of the family of common soil bacteria that produce more than two-thirds of the world's antibiotics and other medically important compounds. Now that its genome (which contains a staggering 7825 genes, about half as many as a fruit fly) has been decoded, researchers plan to engineer the bacterium to optimise the production of novel antibiotics, immunosuppressants, and less toxic antivirals and anticancer agents by genetic engineering.

The S. coelicolor genome sequence revealed two dozen clusters of genes for 'secondary metabolites' (complex molecules, such as antibiotics, produced by particular strains of an organism), reinforcing the idea of mining the genomes of Streptomyces species for spare parts for use in this endeavour.

A democratic science

One of the beauties of genome sequencing is that almost anyone can join in. "For a developing country, a genome project is far more straightforward to set up than hypothesis-driven research," said Andrew Simpson (Ludwig Institute, São Paulo, Brazil).

In 1997, the state of São Paulo formed a 'virtual genomics institute': a centrally organised genome project involving 31 laboratories. While plans were being discussed, São Paulo orange producers asked them to sequence Xylella fastidiosa, a bacterium that infects orange trees and causes small, hard, juiceless fruit. As 30 per cent of the world's orange juice comes from São Paulo, and 30 per cent of the trees were thought to be infected, the project would be of national importance.

The completed sequence provided important clues for how the bacteria interact with the orange tree, and the institute have continued their sequencing work on other bacteria of agricultural importance. "Genome sequencing has stimulated Brazilian science and technology, and produced new scientific leaders," said Dr Simpson. "Sequencing has brought biologists together and has made science more democratic."

See also

External links

Share |
Home  >  News and features  >  2002  > A genome fest: 25 years of pathogen genome sequencing
Wellcome Trust, Gibbs Building, 215 Euston Road, London NW1 2BE, UK T:+44 (0)20 7611 8888