Feature: Biobanks - The long game
19 January 2007. By Richard Twyman

The causes of some diseases are straightforward. A relatively simple DNA test can immediately predict whether or not someone will develop cystic fibrosis, for example. But the most common chronic diseases - cancer, diabetes, heart disease, stroke and dementia - are far more complex. Genes do influence their risks, but so too do lifestyle factors such as diet and smoking habits, and the environments in which we live and work.
The problem for researchers studying such diseases is that many different factors may be involved, and any one factor may have only a small effect. To untangle this complexity, large numbers of people with each disease need to be studied in great detail - and this is where biobanks come in. By collecting DNA, medical and lifestyle data from hundreds of thousands of people and following their health long-term, researchers can work out why some people develop a particular disease while others do not.
Although the term 'biobank' is often applied to any database of health-related information and clinical samples (typically blood and urine) from individuals, the projects can be set up in markedly different ways. 'Prospective' biobanks assess and take samples from participants at the start of the study, and then follow their health over subsequent years, even decades. Other studies are 'retrospective', collecting information and samples from people who have already developed a particular disease, or are family-based genetic studies, which aim to track down genes associated with diseases or other traits.
UK Biobank
UK Biobank, a long-term prospective project, was given the full go-ahead in August 2006. Funded by the Wellcome Trust, the Medical Research Council, the Department of Health, the Scottish Executive and the Northwest Regional Development Agency, and hosted by the University of Manchester (with scientific input from more than 20 other British universities), this £61 million project aims to provide the richest source of health-related data and samples for researchers from around the world.
Following a successful test run in March 2006, with 3800 people taking part from the Altrincham area near Manchester, UK Biobank is now expanding its efforts across the country. Over the next three to four years, it will recruit 500 000 adults aged 40 to 69, nearly 1 per cent of the UK population. With their fully informed consent, the participants will complete a detailed lifestyle questionnaire, be interviewed about their medical history, have several standard physical measurements (such as blood pressure, body size and lung function), and will donate blood and urine samples. The samples - about 15m in all - will be stored for decades at ultra-low temperatures, with a purpose-designed robotic system in Cheadle handling samples from up to 1000 participants every day.
Information about participants' health will then be obtained, with their permission, from medical and other health-related records. As follow-up continues, medical researchers will be able to compare the lifestyle, genes and other factors among participants who develop some particular disease during long-term follow-up with those among participants who do not. For common conditions, such as heart disease and diabetes, this will be possible within five to ten years of starting the project, whereas for less common diseases it is likely to take much longer before there are sufficient disease cases for reliable analysis.
By measuring many different exposures (not just genes) in large numbers of people, this prospective study will be able to assess the impact of a wide range of factors, alone or in combination, on many different conditions.
As with any project that is built upon the trust and confidence of those who take part, consent, data security and privacy are crucial. Participants will be given detailed information about UK Biobank's aims and what is required to be involved. In particular, they will be asked to give broad consent for their records and samples to be used for any medical or other health-related research. The data will be stored securely on computer so that all information about participants is well protected. Researchers who use the resource will need to be approved by UK Biobank, but can apply from anywhere in the world and from academia or industry. To protect participants further, the data and samples used by researchers will not include personal identifiers, so that genetic, lifestyle and other factors cannot be traced back to any individual. An independent Ethics and Governance Council has been set up to help monitor, and advise on, the way in which the project is conducted.
International biobanks
There are several other major prospective biobanks worldwide. The largest to date is the European Prospective Investigation into Cancer and Nutrition (EPIC), which was set up to look specifically at the relationship between cancer, genetics and nutrition. Since 1992, the study has recruited 520 000 people in ten European countries (Denmark, France, Germany, Greece, Italy, The Netherlands, Norway, Spain, Sweden and the UK) and has found, for example, that a diet high in fibre reduces the risk of colorectal cancer risk, as does eating fish, while red and processed meat increase the risk.
On a similar scale, the Chinese Kadoorie Study of Chronic Disease is investigating the roles of genetic and environmental factors, such as tobacco, infections and diet, in premature death and disability. Half a million adults aged 35 and over - 50 000 from each of ten rural and urban areas throughout China - will be taking part in the study, and more than 300 000 have already been recruited.
The Mexico City Prospective Study, which began in 1999, has recruited 160 000 men and women aged over 40 from the city's Coyoacan and neighbouring districts and is looking at the main avoidable causes of chronic diseases. It has collected medical and lifestyle data such as smoking habits, alcohol consumption and diet, as well as blood pressure and blood samples, and is repeating these assessments in subsamples of the group every five years.
In Estonia, a biobank project has been running since 2001 as part of the Estonian Genome Project Foundation. The aim is to create a database of health, genealogy and genome data from a large part of the Estonian population; at present, the biobank contains information from over 10 000 contributors. The project did suffer financial problems when venture-capital funding for the scheme ran out in 2005, but the Estonian Government subsequently injected €8m (around £5.5m) into the project over four years, enough to raise the number of participants to 100 000.
With other prospective biobanks being discussed or established in several countries - including the USA, Mexico, Singapore, Canada, Norway and Sweden - it may be possible in the future to combine the data from many of these studies into a massive epidemiological meta-database. The more people who can be studied, and the more data that can be analysed, the more robust the statistics. But, as discussed at the 'From Biobanks to Biomarkers' conference held in September 2005 (see further reading), there are many issues that need to be addressed before such a plan comes to fruition. The type of data being collected can vary markedly between studies, as can the type of consent gained from volunteers (which may restrict the routine sharing of data unless it is anonymised). To help address such problems, the Public Population Project in Genomics (P3G), a not-for-profit international group, is working to standardise methodologies and improve coordination across biobanks.
Large-scale prospective biobank projects
Project |
Country |
Target no. of participants |
Estonia |
100 000 | |
China |
500 000 | |
Mexico |
160 000 | |
European Prospective Investigation into Cancer and Nutrition |
Denmark, France, Greece, Germany, Italy, Netherlands, Norway, Spain, Sweden, UK |
520 000 |
UK |
500 000 |
Richard Twyman is a science writer based in York.

