The Human Genome Project (HGP) was an international scientific research project with a primary goal to determine the sequence of chemical base pairs which make up DNA and to identify and map the approximately 20,000â€“25,000 genes of the human genome from both a physical and functional standpoint. The first available assembly of the genome was completed in 2000 by the UCSC Genome Bioinformatics Group, composed of Jim Kent (then a UCSC graduate student of molecular, cell and developmental biology), Patrick Gavin, Terrence Furey and David Kulp.
The project began in 1990, initially headed by James D. Watson at the U.S. National Institutes of Health. A working draft of the genome was released in 2000 and a complete one in 2003, with further analysis still being published. A parallel project was conducted outside of government by the Celera Corporation. Most of the government-sponsored sequencing was performed in universities and research centers from the United States, the United Kingdom, Canada, and New Zealand. The mapping of human genes is an important step in the development of medicines and other aspects of health care.
While the objective of the Human Genome Project is to understand the genetic makeup of the human species, the project also has focused on several other nonhuman organisms such as E. coli, the fruit fly, and the laboratory mouse. It remains one of the largest single investigational projects in modern science.
The HGP originally aimed to map the nucleotides contained in a haploid reference human genome (more than three billion). Several groups have announced efforts to extend this to diploid human genomes including the International HapMap Project, Applied Biosystems, Perlegen, Illumina, JCVI, Personal Genome Project, and Roche-454.
The "genome" of any given individual (except for identical twins and cloned organisms) is unique; mapping "the human genome" involves sequencing multiple variations of each gene. The project did not study the entire DNA found in human cells; some heterochromatic areas (about 8% of the total genome) remain un-sequenced.