The first draft of the dog genome sequence has been deposited into free public databases for use by biomedical and veterinary researchers around the globe, the National Human Genome Research Institute (NHGRI), one of the National Institutes of Health (NIH), announced today.

A team led by Kerstin Lindblad-Toh, Ph.D., of the Broad Institute of MIT and Harvard, CambridgeMass., and Agencourt Bioscience Corp., BeverlyMass., successfully assembled the genome of the domestic dog (Canis familiaris). The breed of dog sequenced was the boxer, which was chosen after analyses of 60 dog breeds found it was one of the breeds with the least amount of variation in its genome and therefore likely to provide the most reliable reference genome sequence.

The initial assembly is based on seven-fold coverage of the dog genome. Researchers can access the sequence data through the following public databases: GenBank (www.ncbi.nih.gov/Genbank) at NIH’s National Center for Biotechnology Information (NCBI); EMBL Bank (www/ebi.ac.uk/index.html) at the European Molecular Biology Laboratory’s Nucleotide Sequence Database; and the DNA Data Bank of Japan (www.ddbj.nih.ac.jp). The data can also be viewed through the UCSC Genome Browser (www.genome.ucsc.edu) at the University ofCalifornia at Santa Cruz and the Ensembl Genome Browser (www.ensembl.org) at the Wellcome Trust Sanger Institute in CambridgeEngland. Viewing capabilities also will be available in August at NCBI’s Map Viewer (www.ncbi.nlm.nih.gov/mapview).

The NHGRI-supported researchers are currently comparing the dog and human genome sequences and plan to publish results of their analysis in the next several months.

The dog genome is similar in size to the genomes of humans and other mammals, containing approximately 2.5 billion DNA base pairs. Due to a long history of selective breeding, many types of dogs are prone to genetic diseases that are difficult to study in humans, such as cancer, heart disease, deafness, blindness and autoimmune disorders. In addition, the dog is an important model for the genetics of behavior and is used extensively in pharmaceutical research.

To best characterize disease in dogs, it is important to have a sufficient number of markers in the genome. Therefore, in addition to the boxer, nine other dog
breeds, four wolves and a coyote were sampled to generate markers that can be used in disease studies in any dog breed. A preliminary set of about 600,000 single nucleotide polymorphisms (SNPs), which amounts to a SNP roughly every 5,000 DNA base pairs, is currently being aligned to the released assembly. The reads used to identify the SNPs are publicly available in NCBI’s Trace Archive (www.ncbi.nlm.nih.gov/Traces/trace.cgi) and the SNPs will be available shortly at the Single Nucleotide Polymorphism database, dbSNP (www.ncbi.nlm.nih.gov/SNP).

Sequencing of the dog genome began in June 2003. NHGRI provided about $30 million in funding for the project to the Broad Institute, which is part of NHGRI’s Large-Scale Sequencing Research Network.

To learn more about the rapidly expanding field of comparative genomic analysis, go to: www.genome.gov/10005835. To read the white paper that outlines the scientific rationale and strategy for sequencing the dog genome, go to:www.genome.gov/Pages/Research/Sequencing/SeqProposals/CanineSEQedited.pdf

A high-resolution photo of Tasha, the boxer whose DNA was sequenced, is available at: www.genome.gov/11007323.

NHGRI is one of 27 institutes and centers at NIH, an agency of the Department of Health and Human Services. The NHGRI Division of Extramural Research supports grants for research and for training and career development at sites nationwide. Information about NHGRI can be found at: www.genome.gov

.