The ALlele FREquency Database   
ALFRED is a resource of gene frequency data on human populations
supported by the Yale Center for Medical Informatics.


Polymorphism and Population Information
Searching ALFRED
Data display formats
Data download formats
Data submission to ALFRED

-click the question to show the answer
What is ALFRED?
ALFRED (The ALlele FREquency Database) is a compilation, into a user-friendly and scientific frame work, of allele frequencies for DNA sequence polymorphisms in anthropologically defined human populations. ALFRED is free, web-accessible and actively curated with data linked to ethnographic and molecular databases, and to the corresponding literature. More
How often is ALFRED updated?
ALFRED is updated on a daily basis.
What are the sources of data in ALFRED?
ALFRED accumulates gene frequency data on human populations from various sources including published literature, collaborators, and public high throughput data sources such as the HGDP-CEPH website and Human Genome Diversity Project at Stanford University. Some unpublished data from the host laboratory are also being made public in ALFRED.
What kind of data upload contributes to the big increase in table numbers?
The recent dramatic growth in tables in ALFRED is due to the systematic automated uploads of various high throughput datasets.
Where can I find the list of available high throughput datasets in ALFRED?
The summary information along with the citation reference for all the high throughput datasets is available under the ‘Summaries tab’ – ‘High Throughput’ link.
Where can I find summary tables providing an overview of ALFRED?
Various summary tables - sites list, table numbers, population list, Fst and Average heterozygosity’ etc, - are all available under the ‘Summaries’ tab from the ALFRED homepage.
Are there any graphical summaries displaying the contents of ALFRED?
Yes. They are available under the ‘Summaries' –‘Table Numbers’ menu tab.
Are the numbers displayed in the ‘Table Numbers' summaries current?
Yes. They are automatically updated whenever new data are added.
Are the Fst and Average heterozygosity values available for all the sites in ALFRED?
This information is available for all the di-allelic sites in ALFRED.
And where can I find this information?
This information is displayed in 2 places in ALFRED: To view this information as a comprehensive table for all the sites mouse over ‘Summaries tab’ and click on ‘Fst and Avg Het’ link. The page that comes up displays the information for the sites organized by chromosome number and can be sorted by all the available fields in the page. This information is also displayed in the individual site information page for that polymorphism.
Will I be able to limit the search of the comprehensive ‘Fst and Avr. Het’ tables by number of populations?
Yes, a minimum number of populations can be set since the values are less meaningful when data exists for only a few populations.
Are all the samples (if multiple samples available in ALFRED) for a particular population for a selected site included in the Fst value calculation?
No, when multiple samples of a population exists only the sample with highest sample number is included for the calculation. In other words, only one sample is included for one particular population in the calculation of Fst values.
Are there individual genotype data in ALFRED?
No. In general, ALFRED is not funded to provide individual genotype data. However, ALFRED has genotype summary information available for a small number of polymorphisms to demonstrate how such a feature might work if ALFRED’s mission expanded to including genotype data. Genotype summary data for a selected site is accessible by clicking on the “G” icon from the allele frequency page - graph or tabular view. You can easily find the polymorphisms with genotype information by typing “Genotype” in the keyword search function field. Also available under ‘Summaries’ tab is a link to the ‘Kidd Lab genotype’ page. This link provides individual genotype data for selected sites typed in the laboratory of Kenneth Kidd at Yale School of Medicine.
What does the ‘Educational material’ section under ‘Documentation’ contain?
This section offers some starting pointers for use of ALFRED data in classroom projects. This section does not include the various uses of the data for this purpose. If you are interested in using the data for teaching and need suggestions on how to use ALFRED data feel free to contact us. We welcome examples of how ALFRED has been used so we can include them here to help others incorporate this resource into teaching.
Can I include ALFRED in a syllabus I am developing for an undergraduate genetics lecture or laboratory course?
Yes. ALFRED is intended to be a useful scientific resource for everybody, so we welcome you having your students use the database. In fact, if you think there is something we can do to make it a more useful didactic tool, please send us your suggestions. Our copyright precludes any for profit use, but non-profit educational use is welcome.
What is the ALFRED wiki application for?
The wiki application is for users to modify/add descriptive texts for population annotation in ALFRED. This is to supplement the population curational efforts in ALFRED. This could be part of a course assignment, allowing parts of a student paper to help the broader community.
Do I have to create an account to make changes in ALFRED wiki?
Yes. The contributor’s name is necessarily associated with the entry.
If I add information to a population in ALFRED wiki will I be able to see that in the population description page of ALFRED?
No. Our curators will be responsible for comparing between different Wiki update versions and adding relevant information to the population description page in ALFRED.
Do I have to register to get ALFRED newsletters?
Yes. You have to register to receive periodic ALFRED newsletters highlighting the recent updates to the database.
How do I create url links to ALFRED description pages?
For details on how to create urls to ALFRED pages click on the link – ‘Documentation’ -> ‘About ALFRED’ -> ‘How to create URL to ALFRED’. Currently two other databases – dbSNP and PharmGKB have such links into ALFRED.
Polymorphism and Population Information
-click the question to show the answer
How do you name polymorphisms in ALFRED?
Since allele frequency data are extracted from published literature by ALFRED curators, in general the polymorphism name in ALFRED is the same as the name from the literature. If a polymorphism is addressed differently by different authors, we pick the most common one as the primary polymorphism name in ALFRED and the remaining names for this polymorphism are entered as synonyms. These synonyms include the dbSNP rs# so that number can always be used as a keyword in the search. However, recently with various high throughput data being entered in ALFRED using automated upload processes, the polymorphisms are named using dbSNP rs#. We also provide a link to the dbSNP rs# page from the site description page in ALFRED.
What are the different types of Polymorphisms addressed in ALFRED?
There are STRPs (short tandem repeat polymorphisms), VNTRs (variable number tandem repeat polymorphisms), SNPs (single nucleotide polymorphisms), INDELs (insertion/deletion polymorphisms), and RFLPs (restriction fragment length polymorphisms) in ALFRED. ALFRED also contains haplotype data.
Why are there different versions for the same polymorphism?
If different nomenclatures are used by different authors for the alleles of the same polymorphism, integration of the alleles becomes difficult and is best left to the researcher not ALFRED’s curators. Therefore, the data are entered as separate versions of the same polymorphism. For example there are 3 versions of 3' VNTR of APOB.
Which databases are loci and polymorphisms cross-referenced to?
Individually entered loci are usually cross-referenced to Entrez Gene, OMIM, GenBank, PubMed, CHLC, CEPH, UniSTS, LSDBs (locus specific databases) along with other related resources. Polymorphisms are cross-referenced to dbSNP, PharmGKB, CHLC, CEPH along with other related resources. However for the recent automated uploads the loci are being systematically linked to Entrez Gene and the polymorphisms are being linked to dbSNP and PharmGKB.
Is there any redundancy in polymorphism entries in ALFRED?
The curators make every effort to enter each polymorphism into ALFRED only once, though allele frequencies for a polymorphism may be extracted from many publications. However, since the nomenclature used for polymorphisms and alleles is not consistent in the literature we may have some redundancy with regards to polymorphisms.
Why are some of the descriptions missing? How can I find this information?
Detailed descriptions may be missing for the populations, loci, and polymorphisms that have been added recently. These descriptions often require extensive research. Unfortunately new populations, loci, and polymorphisms are being added to ALFRED faster than ALFRED staff can adequately research and describe them. If there is a population, locus, or polymorphism you are interested in that does not have a description and you would like more information contact ALFRED staff at and we will do our best to accommodate your request. If you have such information we welcome your input- please send it to us.
Are the chromosomal base pair positions in ALFRED from the current NCBI build?
The chromosomal positions in ALFRED are from NCBI build 36.3.
What does ‘Status’ mean in the sites list table in the ‘Locus description’ page?
When we migrated from NCBI build 35.1 to 36.1 some polymorphisms were “moved” from one locus to another according to the latest build. To help users locate polymorphisms that have “moved”, ALFRED provides links from the polymorphism page to its previously associated locus page as well as a link from the locus originally associated with the polymorphism to the present locus. The 'Status’ field provides this information for the sites that have moved to a different locus.
What is the difference between a population and a sample?
A population is the total number of individuals belonging to a defined group and the sample is group of individuals from that population on which the allele frequencies are based. Thus, it is expected that different samples, especially if containing different numbers of individuals, will give different estimates for the allele frequencies in the population as a whole.
How are the geographic regions divided?
The geographic regions are divided strictly on an arbitrary basis. The divisions are used only to facilitate browsing of the large number of populations ALFRED currently has.
In frequency search results why do I see sample duplications? Like two samples for the same population but with same sample size and same frequencies for the same allele?
This may be a case where the frequencies for the two samples were extracted from two different publications. ALFRED’s curators do their best to extract allele frequencies from the published literature. If the papers are from the same lab, this might be a case of sample duplication. Before data are uploaded into ALFRED, the corresponding authors are contacted in case of conflicting or erroneous data. If you are using data from ALFRED for analysis purposes and come across data duplications or other ambiguities we advise you to consult the original author for clarification. Every frequency in ALFRED is linked to the publication it was extracted from.
Frequency data display formats
-click the question to show the answer
Does ALFRED provide various frequency display formats?
Yes.The allele frequency data are displayed as graphical stacked bar/ tabular number format and as pie-charts on Google map and Google Earth.
What do the icon in the allele frequency display formats represent?
This is a clickable icon which provides the additional typing information linked to that frequency data such as the typing method used, the contributor of the data, and the publication reference from which the frequency data were extracted.
For a selected site are all the samples for a particular population displayed in Google map display format?
No. If there are multiple samples for the particular population for a particular site the one with the highest sample number is displayed on the Google map. However, you can view different samples of the same population separately on Google Earth by clicking on the pie chart for the population.
Do I have to install Google Earth on my computer to view the pie-charts on Google Earth?
Yes. To view on Google Earth the user's computer must have the Google Earth application downloaded and installed.
Frequency data download formats
-click the question to show the answer
In what different formats can I download allele frequency data from ALFRED?
Go to 'Downloads' under the 'Summaries Tab' section to download populations, polymorphisms and frequencies table in text format. These files can be opened in MS excel. The data dumps in XML format have not been updated since the high throughput data uploads started and are currently disabled. We are working on new functions to streamline data downloads from ALFRED. Frequency tables in tab-delimited, XML, PML, and Arlequin formats for individual sites can be downloaded from corresponding description pages. However, these files are not up-to-date as well.
Where can I find information about what each download format means?
Click the ‘Help’ button in the ‘Frequency download’ section available from each site description page.
How can I convert the semi-colon delimited format files to tab delimited format files?
Copy and paste the displayed semi-colon delimited format files in a notepad and save it as a text (file.txt) file. Then open the text file in Microsoft Excel Program and save it as tab delimited file.
Data submission to ALFRED
-click the question to show the answer
What are the criteria for entering data into ALFRED?
Criteria for Data Entry into ALFRED
How can I submit data to ALFRED?
The section Data submission to ALFRED in About ALFRED will help you with information needed to submit frequency data to ALFRED. You can also contact our staff to assist you.
-click the question to show the answer
How should I cite ALFRED?
Use for URL citing.
For publications
a. Kidd KK, Rajeevan H, Osier MV, Cheung KH, Deng H, Druskin L, Heinzen R, Kidd JR, Stein S, Pakstis AJ, Tosches NP, Yeh CC, Miller PL. "ALFRED – the ALlele FREquency Database – update." Am J Phys Anthropol. Annual Meeting Issue: Supplement S36:128. (2003)


b. Rajeevan H, Osier MV, Cheung KH, Deng H, Druskin L, Heinzen R, Kidd JR, Stein S, Pakstis AJ, Tosches NP, Yeh CC, Miller PL, Kidd KK. "ALFRED – the ALlele FREquency Database – update." Nucleic Acids Research..31(1):270-271.(2003) pdf file of article


c. Rajeevan H, Cheung KH, Gadagkar R, Stein S, Soundararajan U, Kidd JR, Pakstis AJ, Miller P, Kidd KK. "ALFRED: An allele frequency database for Microevolutionary Studies." Evolutionary Bioinformatics Online.2005:1 (2005) pdf file of article


d. Rajeevan H, Soundararajan U, Kidd JR, Pakstis AJ, Kidd KK. "ALFRED: an allele frequency resource for research and teaching." Nucleic Acids Research. 40(D1): D1010-D1015.(2012) pdf file of article

© 2019 Kenneth K Kidd, Yale University. All rights reserved. The full Copyright Notification is also available.
Originally prototyped by Michael Osier with the aid of Kei Cheung
Upgrades and maintenance since 2002 by Haseena Rajeevan

Last Modified 12/3/2019 6:55:29 PM