featSNP

Data Summary

The SNPs information used on this website based on the Human dbSNP build 144, downloaded from ftp.ncbi.nih.gov/snp.

In dbSNP file (build 144), 84,435,229 SNPs records are from and the rest are from Insertion, Deletion, Indel and MNP (See table on the right side). Among these 84,435,229 SNP records, we used those 82,257,578 with two observed alleles to build this website (See table on the right side) and we will add other information such as SNPs with more alleles and Indels in our next version of the website.

Among all 82,257,578 SNP records with two observed alleles, 903,636 SNPs locate in non-reference chromosomes (scaffolds, assembly patches and haplotypes); 102353 SNPs are found in two different chromosomes; 4364 SNPs' assembly alleles do not match their observed alleles (with “InconsistentAlleles” sign in the records). After filtering out 1,010,353 SNPs, we obtained 81,144,876 SNPs and used their information to perform the analysis.

A
SNPInsertionDeletionIndelMNP
Number84,435,2291,591,2942,595,51733,234110

SNP: all observed alleles are single nucleotides (can have 0, 1, 2, 3 or 4 alleles)
Insertion: the polymorphism is an insertion relative to the reference assembly
Deletion: the polymorphism is a deletion relative to the reference assembly
Indel: insertion/deletion
MNP: Multiple Nucleotide Polymorphism



B
# of observed allele012>3
Number1,866,28036,66782,257,578274,704
number1
number2
number3
number4

The four histograms above show the SNP distribution in different sections of our website.

Histogram shows the SNP distribution for the Transcription Binding Motifs Predction. Of those 81,144,876 SNPs used on our website, 16,031,189 (~19.76%) SNPs do not have predicted motifs which means these SNPs may not have significant effect on the binding motifs.

Histogram shows the SNP distribution for the number of associated genes. The definition of the associated genes here is that if an SNP locates within the gene (intron or exon), it should be assigned to this gene itself while if a SNP locating in the intergenic region, it should be assigned to the gene whose TSS is the closest to this SNP. Based on this definition, All the SNPs have at least one associated gene and some of them even have more than 2 associated genes.

Histogram shows the SNP distribution for different number of correlation heatmap plots. Some SNPs do not have correaltion heatmap plot because the RPKM of its associated gene is lower than 0.2 or this SNP does not have any predicted motifs.

Histogram shows the SNP distribution for different brain tissues having eQTL information with a p-value smaller than 1e-5. For each tissue, clicking the link will lead you to see the SNP distribution within the tissue for the eQTL records for different genes.