Table of Contents
- Column Label Descriptions
- Chromosome Location Search
- Contact us
1. Collecting SNPs in the 3'-UTR
We collected the locations of 3'-UTRs of all known genes in Ensembl Biomart1 for the mouse (mm9) and human (hg19) genomes. SNPs located in these 3'-UTRs were then extracted from dbSNP build 132.
2. Identifying and annotating PolymiRTS
For each SNP, we assessed whether its two alleles lead to different miRNA target sites. We only consider those 3'-UTR SNPs that affect the match to the seed region of the miRNA. Mature miRNA sequences were downloaded from the miRBase (mirbase.org). We used the criteria of TargetScan2 in the prediction of miRNA sites. Basically, besides requiring a perfect Watson-Crick match to the seed nucleotides 2-7 of miRNA, we further require that there is either a perfect match to the 8th nucleotide of miRNA, or an anchor adenosine immediately downstream the 2-7 seed in the target.br>We assigned the PolymiRTS to one of the four classes: 'D' (an allele disrupts a conserved miRNA site), 'N' (a derived allele disrupts a nonconserved miRNA site), 'C' (a derived allele creates a new miRNA site) and 'O' (other cases when the ancestral allele can not be determined unambiguously). PolymiRTS of class 'C' may cause abnormal gene repression and PolymiRTS of class 'D' may cause loss of normal repression control. These two classes of PolymiRTS are most likely to have functional impacts. We used the pre-calculated 46-way Multiz alignments of vertebrate genomes to derive the annotations. For a miRNA site to be conserved, we require that it is present in at least two other vertebrate genomes in addition to the query genome. For mouse SNPs, their ancestral alleles were determined by mouse vs. rat (rn4) genome alignment. For human SNPs, their ancestral alleles were determined by human vs. chimpanzee (panTro2) genome alignment. Additionally, we also categorized PolymiRTS with A/G alleles because they are supposed to be less deleterious with their ability to form G:U wobble base-pairs with miRNAs.
3. Experimentally validated targets
Experimentally supported miRNA-target interactions were collected from several sources. Three databases, miRecords (http://mirecords.biolead.org/), TarBase (http://diana.cslab.ece.ntua.gr/tarbase/), and miTarBase (http://mirtarbase.mbc.nctu.edu.tw/), contain collections of miRNA targets from both low- and high-throughput experiments. Additionally, several experimental techniques, such as the HITS-CLIP3and PAR-CLIP4 have recently been developed and used to identify the specific mRNA sequences that bind with miRNAs in the RNA-inducing silencing complex (RISC). To include data from these experiments, we first obtained the mRNA sequences bound in RISCs from ago.rockefeller.edu, which is associated with the HITS-CLIP experiment, and the supplementary material from Ref. 5. We also included the 11 miRNA-mRNA pairs with the 11 highest allelic imbalance ratios from ADD AUTHORS NAME 5. The type of experiment (high- or low-throughput) used to support the miRNA-mRNA interaction, as well as if the experiment identified only the mRNAs that bind to the miRNA or the specific locations that are targeted by the miRNA were also determined. See the "Column Label Description" section for further discussion of the classification scheme used.
4. SNPs in miRNA seed regions
For each of miRNA, we collected all SNPs in the seed regions from dbSNP build 132. We identified 5 and 24 SNPs in miRNA seed regions for mouse and human, respectively. We extracted the entire 3'-UTR of all known genes in Enesembl Biomart for the mouse and human genomes and used TargetScan to identify all predicted target sites that would be either disrupted or created by the SNPs in the miRNA seed regions. Disrupted sites are targets of the miRNA for seeds with the reference allele at the SNP location, while the created sites are targets of the miRNA for seeds with the derived allele.
5. Assessing PolymiRTS in cis-acting eQTLs
Genes with both cis-acting eQTL and PolymiRTS are featured in the database.
For mouse, gene expression levels, which are publically available in the GeneNetwork (www.genenetwork.org), in nine tissues (whole brain, cerebellum, eye, hippocampus, kidney, liver, nucleus accumbens, prefrontal cortex, and retina) in the BXD recombinant inbred panel were examined. Gene expression levels were treated as quantitative traits and were mapped onto genomic regions (eQTL) using standard marker regression. A gene is said to have a significant cis-acting eQTL if the QTL peak location is within 5 Mb from the gene's physical location and the genome-wide significance level was < 0.05.br>
Two methods were used to identify genes with cis-eQTLs in humans. First, gene expression levels in lymphoblastoid cells of 194 human individuals from 14 CEPH families were downloaded from the GEO database and the raw data were processed by using the RMA protocol. Genotypes for 1628 autosomal SNP markers were downloaded from The SNP Consortium database. We used Merlin to remove genotype errors and perform family-based linkage analysis. A gene is said to have a cis-acting eQTL if the LOD peak location is within 10 Mb from the gene's physical location and the p-value is <0.05. Second, the cis-eQTLs identified in a variety of literature sources were included in the database. These eQTLs include all of the records contained in the GTEx eQTL browser (http://www.ncbi.nlm.nih.gov/gtex/test/GTEX2/gtex.cgi) as of September 2011 as well as in 5 additional studies in skin6 , cortex7, monocytes8, and lymphoblasoid cells9,10.
6. Assessing PolymiRTS in pQTLs
For mouse, we first mapped QTLs (genome-wide significance < 0.1) for more than 2000 published BXD phenotypes (physiological/behavioral traits). For each QTL, we linked it with genes that are physically located in the QTL interval and have at least one PolymiRTS. These genes, together with genes with nonsynonymous SNPs, are candidate causal genes underlying the pQTL.
For human, we collected all genes corresponding to SNPs associated with human diseases and traits in the NHGRI GWAS catalog (www.genome.gov/gwastudies) and dbGaP (http://www.ncbi.nlm.nih.gov/gap) and compared them with the list of genes with SNPs in miRNA target site in our PolymiRTS database.
The database can be browsed by four different criteria:
1. Genes with SNPs in miRNA target sites
This table displays all genes that contain SNPs in predicted miRNA target sites. It can be filtered to select only genes that contain target sites with specific functional classes or gene symbols that start with certain characters. Clicking on the RefSeq ID provides a table with specific details of the SNPs in the miRNA target sites as well as regulation of the gene by cis-eQTLs associations with complex traits. This table can be filtered based on conservation, functional class, and experimental support. See the "Column Label Definitions" section for further description of these categories.
2. SNPs in miRNA seeds
This table displays miRNAs with SNPs in seed regions. For each miRNA with a seed SNP, two tables are available: one containing genes with putative target sites that are disrupted by the derived allele in the miRNA seed and one containing genes with putative target sites that are created by the derived allele in the seed.
3. Human diseases and traits
This table displays all genes in the PolymiRTS database that also have been associated with human diseases or traits in GWAS studies.
4. Experimentally validated targets
This table displays experimentally validated gene-miRNA pairs that contain SNPs in target sites.
| Column Label Descriptions
||SNP location in the mRNA transcript. It is a zero-based number.
||Link to dbSNP.
||Whether the SNP can form a G:U wobble basepair with the miRNA. Y: Yes; N: No.
||If applicable, the ancestral allele is denoted.
||Two alleles of the SNP in the mRNA transcript.
||Genotypes of two mouse inbred strains to be compared. The default compares C57BL/6J with DBA/2J.
||Link to miRBase.
||Occurrence of the miRNA site in other vertebrate genomes in addition to the query genome. By clicking the hyperlink, the users can examine the genomes in which this miRNA target site occurs.
D: The derived allele disrupts a conserved miRNA site (ancestral allele with support >= 2).
N: The derived allele disrupts a nonconserved miRNA site (ancestral allele with support < 2).
C: The derived allele creates a new miRNA site.
O: The ancestral allele can not be determined.
||Sequence context of the miRNA site. Bases complementary to the seed region are in capital letters and SNPs are highlighted in red.
LT: The miRNA-mRNA interaction is supported by a low-throughput experiment (e.g., luciferase reporter assay or Western blot).
HT: The miRNA-mRNA interaction is supported by a high-throughput experiment (e.g., microarray or pSILAC).
LTL: The miRNA targeting the specific location is supported by a low-throughput experiment (e.g., allelic imbalance sequencing).
HTL: The miRNA targeting the specific location is supported by a high-throughput experiment (e.g., HITS-CLIP).
N: Predicted target site with no experimental support.
|The user can query the database by SNP ID (e.g. rs27454734), miRNA ID (e.g. mmu-miR-140), RefSeq ID (e.g. NM_172694), HUGO gene identifier (e.g. Igfbp1), any word in teh gene or trait description, and GO accession number or name. The search results can also be filtered based on functional class, conservation, and experimental support. See the "Column Label Descriptions" section for further description of these categories.
| Chromosome Location Search
|This feature is designed for researchers who have obtained QTL (genomic loci) controlling traits of their interests and want to look through a functional polymorphism set (all PolymiRTSs for example) within this genomic region to identify the causal variant. For mouse, we provide the inbred strain comparison option so that the query only searches against the SNPs between the two selected strains.
|We offer flat file downloads for the database, including the main records of human and mouse PolymiRTS as well as a list of genes with cis-acting eQTLs. A description of the files available for download is provided in File_description.txt.
| 1. Syed Haider, S., et al. 2009. BioMart Central--unified access to biological data. Nucleic Acids Res, 37, W23-7.
2. Lewis B. P., et al. 2005. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell, 120, 15-20.
3. Chi, S.W., et al. 2009. Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature, 460, 479-786.
4. Hafner, M., et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell, 141, 129-141.
5. Kim, J. and Bartel, D. P., 2009. Allelic imbalance sequencing reveals that single-nucleotide polymorphisms frequently alter microRNA-directed repression. Nature Biotechnol, 27, 472-477.
6. Ding, J., et al. 2010. Gene expression in skin and lymphoblastoid cells: refined statistical method reveals extensinve overlap in cis-eQTL signals. Am J Hum Genet, 87, 779-789.
7. Myers, A. J., et al. 2007. A survey of genetic human cortical gene expression. Nat Genet, 39, 1594-9.
8. Zeller, T., et al. 2010. Genetics and beyond: the transciptome of human monocytes and disease susceptibility. PLoS One, 5, e10693.
9. Morely, M., et al. 2004. Genetic analysis of genome-wide variation in human gene expression. Nature, 430, 743-747.
10. Pickrell, J. K., et al. 2010. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 464, 768-772.
Please send questions and comments to Dr. Yan Cui at University of Tennesee Health