ERISdb: a Database of Plant Splice Sites


Help contents:
i. BROWSE
ii. U12 INTRONS
iii. microRNA GENES
iv. WEBLOGOSS
v. DOWNLOAD
vi. EXAMPLE

BROWSE

Species selection page
Here, one can choose from one of eight species: Arabidopsis_thaliana, Chlamydomonas reinhardtii, Glycine max, Oryza sativa, Physcomitrella patens, Selaginella_moellendorffii, Vitis vinifera, Zea mays

Transcript selection page In this page one can select a transcript of interest from the species. Upon clicking transcript name, one is redirected to Splice site selection page. There is possibility to filter transcripts by biotype, gene name, keywords or chromosome. When searching by keywords, words separated with space are allowed, e.g. splicing factor.

Splice site selection page
This page helps to select a splice site of interest.

When hovering over a splice site selection panel, the corresponding splice site in gene structure is marked with green line. In the example above, this is 53'ss of intron 2.
The coding sequence of a gene is marked with red, while noncoding regions - with blue.
For display purposes introns (black line) are not kept in scale and truncated to 1000 bases. If this is a case, the intron is marked with . Short introns (<1000 bases) and exons are scaled according to their lengths in the genome.

There is information provided whether the transcript is located in forward or reverse strand. Regardless it is reverse or forward strand, the structures are displayed so that the 5' part of the transcript is to the left.

Upon clicking a desired splice site, one is redirected to Splice site data page.

Splice site data page
This is a central point of the database, where different types of data associated with the selected splice site are available from one place.

Here, one can choose the following from the tab menu:
a) General
Displays basic data about the splice site, like the sequence or intron type (U2 or U12).

b) Orthologs Shows alignment of the splice site with a corresponding splice site from an orthologous gene. The alignment includes both exonic and intronic sequence.

c) EST
Displays an alignment of genomic sequence with ESTs that support the splice site. It is possible to view the full alignments that often span over multiple introns by clicking a provided link.

d) RNA-Seq
displays blocks of exonic sequence (supported by RNA-Seq data) that surround an intron of interest.
The score tells about ...
This feature is available only for four species: Selaginella moellendorffii, Oryza sativa, Physcomitrella patens, Zea mays.

Example:

 ATGC   Supported by RNA-Seq exonic block that spans over the splice site

 atgs   Truncated intron sequence

Block sizes: 41 (upstream exon), 71 (downstream exon)
Score: 9

CAGGCCGCCAGCGCCGCCGTGCGCGCCGCCGTCTCCCGCGACCCGCTCTTCGTCAACACCGCCGTTTCGCTCCTGCACTCCTCGCTCACGTCGGCCTCCGgtctggcctc ... gtatttgcagTTATCTTTGTTCTTGTCAACCGATGGCATAATAAGGACCTCAAGAACATGTTTGAGCATGAAGAATTGTTTGGTGGCAGTTGGGTTGGAGCATATTCTGC


e) PPT, BS, UA-tract

Polypyrimidine tract (PPT), branch site (BS) and UA-tract are shown here in alignment with an intron.
This feature is available only for 3' splice sites.
PPTs, BSs and UA-tracts are marked with different colors, as in example below.

 atgc   intronic sequence     ATGC   exonic sequence

Intronic sequence is truncated to 55 bases.

gcaaatgagattatctgattcttccactttcaactgagttgtttcttctatgcagTGTATGGGTTGCTAGAAAACCACCAGGTTTTGCCTTTATTGACTTTGATGACCGCAGGGATGCAGAAGATGCAATTCGTGATTTAGATG
            atctgat  putative branch site (score: 42)
 ttgtttcttct  putative PPT
 agattat  UA-rich tract

If putative PPT overlaps with branch site or branch site was not found, then it is called CU-rich tract.
For branch site the score value is provided which tells about the similarity of found BS to the fungal consensus: TACTAAC (the higher value, the higher similarity).

f) cis-regulatory sequences

Both exonic and intronic splicing regulatory sequences are presented here. Both ERISdb predictions (in exons and intrns) and Pertea et al. predictions are available and marked with different colors, as in the example:

 atgc intron ATGC exonic elements by Pertea et al.
 ATGC exon atgc putative intronic elements
 ATGC putative exonic elements identified for retained introns
        10        20        30        40        50        60        70        80        90        100       110       120       130       140       150       160       170       180       190       200       210       220 
---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------| 
gtgtgattgtgttcttctacttttggtttggtagagattggtggtggtgcttgttgctttcctcagcgtggtgggagtgatttgtgttgaatacctgcagGGTCGAGCGCGCTCATGGCTCGCGTGTACGTGGGGAACCTGGATCCGCGCGTGACGGCGCGGGAGATCGAGGACGAGTTCCGCGTGTTTGGGGTTCTGAG

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - GGAGAT
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - GAGATC

- -tgattgt
- - - - - - - - - - - - - - - - - - - -ggtggtg
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - cgtggtg
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - tgtgttg
In case of intronic elements, one can click the sequence to see the corresponding data: a WebLogo for a cluster that contains the putative SRE, a corresponding PWM and score value that was calculated based on the PWM.
U12 INTRONS
In this page one can browse and search for U12 introns predicted in all plant species except C. reinhardtii which is devoid of U12 splicing machinery. The most prominent features of U12 introns i.e. highly conserved sequence at donor site, terminal dinucleotide at acceptor site and branch site are marked with different colors as in example:
#21
gene: GRMZM2G154267, transcript: GRMZM2G154267_T01
Orthologs: AT5G22110 GLYMA16G34470 LOC_Os05g06840 LOC_Os08g36330 SELMODRAFT_236997 SELMODRAFT_111965 SELMODRAFT_236823 Vv17s0000g04830
CAATTGACTTATCTAATGCCatatcctaaaaaaacagtgttattgttagcacttatggctagttactatt ... gatttcgagattatcgagaaagtatcactactttccttaacatagaacacAAAATCACGTCAGGTTTTTT The provided orthologs represent orthologous genes with U12 intron found. Upon clicking the gene name, the U12 intron data in the gene is displayed. microRNA GENES One can choose a microRNA of interest from one of the following sources of miRNA gene structure: ERISdb predictions, miRNAs annotated in Ensembl Plants 15 or paper by Mica et al..
In case of Ensembl miRNAs, the user is redirected to a corresponding record in Splice site selection page.
Upon selecting a microRNA from ERISdb the following will be displayed (see example below):
- pre-miRNA sequence (yellow)
- EST sequence (green)
- genomic sequence (blue).
Predicted intron is in lower case and is truncated to 50 bases at both ends.
microRNA: gma-MIR394g
EST name: AW099182.1
ACAGAGTTTCTTGGCATTCTGTCCACCTCCACTTCTTGGCCCTATCTACGTACTCGGAGGTGGATATACTGCCAATAGAG
CACGTGGGTTTAACAAAGGGTTGCTTACAGAGTTTCTTGGCATTCTGTCCACCTCCACTTCTTGGCCCTATCTACGTACTCGGAGGTGGATATACTGCCAATAGAGCTGTGTTGGCTTCTCTTTGTCAAGCCTCCGTGATACTATAAATCTCAGTTAAATACATAAACACTTTTTTCCCATTTTTCTTTTCTGGCTATCAATTATATGAG GATCCCACCGGTCGTTTGCAAAAAAATGCTAAGAAATCGGATAAAGAGTAGAACGCCTCGAGTCGCTTGCCGCTTGCTTCTGGTGTTAGAAAGAAACCCTATAATCTTGTTTCCCTCCATTCTTATGAAACCCTATATTTCCCTTAGAACTTGGGCCTTTCCATGTTTCTTTAATTCTACTAGTCTAGCGATTGTTTGATGTTTCTAGTTTATTTCTTAAAATAAATTCTTGACGTTGTGT
CACGTGGGTTTAACAAAGGGTTGCTTACAGAGTTTCTTGGCATTCTGTCCACCTCCACTTCTTGGCCCTATCTACGTACTCGGAGGTGGATATACTGCCAATAGAGCTGTGTTGGCTTCTCTTTGTCAAGCCTCCGTGATACTATAAATCTCAGTTAAATACATAAACACTTTTTTCCCATTTTTCTTTTCTGGCTATCAATTATATGAGgtaaagcaacctccttcctttttctttccttttgttgttttgtttttgtg ... ctagatagcacgatactgtgtgcttttcccctcattttttgattttgcagGATCCCACCGGTCGTTTGCAAAAAAATGCTAAGAAATCGGATAAAGAGTAGAACGCCTCGAGTCGCTTGCCGCTTGCTTCTGGTGTTAGAAAGAAACCCTATAATCTTGTTTCCCTCCATTCTTATGAAACCCTATATTTCCCTTAGAACTTGGGCCTTTCCATGTTTCTTTAATTCTACTAGTCTAGCGATTGTTTGATGTTTCTAGTTTATTTCTTAAAATAAATTCTTGACGTTGTGT
In case of miRNAs from Mica et al. the alignment of miRNA splice site and deep sequencing reads are presented (see below). Here, reads that support the splice sites are chopped into two parts: one aligned to exon sequence at 5'ss and another - at 3'ss. For vvi-MIR162 and vvi-MIR168 there are multiple splice sites predicted but we present them separately as their arrangement in different splice forms is unknown.
 ATGC    exonic sequence ATGC    5' end of a read
 ATGC    intronic sequence ATGC    3' end of a read
vvi-MIR394B chr18 - 1385724 1385362
CTCTCTCGCTCTTCCACTCTAGAGCATCAAGGTGAAAACCCCA ........ CTTGTGTTGCAGGGGTTTCATCAACTCCTCCTCTTTGCCTCTT
                  CTAGAGCATCAAG                                  GGGTTTCATCAACTCCT 1
         TCTTCCACTCTAGAGCATCAAG                                  GGGTTTCATC 1
                     GAGCATCAAG                                  GGGTTTCATCAACTCCTCCT 2
            TCCACTCTAGAGCATCAAG                                  GGGTTTCATCA 1
                        CATCAAG                                  GGGTTTCATCAACTCCTCCTCTT 2
           TTCCACTCTAGAGCATCAAG                                  GGGTTTCATC 5
                    AGAGCATCAAG                                  GGGTTTCATCAACTCCTCC 3
WEBLOGOS
This page provides WebLogos for 53'ss, 33'ss and branch sites of U2 and U12 introns for analyzed plant species. also, the corresponding PWMs are available. DOWNLOAD
Different types of data and tools are available for download here.

EXAMPLE

Click "Browse" in the main menu and select Zea mays. Then type "GRMZM2G134756" in the "Transcript selection" page (Input gene name or its part field).
There are two transcripts for this gene: GRMZM2G134756_T01 and GRMZM2G134756_T02:
Click GRMZM2G134756_T01 and then choose 3' splice site of intron 4, as shown in the figure:
Note that intron 3 is marked with , which means that it is more than 1000 bases long and was shortened for display purposes.
Five data tabs will appear: General, Orthologs, EST, PTT, BS, UA-tract, and cis-regulatory elements:

By clicking the tabs, one can access the corresponding information.
Let's look into "General" tab. It says that it is U12 intron and there is a link provided to U12 page. In the U12 page we can find out that both splice forms have U12-type intron and that this is a U12-type intron in Oryza sativa as well.
 atgc   highly conserved terminal intronic nucleotides at 5' or 3' end.

 atgc   branch site.
#1
gene: GRMZM2G134756, transcript: GRMZM2G134756_T02
Orthologs: LOC_Os08g23110 LOC_Os08g05490
ACCAATTATTCTTTCACATCgtatcctgaatacttattttgatcttgttctctttttaacctttaaatat ... ttgtttgtactttgcagagcatgtatttttttgcttaacctgaaatgcagATATGCTCCCTGGTTTTAAA

#2
gene: GRMZM2G134756, transcript: GRMZM2G134756_T01
Orthologs: LOC_Os08g23110 LOC_Os08g05490
ACCAATTATTCTTTCACATCgtatcctgaatacttattttgatcttgttctctttttaacctttaaatat ... ttgtttgtactttgcagagcatgtatttttttgcttaacctgaaatgcagATATGCTCCCTGGTTTTAAA
If we go back to "Splice site data" page, in the orthologs tab we can see alignments of Zea mays splice site with two homologs in Oryza sativa. By clicking the transcript ids we can access the corresponding "splice site data'.
In EST tag there is an experimental support for this splice site in a form of EST sequences. The two homologous splice sites in O. sativa are supported by both EST sequences and RNA-Seq reads.
All three homologs lack PPT tract but they have quite distinct UA-tracts (PTT, BS, UA-tract tab).
Finally, in the cis-regulatory elements tab we can discover that the homologs possess the same three putative regulatory sequences in exonic sequences.
i. BROWSE ii. U12 INTRONS iii. microRNA GENES iv. WEBLOGOSS v. DOWNLOAD vi. EXAMPLE