EucGenIE : help
00 00       00

 

Overview

The GeneList tool allows the user to search for genes using gene IDs, descriptions, experiments, GO ids and different annotations, and then saves the result in a list that can be used by other tools.

Basic Usage Simply type in gene ids, descriptions, or different annotations. The matching genes will be displayed with selected annotations. The result can be customized by clicking the Select Displayed Annotations button. There are three buttons to “Save all to Gene List”, “Remove selected from Gene List” or “Empty Gene List”. The “Share table” button allows sharing the current GeneList with other users by way of an auto generated URL. The GeneList tool is the starting point for most PlantGenIE workflow.

Multiple GeneList The GeneList tool is capable of holding several named gene lists for use in other tools. These lists can be Added, Renamed or Deleted. Once clicking on a GeneList name, it will become the active GeneList and will be displayed in all other tools. GeneLists will remain for seven days while shared GeneList will be saved for 30 days.

Data

We use both PostgresSQL and MySQL to store annotation data. The GeneLists tool uses both in house annotation data and data from Phytozome and Plaza.

Implementation

This tool uses JavaScript, PHP, MySQL, PostgreSQL, JQuery, Datatables and Toastr libraries. SQL views and tmp tables gave additional speed to the tool.

Overview

The BLAST (Basic Local Alignment Search Tool) tool compares input sequences to PlantGenIE sequence databases to identify homologous sequence matches.

Basic Usage

Simply paste your sequence (with or without a FASTA header) into the Query Sequence input text box. Alternative you can retrieve a transcript sequence by entering a gene ID into the Load example text box, or you can upload a sequence file (Less than 100 MB) using the upload file function. Having used one of these input options, click and select the desired dataset from the lists of available BLAST databases. Finally click the BLAST! button at the bottom of the page.

PlantGenIE BLAST uses standard default NCBI BLAST options. However users can change the following advanced options:

Option Description
Scoring matrix Substitution matrix that determines the cost of each possible residue mismatch between query and target sequence. See BLAST substitution matrices for more information.
Filtering Whether to remove low complexity regions from the query sequence.
E-value cutoff The maximum expectation value of retained alignments.
Query genetic code Genetic code to be used in blastx translation of the query.
DB genetic code Genetic code to be used in blastx translation of the datasets.
Frame shift penalty Out-of-frame gapping (blastx, tblastn only) [Integer] default = 0.
Number of results The maximum number of results to return.

BLAST results

The BLAST Results page will be automatically reloaded until the search results are successfully retrieved. BLAST results are organized into a table containing Query ID, Hit ID, Average bit score (top), Average e-value (lowest), Average identity (av. similarity) and Links. Clickable BLAST results display the corresponding region of identified homology within the GBrowse tool, where the matching region is shown.

Data

The BLAST tool uses public genome assemblies, early release de novo assemblies from UPSC and data from [Phytozome] (http://www.phytozome.net/) and Plaza.

Implementation

PlantGenIE BLAST search is implemented using NCBI Blast (v2.2.26) and a backend PostgresSQL Chado database. We use PHP, JavaScript, XSL, Perl and d3js, Drupal libraries to improve Open Source GMOD Bioinformatic Software Bench server to provide a graphical user interface.


Overview
GBrowse is an open-source, genome annotation viewer.

Basic Usage
To find particular region of the chromosome, type a gene name, a short sequence (minimum of 15 bp), or a nucleotide range in the Landmark or Region box located near the top left of the page and click on the Search button. The area shown in the Details panel is highlighted by a box. You can grab the box and slide it left or right within limits (it can't slide over the whole genome). Once you get to a particular location, you can fine-tune the view with the Scroll/Zoom buttons to move along the chromosome or change magnification.

Data
GBrowse uses in house annotation data and data from Phytozome and Plaza.

Implementation
PlantGenIE GBrowse uses customized version of Generic Genome Browser version 2.49. We use dedicated GBrowse servers for each of our PlantGenIE resources.


Overview
exImage provides an intuitive pictographic view of expression data across a diverge range of PlantGenIE datasets.

Basic Usage
Users can either enter a gene ID in the input text area (and hit the "GO" button) or create a gene list which then will appear as an interactive list in the tool. exImage will shade the samples according to expression levels across multiple samples using either absolute or relative values. Relative values displays expression relative to the mean expression across all samples. The current view can be exported in various vector formats including publication ready PDFs or as expression values. The ‘Take a tour’ feature will provide a brief introduction to the basic functionalities in exImage.

Data
exImage uses VST (Variance-Stabilizing Transformation) values for absolute expression, and no unit for the relative values. Absolute expression values were generated by aligning RNA-Seq reads to the reference genome and gene annotation with aligned read numbers then used to calculate VST values.

Implementation
exImage was developed using PHP, Javascript, d3js, rsvg-convert, imagemacgick, librvg and batik. exImage uses a MySQL database as a backend data source. exImage was inspired by the eFP resource.

Overview

exMatch is a tool where the user can find genes by colouring samples. There is a colour legend on the left side of the tool and user needs to drag desired colours from the colour palette and drop into the relevant sample image. Once the user clicks “Find Genes” button, the tool will find corresponding genes based on shaded samples. The underlying principle is the opposite to the functionalities of the exImage tool. Currently, this brand new tool only exists in the EucGenIE.org web resource


Overview
exPlot is an interactive plotting tool visualize expression profiles as line graphs for selected genes and experiments.

Basic Usage
Type in multiple gene IDs in the input text area separated by comma, space, tab or new line and hit the "Search" button. Alternatively, you can create a gene list, in which case genes from the currently active list will be displayed. The tool plots VST normalized gene expression values across the selected samples or pre-defined sets of samples for the input genes. Different sample sets are available in the ‘SampleList’ in the top-right corner of the page. The plot is interactive and allows the user to select a subset of the displayed genes and to create a new GeneList containing only these genes. Publication-ready figures, in PDF or SVG format, can be exported

The ‘Take a tour’ feature will provide a brief introduction to the basic functionalities available in exPlot.

Data
exPlot uses the same VST(variance-stabilizing transformation) datasets stored in MySQL database.

Implementation
exPlot was developed using JavaScript, PHP and MySQL. It uses Highchart open source framework to visualize, draw and export charts interactively.

Overview
The Chromosome diagram tool plots the location of genes in the active gene list.

Basic Usage
Type in multiple gene ids inside the input text area separated by comma, space, tab or new line and hit the "Submit" button. Click the padlock icon to enable zoom in function. You can scroll or use the zoom slider to zoom in or zoom out the chromosome diagram. When you mouse over the gene location, it will show the detailed information popup and link to the Gene Information page. You can simply drag and select the favorite gene locations and export as TSV or GFF3; or visualize in Phytozome or Agrigo. The Chromosome diagram tool allows users to upload a gene list and display the chromosomal location of those genes. It has controllers to change the color of the output diagram and also generates publication-ready plot that can be exported in common file formats including PDF.

Data
The Chromosome diagram tool uses basic annotation data from PlantGenIE MySQL database and you can upload custom files.

Implementation
The Chromosome diagram tool was built using Action script, PHP and MySQL.

eXHeatmap
This tool is generates a heatmap plot, useful for clustering and for analyzing the expression of genes relative to each other. The network analysis tool (Popnet) is a useful alternative to clustering, while the expression plotting tool (exPlot) can be a useful alternative for plotting expression profiles. This tool uses the current gene list and sample list available in the Master Menu, so if those lists are empty, users must first fill them up from a set of dedicated tools.

Clustering with the heatmap
The genes are clustered based on the choice of a distance function and the result of the clustering is shown by means of a dendogram, that can be places on either of x and y axes. The color scale indicates how far the actual expression values are from the local consensus. Distance functions are quantifying how similar is the expression of two genes/samples. For more accurate estimators of gene expression similarity use the PopNet tool. Based on the all-pair distance estimations the genes are clustered together using a chosen variety of the hierarchical clustering algorithm. The sample information is selectable from the command panel. By clicking on the heatmap itself you will open a publishing-ready pdf, or you can export the heatmap data from the command panel and import it into your favorite plotting program.

Overview

exNet (expression Network) is an interactive tool for exploring co-expression networks.

Basic Usage

exNet visualizes co-expression between genes in the active gene list. Co-expression is visualized by drawing a co-expression network where genes are displayed as nodes and co-expression between genes is indicated by connecting nodes with an edge (i.e. if two genes have a line connecting them, they are co-expressed above the selected threshold). Genes are co-expressed if their profiles are correlated above a set co-expression threshold, and both the correlation measure and the co-expression threshold can be changed by the user. There are options to expand the network to include all co-expressed neighbours of genes currently represented in the network, or to remove genes (e.g. unconnected nodes). There are also options to color genes either individually or using pre-defined groups such as clusters or Gene Ontology categories. It is also possible to change the node shapes. Users can select a subset of genes within the network (selected by either mouse-dragging over the desired genes or by SHIFT-select and clicking on multiple nodes) and the  tool will then display (below the network) the expression profiles of the subset as a line plot or a heatmap. For pairs of genes, the tool can visualize their co-expression as a scatterplot.

Example workflow:

  1. Add a gene or genes to the currently active gene list.
  2. Go to the exNet tool and display the network.
  3. Select a gene or genes and right click, choose expand (to add co-expression neighbors).
  4. After expansion, add new genes (yellow) to your gene list by selecting them and clicking the 'Add' button in the "GeneList actions:" section of the controls below the network.
  5. You can then use the updated gene list in other PlantGenIE tools, e.g. Enrichment, exImage, exPlot or ComPlEx.

 

exNet display panel

The various elements of the exNet interface are shown Figure 1 and will be explained in detail. To view a network, exNet requires an active list of genes selected using the gene list tool. The network display shows an editable co-expression network of the current gene selection and allows various operations. Some of these operations are available by selecting genes and right clicking on the selection to access the selection menu. The actual network being displayed is controlled from the display settings panel. Various plots can be interactively displayed as the user selects network elements, and can be opened in their own specialized tools.

interface

Figure 1. ExNet

Display settings

This panel allows users to choose different co-expression measures (correlation measures). Co-expression can be displayed as either CLR values or as ordinary Pearson correlations (Pear). The CLR values are based on Mutual Information (MI); a pair-wise measure of mutual dependence. The Context Likelihood of Relatedness (CLR) approach transforms each MI value into a z-score indicating how much higher (in standard deviations) that MI value is than the average MI value in the network neighborhood. Hence the CLR value indicate the significance of the co-expression between two genes. The threshold boxes control the network size. For example, a lower threshold for display will result in more links between the selected genes (nodes) in the network panel. A lower threshold for expansion will include more co-expressed genes to be shown when selected genes are expanded (Figure 3a). You can also change the shapes of the nodes and you can use separate shapes for the genes annotated as transcription factors.

display settings

Figure 2. Display settings panel

Co-expression values between all pairs of genes across all samples are precomputed and stored in the database. However, it is sometimes necessary to compute co-expression for custom selections of samples (selected using the sample list selector in the menu, note that sample selection is currently not available for all datasets). If the checkbox for sub-network is on, a custom co-expression network of the type specified in the correlation drop-box will be computed on the fly, using the active sample list. By specifying the minimum co-expression level in the attached ‘thresh’ box you can filter the number of links. CLR cannot be computed for custom sample selections so for this option the network will show MI correlations.

Co-expression network actions

Checking the gene profiles box will show gene expression profiles inside the network nodes for the active sample list. Note, however, that these gene profiles may not always look similar (even for highly co-expressed genes) because the default co-expression network is computed across all samples while the profile is shown for the selected subset of samples. To draw a network for the selected samples, use the sub-network functionality.

network display

Figure 3. Network display panel: a) Selection menu. b) Selection method.

The selection menu (Figure 3a) appears by right-clicking a node selection and allows expanding the network and selecting pathway nodes.

Expanding the network. A node selection can be expanded to include all co-expressed genes at a predefined co-expression threshold (specified by changing the expansion threshold in the display settings panel). The network display panel cannot display networks of unlimited size (it is designed to display a few hundred elements (nodes + links), this is due to restrictions on the user’s web client memory). If the network exceeds a certain number of elements, a warning will be displayed and the user will have to raise the expansion threshold before trying again.

Pathway nodes. After selecting genes in two or more regions of the displayed network, the Select ptw nodes - option selects the genes in the shortest paths between these initial selections. The initially selected subnetworks can for example be two Gene Ontology categories and the pathway genes are the genes connecting them.

pathway nodes

Figure 4. Pathway nodes. In this example the shortest paths are computed between two genes and pathway genes are selected.

Gene list and export panel

GeneList actions. The gene list and export panel can be used to change the gene list based on network selections. Selected genes can be added/removed from the active gene list (marked with red in the [master menu]), replace the active list or saved into a new list (input the name of the new list and select Save all or Save selected).

edit export

Figure 5. Gene list and export panel.

Export options. The Genes button exports the gene names in the network as a text file. The ‘SVG’ button exports the current network as a publication-quality figure. The ‘Graphml’ button exports the network in the graphml format so that it can be further edited in graph editing programs such as Cytoscape and yEd.

Plotting panel

Three types of plots can be generated for the current network selection: expression profiles (‘geneprofile’), scatter-plots and heatmaps. These plots can also be saved (selecting the corresponding button will open the image in a new tab). These plots will automatically refresh as the user changes the gene selection.

Gene profile. This button plots the expression profile of the selected genes for the active sample list. The [‘ExPlot’ tool] can be opened for a more detailed analysis of the expression profiles.

plot profile

Figure 6. Profile plot.

Gene scatter plot. This button produces a scatterplot of the expression values of two genes for the current selection of samples. Clicking a link in the network will also produce a scatter plot, but for all samples used to build the network. The scatter plot tool is limited to two genes.

scatter plot

Figure 7. Scatter plot.

Heatmap. The heatmap tool displays a heatmap for the active sample list and the selected genes. A link is also provided for downloading the corresponding expression table. This plot also has a dedicated tool called ‘exHeatmap’, with additional settings and options to generate publication-quality figures.

heatmap

Figure 8. Heatmap.

The Color panel

The color panel can be used to color named genes in the network or to color genes from the same GO categories. The ‘GO enrichment’ tool can be used to test the statistical enrichment of GO categories in a selection.

"This textbox allows the user to select a number of categories for colouring the genes, including GO terms, chip_Seq annotations and gene ids. There are some limitation for this tool, the maximum number of single genes to be coloured is 500 and the maximum number of different annotation categories are eight. There are also a limitations in the colouring when the same gene id exists in two or more categories, in this case the colour will be coloured and then overwritten in in the order 1. GO term (first to last) 2. chipSeq chip (first to last) 3. Gene id. First to last means the order in which they were written in the textbox. In the case when colouring several terms from the same gene ontology tree for example it is recommended write the terms in order from parent to child."

GO list

 

Figure 9. Color menu.

The textbox allows the user to type/paste nodes that should be colored, including GO terms, ChIP-Seq annotations and gene IDs. This is limited to 500 single genes and 8 categories. Note that if a single gene exists in two or more categories, the color will be overwritten in this order: 1. GO term (in the order they were written) 2. ChIP-Seq (in the order they were written) 3. Gene ID.

"When the colour genes button is clicked, genes belonging to selected categories will be coloured in the network panel. The term colour will be listed in the plots and color display."

colorpanel

Figure 10. Color display.

Data

There are currently two co-expression networks in our database for PopGenIE (All affymetrix or Asp201 Expression Atlas) and one each for AtGenIE and ConGenIE.

Implementation

exNet uses Cytoscape Web (Flash) as the core for the network layout and visualization; the web page is coded in HTML, JavaScript and PHP. For structuring and printing the network information, a python script is used. The data is stored in a MySQL database and PHP mysql and Python MySQLdb packages are used to access the data.

Best tips to try before you contact us!
We have found that many apparent problems with tools in PlantGenIE can result from previous results that have been cached. Before reporting a bug/problem we would request that you first clear your browser cache, quit the browser, again clear the cache when you re-open the browser and then finally check that the problems remains.

Overview
The Gene information page contains basic information about a gene including sequence, function and family information.

Basic Usage
The Gene Information Page consists of dedicated tabs names: Basic Information (including GBrowse details), Sequence, Functional Information, Expression Overview, Gene Family and Publications (including community annotations). The Basic tab gives a quick overview (Chromosome, Description, Synonyms, Arabidopsis id and GBrowse image map) of the gene or the gene model. The Sequence tab contains Genomic-, CDS-, Transcript- and Protein- sequences. You can easily BLAST any of the above sequences by clicking the related BLAST button, and you can extract Upstream or Downstream Genomic sequence by adjusting the upstream/downstream input boxes. 5’ UTR, CDS, 3’ UTR regions are highlighted with dedicated colors. The Functional information tab includes annotations from different data sources including GO, PFAM, PANTHER, KO, EC and KOG. The Expression Overview tab displays the exImage image for the gene, and gives a visual overview of the tissues where the gene is expressed. More tissues/samples are avaiable at the dedicated exImage tool. The Gene Family tab contains gene family information across several different species. You can select a species and download fasta file or create a phylogenetic tree using either Galaxy or Phylogeny.fr. You can also send a gene families to the PlantGenIE GeneList or visualize expression conservation/divergence using the ComPlEX tool. The Community Annotation tab will display the user submitted annotation of gene models. You can edit the current annotation using the WebApollo annotation editor. Once members of PlantGenIE team approved the new submission, it will display inside the Community Annotation tab.

The Gene Information page is the starting point to WebApollo and this will also be the final destination for many of the PlantGenIE tools, for example GBrowse, GeneList or exPlot. There are dedicated pages for both genes and transcripts information.

Data
The Gene Page uses data from various sources including Gbrowse, exImage, WebApollo and MySQL. The GeneLists tool uses in house annotation data and data from Phytozome and Plaza.

Implementation
The Gene Information page uses JavaScript, PHP, MySQL, PostgreSQL, JQuery and d3js.



here