In Silico Biology 8, 0038 (2008); ©2008, Bioinformation Systems e.V.  


TRABAS: a database for transcription regulation by ABA signaling


Ananyo Choudhury* and Ansuman Lahiri




Department of Biophysics, Molecular Biology & Bioinformatics, University of Calcutta
92, APC Road, Kolkata 700009, India



* Corresponding author

   Email: ananyo.c@gmail.com





Edited by E. Wingender; received March 29, 2008; revised July 22 and August 19, 2008; accepted August 24, 2008; published September 07, 2008



Abstract

The effects of abscisic acid (ABA) induction on Arabidopsis thaliana transcriptome have been investigated by various expression studies. We have assembled and analyzed data from available expression studies related to ABA signaling in Arabidopsis along with other available microarray data, functional annotations and information related to occurrence of cis-regulatory elements in promoters of Arabidopsis genes in a database called TRABAS. TRABAS is expected to provide a simple, user-friendly platform to facilitate the study of different aspects of ABA mediated transcription regulation and is freely available at http://www.bioinformatics.org/trabas/.

Keywords: ABA signaling, expression profile, cis-regulatory element, gene ontology



Introduction

Abscisic acid is a plant hormone that seems to be present in all higher plants and plays important roles in a number of physiological processes. The processes regulated by ABA include several events in late seed development [McCarty, 1995; Leung and Giraudat, 1998] and response to environmental stress such as desiccation, salt and cold [Bray, 1993; Zhu, 2002]. These common stress conditions adversely affect plant growth and crop production and thus have enormous fundamental and practical significance. Knowledge about how ABA mediates signals about a stressful environment and leads to adaptive responses is important for breeding programs and for devising transgenic strategies to improve stress tolerance in plants.

Regulation of transcription by ABA has been investigated in many different expression studies [Hoth et al., 2002; Seki et al., 2002; Leonhardt et al., 2004; Sanchez et al., 2004; Xin et al., 2005; Li et al., 2006]. These studies have identified more than 2000 ABA regulated genes in the Arabidopsis genome and have demonstrated that different sets of genes were induced and repressed by ABA under different sets of experimental conditions. In addition to ABA mediated gene regulation, some of these expression studies also focused on more specific aspects like the time course of ABA mediated gene expression [Seki et al., 2002], comparative analysis of ABA mediated gene expression in different tissues [Leonhardt et al., 2004] and ABA mediated gene expression in plants which show mutations in proteins involved in ABA signaling [Hoth et al., 2002; Suzuki, 2003; Xin et al., 2005]. We have assembled data from such expression studies to identify ABA regulated genes and used behavior of genes under different sets of conditions to organize the ABA regulated genes into expression classes.

Although most of the expression data used in TRABAS can be retrieved from different existing microarray databases like TAIR [Rhee et al., 2003], NASC [Craigon et al., 2004], Genevestigator [Zimmermann et al., 2004] etc., the collection of expression data related to ABA signaling from various experiments which were conducted on different experimental platforms (cDNA, microarrays, affymetrix microarrays, MPSS (Massive Parallel Signature Sampling) and slides that are not included in public databases) is expected to make TRABAS very useful to researchers. Furthermore, recognition of subgroups based on expression characteristics within these datasets requires significant amount of time and effort. Moreover, retrieval of functional annotations and distribution of cis-regulatory elements (CRE) usually require a separate database search. In TRABAS we have attempted to provide an integrated search platform for a multitude of information related to ABA signaling mediated transcription regulation.



The interface of TRABAS

TRABAS is an integration of available information related to ABA signaling mediated gene regulation gathered using different independent methods like MPSS, cDNA microarray and Affymetrix 8K and 25K arrays. The interface of TRABAS (Fig. 1) is implemented on an Apache web server and uses a cgi-Perl script to query the integrated flat file database. The web interface enables retrieval of expression data of ABA regulated genes in ABA signaling related and under other conditions. In addition to expression data, the interface of TRABAS can be also used to retrieve Gene Ontology (GO) annotations and information related to distribution of known CREs in promoters of Arabidopsis genes.



Click on the thumbnail to enlarge the picture
Figure 1: The web interface of TRABAS database.

For selection of a set of ABA regulated genes we have included two different filters in the interface of TRABAS. The first filter ("score filter") uses expression values of genes in different ABA signaling related experiment slides as a filter to search for a set of genes which pass the specified expression value criteria in the selected slide. Use of this filter requires selection of a slide, expression behavior and an expression score cutoff from the corresponding drop down menus. With these inputs the program queries the database and selects out genes which show a particular expression pattern (up or down regulation) and have expression values greater than a selected fold change cut-off in a particular wild type experiment slide. Studies on ABA time course, tissue preference and role of mutants in ABA mediated expression regulation have identified different subgroups of ABA regulated genes. The class IRT (induced rapid transient), for example, contains genes which show rapid and transient induction in ABA time course arrays. In the process of construction of TRABAS, we have also categorized genes into expression classes on the basis of the number of experiments in which a gene is induced or repressed. The list of different expression classes is available at www.bioinformatics.org/trabas/trabas_slides.html. Genes from each of these classes can be selected by selecting "ON" option from the "class filter menu" and selecting the desired ABA related expression class from the "Class ID" drop down menu.

After selection of a gene set using any of the filters, TRABAS can then be used to retrieve expression data for the selected set of genes. Different expression categories, each of which comprises of a set of microarray experiment slides, can be selected from the options in "Data to output" drop down menu. The expression category "ABA_all_data" comprises of 22 slides derived from eight different expression studies (ME00333 (TAIR), ME00351 (TAIR), Hoth et al., 2002; Seki et al., 2002; Suzuki et al., 2003; Sanchez et al., 2004; Xin et al., 2005; Li et al., 2006). The category "ABA wild type only" on the other hand contains 13 slides, derived from six different expression experiments that are related to ABA signaling in wild type Arabidopsis plants. In addition to expression data related to ABA signaling, we have also included available microarray data from other expression studies into TRABAS. The categories which are currently available are abiotic stress, biotic stress, hormone treatment, chemical treatment and development. The category "abiotic stress" contains experiments like heat, osmotic stress, salt stress etc. The "biotic stress" category includes expression experiments that are concerned with the response of plants to different pathogens. The category "hormone treatment" includes experiments that are related to administration of various plant hormones like ABA, gibberellin, jasmonic acid, auxins, brassinosteroids etc. whereas the category chemical treatment includes experiments related to administration of other chemicals and drugs like paclobutrazol, AgNO3, ibuprofen etc. The category "development" includes experiments related to various stages that occur during plant development or related to formation of different plant organs. The list of different categories and microarray slides included in these categories is available at www.bioinformatics.org/trabas/trabas_slides.html.

The "Data to output" menu of TRABAS can also be used to retrieve GO (gene ontology) annotations related to the selected set of ABA regulated genes. The options GO MF (GO Molecular Function), GO BP (GO Biological Process) or GO CC (GO Cellular Component) from this menu retrieve and summarize the available Gene Ontology (GO) annotation information for the selected group of ABA regulated genes. Distribution of cis-regulatory elements (CRE)/cis-regulatory motifs (CRM) in a set of co-expressed genes often provide significant insight into the underlying regulatory mechanisms. To enable such analysis for any of the ABA regulated gene sets we have included the "PLACE motifs" and "ATCISDB motifs" option in TRABAS which calculates and compares the occurrence of all known plant CREs from the PLACE [Higo et al., 1999] and ATCISDB [Molina and Grotewold, 2004] databases in an ABA regulated gene set to their respective occurrences in all Arabidopsis promoters. Similar reference frames based on genomic occurrence of GO classes is also provided in the outputs of GO annotation retrieval options.

As experiments included in TRABAS were performed on different array platforms (8K, 25K and cDNA microarrays). We have also integrated a search tool named "array search tool" into TRABAS. This tool takes a set of gene identifiers as input and shows the presence/absence of each of these genes in 8K, 25K and cDNA microarrays.

In spite of our best efforts to include all ABA signaling related expression experiments and subgroups derived from them, we might have excluded a few. Moreover data from newer expression studies might become available in the meantime. Additionally, use of the TRABAS database and associated tools can be useful for studying expression, functional annotation and CRE distribution in other gene sets as well. Therefore, we have enabled TRABAS to receive a set of gene identifiers from users and perform retrieval of expression data and functional annotation and CRE distribution analysis for such gene sets (www.bioinformatics.org/trabas/trabas_dataset.html).

As an example we used TRABAS to study the expression of sixteen genes, which were found to be invariantly induced by ABA in an earlier comparative study [Xin et al., 2005]. Tab. 1 shows the expression of these genes in two more recent experiments (ME00333 (TAIR) and the slide from Li et al., 2006). Interestingly, all genes from this gene set (with the exception of the gene At4g27260) show significant upregulation by ABA induction in slides from both of these experiments.


Table 1: Genes upregulated by ABA under different experimental conditions.
Slide Li
ABA 10 μM 6h*
ME00333
ABA 10 μM 30 minutes
ME00333
ABA 10 μM 1 hour
ME00333
ABA 10 μM 3 hours
At1g01470 3.02 −0.54 0.54 2.18
At1g20440 3.24 −0.35 0.64 1.68
At1g20450 3.42 −0.19 1.13 1.80
At1g62570 5.91 −0.16 1.57 3.44
At2g28400 8 −0.06 1.50 2.70
At2g33380 33.33 −0.97 2.50 5.13
At2g41190 9.49 −0.99 0.36 4.14
At2g46680 6.99 −0.63 1.71 3.74
At3g11410 4.65 0.74 1.49 2.18
At3g61890 5.16 −0.07 1.50 3.03
At4g26080 4.05 0.69 1.57 2.12
At4g27260 −0.26 0.45 1.32
At4g27410 6.99 0.54 1.89 2.87
At4g33550 2.9 −0.84 0.65 3.55
At5g15960 5.17 −0.74 0.94 2.80
At5g52310 6.32 −0.18 1.77 3.19
The small set of genes that is upregulated in different experimental conditions involving ABA administration [Xin et al., 2005] is also seen to be predominantly upregulated in two recent experiments involving ABA induction [Li et al., 2006] (ME00333 (TAIR)).
Slides with "*" sign show expression data in terms of fold change, whereas data for other slides is in form of log2 fold change. The blank line(s) indicate genes for which exact information about its expression in the particular slide was not available, which may be due to non-induction or absence of the gene in the particular array slide. Using array search tool it was verified that the gene At4g27260 was present in the array and therefore the blank line in this case reflect that the gene was not significantly induced/repressed in the slide (Li ABA 10 μM 6h*).



Materials and methods


TAIR dataset

Normalized and log transformed Arabidopsis 25K microarray data for 1388 slides (TAIR dataset) corresponding to more than 50 independent experiments was retrieved from ATTED database [Obayashi et al., 2007]. All control/no treatment slides were removed from the set of 1388 slides. For replicate sets a single slide containing average expression values of the corresponding replicates were taken. The resulting slides were then grouped according to the major variable of the corresponding microarray experiments into five different subcategories – abiotic stress, biotic stress, hormone treatment, chemical treatment and development.


ABA datasets

The dataset of ABA induced/repressed genes in TRABAS was prepared on the basis of gene expression data from two experiments related to ABA signaling from the TAIR dataset (ME00333, ME00351) and seven other experiments [Hoth et al., 2002; Seki et al., 2002; Suzuki et al., 2003; Leonhardt et al., 2004; Sanchez et al., 2004; Xin et al., 2005; Li et al., 2006] related to ABA signaling in Arabidopsis (expression data were retrieved from supplementary materials of respective publications). Many of the ABA signaling related experiments were targeted to study gene expression in mutants, transgenics, and specific tissue [Hoth et al., 2002; Leonhardt et al., 2004; Sanchez et al., 2004; Xin et al., 2005] and contain non-wild type/tissue specific data. To identify the ABA regulated transcriptome in wild type Arabidopsis, we selected out 13 slides from six experiments (ME00333; ME00351; Hoth et al., 2002; Seki et al., 2002; Xin et al., 2005; Li et al., 2006] which contain data for ABA mediated gene regulation in wild type plants only (ABA wild type dataset).


Expression classes related to ABA signaling

The ABA regulated genes were categorized into different expression classes on the basis of available information about their behavior in time course, tissue preference and effect of different gene mutations on expression profiles. Moreover, for each of the wild type experiments, genes which show at least 4-fold induction/repression in any of the corresponding slides were selected out and these genes were categorized into expression classes on the basis of the number of experiments in which they are induced/repressed. The list of different expression classes is available at www.bioinformatics.org/trabas/trabas_slides.html.


Annotation and other data

Gene descriptions and GO annotations for all Arabidopsis genes were downloaded from TAIR (http://www.arabidopsis.org). One kb upstream sequences of all Arabidopsis genes, based on TAIR7 were also retrieved from TAIR. Consensus sequence of Arabidopsis cis-regulatory elements were taken from PLACE [Higo et al., 1999] and AtCISDB [Molina and Grotewold, 2005] databases. A custom Perl script was used to search Arabidopsis 1 kb upstream sequences for the presence of these cis-regulatory elements. Both GO annotation and CRE search results were integrated into the database.



Future developments

We have planned various improvements for the next version of the TRABAS database. TRABAS currently contains ABA signaling related data for Arabidopsis only. In the next version of TRABAS, we would like to include other plant species (like rice and maize) also. We also plan to include different tools for statistical analysis of known and user-provided CREs and study of GO term enrichment. Finally we plan to integrate other available annotations (like metabolic pathway) and expression data (both for ABA signaling and other experimental categories).




References