In Silico Biology 7, 0036 (2007); ©2006, Bioinformation Systems e.V.  


Correlation between gene silencing activity and structural features of antisense oligodeoxynucleotides and target RNA


Li Liao1 and Zhongwei Li2




1 Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
   Email: lliao@cis.udel.edu
2 Department of Biomedical Sciences, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA
   Email: zli@fau.edu





Edited by H. Michael; received April 26, 2007; revised June 27, 2007; accepted July 05, 2007; published August 18, 2007



Abstract

Antisense oligonucleotides inactivate mRNA targets, providing a tool for post-transcriptional gene silencing and a potential novel treatment for many diseases. Reliable design of active antisense depends on better understanding of the mechanism of antisense-target RNA interaction. We have studied the correlation between activity of antisense oligodeoxynucleotides (ASO) and structural features of both antisense and target RNAs. A total of 348 ASOs with known activities and their target RNA sequences are classified into categories according to their predicted secondary structural features. Statistical analysis showed that higher activity is more likely to happen at RNA stem-loops than at other RNA structural categories. The data suggest a weak correlation between the stability of ASO structure and activity. Remarkably, a structural fit between ASO and target seems important for antisense activity. Significantly higher antisense activity is achieved with stem-loop ASOs on stem-loop or linear RNA targets.

Keywords: antisense, RNA, gene inactivation, mRNA target, secondary structure, oligodeoxynucleotides, stem-loop, statistical analysis



Introduction

An antisense oligodeoxynucleotide (ASO) may inhibit expression of a gene by binding to a target sequence in the mRNA through base complementarity. This regulatory effect of ASOs has been useful in research and has been explored as a potential therapeutic utility for many diseases [Tamm et al., 2001; Jansen and Zangemeister-Wittke, 2002]. However, only a small fraction of ASOs efficiently silence the target RNA. Among many factors affecting the activity, binding efficiency of ASOs to the target RNA sequences plays a major role [Lehmann et al., 2000; Sczakiel, 2000; Matveeva et al., 2003]. It is therefore of great importance to understand what factors may affect the binding efficiency [Branch, 1998]. It has been suggested that the accessibility of the target RNA sequences [Scherr et al., 2000], formation and stability of the ASO-target duplexes [Matveeva et al., 2003] play pivotal roles in determining binding efficiency. Since precise prediction of antisense-target interaction is presently not possible, statistical analysis of a large number of antisense results is essential for understanding various factors affecting ASO activity. Such analysis may facilitate high-throughput, genome-wide identification of RNA targets, and design of effective gene-silencing nucleic acid constructs [Kramer and Cohen, 2004; Ravichandran et al., 2004].

Several analyses have been done to reveal the correlation between activity and sequence/structure of ASOs as well as target RNA. Much effort has been focused on prediction of single-stranded regions of target RNA with the assumption that they may be better accessible by antisense oligonucleotides [Mathews et al., 1999; Patzel et al., 1999; Scherr et al., 2000; Vickers et al., 2000; Ding and Lawrence, 2001]. Interestingly, based on analysis of a large number of ASOs whose activities were experimentally shown, Matveeva and colleagues suggested that certain sequence motifs in ASOs may be correlated with the activities [Matveeva et al., 2000]. Further analysis suggested that higher thermostability of ASO-target duplexes and lower stability of ASO self-structures are positively correlated to antisense activity [Matveeva et al., 2003; 2004]. These studies provided important details of antisense mechanism from various points of view. Statistical analysis is a useful approach to determine important sequence and structural features of both antisense and target RNA.

Since ASO accessibility and silencing efficiency are complicated processes, more factors are expected to play roles in determining antisense activity. In this work, we conducted an analysis on a set of 348 oligonucleotide DNA sequences whose activities in targeting different parts of mRNA were previously determined [Giddings et al., 2000; Matveeva et al., 2000]. Our results provide further evidence that secondary structural features of target RNA are correlated with activities of antisense oligonucleotides.



Materials and methods


Database and secondary structure prediction

Oligonucleotide sequences and their target RNA sequences were downloaded from the database at University of Utah, and are available as Supplementary material. The dataset contains 348 oligonucleotide sequences of length ranging from 10 to 22 nucleotides (nt) with an average of 19.5 nt, and 21 target RNA sequences of length ranging from 400 to 7135 nt with an average of 3938 nt. The number of target RNA sites is larger than the number of ASOs, since sometimes one ASO binds to more than one target site, on the same RNA or different RNA sequences. The activities of gene expression were measured as the ratio of levels of particular mRNA or protein in cells after treatment with experimental antisense versus control oligonucleotide. Detailed description of the data is given previously [Giddings et al., 2000]. In this article, the antisense activity is expressed as the percentage of the reduction in the level of gene expression caused by ASO. For instance, if the level of the target mRNA drops to 30% by ASO as compared to control, the antisense activity is 70% or 0.7. Greater value of antisense activity means better gene silencing.

The program RNAfold in Vienna package (http://www.tbi.univie.ac.at/RNA/) was utilized for predicting secondary structures of antisense oligonucleotide and target RNA sequences. RNAfold implements two dynamic programming algorithms: the minimum free energy algorithm producing a single optimal structure [Zuker and Stiegler, 1981], and the partition function algorithm calculating base pair probabilities in the thermodynamic ensemble outputs [McCaskill, 1990]. For each input sequence, the minimal free energy secondary structure and the corresponding free energy (in kcal/Mol) predicted by RNAfold are recorded. For this study, various local structural features were classified into four types according to the presence and orientation of stretches of residues that are predicted to form base-pairs, including stem-loops, turn at multi-branch loop, bulge loop, and linear structures. Such classification is done by a simple program and then curated manually. It is noted that a similar method but using exhaustive enumeration of all local structural features for segments of size 3 (triplets) has recently been reported [Xue et al., 2005] for classifying real and pseudo microRNA precursors.


Statistical analysis

Statistical analysis was conducted at different levels. First, we studied the correlation between activities and structures of target RNA at the binding sites as described in Tab. 1. Second, we extended our analysis to antisense oligonucleotides on their structural types and stability. Pearson correlation and t-test statistics [Press et al., 1992] were used to analyze the data. The correlation coefficient r for two series of data xi and yi is defined by the following formula:

where x and y are the respective means of two series. The value of r lies in the range [−1, 1], with −1 for negative correlation, 0 for no correlation, and +1 for positive correlation. For each calculated r, its significance is estimated by a t-test statistic. To identify how activities (as xi) are correlated with a structural feature type (as yi), the value of yi is assigned as 1 if a structural feature type at the binding site of the target RNA sequence is the given type, and assigned as 0 if otherwise. It is worth noting that by using Pearson correlation there is no need for an artificial cut-off value for activities to classify oligonucleotide as active or inactive. The results are discussed in the next section.

For all 348 oligonucleotides, their activities and structural stability (free energy release) were plotted (see below). Thus, either we skip the reference to Fig. 2 here, or we have to shift the figure and renumber it. I recommend the first variant, since the plot seems more a Result than a Method.).



Results and discussion

We have conducted statistical analyses for correlations between the antisense activity and the different structural elements. We first looked at the ASOs and the RNA target sites separately, and then we further examined the correlations when the different structural features at ASOs and the RNA target sites are combined, to test our hypothesis that a "fit" between structural features of an ASO and its RNA target may be important in determining gene silencing activities.


Antisense oligonucleotides and target RNA binding sequences can be classified to categories based on their secondary structural properties

The program RNAfold from the Vienna RNA package was utilized to predict the secondary structure of the target RNA sequences. To do this, we first folded the entire RNA molecules, and then identified the structural features of the ASO binding sequence in each of the folded RNAs. Secondary structural features at ASO binding sites are classified into four categories as described in Materials and methods and are shown in Fig. 1 and Tab. 1. Note that in the data set, some ASOs bind to multiple target sites on the same or different mRNA. Therefore, the total number of target sequences (434) is more than the number of ASOs under study (348). Among different categories, bulged sites are the most abundant. Stem-loops, turns and linear regions are present in about the same frequency. Since the sequences were collected from the tested ASOs without being designed according to structural features, we do not anticipate any deliberate selection of ASOs. Rather, it might be reflected that bulges are common features among the mRNAs tested.



Click on the thumbnail to enlarge the picture
Figure 1: Schematic illustration of the structural features. Type I is stem-loop, type II is turns in multi-branched loop, type III is bulge loop, and type IV is unstructured linear strand. The thicker dark bars stand for Watson-Crick pairings, the thinner gray bars for phosphodiester bonds.


We also used RNAfold to predict the secondary structure (if any) of the antisense DNA oligonucleotides. We believe that the program, although designed for RNA folding, can give us reasonable prediction of the structure of the single-stranded DNA oligos, as suggested in the documents provided with the program. In addition to selecting DNA parameters, however, the wobble G-U base pair was also allowed for DNA folding as G-T pairing. Base modification was not used in the study of ASO activity ([Giddings et al., 2000] and references cited therein), and therefore was not considered as a factor for folding. As would be expected for folding of such short sequences, turns and bulges were not present. Therefore, only two categories of ASOs were found: stem-loop (type I, 198 occurrences) and linear (type IV, 150 occurrences). In addition, it should be noted that the categories of RNA targets and ASOs are not directly comparable, since the local structures of RNA binding sites are predicted as part of the fold of the entire molecules. In fact, we expect that an ASO may adopt the same or a different structure as compared to its target RNA sequence. We have analyzed antisense activity in relation to the features of the ASOs and target RNAs separately in the following sections.


Table 1: Correlation between activities of antisense oligonucleotides and structural features of target RNA sequences at the binding sites.
Type of structureDescriptionNumber of target RNA sitesAverage activityCorrelation coefficient (r)Significance (t-test P-value)a
IStem-loop360.500.204< 0.00001**
IITurning at multi-branch loop500.460.1270.008*
IIIBulge loop2990.39-0.1300.003*
IVLinear490.34-0.324< 0.00001**
a The software for t-test does not give exact number on P-value smaller than 0.00001.
** Very significant.
*  Significant.


Correlation between antisense activity and target RNA structure

Relative antisense activities from 0 to 1 are averaged for each category of target RNA structures (Tab. 1). Stem-loop structures have the highest average activity, while linear regions have the lowest activity. Average activities for turns and bulges are in between. The differences in activities among the structural groups, albeit relatively small, indicate that structural feature in target RNA is an important factor for determining antisense function. Some structures seem to be superior to others for antisense activity.

In order to better understand the relationship, Pearson correlation was calculated between the antisense activities and the structural features of the target RNA sequences at the binding sites. The results are listed in Tab. 1. Interestingly, stem-loops and turns are positively correlated with the activities, whereas bulges and linear regions are negatively correlated with the activities. Stem-loop and linear regions have the strongest correlation. In addition, the correlations for stem-loop and linear types are very significant, having a P-value of < 0.00001. Turns and bulges have a significance value of between 0.001 and 0.01. This suggests that, according to the structural classification described in Tab. 1, stem-loops are more suitable than less-structured regions to support antisense action. Surprisingly, linear regions are not very good candidates for antisense action.


Correlation between antisense activities and structural features of antisense DNA oligonucleotides

All 348 oligonucleotides were analyzed using the program RNAfold, which determines the secondary structures and minimum energy for given sequences. As discussed above, ASOs only adopt to either stem-loop or linear structures. We calculated the correlation between activities and structural features. The correlation coefficients are 0.067 for stem-loop and −0.067 for linear, respectively. Based on a t-test (significance P-value = 0.212 for both types), we conclude that there are no significant correlations between antisense activity and secondary structure of oligonucleotides.

Yet, it is likely that stable stem-loops present in ASOs may inhibit their activity, as suggested by previous work [Matveeva et al., 2003; 2004]. Therefore, we further analyzed the relationship between activities and structural stability of ASOs. Structural stability was predicted by free energy calculated by RNAfold. A plot of free energy and antisense activity of all 348 ASOs is shown in Fig. 2. All linear ASOs have a free energy of 0 and are plotted on the top line of the graph. Stem-loop ASOs are distributed in the large part of the plot. It appears that the stem-loop ASOs with higher activity are more concentrated in the region of lower stability (close to 0 free energy) than the ones with lower activity. There is a significant (t-test P-value = 0.013), but weak correlation (correlation coefficient r is 0.133) between antisense activity and the free energy for the stem-loop ASOs; the correlation coefficient r for linear ASOs is zero since they have 0 free energy. This result suggests that more active stem-loop ASOs tend to have less stable secondary structures, and is consistent with previous observations [Matveeva et al., 2003; 2004].



Click on the thumbnail to enlarge the picture
Figure 2: Plot of antisense activities versus minimal free energy of antisense oligonucleotides. Data from a total of 348 DNA oligonucleotides are plotted. Open square () stands for type I oligonucleotide structure (stem-loops), and filled triangle () for type IV (linear). Only type I and IV are observed for ASOs.


Interaction of ASO and target RNA structures

Observations made in the above sections prompted us to hypothesize that a "fit" between structural features of an ASO and its RNA target may be important in determining gene silencing activities. To test this, we have analyzed the antisense activity of ASO, either stem-loop or linear, on each type of target RNA structures. The results are summarized in Tab. 2. When stem-loop ASOs were used on stem-loop target RNA, high average activity (0.58) was achieved with positive correlation (r = 0.31). Stem-loop ASOs are also effective on linear RNA targets, with average activity of 0.43 and a positive correlation (r = 0.29). Correlations of both matching pairs are significant based on t-test, at P < 0.10 and P < 0.05 levels, respectively. In contrast, linear ASOs are significantly non-effective on stem-loop and linear target RNAs with negative correlations (r = −0.44 and −0.34, respectively). Other types of ASO and target matches show insignificant or no correlations. In conclusion, stem-loop ASOs are able to target type I and type IV RNA better than linear ASOs.


Table 2: Correlation between activities of linear or stem-loop antisense oligonucleotides and structural features of target RNA sequences at the binding sites
Type of ASO Type of target
RNA structure
Average activity of ASOs Correlation coefficient (r) Significance
(t-test P-value)
I I 0.58 0.31 0.08*
II 0.41 −0.11 0.2
III 0.39 0.05 0.3
IV 0.43 0.29 0.04**
IV I 0.33 −0.44 0.01**
II 0.53 0.17 0.5
III 0.40 0.01 0.09*
IV 0.24 −0.34 0.02**
** Significant at < 0.05 level.
*  Significant at < 0.10 level.


Our results suggest that a fit between structural features of ASO and target RNA is important for achieving good antisense activity. This is consistent with previous studies of the activity of naturally occurring antisense RNA species. Work with antisense RNA suggested that bulged regions in target RNA are important for the interaction with naturally-occurring bulged antisense RNAs [Hjalt and Wagner, 1992]. Bulged residues in the antisense RNA play important roles in binding to their target RNA [Kolb et al., 2001], and that loop sizes in the bulges have strong effects [Hjalt and Wagner, 1992; 1995]. Antisense / target structural matching may be a general mechanism for antisense activity, although there may be differences in the mechanism of RNA-RNA and DNA-RNA interactions.

Antisense activity is determined by multiple factors [Branch, 1998]. While our analysis suggested that certain secondary structural features of targets and ASOs play a role in determining antisense activity, other important factors certainly exist. In fact, there are ASOs of varying levels of activity in each structural category analyzed, indicating the presence of other determinants not yet resolved here. In addition, some features may be more important than others under certain conditions. In this regard, our results provide a different but not conflicting point of view from some previous studies. It has been shown experimentally that the gene-silencing activity of ASOs is higher on single-stranded RNA than on more structured targets [Vickers et al., 2000]. ASO activity was shown significantly correlated with predicted single-stranded structures in a mRNA [Ding and Lawrence, 2001]. These studies characterized activity determinants for a relatively small number of well defined targets and ASOs without considering possible structural features of ASOs. Our analysis based on a larger number of observations suggests that structural features other than linear regions can also be good antisense targets, and that ASO structure may play a role in potency. Further statistical analysis including additional features such as sequence motifs [Matveeva et al., 2000] and nucleotide contents may further refine the model, which should be useful for robotic design of antisense and other therapeutic nucleic acids. Such statistical analyses can also provide explanations why some predictive methods based on both the sequential and secondary structural information, such as [Craig and Liao, 2006], have attained higher accuracy than if only the sequential information has been used.

In a recent paper, RNAcofold, a standard dynamic programming method based on partition function and base pairing probabilities, was proposed to predict the cofolding of interacting RNAs [Bernhart et al., 2006]. A similar method for predicting DNA-RNA interactions, should it exist, would be able to explain thermodynamically the correlations between stem-loop like ASOs and stem-loop type target sites. Yet, as discussed in [Bernhart et al., 2006], while it is relatively easy to convert their method to do DNA-DNA interactions, it is much more complicated to do DNA-RNA interactions, due to the requirement of a complete set of DNA-RNA parameters which are not available at present, and the difficulty in distinguishing pure RNA and pure DNA loops from mixed RNA-DNA loops. Because of these difficulties, it is beyond the scope of the current paper to develop a similar method for dealing with DNA-RNA interactions, which should be pursued in a future work.



Acknowledgements

The authors thank Dr. Olga V. Matveeva and colleagues for making the ASO activity and sequences database available to our analysis in this work. We are grateful to Wanda Dominger for editing the manuscript. The authors are thankful to the anonymous reviewer for helpful comments.




References