When performing RNAi experiments using siRNA duplexes, the first critical challenge is the design of efficient siRNA
Recent studies concerning the importance of mRNA accessibility in relation to the efficiency of siRNA silencing has led MWG Biotech to create an on-line design tool allowing researchers to view the mRNA target site.
RNAi mechanism and history.
RNAi (RNA interference) is the gene silencing process caused when the presence of double stranded RNA induces cleavage of the mRNA.
It is used as a technology for analysing gene function and is a powerful tool for drug target validation.
There is also high expectation for siRNA (small interfering RNA) as a tool for in vivo investigation.
The RNAi mechanism starts either with an endogenous miRNA (microRNA) or an exogenous long dsRNA (double stranded RNA) processed by an RNAse III family enzyme, termed Dicer, resulting in a small interfering RNA (siRNA), a 21-23 nucleotides RNA duplex.
It is composed of a 19mer sequence with symmetric 2-3nt 3' overhangs and 5' phosphate groups.
The siRNA associates with cellular proteins forming the RNA-induced silencing complex (RISC).
RISC contains a helicase that unwinds the siRNA duplex.
Afterwards, the siRNA antisense strand guides the RISC to the target mRNA resulting in endonucleolytic cleavage.
Introducing dsRNA longer then 30nt into mammals provokes the antiviral/interferon pathway, so it was proposed to shortcut the Dicer process and introduce siRNA directly.
One of the challenges in applying small interfering RNAs is the identification of a potent siRNA on the corresponding mRNA.
Two hurdles, which must be overcome, are correct target identification and minimisation of potential off target effects.
The in silico design process is therefore the first and most important step.
Over time specific rules for siRNA design have been developed.
Tuschl and colleagues searched systematically for effective target sites on mRNAs and applied a set of rules for detecting 21mer target sites, searching for distinct motifs on the target gene DNA coding sequence (cds).
Another step forward in the siRNA design was the work of Reynolds et al.
They screened 180 siRNAs targeting the mRNA of two different genes and concluded in a scoring system to evaluate the efficiency of in silico designed siRNAs.
They found, that eg a low GC content, a lack of internal repeats, an A/U rich 5' end and other features enhance the silencing effect of siRNA.
Since then, various other siRNA design algorithms have been suggested, but a combination of Tuschl's motif search criteria on the cds to find potential target sites and Reynolds scoring system for validating the efficiency of the potential siRNA has become the most accepted design method to find effective siRNAs.
Optimisation methods.
Several methods have been developed to improve the efficiency of siRNA molecules.
The use of pools, a mixture of siRNA sequences all targeting the same mRNA, have been proposed as being more potent in down regulating gene function, but the risk of off-target effects has not been fully clarified.
Recently, Siolas et al found that 27mer duplexed nucleotides offer a potency advantage over the more traditional 21mer siRNA.
These 27mers take the basic 19mer siRNA designs and add additional four bases to either end to create a blunt duplex rather than having overhangs.
These longer dsRNA molecules are hypothesised to load more efficiently into the RISC complex, since they undergo natural cleavage by Dicer.
These 27mers have shown to also have the ability to silence mRNAs, where 21mer duplexes have previously failed.
These results directed us to implement the 27mer siRNA as an ordering option.
Using bioinformatics software to reduce off-target effects has also become standard practice.
Typically, a Blast search on organism specific mRNA databases is used to identify these off-targets.
Recent publications analysing the relationship between local free energy and siRNA efficiency (4, 5, 6) have shown that the mRNA target site secondary structure is of great importance for siRNA efficiency.
For example, open regions with a high number of unpaired nucleotides, resulting in improved potency, have been correlated to low negative local free energy values.
The MWG online siRNA design tool.
MWG's web based online siRNA design tool has a fully transparent and flexible design process.
All design parameters can be adapted if necessary.
The DNA coding sequence (cds) of one or several genes is input either in Fasta format or as NCBI accession numbers.
If needed, the coding region of an input sequence can be manually defined.
The design algorithm then searches on the cds for potential target sites using all Tuschl motifs by default.
The search can be limited to a single motif.
The next step in the design process is designing the siRNA to all corresponding target sites.
The siRNA design avoids stretches with more than three equal nucleotides and a U at 3' end.
All these restrictions are active by default but can be switched off one by one.
Further restrictions influencing the siRNA design are the GC content (default is between 30% and 53%) and the distance of the siRNA from the start codon and the stop codon respectively.
The distance from start and stop codon is 100 nucleotides by default and is critical for the design especially for short cds.
It is recommended to first vary the distances, if the design algorithm fails finding any siRNA.
As an extra option, a Blast search can be added using databases of human, mouse or rat mRNAs.
The Blast search does not directly influence the siRNA design but detects possible off-target effects and can be reviewed on a separate Blast output page.
For each siRNA, all Blast hits are shown directly linked to the NCBI web page.
The hits are marked in green if the Blast search results in one perfect match.
Yellow indicates several hits with the same gene name referencing alternative splice forms, and the red colour warns, if several hits are found with different gene names.
The secondary structure of the mRNA target site plays an important rule in siRNA efficiency.
Due to this fact we implemented a secondary structure view of the mRNA with a red colored target site using the RNAfold program from the Vienna package 1.4.
As an additional useful parameter, we calculate the number of bound bases of the target site.
With these new options the user can quickly identify highly efficient siRNAs.
Folding with RNAfold is a CPU and memory-consuming algorithm that depends highly on the sequence length.
To avoid an overload of the machines, we set a maximum cutoff of 3000 nucleotides for folding input.
Mean length of human coding sequences is 1443 nucleotides (Unigene Release 146).
Another argument for setting a cut-off is the waiting time as a critical point in web based applications.
And finally, the statistical reliability of folding becomes more difficult with growing sequence length.
Influence of the secondary structure For antisense oligodeoxyribonucleotide applications (ODN) it was shown that the secondary structure of the target site has an important impact on down regulation efficiency.
It was recommended for the in silico design of ODN to find target sites on the mRNA with >10nt consecutive sequence stretches not involved in base pairing.
In contrast to ODN, where the oligodeoxyribonucleotide directly binds to the mRNA, in the RNAi pathway, it is not fully understood how the siRNA-mRNA target hybrid is formed.
The importance of secondary structure is supported by studies, which have shown that the presence of hairpin structures within the target region will dramatically reduce the potency of gene silencing by siRNA.
There seems to be a linear correlation between siRNA silencing of the target gene and the local free energy of the target region.
A low negative local free energy, which correlates to an open region with a high number of unpaired nucleotides results in improved potency.
Luo and Chang found the number of hydrogen bonds formed by the target region to be a useful parameter in siRNA design.
The presence of a large number of unpaired nucleotides seems to be a marker for a highly accessible target site.
Heale et al used local structural features to calculate the secondary structure of the target site.
The accessible target site prediction is the calculated difference between the free energy of the secondary structure and the theoretical maximum stability of antisense binding based on a perfect helix formed between the target site and the antisense sequence.
This again results in a better accessibility for target sites with low negative local free energy.
Yiu and colleagues found target sites enclosed by neighbouring branches to be less effective.
In contrast to all other publications reviewed, their filtering algorithm removes target sites on or near to bigger loops ignoring the local free energy.
Interestingly, they have found that applying their filtering algorithm to various design tools, including those with the Tuschl/Reynolds combination, decreases the output of ineffective siRNAs up to 53%.
Simply adding this filter to the basic Tuschl/Reynolds combination improved the identification of effective siRNAs better than other commercial design tools currently being employed.
This shows the high potential of secondary structure for the in silico design of effective siRNAs.
Conclusion The mRNA target site secondary structure is an important factor to evaluate an efficient siRNA candidate.
The possibility to visually check the target site and get number of bound bases is a feature that is new in commercial siRNA design tools and gives the user the opportunity to design highly efficient siRNAs.
The user can combine the Tuschl principles, high Reynolds score, a low number of Blast hits, and a highly accessible target site found through the secondary structure view to order the most efficient siRNA as a 19mer or 27mer.
Being completely integrated into MWG Biotech's on-line ordering system, the design tool allows the user to directly order the siRNA with a simple mouse click.