Computational Analysis Of The Interplay Between Rna Structure And Function PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Computational Analysis Of The Interplay Between Rna Structure And Function PDF full book. Access full book title Computational Analysis Of The Interplay Between Rna Structure And Function.

Computational Analysis of the Interplay Between RNA Structure and Function

Computational Analysis of the Interplay Between RNA Structure and Function
Author: Elan A. Shatoff
Publisher:
Total Pages: 0
Release: 2021
Genre: Molecular structure
ISBN:

Download Computational Analysis of the Interplay Between RNA Structure and Function Book in PDF, ePub and Kindle

RNA is ubiquitous in the cellular environment, and it can function in innumerable ways with a variety of interaction partners. A RNA molecule's structure, in particular the set of base pairing interactions between the nucleotides of the molecule known as secondary structure, can help determine its function. Since most proteins can only bind to either single stranded or double stranded RNA, RNA secondary structure can also help determine where and how RNA-protein binding interactions occur. In this work I investigate computational models for RNA-protein interactions in a variety of different contexts. In Chapter 2 I probe the effect of single nucleotide variations on RNA-protein binding as mediated by RNA secondary structure. Single nucleotide variations are single nucleotide changes in an organism's genome that can often cause disease, and may do so through a number of different mechanisms. In this work we propose that sequence changes can affect accessibility to protein binding sites through changes in secondary structure, even when these sequence changes occur tens of nucleotides outside of protein binding sites. We find that single nucleotide variations can have a many fold effect on the binding affinity of proteins for RNA, and characterize the genome-wide effect of single nucleotide variations on HuR binding. HuR is a single-stranded RNA binding protein that binds to AU-rich sequences, and has links to diseases such as cancer. We also find an asymmetry in this effect for HuR, indicating that this effect may be under selection. Following the previous work, which utilizes a model incorporating single stranded RNA binding proteins into RNA secondary structure folding, I introduce a model for incorporating double stranded RNA binding proteins (dsRBPs) into RNA secondary structure partition function calculations in Chapter 3. The dsRBPs are an important but understudied class of proteins that have uses in a wide range of processes. We implement our model in the ViennaRNA package, and validate it by calculating a number of experimental observables for transactivation response element RNA-binding protein. We find that RNA secondary structure can have a many fold effect on the effective binding affinity of dsRBPs, and show that calculated affinities for pre-miRNA-like constructs correlate with experimentally measured processing rates. Our model provides a novel method for interrogating the interplay between dsRBPs and RNA secondary structure. In Chapter 4 I study RNA-protein interactions in a different context, and investigate the role of Shine-Dalgarno (SD) sequences in translation in the Bacteroidetes. The Bacteroidetes are a phylum of bacteria known to rarely use SD sequences, but after performing a survey of SD usage in the phylum we find that certain ribosomal protein genes utilize them, particularly rpsU. A cryo-electron microscopy structure of the ribosome from Flavobacterium johnsoniae, a member of the Bacteroidetes, also shows that S21, which is encoded by the ribosomal open reading frame rpsU, sequesters the anti-Shine-Dalgarno (ASD) sequence. In our survey of SD sequences we also find covariation between the SD sequence of rpsU and the ASD sequence. These observations suggest an autoregulatory model for S21 in the Bacteroidetes.


Computational Analysis and Prediction of RNA-protein Interactions

Computational Analysis and Prediction of RNA-protein Interactions
Author: Michael Uhl
Publisher:
Total Pages: 0
Release: 2022*
Genre:
ISBN:

Download Computational Analysis and Prediction of RNA-protein Interactions Book in PDF, ePub and Kindle

Abstract: This dissertation is about the computational analysis and prediction of RNA-protein interactions. Ribonucleic acids (RNAs) and proteins both are essential for the control of gene expression in our cells. Gene expression is the process by which a functional gene product, namely a protein or an RNA, is produced from a gene, starting from the gene region on the DNA with the transcription of an RNA. Once regarded primarily as a messenger to transmit the protein information, recent years have seen RNA moving further into the biomedical spotlight, thanks to its increasingly uncovered roles in regulating gene expression. In addition, RNA has showcased its therapeutic potential, as famously demonstrated by the groundbreaking success of RNA vaccines in the COVID-19 pandemic. However, RNAs rarely function on their own: In humans, more than 1,500 different RNA-binding proteins (RBPs) are involved in controlling the various stages of an RNA's life cycle, creating a highly complex regulatory interplay between RNAs and proteins. It is therefore of fundamental importance to study these RNA-protein interactions, in order to deepen our understanding of gene expression. Over the last decade, CLIP-seq has become the dominant experimental method to identify the set of cellular RNA binding sites for an RBP of interest. However, analysing the resulting CLIP-seq data can be challenging, as there are many analysis steps and CLIP-seq protocol variants available, each requiring specific adaptations to the analysis workflow. Consequently, there is a need for analysis guidelines, providing easy access to tools, as well as the constant improvement of tools and workflows to increase the accuracy of the analysis results. The first set of works included in this thesis (publications P1, P4, and P5) deals with these topics, by providing a review article on CLIP-seq data analysis, as well as two articles on how to further improve CLIP-seq data analysis. Publication P1 supplies readers with an overview of tools and protocols, as well as guidelines to conduct a successful analysis, drawing largely from our own experience with analysing CLIP-seq data. Publication P4 demonstrates the issues current binding site identification tools have with CLIP-seq data from RBPs that bind to processed RNAs, and that the integration of RNA processing information improves the resulting binding site quality. On top of this, publication P5 presents Peakhood, the first tool that utilizes RNA processing information in order to increase the quality of RBP binding sites identified from CLIP-seq data. A natural drawback of experimental methods is that a target RNA needs to be sufficiently expressed in the observed cells for an RNA-protein interaction to be detected. Hence, since gene expression is a dynamic process that differs between cell types, time points, and conditions, a CLIP-seq experiment cannot recover the complete set of cellular RBP binding sites. This creates a demand for computational methods which can learn the binding properties of an RBP from existing CLIP-seq data, in order to predict RBP binding sites on any given target RNA. Besides interacting with proteins, RNAs can also interact with other RNAs, further increasing the amount of possible regulatory interactions between RNAs and proteins. In this regard, long non-coding RNAs (lncRNAs), a large class of non-protein-coding RNAs whose functions are still vastly unexplored, have become especially important, as it has been shown that they can engage in RNA-RNA interactions, whose regulatory mechanisms also include RNA-protein interactions. As such mechanistic studies are typically slow and expensive, computational tools that combine RNA-protein and RNA-RNA interaction predictions to infer potential mechanisms could be of great help, e.g., by screening a set of target RNAs and proteins and suggesting plausible mechanisms for experimental validation. The second set of works included in this thesis (publications P2 and P3) thus deals with the computational prediction of RNA-protein interactions, RNA-RNA interactions and the functional mechanisms that can be inferred from these interactions. Publication P2 introduces MechRNA, the first tool to infer functional mechanisms of lncRNAs based on their predicted interactions with RBPs and other RNAs, as well as gene expression data. We demonstrated MechRNA's capability to identify formerly described lncRNA mechanisms and experimentally validated one prediction, underlining its value for functional lncRNA studies. Finally, publication P3 presents RNAProt, a flexible and performant RBP binding site prediction tool based on recurrent neural networks. Compared to other popular deep learning methods, RNAProt achieves state-of-the-art predictive performance, as well as superior runtime efficiency. In addition, it is more feature-rich than any other available method, including the support of user-defined predictive features. We further showed that its visualizations agree with known RBP binding preferences, and demonstrated that its additional predictive features can increase the specificity of predictions


Computational Characterization of Protein-RNA Interactions and Implications for Phase Separation

Computational Characterization of Protein-RNA Interactions and Implications for Phase Separation
Author: Alexandros Armaos
Publisher:
Total Pages: 110
Release: 2020
Genre:
ISBN:

Download Computational Characterization of Protein-RNA Interactions and Implications for Phase Separation Book in PDF, ePub and Kindle

Despite what was previously considered, the role of RNA is not only to carry the geneticinformation from DNA to proteins. Indeed, RNA has proven to be implicated in morecomplex cellular processes. Recent evidence suggests that transcripts have a regulatoryrole on gene expression and contribute to the spatial and temporal organization of theintracellular environment. They do so by interacting with RNA-binding proteins (RBPs)to form complex ribonucleoprotein (RNP) networks, however the key determinants thatgovern the formation of these complexes are still not well understood. In this work, I willdescribe algorithms that I developed to estimate the ability of RNAs to interact withproteins. Additionally, I will illustrate applications of computational methods to proposean alternative model for the function of Xist lncRNA and its protein network.Finally, I will show how computational predictions can be integrated with highthroughput approaches to elucidate the relationship between the structure of the RNA andits ability to interact with proteins. I conclude by discussing open questions and futureopportunities for computational analysis of cell's regulatory network.Overall, the underlying goal of my work is to provide biologists with new insights intothe functional association between RNAs and proteins as well as with sophisticated toolsthat will facilitate their investigation on the formation of RNP complexes.


Computational Analysis and Annotation of Structurally Functional RNAs

Computational Analysis and Annotation of Structurally Functional RNAs
Author: Milad Miladi
Publisher:
Total Pages:
Release: 2021
Genre:
ISBN:

Download Computational Analysis and Annotation of Structurally Functional RNAs Book in PDF, ePub and Kindle

Abstract: This work is a dissertation about computational methodologies and analyses of ribonucleic acid (RNA) molecules based on their sequence and structure properties. RNA is an essential molecule of living cells that acts as the career of the proteins genetic information and also as a regulatory functional element that contributes to cellular mechanisms. While only less than 3% of the human genome is encoding for known proteins, more than 85% of the genome is getting transcribed into RNA. Alone for the human genome, tens of thousands of non-coding RNA genes exist bearing pervasive functions. Despite the important roles of RNAs, functional and the regulatory mechanisms of a large number of the non-coding and protein-coding RNAs is either unknown or poorly understood. To solve this challenge, computational methodologies are a vital asset for a scalable and systematic analysis and annotation of RNAs with unknown functions. RNAs are polymer molecules that fold into complex structures within the cells. For a functional RNA, its folded structure often plays an important role and is better conserved than the polymer sequence through evolution. Therefore, it is essential to consider both the sequence and structure information for the task of annotation and discovery of functional RNAs using the computational approaches. Comparative methodologies utilise the evolutionary conservation information of both sequence and structure. They are pivot assets for providing reliable structure prediction and annotation of functional RNAs. Over the past decade, millions of RNA sequences have been obtained using techniques such as genomic screens and high-throughput sequencing experiments. These techniques produce up to several thousands or even millions of sequences and can be applied over all the domains of life. Analysing these large collections of sequences, for the evaluation and annotation of functional RNAs, demands efficient optimisation algorithms with sufficiently accurate models. Additionally, since the cells rely on heterogeneous molecules and mechanisms to function, integrative analysis of biological data is commonly required nowadays. Therefore, computational approaches based on techniques such as machine learning are needed to provide comprehensive strategies with high efficiencies also at different levels of the data. This thesis addresses some substantial challenges for the evaluation and annotation of functional RNAs by presenting novel contributions using computational analysis, optimisation algorithms, comparative methodologies, clustering approaches. The personal contributions are presented in the form of six works that are encompassed as six publications from three domains for the tasks of annotation, discovery, and analysis of functional RNAs. SPARSE and Pankov are two novel contributed algorithms for the problem of simultaneous alignment and folding (SA&F) of RNAs. SPARSE achieves a quadratic complexity without sequence-based heuristics by utilising a strong sparsification over the ensemble of possible secondary structure formations. The second SA&F algorithm Pankov, enables a fast simultaneous alignment and folding of RNAs while cohering to the nearest-neighbour thermodynamics principle of the standard RNA folding model. Pankov provides the most accurate SA&F probabilistic energy model until today, by mapping the nearest-neighbour principle to a Markov scheme using conditional in-loop probabilities. RNAscClust and GraphClust2 are presented for scalable clustering of RNA sequences based on sequence and structure. The RNAscClust methodology enables a linear-time clustering of paralogous RNAs based on their sequence and structure. Both tools are machine learning approaches that utilise graph kernel and locality-sensitive hashing schemes to support the clustering of input entries in an asymptotically linear time. RNAscClust incorporates orthogonal structure conservation to enhance the clustering and annotation performance. GraphClust2 is an integrative approach for the accessible and scalable clustering of RNAs to identify structurally conserved non-coding RNAs and motifs. GraphClust2 outperforms its predecessor and importantly supports diverse sources of genomic and experimental data in an accessible fashion. GraphClust2 bridges the gap between high-throughput sequencing experiments and the structure-based methodologies for functional RNA discovery. The final topic covered by this thesis is the mutational analysis of RNA secondary structure and function. A large-scale compilation and statistical analysis of somatic cancer synonymous mutations is presented. The analysis and experiments reveal that the synonymous mutations, despite not changing encoded protein sequence, can have substantial impacts on the gene expression levels and considerably disrupt the local secondary structure of mRNAs. Finally, MutaRNA is presented as an accessible web-based solution for evaluating the impact of mutation on the RNA secondary structure and visualising the complex impacts of the mutation on the intra-molecular interactions potentials in an intuitive manner


RNA 3D Structure Analysis and Prediction

RNA 3D Structure Analysis and Prediction
Author: Neocles Leontis
Publisher: Springer Science & Business Media
Total Pages: 402
Release: 2012-06-05
Genre: Science
ISBN: 3642257402

Download RNA 3D Structure Analysis and Prediction Book in PDF, ePub and Kindle

With the dramatic increase in RNA 3D structure determination in recent years, we now know that RNA molecules are highly structured. Moreover, knowledge of RNA 3D structures has proven crucial for understanding in atomic detail how they carry out their biological functions. Because of the huge number of potentially important RNA molecules in biology, many more than can be studied experimentally, we need theoretical approaches for predicting 3D structures on the basis of sequences alone. This volume provides a comprehensive overview of current progress in the field by leading practitioners employing a variety of methods to model RNA 3D structures by homology, by fragment assembly, and by de novo energy and knowledge-based approaches.


Detection of RNA Structural Heterogeneity

Detection of RNA Structural Heterogeneity
Author: Sitara C. Persad
Publisher:
Total Pages: 78
Release: 2018
Genre:
ISBN:

Download Detection of RNA Structural Heterogeneity Book in PDF, ePub and Kindle

Beyond its function as a messenger molecule in protein formation, the linear sequence on RNA is capable of folding into higher order structures which may interact with other molecules and play key functional roles in the cell. Current methods in characterizing RNA structure via experimental probing are limited to the population average, which obscures structural heterogeneity. This thesis addresses the problem of inferring structural heterogeneity from dimethyl sulphate (DMS) probing data. First, we analysed sequence data to uncover experimental biases and developed simulations for sample structures. We proposed and evaluated machine learning methods in unsupervised learning to infer structural heterogeneity. Secondly, we designed and implemented runDMC, a web platform designed to facilitate the discovery of alternative RNA secondary structures, using in vivo chemical probing data and machine learning clustering methods. runDMC accepts experimental probing data and provides an intuitive, user-friendly interface for discovery of alternative structures. We anticipate that runDMC will facilitate the widespread use of DMS probing and analysis in the biological community, enabling the discovery of more RNA alternative structures.


Computational Genomics with R

Computational Genomics with R
Author: Altuna Akalin
Publisher: CRC Press
Total Pages: 462
Release: 2020-12-16
Genre: Mathematics
ISBN: 1498781861

Download Computational Genomics with R Book in PDF, ePub and Kindle

Computational Genomics with R provides a starting point for beginners in genomic data analysis and also guides more advanced practitioners to sophisticated data analysis techniques in genomics. The book covers topics from R programming, to machine learning and statistics, to the latest genomic data analysis techniques. The text provides accessible information and explanations, always with the genomics context in the background. This also contains practical and well-documented examples in R so readers can analyze their data by simply reusing the code presented. As the field of computational genomics is interdisciplinary, it requires different starting points for people with different backgrounds. For example, a biologist might skip sections on basic genome biology and start with R programming, whereas a computer scientist might want to start with genome biology. After reading: You will have the basics of R and be able to dive right into specialized uses of R for computational genomics such as using Bioconductor packages. You will be familiar with statistics, supervised and unsupervised learning techniques that are important in data modeling, and exploratory analysis of high-dimensional data. You will understand genomic intervals and operations on them that are used for tasks such as aligned read counting and genomic feature annotation. You will know the basics of processing and quality checking high-throughput sequencing data. You will be able to do sequence analysis, such as calculating GC content for parts of a genome or finding transcription factor binding sites. You will know about visualization techniques used in genomics, such as heatmaps, meta-gene plots, and genomic track visualization. You will be familiar with analysis of different high-throughput sequencing data sets, such as RNA-seq, ChIP-seq, and BS-seq. You will know basic techniques for integrating and interpreting multi-omics datasets. Altuna Akalin is a group leader and head of the Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center, Berlin. He has been developing computational methods for analyzing and integrating large-scale genomics data sets since 2002. He has published an extensive body of work in this area. The framework for this book grew out of the yearly computational genomics courses he has been organizing and teaching since 2015.


Physics-based Modeling for RNA Folding

Physics-based Modeling for RNA Folding
Author: Xiaojun Xu
Publisher:
Total Pages:
Release: 2014
Genre: Electronic dissertations
ISBN:

Download Physics-based Modeling for RNA Folding Book in PDF, ePub and Kindle

RNA (ribonucleic acid) molecules play a variety of crucial roles in cellular functions at the level of transcription, translation and gene regulation. Many RNAs have also been found to play catalytic and regulatory roles. RNA functions are tied to structures and dynamics. To understand the quantitative relationship between RNA functions and RNA 3D structures, stabilities and kinetics, we need a predictive model for RNA folding. We focus on the development of physics-based models for RNA folding. RNA molecules often undergo multiple conformational switches in order to perform their cellular functions, suggesting that RNA-involved processes can be kinetically controlled. Therefore, a full characterization of RNA folding and function should consider not only the native structure but also the folding kinetics. RNA hairpin is one of the most fundamental motifs in RNA structures. Predicting the physical mechanism of the folding and conformational switch for hairpins has far reaching impact on our understanding of more complex RNA folding problems. In the first project, we used Kinetic Monte Carlo method to explore the detailed kinetic mechanism for the conformational switches between bistable RNA hairpins. We found three types of conformational switch pathways for RNA hairpins: refolding after complete unfolding, folding through basepair-exchange pathways and through pseudoknot-assisted pathways, respectively. The study of this project led to new insights about RNA folding: tertiary structure such as pseudoknots can help secondary structure folding by lowering the kinetic barrier. The problem of RNA structure prediction from the nucleotide sequence is an unsolved problem for both 2D and 3D structures. One of the key issues in the current RNA structure prediction methods is the lack of the knowledge about the tertiary interactions. Tertiary interactions are crucial for RNA functions. Therefore, a model that can predict RNA structures with tertiary interactions is highly needed. In the second project, we developed a statistical mechanical model for the folding of tertiary structures with loop-loop kissing interactions. Based on a coarse-grained RNA conformational model, we developed an algorithm to compute the entropy parameters for the formation of the kissing structures. The model enables computational prediction of the structure and thermodynamic stability of RNA tertiary folds with long-range kissing loop-loop kissing contacts. The validity of the model is supported by the good theory-experiment agreements for the native structures and thermodynamic stabilities for a series of RNA/RNA complex systems. For large RNAs, due to the increased complexity of the structures, the accuracy of computational predictions for RNA folding decreases. For such complex RNA structures, a novel structure determination strategy is to combine computational modeling with experimental data such as the reactivity profile of Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE), an method for chemical probing of RNA structure. Although the SHAPE technology has been widely applied to a broad range of RNA structure probing problems, the quantitative relationship between the structure/dynamics and SHAPE reactivity is not understood. In the third project, we developed a model to quantify the relationship between RNA three-dimensional structure and SHAPE reactivity. We developed an analytical function to rebuild SHAPE profiles from three-dimensional structures. The algorithm starts from RNA structures and combines nucleotide interaction strength and conformational propensity, ligand (SHAPE reagent) accessibility, and base-pairing pattern through a composite function to quantify the correlation between SHAPE reactivity and nucleotide conformational stability. Comparisons between predicted SHAPE profiles and experimental SHAPE data show high correlations, suggesting the validity of the extracted analytical function. The result supports our analysis for the key factors that determine the SHAPE reactivity from structure.


Molecular Biology of The Cell

Molecular Biology of The Cell
Author: Bruce Alberts
Publisher:
Total Pages: 0
Release: 2002
Genre: Cytology
ISBN: 9780815332183

Download Molecular Biology of The Cell Book in PDF, ePub and Kindle