• Statistical modelling of spatio-temporal dependencies in NGS data
  • Ranciati, Saverio <1988>


  • SECS-S/01 Statistica


  • Next-generation sequencing (NGS) has rapidly become the current standard in genetic related analysis. This switch from microarray to NGS required new statistical strategies to address the research questions inherent to the considered phenomena. First and foremost, NGS dataset usually consist of discrete observations characterized by overdispersion - that is, discrepancy between expected and observed variability - and an abundance of zeros, measured across a huge number of regions of the genome. With respect to chromatin immunoprecipitation sequencing (ChIP-Seq), a class of NGS data, it is of primary focus to discover the underlying (unobserved) pattern of `enrichment': more particularly, there is interest in the interactions between genes (or broader regions of the genome) and proteins, as they describe the mechanism of regulation under different conditions such as healthy or damaged tissue. Another interesting research question involves the clustering of these observations into groups that have practical relevance and interpretability, considering in particular that a single unit could potentially be allocated into more than one of these clusters, as it is reasonable to assume that its participation is not exclusive to one and only biological function and/or mechanism. Many of these complex processes, indeed, could also be described by sets of ordinary differential equations (ODE's), which are mathematical representations of the changes of a system through time, following a dynamic that is governed by some parameters we are interested in. In this thesis, we address the aforementioned tasks and research questions employing different statistical strategies, such as model-based clustering, graphical models, penalized smoothing and regression. We propose extensions of the existing approaches to better fit the problem at hand and we elaborate the methodology in a Bayesian environment, with the focus on incorporating the structural dependencies - both spatial and temporal - of the data at our disposal.


  • 2016-06-10


  • Doctoral Thesis
  • PeerReviewed


  • application/pdf



Ranciati, Saverio (2016) Statistical modelling of spatio-temporal dependencies in NGS data, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Scienze statistiche , 28 Ciclo. DOI 10.6092/unibo/amsdottorato/7680.