Caracterización bioinformática de la relación entre el impacto molecular de las variantes patogénicas y el fenotipo clínico

  1. Marín Sala, Oscar
Dirigida por:
  1. Xavier de la Cruz Montserrat Director/a

Universidad de defensa: Universitat Autònoma de Barcelona

Fecha de defensa: 19 de julio de 2017

Tribunal:
  1. Josep Lluis Gelpi Buchaca Presidente/a
  2. Xavier Daura Ribera Secretario/a
  3. Juan Fernández Recio Vocal

Tipo: Tesis

Teseo: 489733 DIALNET lock_openTESEO editor

Resumen

The advent of Next Generation Sequencing (NGS) carries the promise to change medicine's paradigm, but sequencing data comes with a myriad of noticeable technical and methodological challenges. Those hurdles difficult the integration of NGS technologies in precision medicine. Machine Learning is a possible solution to some of those problems, as it is a powerful toolbox with algorithms capable of processing big and complex data. This thesis deals with key topics in the clinical application of NGS techniques using bioinformatics and machine learning methods. First, we study the molecular and evolutionary characteristics of variants known as compensated pathogenic deviations (CPD), which are pathological variants appearing as wild type in other organisms, and its associated phenotype impact. Second, we apply neural network models to predict the phenotype severity of pathological variants. We use physico-chemical and evolutionary attributes that describe the amino-acid change, using proteins F8 as F9 as our models. We also analyze the characteristics of variants associated to mild and severe versions of disease. Last, we apply methods based on decision trees to create a CPD prediction methodology from descriptors of the molecular change and the evolutionary relationship between positions in the protein sequence. We use those predictors to search for CPD variants within humans, studying the sequenced individuals from the 1000G project. We study the likelihood that those variants are a fraction of the incidentalome.