Application of computational docking to the characterization and modulation of protein-protein interactions of biomedical interest

ROSELL OLIVERAS, MIREIA

Application of computational docking to the characterization and modulation of protein-protein interactions of biomedical interest

ROSELL OLIVERAS, MIREIA

Supervised by:

Juan Fernández Recio Director

Defence university: Universitat de Barcelona

Fecha de defensa: 21 December 2020

Committee:

Javier Sancho Sanz Chair
Xavier Barril Alonso Secretary
Fabián Glaser Geller Committee member

Type: Thesis

Teseo: 716815 DIALNET

Abstract

The study of the 3D structural details of protein interactions is essential to understand biomolecular functions at the molecular level. In this context, the limited availability of experimental structures of protein-protein complexes at atomic resolution is propelling the development of computational docking methods that aim to complement the current structural coverage of protein interactions. One of these docking approaches is pyDock, which uses van der Waals, electrostatics, and desolvation energy to score docking poses generated by a variety of sampling methods, typically FTDock or ZDOCK. The method has shown a consistently good prediction performance in community-wide assessment experiments like CAPRI or CASP, and has provided biological insights and insightful interpretation of experiments by modeling many biomolecular interactions of biomedical and biotechnological interest. Here, we describe our approach using pyDock for the structural modeling of protein assemblies and the application of its modules to different biomolecular recognition phenomena, such as modeling of binding mode, interface, and hot-spot prediction, use of restraints based on experimental data, the inclusion of low-resolution structural data, binding affinity estimation, or modeling of homo- and hetero-oligomeric assemblies. The integration of template-based and ab initio docking approaches is emerging as the optimal strategy for modeling protein complexes and multi-molecular assemblies. We will review the new methodological advances on ab initio docking and integrative modeling. The seventh CAPRI edition imposed new challenges to the modeling of protein-protein complexes, such as multimeric oligomerization, protein-peptide, and protein-oligosaccharide interactions. Many of the proposed targets needed the efficient integration of rigid-body docking, template-based modeling, flexible optimization, multi-parametric scoring, and experimental restraints. This was especially relevant for the multi-molecular assemblies proposed in the CASP13-CAPRI46 joint rounds. We will present the results for the 7th CAPRI edition and CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. One of the known potential effects of disease-causing amino acid substitutions in proteins is to modulate protein-protein interactions (PPIs). To interpret such variants at the molecular level and to obtain useful information for prediction purposes, it is important to determine whether they are located at protein-protein interfaces, which are composed of two main regions, core and rim, with different evolutionary conservation and physicochemical properties. Here we have performed a structural, energetics and computational analysis of interactions between proteins hosting mutations related to diseases detected in newborn screening. Interface residues were classified as core or rim, showing that the core residues contribute the most to the binding free energy of the PPI. Disease-causing variants are more likely to occur at the interface core region rather than at the interface rim (p < 0.0001). In contrast, neutral variants are more often found at the interface rim or at the non-interacting surface rather than at the interface core region. We also found that arginine, tryptophan, and tyrosine are over-represented among mutated residues leading to disease. These results can enhance our understanding of disease at the molecular level and thus contribute towards personalized medicine by helping clinicians to provide adequate diagnosis and treatments. The phenotypic effects of non-synonymous genetic variations leading or predisposing to disease can be rationalized on the basis of the functional and structural impact in the mutated protein, including the perturbation of the interaction network and molecular pathways in which such protein is involved. Therefore, understanding these effects at the molecular level is essential to build accurate disease models and to achieve higher precision in diagnosis and therapeutic intervention. In this context, we can computationally characterize the effect of pathological mutations on specific protein-protein interactions ("edgetic"), based on their protein structure, if available, or on docking models. Protein-protein interactions that are clearly stabilized or destabilized by these mutations can be potential targets for therapeutic intervention. We have analyzed the predicted energetical effect of mutations on PPIs by applying a variety of computing methods to model the mutation and compute the change in binding affinity (FoldX, mCSM, pyDock combined to SCWRL3). We validate the predictive energetical impact through experimental mutations contained in SKEMPI 2.0 and apply these approaches in pathological and neutral single amino acid variants (SAVs) afterward (from ClinVar/Humsavar and gnomAD). Based on this, we have identified pathological mutations that clearly affect the analyzed interactions by stabilizing or destabilizing them. As discussed above, protein-protein interactions are important for biological processes and pathological situations and are attractive targets for drug discovery. However, rational drug design targeting protein-protein interactions is still highly challenging. Hot-spot residues are seen as the best option to target such interactions, but their identification requires detailed structural and energetic characterization, which is only available for a tiny fraction of protein interactions. This thesis covers a variety of computational methods that have been reported for the energetic analysis of protein-protein interfaces in search of hot-spots, and the structural modeling of protein-protein complexes by docking. This can help to rationalize the discovery of small-molecule inhibitors of protein-protein interfaces of therapeutic interest. Computational analysis and docking can help to locate the interface, molecular dynamics can be used to find suitable cavities, and hot-spot predictions can focus the search for inhibitors of protein-protein interactions. A major difficulty for applying rational drug design methods to protein-protein interactions is that in the majority of cases the complex structure is not available. Fortunately, computational docking can complement experimental data. An interesting aspect to explore in the future is the integration of these strategies for targeting PPIs with large-scale mutational analysis.