Structural prediction of protein complexes by computational docking with pydockoptimization and new developments for large-scale application

  1. PONS PÉREZ, CARLES
Dirigée par:
  1. Juan Fernández Recio Directeur

Université de défendre: Universitat de Barcelona

Fecha de defensa: 07 juillet 2011

Jury:
  1. Alfonso Valencia Herrera President
  2. Francisco Javier Luque Garriga Secrétaire
  3. Alexandre M. J. J. Bonvin Rapporteur

Type: Thèses

Teseo: 311832 DIALNET

Résumé

After sequencing the human genome, the next big challenge in Biology is the unraveling of the intricate protein-protein interaction networks that perform the majority of functions in cells. High-throughput experimental techniques and different computational methods have contributed to partially describe the interactomes of several organisms. However, a complete understanding of these protein interactions and their mechanisms of association requires the atomic details of the formed protein complexes. In spite of the success of experimental techniques like X-ray crystallography or NMR, there is a growing gap between the number of reported protein-protein interactions and the ones for which there is available structural information. Thus, new hybrid approaches for the characterization of protein complexes need to be implemented, where computational methods such as protein docking are most needed. However, the general poor success rate of docking methods and their computational cost hinder their efficient large-scale application. The work in this thesis has produced new developments and thorough analyses that facilitate the large-scale application of our docking approach pyDock, with special focus on the quality of the docking predictions, the reliability of the results, and the acceleration of the docking protocol. We initially evaluated our docking tool, pyDock, using standard protein-protein docking benchmarks and participating in the CAPRI experiment. The performance achieved was in line with that of the best docking methods. From these analyses we detected conditions that can be used to estimate a priori the success rate of our protocol, like type of complex, size of the proteins, average pyDock scores, predicted flexibility by normal modes analysis, or binding affinity. However, the most determining factor for docking success was the conformational changes upon binding. Most cases with flexibility (i.e. average unbound/bound RMSD of the proteins) below 0.5 Å could be correctly predicted by pyDock, whereas cases with flexibility above 1.5 Å were extremely difficult for our protocol. In this thesis we show that the limitations of our rigid-body docking approach could be partially overcome by the integration of low-resolution information. The inclusion of SAXS data, topological features of residue networks, or a combination of statistical potentials and desolvation values, significantly increased the docking success in those cases with moderate conformational changes upon binding (0.5 Å < flexibility < 1.5 Å). Additionally, we show that a cross-docking strategy using precomputed conformational ensembles was a successful strategy for the integration of flexibility into our rigid-body approach. This increased the quality of the docking results and provided mechanistic details of the association process in ubiquitin complexes. From a computational point of view, sampling and scoring were accelerated in two orders of magnitude. FFT-based rigid-body sampling was optimized for the Cell BE processor, and scoring by pyDock was accelerated in combination with SIPPER (statistical potentials and desolvation values), which was used as filter of docking poses. Finally, we evaluated the capabilities of pyDock for the prediction of binding affinities in protein complexes. We show that pyDock accurately predicted the binding affinity in a group of protein complexes, and that it discriminated between high and low affinity interactions based solely on the unbound subunits of the proteins. These results show for the first time that the prediction of interactions at a proteomic scale may benefit from the evaluation of unbound docking results.