Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Computer modeling of protein-peptide interface solvation
(USC Thesis Other)
Computer modeling of protein-peptide interface solvation
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Computer Modeling of Protein-Peptide Interface Solvation
By
Yi-Hsun Chen
A Thesis Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
(PHARMACEUTICAL SCIENCES)
December 2014
Copyright 2014Yi-Hsun Chen
ii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS iii
LIST OF TABLES iv
LIST OF FIGURES v
ABSTRACT vi
I. CHAPTER 1: INTRODUCTION 1
II. CHAPTER 2:SOLVATION OF 1600 PROTEIN-PEPTIDE COMPLEXES 7
1. Introduction: Importance of water in protein-peptide complexes 7
2. Methods 7
3. Results 9
4. Discussion 21
III. Chapter 3: SOLVATION OF ABL-SH3 BINDINGPROTEIN,3EG1 24
1. Introduction: Role of SH3 domain and Abl-SH3 binding domain 24
2. Methods25 25
3. Results28 28
4. Discussion 38
IV. REFERENCES 42
iii
ACKNOWLEDGEMENTS
I would like to express my appreciation to the people who support me throughout my two-
year study at USC master program. Without their help and support, I would not be able to finish
this thesis and get my master degree.
First of all, I would like to express my gratitude to my mentor, Dr. Ian S. Haworth, for his
patience and guidance. He patiently teaches and guides me to conduct my master research
project. Most importantly, he always encourages me and cheers me up when I ran into difficulty
in my study, which really means a lot to me. He also gave me a lot of advices and help when I
look for jobs. It is really my great pleasure to know him and be able to join Dr. Haworth’s
research team.
I also want to deeply thank my master thesis committee members, Dr. Curtis Okamoto and
Dr. Wei-Chiang Shen. Dr. Okamoto and Dr. Shen provided valuable suggestions to my thesis
editing.
Next, I would also like to express my appreciation to my parents, and my sisters for their
financial support and mental support. My family always supports me no matter what decision I
make or what path I choose. Especially my second sister and my second brother in law, I am so
thankful and lucky to have them in my life and have them as my family.
Lastly, I would like to thank my classmates in USC for their support and encouragement
during my two year of study in the USC. In addition, I would like to thank my friends in Taiwan:
Annie Lee, Peggy Lee, I-Pei Lee, Shih-Yi Chien, Claudia Tien, and friends in the United States:
Phyllis Kao, Tracy Teng, Xinxin Chen and many more for cheering me up when I feel depressed.
iv
LIST OF TABLES
Table 2-1: Average score of water match and water with no match in different structures of
the three complexes chosen 21
Table 3-1: Energies after solvation for original complex (p41) and four mutants (p40, p17,
p0, and p7); experimental data are also included 29
Table 3-2: Energy and ranking of the two important poses (pose 364 and pose 680) in
original complex (p41) and in each mutant (p40, p17, p7 and p0) 29
v
LIST OF FIGURES
Figure 2-1: Solvated water comparison of XX and XG in 3EG1 10
Figure 2-2: Solvated water comparison of XX and XG in 3OKI 10
Figure 2-3: Solvated water comparison of XX and XG in 3OKH 10
Figure 2-4: Number of water molecules and amino acids in 3EG1 with four structures (XX,
XG, GX, and GG) 11
Figure 2-5: Water molecules matches in structures 1 and 2 (XX and XG), and in structures 1
and 3 (XX and GX) of 3EG1 13
Figure 2-6: Water molecules matches in structure 1 and 4 (XX and GG) of 3EG1 13
Figure 2-7: Number of water molecules and amino acids in 3OKI with four structures (XX,
XG, GX, and GG) 14
Figure 2-8: Water molecules match in structure 1 and 2 (XX, and XG) of 3OKI 15
Figure 2-9: Water molecules match in structure 1 and 3 (XX and GX) of 3OKI 16
Figure 2-10: Water molecules match in structure 1 and 4 (XX and GG) of 3OKI 16
Figure 2-11: Number of water molecules and amino acids in 3OKH with four structures
(XX, XG, GX, and GG) 17
Figure 2-12: Water molecules match in structures 1 and 2 (XX and XG) of 3OKH 18
Figure 2-13: Water molecules match in structures 1 and 3 (XX and GX) of 3OKH 19
Figure 2-14: Water molecules match in structures 1 and 4 (XX and GG) of 3OKH 20
Figure 3-1: Movement of peptide p41 around the protein 28
Figure 3-2: Top view and side view of crystal structure (3EG1) and p41(APSYSPPPPP)
water molecules comparison 30
Figure 3-3: Close view of one of the double water bridges between Gln 114 (N) and Pro7
(O) in p41 (APSYSPPPPP) and 3EG1 31
Figure 3-4: Close view of the single water bridge between Glu 98 (OE1) and Tyr 4(O) in p41
(APSYSPPPPP) and 3EG1 32
Figure 3-5: The top view and side view of crystal structure (3EG1) and p40 (APTYSPPPPP)
water molecules comparison 32
Figure 3-6: Close view of the double water bridge between Gln114 (N) and Pro 7(O) in p40 33
vi
(APTYSPPPPP) and 3EG1
Figure 3-7: Close view of the single water bridge between Glu 98(OE1) and Tyr4 (O) in p40
(APTYSPPPPP) and 3EG1 33
Figure 3-8: The top view and side view of crystal structure (3EG1) and p17 (APTYSPPLPP)
water molecules comparison 34
Figure 3-9: Close view of the double water bridge between Gln 114 (N) and Pro 7(O) in p17
(APTYSPPLPP) and 3EG1 34
Figure 3-10: Close view of the single water bridge between Glu 98(OE1) and Tyr 4 (O) in
p17 (APTYSPPLPP) and 3EG1 35
Figure 3-11: The top and side view of the crystal structure (3EG1), p41 and p40 36
Figure 3-12: Comparison of water difference in p41 and p40 36
Figure 3-13: Close view of the crystal structure, p41 and p40 37
Figure 3-14: Top and side views of p40 and p17 38
Figure 3-15: Close view of p40 and p17 38
vii
ABSTRACT
Water is important in the formation of bimolecular complexes by forming hydrogen
bonds to other neighboring water molecules. The formation of hydrogen bond networks could
lead to the structure stability. In addition, water molecules at interfaces cause tighter packing of
atoms, mediate polar interactions, and contribute to the specificity of binding in protein-protein
or protein-peptide complexes. Therefore, inclusion of water molecules into a theoretical model
can increase the accuracy of predicting structure of protein-protein or protein-peptide complexes.
This study is separated into two chapters. In the first chapter, since water mainly forms hydrogen
bonds and interacts with the protein backbone, we would like to know if we could predict the
water structure based on a polyglycine backbone and apply this technique when we encounter
unknown sequence binding. We want to include water molecules at an early stage before
docking of the peptide to the protein. Our hypothesis is that we could use polyglycine backbones
to do docking and solvation. In the second chapter, we chose one specific Abl-SH3 domain
binding complex, 3EG1, to study because Abl-SH3 domain is a potential therapeutic target. We
compared experimental and computational results to see if computational modeling can
reproduce the actual binding position. The results in this study demonstrate how solvation of a
polyglycine backbone can be used to predict complex binding positions, and hence could lead to
applications in structure-based drug design.
1
I. Chapter 1: Introduction
Introduction:
Water molecules are everywhere and they are the most important molecules in the living
systems. They have a small size and the charge distribution can make them polar. Moreover,
water can act as a hydrogen bond donor or a hydrogen bond acceptor. When water molecules
bind to proteins or ligands and are released into the bulk solvent, there will be gain in entropy
(Barillari et al., 2007).Water plays an essential role in the process of forming of biomolecular
complexes. Water molecules bind to each other by forming hydrogen bonds. In addition, the
formation of hydrogen bond networks can contribute to the structure stability (Bui et al., 2007).
Furthermore, a lot of structural data from protein-protein or protein-DNA complexes have shown
that water molecules mainly exist in interfaces (Bui et al., 2007).These water molecules can
secure the packing of atoms, mediate polar interactions, and contribute to binding specificity
between the protein interfaces. Interfacial water molecules are also especially important in major
histocomplatibility complex (MHC) molecules (Bui et al., 2007).These MHC molecules are
involved in the immune response and they use water bridge gaps between the protein and the
peptide to bind to peptides with different sequences (Bui et al., 2007).
A few theoretical methods including molecular dynamic simulations, Monte Carlo
simulations, and energy minimization have been carried out to predict water positions and
calculate hydration energies. Each theoretical method might differ in detail, but they all have a
similar result showing that water molecules are more energetically favorable in a polar cavity
where the water can form multiple hydrogen bonds (Bui et al., 2007).However, it is still unclear
whether the nonpolar cavity that is less capable of forming multiple hydrogen bonds will remain
2
empty or another bonding network including van der Waals or water bridging between the polar
residues will be formed (Bui et al., 2007).
An algorithm called WATGEN has been developed by our research group for structure-
based prediction of the water network at protein-protein or protein-peptide binding interfaces.
WATGEN predicts water networks by four sequential steps: distribution, scoring, selection and
optimization (Bui et al., 2007).In the distribution process, water molecules are distributed around
the hydrogen centers (either hydrogen donor or hydrogen acceptor).Water molecules are then
divided into several groups based on their ligand hydrogen centers or receptor hydrogen centers.
Each water molecule is given a score based on its number of interacting water sites and the
number of polar atoms on the receptor or ligand. The “best” water sites are selected based on the
scoring system. Lastly, hydrogen atoms are added to the oxygen in geometry in order to
maximize the number of hydrogen-bonding contacts.
Water molecules are also involved in the recognition, specificity or affinity of protein
complexes with RNA, DNA and sugars (Li et al., 2011). When proteins interact with small
ligands, water is also important. X-ray data, molecular dynamics simulation, and grid-based
simulation can be used computationally to predict the solvation of a protein surface (Li et al.,
2011).The understanding of structural details of protein-RNA interface is becoming more and
more common, and an average protein-RNA interface is proposed to contain 32 water molecules
(Li et al., 2011).In addition, the inclusion of water has been shown to improve the accuracy of
predicting structures.
Water molecules at the interface do not all have the same properties. Some water
molecules can exchange with the bulk water very rapidly while some water molecules are bound
3
tightly. Some water molecules bind to the protein interfaces before ligand binding and stay there
even after the complexes are formed. In other words, water molecules formed at the protein
interfaces can be separated into different categories in WATGEN algorithm. The categorization
is established on the basis of the interaction between water molecules with the protein and ligand,
and how it interacts with the bulk water (Li et al., 2011).When a water molecule forms hydrogen
bonds with protein and ligand, it is considered to be a single-water bridge. Another category is a
double-water bridge. It is defined as a water molecule forming hydrogen bonds to one of the
molecules (either protein or ligand) on the interface and to another water molecule forming
hydrogen bonds to other molecule (ligand or protein). Another type of water molecule will form
a single or double hydrophobic bridge with the protein or ligand (Li et al., 2011).In other words,
this kind of water molecule interacts with protein or ligand by a non-hydrogen bond. However,
only a few of these water molecules are present in the interface. In WATGEN, the depth of water
molecules with respect to bulk water is determined by two criteria. The first criterion is that if the
water molecule has a hydrogen bonded path that can be used to trace back to the bulk solvent,
then this water is said to be exchangeable with the bulk water. The depth of water is defined
according to the distance to the bulk water (Li et al., 2011). The second criterion is that if a water
molecule has no hydrogen bonded path that can be traced back to the bulk water, this water
molecule is said to be buried (Li et al., 2011).
Interfacial water also plays an important role in protein-DNA interaction. Water
molecules can screen for favorable DNA interacting sites and stabilize formation of complexes
(Dijk et al., 2013).Proteins that interact with DNA regulate important cellular processes
including DNA replication, repair and gene expression (Dijk et al., 2013).Therefore, structural
information on protein-DNA complexes is an interesting research area. However, water
4
molecules are mostly neglected during the computational modeling of protein-DNA complexes.
Techniques such as X-ray crystallography, NMR spectroscopy, computational modeling and
simulation have been used to study the structural information of protein-DNA complexes (Dijk
et al., 2013).Docking, a computational method to study the unknown structure from its
constituents, is also used to study the protein-DNA complexes (Dijk et al., 2013).Since the
existence of water molecules might affect the resulting complexes models, water should be
included at the early stage of docking process. Moreover, docking including water molecules at
an early stage could improve the quality of protein-DNA models that are generated (Dijk et al.,
2013).
Water molecules can also provide van der Waals interactions by filling in the space
between protein and the ligand (Li and Lazaridis, 2006). Water molecules in interfaces can also
change the shape or flexibility of the protein binding sites (Li and Lazaridis, 2006).Water can
mediate polar interactions and is abundant in forming direct hydrogen bonds in protein-protein
interactions. In the prediction of binding site structure, or ligand-based pharmacophores, the
inclusion of interfacial water molecules shows great improvement on ligand design (Li and
Lazaridis, 2006). Therefore, the inclusion of interfacial water molecules is becoming more and
more essential in de novo drug design and protein-ligand docking.
Water molecules that mediate protein interactions are a major determinant of chain
folding, catalysis, and conformational stability (Reichmann et al., 2008). Water molecules
interact with backbone and side chains of protein directly in order to form stable hydrogen bonds.
In addition, there are more water molecules found in the interfaces of weak and highly-transient
complexes, such as electron transfer proteins (Reichmann et al., 2008).Protein crystal structures
demonstrate that the hydrogen bonds are nonrandom and many of the water molecules are
5
conserved in the same location in similar proteins. In addition, water mediated interactions are
found to be connected to the water conservation phenomenon and inclusion of water molecules
into theoretical models can improve the accuracy of folding prediction.
In some protein-ligand binding cases, after ligand binding some of the well-ordered water
molecules will be released into the bulk solvent, and this will lead to the gain in entropy and
increase in the binding affinity (Barillari et al., 2007).For example, when HIV-1 protease binds
to DMP450, water molecules will be displaced, and this water displacement is favorable for
binding. On the other hand, there are some cases where displacement of water molecules could
cause the binding affinity of ligands to decrease (Barillari et al., 2007).Two classes of water
molecules were identified. The first class are conserved water molecules that cannot be displaced
after ligands bind. The second class are water molecules that are easily displaced by ligands,
once the ligands are bound to the protein. A ligand that is able to displace the water molecules
after binding to the protein and mimic the structure of water molecules is more likely to have a
favorable binding free energy (de Beer et al., 2010).
One of the strategies in structure based drug design is to modify molecules in order to
improve or achieve a certain therapeutic effect. The reason behind this method is assuming that
similar molecules can bind to the receptor in the similar fashion, and hence the same effect can
be induced (de Beer et al., 2010).However, the new compound might bind to the receptor in a
different pose due to the presence of internal water molecules (de Beer et al., 2010).Therefore,
water molecules are more and more commonly considered in drug design. It is very important to
know which molecules are able to mediate the protein-ligand interaction or which water
molecules can be targeted for displacement in drug design.
6
Since water molecules play a lot of roles in the living organism, we would like to predict
the water network in complexes and determine if the water structure can be used to predict the
binding poses. The following work is divided into two chapters. The first chapter will focus on
1638 protein-peptide complexes obtained from the Protein Data Bank. The main purpose of this
chapter is to see if we can predict the water structure based on interactions with polyglycine
backbones. In the second chapter, we choose a specific peptide complex with the Abl-SH3
domain, which has biological importance. Computational and experimental results are compared
in the two chapters.
7
II. Chapter 2: SOLVATION OF 1600 PROTEIN-PEPTIDE COMPLEXES
Introduction:
Water plays a very important role in biomolecular complexes. At the protein-protein or
protein-peptide interface, water forms hydrogen bonding networks. This water network can
stabilize and facilitate formation and dissociation of protein-protein or protein-peptide complexes
(Li et al., 2011).Water molecules in the biomolecular interface can also create specificity in the
binding pocket and this specificity will determine the type of ligand binding to the protein (Li et
al., 2011).Furthermore, most waters are usually forming hydrogen bonds to protein backbones.
For these reasons, the question arises: “Can we predict the water structure of a complex based
only on inclusion of the polyglycine backbone?” This approach may allow inclusion of water
molecules at an early stage before full docking of a peptide to a protein. Our hypothesis is that
we can do docking and solvation using polyglycine backbones only. If this is the case, then we
might be able to apply this technique to predict a peptide bonding position of unknown sequence.
Methods:
We used the Protein Data Bank search engine to find complexes in order to carry out the
calculation. The first criterion that we used in the search engine is that the complex must be a
macromolecule type. The second criterion was that the structure contains protein, but does not
contain DNA, RNA, or DNA-RNA hybrid. The number of biological assembly chain is two: in
other words, it is a binary complex. Either one of the chains in the structure has a chain length of
10 to 30 amino acids. The search engine was set to remove similar sequence at 90% identity.
8
From the above criteria, there were 1639 search results obtained. A screen shot of the search
process is shown below.
Fasta files and pdb files of all complexes obtained from the search engine were
downloaded. Then we determined the protein chain and peptide chain in the fasta file according
to the pdb files. After the protein chain and peptide chain were identified, we made three changes
to the complexes. Protein and peptide chain are both noted as XX in the file. The first X
represents the protein chain and the second X represents the peptide chain. The first change was
to make the peptide into a polyglycine chain (noted as XG). The second change was to make the
protein into a polyglycine chain (noted as GX), but keep the peptide chain as it is. The last
change was to make both chains into polyglycine chains (noted as GG). However, cysteine-
cysteine bridges were not replaced with glycine. The reason that we made the polyglycine
changes to either protein or peptide is that we would like to see if the backbone of the peptide
can be used to predict the binding position to the protein. After the polyglycine changes were
done, the complexes were used for calculation and water comparison.
9
Since the number of complexes obtained from the Protein Data Bank is large, manually
making changes to polyglycine is very time-consuming, so we created code (prpgly.exe) that can
automatically generate the polyglycine changes. This executable file will only keep the first copy
of the protein-peptide complex. For the automatic preparation of polyglycine change calculation,
the same fasta file was also used. Then the prpgly code was used to pick the first set of possible
protein-peptide interaction and generate four input files (XX, XG, GX and GG). Another
program(addprot.exe) was used to add the side chains back to the backbone. Solvation
calculations were then carried out and water analysis and water comparison were performed.
After the calculation was done, an individual folder for each complex was created and a text file
was produced listing the complexes that did not meet the criteria and were rejected. A file called
“pdb_modify” was used to make changes to chain letters in cases of errors in the original files. In
other words, in some cases, the protein and peptide that were chosen by the program are too far
from each other; therefore, there is no interaction formed between them and the solvation
calculation will fail. Water matching was performed for each structure and the water network
was analyzed in each complex.
Results:
Figures2-1, 2-2 and 2-3 compare the water networks of original structure (XX) and the
structure with a polyglycine peptide (XG) in 3EG1, 3OKI and 3OKH complexes. The figure on
the left hand side represents the water network in the original structure. The figure on the right
shows the water network after the peptide is replaced with polyglycine.
10
Figure 2-1. Solvated water comparison of XX and XG in 3EG1: Left: water molecules in original complex (XX), right: water molecules in
complex with one polyglycine change on the peptide (XG)
Figure 2-2. Solvated water comparison of XX and XG in 3OKI: Left: water molecules in original complex (XX), right: water molecules in
complex with one polyglycine change on the peptide (XG)
11
Figure 2-3. Solvated water comparison of XX and XG in 3OKH: Left: water molecules in original complex (XX), right: water molecules
in complex with one polyglycine change on the peptide (XG)
Since the number of complexes that were used in this calculation is large, the results are
illustrated for only three complexes: 3EG1, 3OKI and 3OKH. Figure 2-4 shows the number of
water molecules and amino acids in 3EG1. 3EG1 is a Abl-SH3 domain binding complex. In
3EG1, protein is represented as strand A and peptide is represented as strand C. Strand A has 58
amino acids and strand C has 10 amino acids. The number of water molecules in the original
complex is 46. When the peptide is replaced with polyglycine, the number of water molecules
becomes 55. The number of water molecules is 42 when protein becomes polyglycine but still
keep peptide as it is. When both strands are replaced with polyglycine, the number of water
molecules becomes 39.
Figure 2-4. Number of water molecules and amino acids in 3EG1 with four structures (XX, XG, GX, and GG); strand A: protein, strand C:
peptide
Figure 2-5shows the water molecules match in structures 1 and 2 (XX and XG), in structures
1 and 3 (XX and GX), and in structures 1 and 4 (XX and GG). Column 1 in Figure 2-5 is the
12
water number in the original XX complex. Column 2 is the water number in structures 2, 3 or 4,
which are XG, GX and GG, respectively. In other words, column 2 is the water number in the
structure where polyglycine is used for the protein or peptide or both chains. If there is no water
matching in column 1 and 2, and column 3 has a zero, it means that there originally is a match,
but another water molecule has a better match. If the number in column 3 shows 1, then there is
no match at all. The fourth column in Figures2-5 and 2-6 display the number of interactions for
the first molecule in structure 1 (XX). The number in the fifth column shows the side chain
interaction. The sixth column depicts the number of backbone interactions. The last column in
Figures2-5 and 2-6 is the water score for matching the original structure and the structure with a
polyglycine change.
13
Figure 2-5. Water molecules matches in structures 1 and 2(XX and XG), and in structures 1
and 3 (XX and GX) of 3EG1
Figure 2-6. Water molecules matches in
structures 1 and 4 (XX and GG) of 3EG1
14
The next complex that we chose has the PDB id of 3OKI. This is a human hormone receptor.
The protein in this complex has 230 amino acids and is represented as strand A. The peptide in
this complex has 12 amino acids, and is shown as strand B. For the original complex, the number
of water molecules calculated is 62. When the peptide is replaced with polyglycine, the number
of water molecules becomes 43. The calculated number of water molecules in structure 3 (GX)
when the protein chain is replaced with polyglycine is 44. When both strands are replaced with
polyglycine, the number of water molecules is 21.
Figure 2-7. Number of water molecules and amino acids in 3OKI with four structures (XX, XG, GX, and GG); strand A: protein, strand B: peptide
Figures2-8, 2-9 and 2-10 depict the water molecules match in the original structure and
different structures with polyglycine replacement. Column 1 in Figures2-8, 2-9 and 2-10 is the
water number in the original complex. Column 2 is the water number in structure 2, 3 and 4,
which are XG, GX and GG, respectively. In other words, column 2 is the water number in the
structure where polyglycine is used for the protein or peptide or both chains. The number in
column 3 shows whether there is a match in the water molecules or not. If the number in column
15
3 is zero, it means that there is originally a match, but another water molecule has a better match.
If the number in column 3 is 1, there is no match at all. The fourth column displays the number
of interaction matches in the first molecule, which is structure 1(XX). The number in the fifth
column shows the side chain interaction. The sixth column depicts the number of backbone
interactions. The last column is the water score about water matching in the original structure
and the structure with a polyglycine chain.
Figure 2-8. Water molecules match in structure 1 and 2 (XX, and XG) of 3OKI
16
Figure 2-9. Water molecules match in structure 1 and 3 (XX and GX) of 3OKI
Figure 2-10. Water molecules match in structure 1 and 4 (XX and GG) of 3OKI
17
The last complex that we chose to study in depth among the 1648 complexes is
3OKH.Figure 11 shows the number of water molecules and amino acids of each strand (protein
and peptide) in 3KOH.Protein in 3OKH is labeled as strand A and peptide is labeled as strand B.
Strand A has 227 amino acids, and strand B has 10 amino acids. In the original crystal structure,
the calculated number of water molecules is 63.When the peptide is changed to polyglycine, the
number of water molecules becomes 40.The number of water molecules in structure 3 (GX)
becomes 34.When both strands are replaced with polyglycine, the number of water molecules
becomes 22.
Figure 2-11. Number of water molecules and amino acids in 3OKH with four structures (XX, XG, GX, and GG); strand A: protein, strand B: peptide
18
Figures2-12, 2-13 and 2-14 show how the water molecules in the original structure match to
different structures with polyglycine replacement in 3OKH complex. There are 63 water
molecules in the original structure. Column 2 is the water number in the structure where
polyglycine is used for protein or peptide or both chains. If the number in column 3 is zero, there
is originally a match, but another water molecule has a better match. If the number in column 3 is
1, there is no match at all. The fourth column displays the number of interaction matches in the
original structure. The number in the fifth column shows the side chain interaction. The sixth
column depicts the number of backbone interactions. The last column is the water score for water
matching in the original structure and the structure with polyglycine chains.
Figure 2-12. Water molecules match in structures 1 and 2 (XX and XG)of 3OKH
19
Figure 2-13. Water molecules match in structures 1 and 3 (XX and GX) of 3OKH
20
Figure 2-14. Water molecules match in structures 1 and 4 (XX and GG) of 3OKH
21
The average score of water matches in the 3OKI, 3EG1 and 3OKH complexes is shown in Table
2-1. The average score of water with a match is more negative than the average score of water
without a match.
Complex
water
match
for
structure
Avg
Score
for
non
matching
Avg
score
for
matching
3OKI
1
and
2
-‐2.4
-‐17.2
1
and
3
-‐8.8
-‐14.0
1
and
4
-‐9.3
-‐16.2
2
and
3
-‐9.1
-‐40.1
2
and
4
-‐10.7
-‐41.3
3
and
4
-‐4.4
-‐13.0
3EG1
1
and
2
0.0
-‐17.3
1
and
3
-‐9.8
-‐19.4
1
and
4
-‐9.5
-‐21.0
2
and
3
-‐9.9
-‐25.3
2
and
4
-‐11.4
-‐21.0
3
and
4
-‐14.2
-‐8.7
3OKH
1
and
2
2.1
-‐11.2
1
and
3
-‐0.4
-‐11.6
1
and
4
-‐2.8
-‐12.9
2
and
3
-‐7.4
-‐26.3
2
and
4
-‐4.3
-‐31.5
3
and
4
-‐8.3
-‐24.9
Table 2-1. Average score of water match and water with no match in different structures of the three complexes chosen
Discussion:
Since we used polyglycine chains in the complexes, the water network in each structure
will be different. The original structure without any change would have the most water
molecules. When the polyglycine chain is used, the number of water molecules decreases. The
structure with both polyglycine chains has the lowest number of water molecules. However, in
the 3EG1 complex, the number of water molecules in the second structure with polyglycine
22
change on the peptide is higher than the original structure without any change. In structures 1
and 2, the number of water molecules in interface 1 is 46, and the number of water in interface 2
is 55. The percentage match in interface 1 and interface 2 is 87% and 72.7%, respectively. For
structure 1 and 3, which are the original structure and the structure with a polyglycine chain for
the protein, the percentage match in interface 1 and interface 2 becomes 54.3% and 59.5%,
respectively. For the comparison of original structure and structure 4 with both polyglycine
chains, the percentage match in interfaces 1 and 2 becomes 47.8% and 56.4%, respectively.
For the water comparison in structures 1 and 2 for 3OKH, the total water in interface 1 is
63 and the total water in interface 2 is 40.The percentage match in interface 1 and 2 is 58.7% and
92.5%, respectively. For structures 1 and 3, which are the original structure and the structure
with a polyglycine chain for the protein, the percentage matches in interfaces 1 and 2 are 47.6%
and 88.2%, respectively. For the water match in structure 1 and 4, the percentage match in
interfaces 1 and 2 becomes 28.6% and 81.8%, respectively. The percentage of water matches in
different structure is the highest when structure 1 is compared to structure 2.
The last complex that we chose to study is 3OKI.For the water comparison in structures 1
and 2 for 3OKI, the total number of water molecules in interfaces 1 and 2 is 62 and 43,
respectively. The percentage water match in interface 1 is 59.7% and the percentage water match
in interface 2 is 86%.For the water comparison in structures 1 and 3, the percentage of water
match in interface 1 is 46.8 % and the percentage of water match in interface 2 is 65.9%.The
percentage of water match in interfaces 1 and 2 is 27.4% and 81%, respectively, in the
comparison of structures 1 and 4, with both chains replaced with polyglycine. The percentage
water match is also the highest when comparing structure 1 with structure 2.
23
A closer look at the water network in different structures shows that the polyglycine
replacement has an impact on how the water molecules are distributed and the number of water
molecules. The average score for water molecule matching is more negative in the water with a
match than in the water without a match. When the polyglycine chain is used, the number of
water molecules decreases. In the 3EG1 complex, when the peptide is replaced with polyglycine,
the number of water molecules becomes less. In addition, there are few hydrophobic waters
missing, and two water molecules that were buried are also missing. This might be due to the
fact that when the peptide becomes polyglycine in 3EG1, hydrophobic side chains are missing as
well, so there is no need for the water molecules to create hydrophobic spaces for the peptide to
bind to the protein. In the case of 3OKI and 3OKH, the number of water molecules decreases
dramatically when the peptide is replaced with polyglycine. The number of hydrophobic water
also becomes less. In conclusion, even though polyglycine replacement did change the number
of water molecules and how water molecules are distributed, docking poses may not change.
Therefore, we might be able to use polyglycine to predict the binding position of a protein-
peptide complex. In other words, we could do the docking and solvation using polyglycine when
the binding sequence of the peptide is unknown.
24
III. Chapter 3: SOLVATION OF ABL-SH3 BINDING PROTEIN, 3EG1
Introduction:
Formation of protein-protein interaction or protein-peptide interaction is very important in
the intracellular signaling (Ruano and Luque, 2012). A subset of these protein-protein
interactions is established by modular domains. These modular domains have the ability to
recognize a short sequence of motif in their targets. Because of this property, these domains can
help in the design of small size inhibitors or identify relevant physiological targets. Proline-rich
motifs are commonly distributed in a variety of genomes (Ruano and Luque, 2012). This type of
motif is usually recognized by polyproline recognition domains, which are one of the most
important classes of protein-interaction domains and are abundant in many proteomes. Several
families of polyproline-recognition domains have been reported, including Src-homology 3(SH3)
domains, EVH1 domains, GYF domains, UEV domains, and WW domains (Ruano and Luque,
2012). Even though each polyproline-recognition domain family has different sizes and folding
patterns and recognizes different proline-rich motifs, they have a similar mechanism when
proline is recognized. This mechanism involves formation of hydrophobic interactions between
the ligand and domain (Ruano and Luque, 2012). In addition, it is very important to keep the
polyproline-recognition domain interaction with its target specific and well-regulated because
PRD interaction plays an essential role in many cellular processes, including apoptosis,
differentiation, cell growth, or transcription (Ruano and Luque, 2012).
Among all the polyproline-recognition domains, SH3 domains are the most common in
vertebrates and have been most widely studied (Ruano and Luque, 2012). Some SH3 domains
are involved in the regulation of enzymatic activity of the protein that contains the domain
(Ruano and Luque, 2012).For this reason, SH3 domains are associated with carcinogenesis,
25
leukemia, Alzheimer disease, inflammatory processes, as well we some viral infections, such as
AIDS. SH3 domains fold into five-stranded β-barrel structure, and this structure is composed of
two orthogonal anti-parallel β-sheets (Ruano and Luque, 2012). The five β-strands are connected
by a longer loop called the –RT loop, and two shorter loops called the n-Src and distal loops
(Ruano and Luque, 2012). A shorter 3
10
helix segment is also contained in SH3 domains.
A highly regulated tyrosine kinase called c-Abl is involved in the development of chronic
myelogenous leukemia (Ruano and Luque, 2012). It is the cellular form of the Abelson leukemia
virus. Under the situation where SH3 domains are deleted or mutated, c-Abl will be activated
(Ruano and Luque, 2012). Moreover, it is found that the inhibitory effect of anti-tumor drug
Gleevac can be enhanced by displacement of the intermolecular interaction between SH3 and the
linker sequence that connects the SH2 and kinase domain (Ruano and Luque, 2012). Therefore, a
lot of studies on targeting the Abl-SH3 domain as potential therapeutic applications and design
of novel ligands with high affinity have been carried out. In this chapter, we choose a specific
peptide complex with the Abl-SH3 domain for which experimental results are available, and
performed WATGEN calculations. The purpose of this chapter is to see if the computational
results can be related to the experimental results.
Methods:
The complex that we chose is PDB id 3EG1, and the peptide is referred to as p41
(APSYSPPPPP). This is also one of the 1600 complexes that we worked with previously. In the
calculations in this chapter, we also performed mutations to create p41-related peptides. The first
mutation is changing the serine at position 3 in p41 to threonine, which yields p40
26
(APTYSPPPPP). The second mutation is changing the proline at position 8 in p40 to leucine,
giving p17 (APTYSPPLPP). The third mutation was changing the serine at position 5 to proline,
yielding p0 (APTYPPPLPP).The last mutation was changing the leucine at position 8 to proline,
giving p7 (APTYPPPPPP).The WATGEN calculations were performed on the four mutants and
the original crystal structure. In 'water.out' file, structure 1 is the protein and structure 2 is the
peptide. We recorded the number of interface water molecules; interface score; total energy
between protein and water, peptide and water, and protein and peptide; and the internal water
energy, entropy and enthalpy.
We also moved the peptide around the binding pocket of the protein. In other words, we
moved the peptide up or down along the axis of the real binding position or rotated torsion
angles to yield different binding poses. At first, we tried to move the peptide up or down 1 Å
away from the real binding position manually, and rerun the calculation. We also rotated the
torsion angles and rerun the calculation again. This process was repeated several times to
produce multiple poses. The reason that we first chose to move the peptide manually is to make
sure the movement really changes the energy when compared to the crystal structure. After that,
we created code that could automatically move the peptide around the protein and generate 729
poses. The input to the code that we created to generate multiple poses is as follows:
27
In the input, number 2 represents the distance between peptide and protein, number 1
represents the angle, and zero represents the torsion angle. Numbers 4, 5, and 6 are the carbonyl
carbon on the serine, Cα on the serine, and the nitrogen on the serine of the peptide, respectively.
For the distance between the peptide and protein, peptide was moved from 6.7Å to 8.7Å with an
increment of 1.0 Å. For the change in angle, the angle started at 88° and ended at 108° with an
increment of 10°. For the change in torsion angle, torsion angle started at 170° and ends at 190°
with an increment of 10°. The clash distance is set to be 0.09999Å. However, in some cases,
there are only 728 poses being generated, because one of the poses is too close to the protein and
hence creates a clash. According to the binding energy, we ranked the binding poses, and
compared each one of them to see if the binding pose that is similar to the one in the crystal
structure has the best energy. We also recorded the protein-peptide energy, total energy and
entropy. In addition, we used a molecular virtualization system called Pymol to check water
molecules of each mutant after solvation, and do a comparison with the original crystal structure
to see if there is any water molecule missing.
We next checked the pdb file of each mutant and compared the water molecules in the
crystal structure with the ones in the mutant. We chose four particular water molecules and
carefully studied these molecules. The water molecules that we chose are the ones that have
some resemblance with the water molecules in the crystal structure. We labeled the peptide and
water molecules in each complex in different colors. The water molecules that are only found in
one complex and the water that we are going to focus on are labeled with different colors.
Furthermore, we figured out how the water molecules are connected to each other through
hydrogen bonds; in other words, they are a single-water bridge or double-water bridge, and how
they are connected to the peptide and protein. The hydrogen-bond distance between them was
28
also measured. We also measured the distance between original water molecules in the crystal
structure and water molecules in the mutants to see how much the water molecules shifted.
Results:
Figure 3-1 shows the manual movement of peptide p41 around the protein. We first moved
the peptide away from the real binding position manually. We moved the peptide a few
Angstroms up and down away from the original peptide position. We also rotated the torsion
angles manually to create a few different binding poses, as showing on the right in the figure.
Figure 3_1. Movement of peptide p41 around the protein: moving distance is in angstrom (
Å
)
Table 3-1 shows the number of interface water, internal energies, and total energy of the
crystal structure peptide p41 and four other mutants. Since these are the numbers in the internal
program, there is no unit given for the calculations. Among the five complexes, p40 has the
highest number of interface water. For the entropy, p40 also has the highest value. For the total
energy, p40 has the most negative value of all.
29
Ligand
p40
(APTYSPPPPP)
p41
(APSYSPPPPP)
p17
(APTYSPPLPP)
p0
(APTYPPPLPP)
p7
(APTYPPPPPP)
interface
water
56
50
52
47
44
total(el+vdw)
structure
1&
H2O
-‐265
-‐283.5
-‐169.2
-‐180.1
-‐142.6
structure
2&H2O
-‐231.7
-‐215.8
-‐243.6
185.1
-‐188
structure
1&
2
-‐142.8
-‐142
-‐143
-‐138.3
-‐138.1
internal
water
energies
-‐250.5
-‐213.2
-‐230
-‐189
-‐185.8
entropy
315
280
285
270
245
total
energy
-‐890
-‐854.5
-‐785.8
-‐322.3
-‐654.5
∆Gexp
(kcal/mol)
-‐8.72
-‐7.94
-‐7.23
N/A
-‐7.36
Table 3-1. Energies after solvation for original complex (p41) and four mutants (p40, p17, p0, and p7); experimental data are also
included
Table 3-2 summarizes the energies and ranking of the two most important poses among the
729 poses generated automatically in each mutant and the original peptide (p41) in 3EG1.
rank
pose
total
pro-‐pep
entropy
p41
1
680
-‐1463.8
-‐138.1
370
22
364
-‐993.9
-‐169.4
315
p40
1
680
-‐1394.8
-‐138.5
355
5
364
-‐1112.7
-‐170.1
295
p17
1
680
-‐1257.6
-‐138.9
350
7
364
-‐1065.9
-‐170.3
305
p7
1
680
-‐1103.1
-‐132.8
340
13
364
-‐893.5
-‐164.1
290
p0
1
680
-‐1177.2
-‐133.2
315
13
364
-‐897.7
-‐164.4
280
Table 3-2. Energy and ranking of the two important poses (pose 364 and pose 680) in original complex (p41) and in each mutant (p40, p17, p7 and p0)
For p41(APSYSPPPPP)(the peptide in 3EG1), the number one ranking is pose 680. Pose
680 has the best total energy and pose 364 whose ranking is 22 has the best protein-peptide
energy. The total energy for pose 680 in p41 is -1463.8 and the protein-peptide energy for post
364 is -169.4.For p40(APTYSPPPPP), in which the serine at position 3 changes to threonine,
30
pose 680 is ranked as number one and pose 363 is ranked as number five. The total energy of
pose 680 is -1394.8 and the protein-peptide energy is -170.1.For p17(APTYSPPLPP), in which
the proline at position 8 in p40 changes to leucine, the number one ranking pose is pose 680 and
pose 364 is ranked as number seven. Pose 680 has total energy of -1257.6 and the protein-
peptide energy for pose 364 is -170.3.For p0 (APTYPPPLPP), in which mutates the serine at
position 5 in p17 changes to proline, the number one ranking pose is 680 and pose 364 is ranked
as number thirteen. The total energy for the number one ranking pose is -1177.2 and the protein-
peptide energy for pose 364 is -164.4.For the last mutation of changing leucine at position 8 in
p0 to proline, which yields p7, pose 680 is still ranked as the number one pose, and pose 364 is
ranked as number thirteen. Total energy for pose 680 is -1103.1, and the protein-peptide energy
in pose 364 is -164.1.
In Figure 3-2, peptide is labeled in light blue, water molecules in the crystal structure are
labeled in dark purple, and pink water molecules represent the calculated water in
p41(APSYSPPPPP). The water molecules that we chose to study are labeled in yellow.
Figure3-2. Top view and side view of crystal structure (3EG1) and p41 (APSYSPPPPP) water molecules comparison; left: top view, right: side view
In Figure 3-3, the water molecules that were chosen to study are water 43 and water 42 in
the calculated p41 structure, while the corresponding water number is water 122 and water 128
31
in the original crystal structure. The double water bridge is between the nitrogen of Gln144 and
oxygen of Pro7. The hydrogen bonding length between water 42 and Pro7(O) is 2.65Å. The
distance between water 43 and water 42 is 2.77Å, and the distance between water 43 and
Gln114(N) is 2.53Å. Compared to water 122 in the crystal structure, water 43 was shifted by
0.94Å, and the distance between water 42 in p41 and water 128 in 3EG1 is 0.67Å.
Figure 3-3. Close view of one of the double water bridges between Gln 114 (N) and Pro7 (O) in p41(APSYSPPPPP)and 3EG1
In Figure 3-4, the water molecule that we chose to study is water 10 in p41 and the
corresponding water molecule is water 14 in the crystal structure. This water molecule is a single
water bridge and the hydrogen bonding is between OE1 of Glu98 and oxygen in Tyr4. The
distance between water 10 and Tyr4(O) is 3.27Å and the H-bonding length between water 10 and
Gln98(OE1) is 2.37Å. Compared to the similar water molecule 14 in the crystal structure, water
10 in p41 was shifted by 1.23Å.
32
Figure 3-4. Close view of the single water bridge between Glu 98 (OE1) and Tyr 4(O) in p41 (APSYSPPPPP)and 3EG1
In Figure 3-5, the peptide is still labeled in light blue. Water molecules in p40 are labeled in
green. Orange was used to label the two types of water molecules that we chose to study. The
change of serine at position 3 to threonine is labeled in light pink.
Figure 3-5. The top view and side view of crystal structure (3EG1) and p40 (APTYSPPPPP) water molecules comparison; left: top view, right: side view
In Figure 3-6, the double water bridge that we chose is between nitrogen of Gln114 and
oxygen of Pro7. The two water molecules in this double water bridge are water 45 and water 46
in p40, and the corresponding waters are water 128 and water 122 in the crystal structure. The
distance between Pro7 (O) and water 45 in p40 is 2.65Å, and the H-bonding distance between
water 45 and water 46 is 2.77Å. The distance between water 46 and Gln114 (N) is 2.53Å.
Compared to the similar water molecule 122 in the crystal structure, water 46 in p40 was shifted
by 0.94Å, and the distance between water 45 in p40 and water 128 in 3EG1 is 0.67Å.
33
Figure 3-6. Close view of the double water bridge between Gln114 (N) and Pro 7(O) in p40 (APTYSPPPPP) and 3EG1
In Figure 3-7, the water that we chose to study in p40 is water 10. It is a single water bridge
between OE1 of Glu98 and oxygen of Tyr4. The corresponding water in the crystal structure is
water 14. The hydrogen bonding length between water 10 and Glu98 (OE1) is 3.27 Å, and the
distance between Tyr4 (O) and water 10 is 2.37Å. The distance between water 10 of p40 and
water 14 of the crystal structure is 1.23Å.
Figure 3-7. Close view of the single water bridge between Glu 98(OE1) and Tyr4 (O) in p40 (APTYSPPPPP) and 3EG1
In Figure 3-8, peptide is labeled in light blue. Water molecules in p17are labeled in blue.
Orange was used to label the two types of water molecules that were chosen to study. The
changes of serine at position 3 to threonine and proline at position 8 to leucine are labeled in
light pink.
34
Figure 3-8. The top view and side view of crystal structure (3EG1) and p17 (APTYSPPLPP) water molecules comparison; left: top view,
right: side view
In Figure 3-9, the two water molecules that form a double water bridge are water 44 and
water 43 in p17. This double water bridge is between oxygen of Pro7 and nitrogen of Gln114.
The distance between Pro7 (O) and water 43 in p17 is 2.65Å, and the distance between water 43
and water 44 is 2.77Å. The H-bonding length between water 44 and Gln114 (N) of the protein is
2.53Å. Compared to the similar water molecule 128 and water 122 in the crystal structure, water
43 and water 44 in p17 were shifted by 0.67Å and 0.94Å, respectively.
Figure 3-9. Close view of the double water bridge between Gln 114 (N) and Pro 7(O) in p17 (APTYSPPLPP) and 3EG1
In Figure 3-10, the single water bridge is between oxygen of Tyr4 on the peptide, water 10
in p17, and OE1 of Glu98 on the protein. The H-bonding distance between water 10 and Tyr4 (O)
is 3.27Å and the distance between water 10 and Glu98 (OE1) is 2.37Å. Compared to the
corresponding water 14 in the crystal structure, water 10 in p17 was shifted by 1.23 Å.
35
Figure 3-10. Close view of the single water bridge between Glu 98(OE1) and Tyr 4 (O) in p17 (APTYSPPLPP) and 3EG1
The results described above are also summarized in the following table:
(Gln114(N)-‐W-‐W-‐
Pro7(O))
length(Å)
(Glu98(OE1)-‐W-‐
Tyr4(O)length(Å)
water
difference1(Å)
water
difference2(Å)
3EG1
Gln114(N)-‐
water122=2.96
Glu98(OE1)-‐
wat14=3.19
wat122-‐wat122=0
wat14-‐wat14=0
water122-‐wat128=2.97
wat14-‐Tyr4(O)=3.02
wat128-‐wat128=0
water128-‐pro7(O)=2.70
p41
Gln114(N)-‐wat43=2.53
Glu98(OE1)-‐
wat10=2.37
wat122-‐
wat43(p41)=0.94
wat14-‐
wat10(p41)=1.23
wat43-‐wat42=2.77
wat10-‐
Tyr4(O)(p41)=3.27
wat128-‐
wat42(p41)=0.67
wat42-‐Pro7(O)=2.65
p40
Gln114(N)-‐wat46=2.53
Glu98(OE1)-‐
wat10=2.37
wat122-‐
wat46(p40)=0.94
wat14-‐
wat10(p40)=1.23
wat46-‐wat45=2.77
wat10-‐
Tyr4(O)(p40)=3.27
wat128-‐
wat45(p40)=0.67
wat45-‐Pro7(p40)=2.65
p17
Gln114(N)-‐wat44=2.53
Glu98(OE1)-‐
wat10=2.37
wat122-‐
wat44(p17)=0.94
wat14-‐
wat10(p17)=1.23
wat44-‐wat43=2.77
wat10-‐Tyr4(O)=3.27
wat128-‐
wat43(p17)=0.67
wat43-‐Pro7(O)(p17)=2.65
p7
Gln114(N)-‐wat38=2.53
Glu98(OE1)-‐
wat20=2.65
wat122-‐
wat38(p7)=0.94
wat14-‐
wat20(p7)=1.46
wat38-‐wat37=2.77
wat20-‐Tyr4(O)=3.59
wat128-‐
wat37(p7)=0.67
wat37-‐Pro7(O)(p17)=2.65
p0
Gln114(N)-‐wat38=2.53
Glu98(OE1)-‐
wat20=2.65
wat122-‐
wat38(p0)=0.94
wat14-‐
wat20(p0)=1.46
wat38-‐wat37=2.77
wat20-‐Tyr4(O)=3.59
wat128-‐
wat37(p0)=0.67
wat37-‐Pro7(O)(p0)=2.65
36
In Figure 3-11, water molecules in the crystal structure are labeled in dark purple, water molecules in
p41 are labeled in pink, and water in p40 are labeled in green. Red is used to label the water molecules
that are only found in p41 and deep bluish green is used to represent the water that are only found in p40.
In Figure 3-12, p41 is the figure on the left hand side while the right hand side shows the water molecules
in p40. The water difference in p41 and p40 mainly show up near Pro6 and Pro9 of the peptide and those
are the hydrophobic waters.
Figure 3-11. The top and side view of the crystal structure (3EG1), p41 and p40; left: top view, right: side view
Figure 3-12. Comparison of water difference in p41 and p40; left: p41, right: p40
37
Figure 3-13. Close view of the crystal structure, p41 and p40
Figure 3-13 is a close view of water differences in the crystal structure, p41 and p40. Deep
purple is used to label water molecules in the crystal structure, pink is used to represent water
molecules in p41, and green is used to label water molecules in p40. Water molecules that are
only found in p41 are labeled in red, and those that are only found in p40 are labeled in deep
bluish green. There is one more double-water bridge found between Pro7 and Gln114 besides the
double-water bridge that we chose to study. This is shown in orange. For Gln114 side chain and
Tyr70, there are also a few hydrogen bonding interactions between them.
Figure 3-14 depicts the top and side view of water molecules in p40 and p17. The peptide is
labeled in light blue and the leucine at position 8 in p17 is labeled in yellow. Water molecules in
p17 are labeled in green, and red is used to label water molecules in p40. Water molecules that
are only found in p40 are labeled in light pink. Water molecules that are only found in p17 are
labeled in light blue. The two important water molecules that form the double water bridge are
labeled in orange.
38
Figure 3-14. Top and side views of p40 and p17; left: top view, right: side view
Figure 3-15 shows a closer view of water differences in p40 (APTYSPPPPP) and
p17(APTYSPPLPP). Leucine on position 8 in p17 is labeled in yellow, and this leucine change
makes a difference to the water molecules in p17 when compared to p40. Moreover, there are
two-double water bridges between Pro7 and Gln114. There are a few hydrogen-bonding
interactions between Gln114 and Tyr70 in p40 and in p17.
Figure 3-15. Close view of p40 and p17
Discussion:
According to the solvation calculation, p40 has the most interface water compared to other
complexes. In other words, it traps more water and allows more interactions. Moreover, p40 also
has a more negative binding enthalpy, which indicates that p40 has better binding affinity than
39
p41 (the crystal structure). For the comparison between p40 and p17, because leucine is more
flexible than proline, the difference in entropy after binding to the protein in p17 should be larger.
The loss in entropy in p40 after binding to the protein is smaller. However, it seems like there are
additional factors involved to release water in p17, so -T∆𝑆 is smaller. Moreover, compared to -
40, p17 is entropically favorable, but enthalpically unfavorable. Therefore, it matches with the
experimental results.
During the calculation process, several poses were generated automatically to see if the
WATGEN calculation could reproduce the pose that is the most similar to the crystal structure
and has the best energy. Among the 728 poses generated in each mutant and in the calculated
p41, pose 680 is always ranked as number one pose. This ranking is based on the overall total
energy. The number one ranking pose has the best total energy of all; in other words, it has the
most negative energy value. In addition, pose 364 is found to have the best protein-peptide
binding energy in mutants and also in the calculated p41. Therefore, pose 364 has the most
resemblance to the crystal structure.
We would also like to see if the solvation calculation could reproduce the water molecules in
the same complex. From the comparison of water molecules in the crystal structure and
calculated p41, most of the water molecules seem to be able to be reproduced. However, the
result might not be perfect, since there are some extra water molecules showing up. The next step
is to choose water molecules that have important hydrogen bonding to study carefully. There are
two types of water structures that we picked to study: a double-water bridge and a single-water
bridge. The double-water bridge forms between the nitrogen on Gln114 and the oxygen of
Pro7.The single-water bridge forms between the OE1 of the Glu98 side chain and oxygen of
Tyr4.For p41, p40 and p17, the water molecules that formed double-water bridge and single-
40
water bridge between the protein and peptide remain at the same location. Comparing those
important water molecules to those in the crystal structure, the water position differences are less
than 1.5Å.Water molecules that were picked to study in p7 and p0 also show up at the same
location within both mutants. The double-water bridge in p7 and p0 is less than 1Å away from
the double-water bridge in the crystal structure. Moreover, the single-water bridge in p7 and p0
is less than 1.5Å away from the single-water bridge in the crystal structure.
For comparison of the crystal structure, p41 and p40, the only difference in the peptide is on
the position 3. At position 3 of p41 or the crystal structure is serine. On the other hand, for p40,
position 3 is replaced with threonine. This amino acid change only creates a small change in the
water structure of the nearby environment. However, there are two more water molecules
(labeled in bluish green and circled in yellow in Figure 12) near the double-water bridge that
were focused on. There are also a couple more water molecules near Gln114 and the end of C-
terminus in p40 compared to p41. These extra water molecules form nice network with Gln114
side chain and the Tyr 70.In addition, this nice water network above the tyrosine could hold
tyrosine in place, and Pro10 at the end of the peptide could hence sit on top of the tyrosine.
Therefore, p40 seems to have better protein-peptide binding than p41.
The next step is to examine the water structures in p40 and p17. These two mutants differ at
position 8: p40 has proline at position 8 and p17 has leucine at position 8. Compared to the
crystal structure, both p40 and p17 change the serine at position 3 to threonine. The change from
proline at position 8 to leucine in p17 makes the water structures different. The two extra water
molecules labeled in light pink and circled in yellow in Figure 14 are near the double-water
bridge (labeled in orange). These two water molecules are only found in the p40 mutant instead
of in p17. These water molecules are probably bulk water and can exchange constantly with the
41
bulk water; hence, they might be not that important. Furthermore, on the double-water bridge
that forms between nitrogen of Gln114 and oxygen of Pro7, there is another double-bridge
forming between NE1 of Gln114 side chain and Pro7. The Gln114 side chain also forms a nice
water network with three extra light pink waters that show up only in p40 in figure 14. This
water network connects to Tyr70 and is able to hold Tyr 70 in place. The importance of keeping
tyrosine in place is to create a good binding environment for the peptide. This is exactly the case,
because Pro10 on the peptide sits on top of the tyrosine very well. In addition, both Tyr70 on the
protein and neighboring Tyr115 are held in place in p40 but not in p17.Lastly, the water network
in p40 makes the carbonyl group at Asp71 of the protein point up. This carbonyl group could
secure the tyrosine at the end in place and also secure the Phe72 on the other side. Therefore,
hydrophobic pockets can be created within the binding grove, and hence it makes the protein-
peptide binding more specific.
42
IV. REFERENCES
Barillari, C., Taylor J., Viner, R. and Essex, J.W. (2007). Classification of Water Molecules in
Protein Binding Sites.J. AM. CHEM. SOC. 129, 2577-2587.
Biela, A., Betz, M., Heine, A., and Klebe, G. (2012). Water Makes the Difference:
rearrangement of Water Solvation Layer Triggers Non-additivity of Functional Group
Contributions in Protein-Ligand Binding. ChemMedChem 7, 1423-1434.
Bui, H.H., Schiewe, A.J., and Haworth I.S. (2007). WATGEN: an algorithm for modeling water
networks at protein-protein interfaces. J. Computational Chemistry 28, 2241-2251.
de Beer, S.B.A.,Vermeulen, N.P.E.,and Oostenbrink, C. (2010). The Role of Water Molecules in
Computational Drug Design.Current Topics in Medicinal Chemistry 10, 55-66.
Homans, S.W. (2007). Water, water everywhere-except where it matters.Drug Discovery Today
12, 534-539.
Hou, T., Chen, K., McLaughlin, W.A., Lu, B., and Wang, W. (2006).Computational Analysis
and Prediction of the Binding Motif and Protein Interacting Partners of the Abl-SH3
Domain.PLosComputaional Biology 2, 46-55.
Lazaridis, T. and Li, Z. (2007). Water at biomolecular binding interfaces.Phy. Chem. Chem.
Phy.9, 573-581.
Li, Y., Sutch, B.T., Bui, H.H., Gallaher, T.K. and Haworth, I.S. (2011). Modeling of the Water
Network at Protein-RNA Interfaces. J. Chem. Inf. Model 51,1347-1352
Reichmann, D.,Phillip, Y., Carmi, A. and Schreiber G. (2008). On the Contribution of Water-
Mediated Interactions to Protein-Complex Stability.Biochemistry 47, 1051-1060.
Ruano, A.Z., and LuqueI. (2012). Interfacial water molecules in SH3 interactions: Getting the
full picture on polyproline recognition by protein-protein interaction domains. FEBS Letters 586,
2619-2630.
vanDijk, A.D.J. and Bovin, A.M.J.J. (2006). Solvated docking: introducing water into the
modeling of biomolecular complexes. Structural Bioinformatics 22, 2340-2347.
vanDijk, M., Visscher, K.M., Kastritis, P.L.and BonvinA.M.J.J.(2013). Solvated protein-DNA
docking using HADDOCK. J.Biomol NMR 56, 51-63.
Abstract (if available)
Abstract
Water is important in the formation of bimolecular complexes by forming hydrogen bonds to other neighboring water molecules. The formation of hydrogen bond networks could lead to the structure stability. In addition, water molecules at interfaces cause tighter packing of atoms, mediate polar interactions, and contribute to the specificity of binding in protein-protein or protein-peptide complexes. Therefore, inclusion of water molecules into a theoretical model can increase the accuracy of predicting structure of protein-protein or protein-peptide complexes. This study is separated into two chapters. In the first chapter, since water mainly forms hydrogen bonds and interacts with the protein backbone, we would like to know if we could predict the water structure based on a polyglycine backbone and apply this technique when we encounter unknown sequence binding. We want to include water molecules at an early stage before docking of the peptide to the protein. Our hypothesis is that we could use polyglycine backbones to do docking and solvation. In the second chapter, we chose one specific Abl-SH3 domain binding complex, 3EG1, to study because Abl-SH3 domain is a potential therapeutic target. We compared experimental and computational results to see if computational modeling can reproduce the actual binding position. The results in this study demonstrate how solvation of a polyglycine backbone can be used to predict complex binding positions, and hence could lead to applications in structure-based drug design.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Solvation as a driving force for peptide docking to the major histocompatibility complex (MHC) class II molecules
PDF
Molecular modeling of cyclodextrin interactions with proteins
PDF
Structural prediction of MHC-peptide-TCR interactions: potential for vaccine design
PDF
Structure-based computational analysis and prediction of TCR CDR3 loops in the TCR-peptide-MHC complex using solvation parameters and peptide molecular dynamics.
PDF
Computational modeling of solvation and docking of peptide-MHC class I
PDF
Cationic cell penetrating peptides: characterization of transport properties in epithelial cells and their utilization as delivery systems for protein and peptide drugs
PDF
Cell penetrating peptide-based drug delivery system for targeting mildly acidic pH
PDF
Characterization of the role of LIM and SH3 protein (LASP-1) in endocytosis and actin cytoskeleton remodeling
PDF
pH-sensitive cytotoxicity of a cell penetrating peptide fused with a histidine-glutamate co-oligopeptide
PDF
Prediction of peptides in formation of MHC class I - peptide - TCR complexes using molecular models and artificial intelligence
PDF
Computational analysis of drug complexes with beta-cyclodextrin
PDF
Computational prediction and analysis of protein-RNA interfaces
PDF
Protein aggregation: current scenario and recent developments
PDF
Inhibition of monoamine oxidase A and histone deacetylase inhibitors: computational prediction of ligand binding
PDF
EXSAN: explicit solvent anchored fragment-base docking
PDF
Modeling of water molecules in protein-ligand binding
PDF
A novel in-cell lethality-based molecular screening system using a split barnase
PDF
Inhibition of MAO-A by Dual MAO-A/HDAC inhibitors: in silico approach for ligand binding and affinity prediction
PDF
Algorithm development for modeling protein assemblies
PDF
Developing recombinant single chain Fc-dimer fusion proteins for improved protein drug delivery
Asset Metadata
Creator
Chen, Yi-Hsun (author)
Core Title
Computer modeling of protein-peptide interface solvation
School
School of Pharmacy
Degree
Master of Science
Degree Program
Pharmaceutical Sciences
Publication Date
11/11/2014
Defense Date
11/11/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Abl-SH3 domain,computational calculation,computer modeling,OAI-PMH Harvest,protein-peptide complex,solvation,WATGEN
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Haworth, Ian S. (
committee chair
), Okamoto, Curtis Toshio (
committee member
), Shen, Wei-Chiang (
committee member
)
Creator Email
yihsun@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-515488
Unique identifier
UC11297576
Identifier
etd-ChenYiHsun-3079.pdf (filename),usctheses-c3-515488 (legacy record id)
Legacy Identifier
etd-ChenYiHsun-3079.pdf
Dmrecord
515488
Document Type
Thesis
Format
application/pdf (imt)
Rights
Chen, Yi-Hsun
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Abl-SH3 domain
computational calculation
computer modeling
protein-peptide complex
solvation
WATGEN