The focus of the
lab has been the coupling of theoretical, computational, and experimental
approaches for the study of structural biology. In particular, we
have placed a major emphasis on developing quantitative methods
for protein design with the goal of developing a fully systematic
design strategy that we call "protein design automation." Our
design approach has been captured in a suite of software programs
called ORBIT (Optimization of Rotamers By Iterative Techniques)
and has been applied to a variety of problems ranging from protein
fold stabilization to enzyme design.
(Descriptions of current projects can be found on
the lab
members page. Highlights of a few projects are included
below.)
A prominent goal of protein design is the generation of proteins with
novel functions, including the catalytic rate enhancement of chemical
reactions at which natural enzymes are so efficient. The ability
to design an enzyme to perform a given chemical reaction has considerable
practical application for industry and medicine. Significant progress
has been made at enhancing the catalytic properties of existing
enzymes; however, the design of proteins with novel catalytic properties
has met with relatively limited success. We have developed and implemented
a general computational approach for the design of enzyme-like proteins
with novel catalytic activities. In addition to the generation of
new catalysts, these methods will allow the exploration of the mechanistic
basis of enzymatic activity.
Recently we have been interested in creating a completely novel
catalyst for the Claisen rearrangement of chorismate to prephenate.
Naturally catalyzed by the chorismate mutases, this reaction offers
many desirable features as an early test of enzyme design methods.
The reaction, a first-order sigmatropic rearrangement of a single
substrate, has neither intermediate steps nor involvement of catalytic
groups such as general acids or bases. The reaction has been extensively
studied in many contexts—as a rare enzyme-catalyzed pericyclic process,
as an essential step in the biosynthesis of aromatic compounds, and
as an example of a reaction that occurs through identical mechanisms
enzymatically and in solution. Our method of enzyme design involves
identifying amino acid sequences likely to bind to the transition-state
structure of the chorismate-prephenate rearrangement. As a part of
this process, we are testing the ability of our method to predict
mutations that enhance the activity of the naturally occurring E.
coli chorismate mutase. The computationally designed Ala32Ser mutation
results in an enzyme with measurably enhanced activity.
In protein evolution, mutations in the genetic code are subject to
selection based on the proteins encoded by the affected genes. Although
many different protein sequences map to each folded structure, the
mechanism by which natural selection generates these varied sequences
remains an open question. It is widely believed that one sequence
may evolve from another through a series of single amino acid mutations
that maintain the overall folded structure at every step. In this
way, each 3D structure is associated with a network of sequences
that are connected to each other by energetically neutral point
mutations. This neutral network hypothesis is backed by experimental
data for folded RNA structures, but direct evidence for the neutral
evolution of proteins remains elusive. A method to find neutral
pathways between sequences that fold to the same structure could
provide information about the evolutionary relationships between
proteins. Furthermore, the determination of neutral trajectories
through mutation space may shed light on the biophysical ramifications
of specific mutations, and might suggest potential improvements
to existing protein design strategies. We have developed a computational
procedure to find energetically favorable pathways between two proteins
that have similar structures and a fixed set of amino acid mutations.
Our program randomly generates amino acid mutations that lead from
one sequence to the other, and evaluates the energies of the resulting
sequences using a fast side chain placement calculation and a physical
force field with continuum solvation. We are currently applying
this procedure to protein G and protein L, two immunoglobulin-binding
proteins displayed on the cell surfaces of certain infectious bacteria
to avoid immune recognition by host organisms. These proteins share
a common fold topology but less than 20% sequence identity. Our
program indicates that there are a large number of potential neutral
trajectories between these proteins. We are expressing several proteins
along one particular trajectory to assess the extent of agreement
between theory and experiment.
Biologically functional proteins often carry out their actions by
interacting with other components in the cell, and protein-protein
association serves a very important role. Proteins can bind directly
to their targets to carry out a function or they can bind specifically
to themselves, forming higher-order structures to perform their
duties. We are interested in learning how proteins utilize their
surface residues to interact with other proteins. We are also curious
about the influence protein backbone geometry has on complex formation.
Previous efforts in designing protein/protein-binding interfaces
have focused on altering binding specificities. These methods fall
short, however, when applied to the design of novel binding sites
due to difficulties in accurately modeling protein backbones. Our
short-term goal is to create novel dimers from monomeric proteins.
We developed a special docking algorithm that positions the member
protein subunits in plausible configurations with respect to each
other using parameters determined from the structures of known protein
complexes. The docking procedure treats the proteins as rigid bodies
and uses the Fourier correlation theorem and the fast Fourier transform
to efficiently search for dimers with the highest interfacial surface
complementarities. Using the docked structures as scaffolds for
protein design and employing hydrophobic surface residues to drive
dimer formation, we demonstrated two successful designs, one heterodimer
and one homodimer, using protein G and engrailed homeodomain, respectively
as the starting monomeric proteins. The computationally designed
dimers were synthesized and characterized using circular dichroism,
nuclear magnetic resonance, analytical ultracentrifugation, and
X-ray crystallography methods. These results suggest that this strategy
can be used to address the protein recognition problem and is generally
applicable to creating novel binding sites with compatible binding
partners.
Interactions of the calcium (Ca2+) protein calmodulin (CaM) with calmodulin-dependent
protein kinase II (CaMKII) are central to the Ca2+ signaling pathways
implicated in learning and memory. Ca2+ signals of different magnitude
and duration are sensed by CaM, which can bind up to four Ca2+ ions.
Ca2+ binding to CaM induces a conformational change within the protein
that is essential for recognition and activation of many CaM-regulated
proteins including CaMKII. CaMKII activated by Ca2+/CaM phosphorylates
a number of downstream protein targets in synapses. The binding
of all four Ca2+ ions to CaM is generally believed to be a prerequisite
for CaM-induced activation of CaMKII. However, the observed Ca2+
concentrations during the periods of Ca2+ influx into the postsynaptic
spine are too low to be consistent with this hypothesis. To investigate
whether CaM can activate CaMKII with only two bound Ca2+ ions, we
designed two CaM mutants: one that binds Ca2+ ions only at the C-terminal
domain (NMUTCWT), and one that binds Ca2+ only at the N-terminal
domain (NWTCMUT). In each CaM mutant, the inactivated domain was
designed by stabilizing it in the "closed" Ca2+-free conformation,
while the other domain was kept intact. Ionization mass spectrometry
confirmed the 2:1 Ca2+/CaM stochiometry for the designed mutants.
NMUTCWT could activate CaMKII at the low Ca2+ concentrations believed
to occur in the postsynaptic density in spines. Our findings show
that differential activation of signaling enzymes by partially saturated
CaM may contribute to synaptic plasticity's sensitivity to the timing
and magnitude of postsynaptic Ca2+ flux and suggests the need to
reevaluate the sensitivity of other postsynaptic signaling enzymes
to CaM containing less than four bound Ca2+ ions.
Protein design is an exceptionally difficult problem characterized
by unique complications. Necessary restrictions such as a fixed
protein backbone and discrete side-chain conformations (rotamers)
require different considerations of structure-energy relationships
than other fields of protein simulation. This structure-energy relationship
has been a long-standing focus of our research, which strives to
address issues including the identity of the forces that lead to
protein stability and the relative strengths of these forces. Until
now, damped Coulombic potentials as well as empirical surface area
and volume scaling functions have been used to include electrostatic
solvation energy in computational protein design calculations. These
methods have allowed for the successful design of stable proteins
but have been a limiting factor in the rational design of enzymatic
activity and molecular recognition, for which polar and charged
amino acids are key. To bring protein design energy functions up
to date with these challenges, we are investigating more sophisticated
continuum models for electrostatic solvation. Two related obstacles
to improving electrostatic solvation energy functions are the combinatorial
explosion in protein design, which requires energy scores for many
side-chains and pairs of side-chains and therefore a very fast energy
solver, and the need to calculate energies in one-body (single side-chain)
and two-body (pairs of side-chains) terms without any knowledge
of the rest of the structure. We are first interested in using fast
perturbation methods for two-body terms, allowing for the computationally
lengthy numerical solution to the Poisson-Boltzmann equation for
a large number of side-chain pairs. We are also testing the speed
and accuracy of various analytical Generalized Born methods. Coupled
with strategies for approximating a molecular surface during the
design calculation, both of these approaches allow us to more accurately
describe the energy of a protein's charge distribution in the context
of its molecular geometry and surrounding solvent. Such improvements
in the electrostatic solvation energy model for protein design will
have a significant impact in the areas of enzyme design and molecular
recognition.

home | research | lab
members | publications | contact | links | group
only
© Calfornia Institute of Technology
|