The Yale Protein Morphing Server.

The morph server enables the automatic generation of 2D and 3D animations of a plausible or semi-plausible pathway between two static conformations of a protein subunit, such as those conventionally solved by x-ray crystallography. We believe these animations and associated interpolated pathways will become a valuable research and educational tool, allowing the researcher or educator to quickly visualize the chemical transformation of a protein subunit from one conformation into another. With the server, it is easy to determine quickly whether a valid chemical pathway exists between two protein conformations, as in a protein such as calmodulin, or whether, as is the case with diphtheria toxin, the two conformations have no clearly valid chemical pathway and therefore exist most likely as the result of other processes, such as domain swapping.

 

 

Figure 1.

Diagram of our approach. The information flow from databases, through the server, and then back again to databases is broken down into its component steps. Experimental data in the PDB and other databases is converted into a motion entry in the Database of Macromolecular Motions, from whence a morph movie is generated and statistics are collected. These results are subsequently stored in the Database of Macromolecular Motions.

 

 

Figures 2a (left), 2b (top right), and 2c (bottom right)

 

Figure 2a (left)

Here, the information flow may be visualized as a series of linked Web pages. Users submit new motions to the server via either the Server Submission Form or via a simplified interface through the Structural Alignment Server’s submission form. The query is processed by the server. If the morph operation is successful, the new morph is added to the Table of Morph Movies (which links off-site URLs as well). This table has links to both the morph’s report form (from which the morph may be viewed) and also the associated motion entry in the Database of Macromolecular Motions is the motion has one. An entry is also created in the motion’s entry in the Database of Macromolecular Motions, linking the motion’s report to the report for the morph movie.

Figure 2b (top right)

 

This is a blow up of main page of database from Figure 2a. The entry page of the Database of Macromolecular Motions, http://bioinfo.mbb.yale.edu/MolMovDB is shown above. Users may jump from this to entries on specific motions, many of which link morph movies, or to a table of morphs (Figure 2c).

 

Figure 2c (bottom right)

 

This is a blow up of On-line Table of Morphs from Figure 2a. Screen shot of the on-line table of morphs web page at http://bioinfo.mbb.yale.edu/MolMovDB/morphs. In addition to linking to the Web report page for the morph, each entry links to the corresponding database motion entry (if applicable) and provides information on the PDB Ids used the generate the morph movie, along with the information on the submitting user. This table also references off-site morph URLs, and thus functions as a comprehensive database of protein morphs available on the Internet.

 

 

 

 

Figure 3

 

Putative Hinge Movie

 

A frame from a "hinge movie" of ras protein (PDB ID 4Q21 to 6Q21 morph intermediate frame) showing the putative hinge regions as identified by the server. The server identifies 71:82 and 118:129 as putative hinge regions in the motion, here shown in black.

 

ADH

 

 

 

Recov

 DNA

Pol-

b

 GroEL

Dip.

Tox.

 

 

 

Figure 4

Sample morphs. An automatic morph of alcohol dehydrogenase (key "adh") as produced by our server. Alcohol dehydrogenase is a "trivial" case, as the motions involved are relatively small but nevertheless dramatic when viewed as a movie. It is shown in the top panel. The two panels below ADH show recoverin (1iku -> 1jsa) and DNA polymerase beta, respectively, which are "easy" cases. GroEL (key "groel") is shown as an intermediate case, as the motions are much larger than in alcohol dehydrogenase. The morph can still be reasonably handled by server, as is especially dramatic on paper due to the large displacement of the motions involved. Ditheria Toxin (key "dt") a hard or impossible case, because the rearrangement between the conformations does not involve a motion, but rather domain switching in the crystalline state. The poor quality of the morph provides the researcher with an immediate clue that the rearrangement pathway is unlikely to be a pure motion. The default MultiGif (or Moving Gif) using a combination of software, including Rasmol [49], Molscript [48], Ghostscript, and a gif to multigif utility, all driven through a Perl script. Additional software renders the molecule into Quicktime and MPEG formats to ensure display in a number of Internet browser environments. A simple HTML and Adobe PDF rendering of the sequence alignment of the residues between conformations is also available. In addition to visual output, the interpolated coordinates can also be downloaded as either an PDB NMR format archive or as an archive of PDB frames in the popular Unix Tape Archive (".tar" file) format.

 

 

 

.

Table 1

Easy

Typical

Large

Impos-
sible

Statistic

[Code]

ADH

Reco-
verin

DNA
Pol-Beta

GroEL

Dipth-
eria Toxin

Input
Structures

Motion ID

[ID]

adh

recvin

polbeta

groel

dt

1st input frame

[inputframe0]

8ADH

1IKU

1BPD

1GRL

1DDT

2nd input frame

[inputframe1]

6ADH

1JSA

2BPF

1AON

1MDT

Size (Å) (in terms of window for rendering)

[max_x_or_y]

36

41

52

55

39

Number of atoms

[natoms]

2887

1639

2697

3993

4110

Number of residues

[nresidues]

374

201

335

548

535

Overall
Motion

Overall RMS between first and last frames

[RMSoverall]

2.0

13

8.6

16

20

Rotation (degrees)

[kappa]

4.9º

73º

9.9º

62º

62º

Overall translation
of centroid (Å)

[translation]

2.1

13

6.1

47

66

X translation (Å)

[TransX]

1.1

-0.24

0.94

45

-45

Y "" .

[TransY]

-0.95

-9.14

4.1

-2.1

-0.54

Z "" .

[TransZ]

1.5

-9.78

-4.4

-10

48

1st
Core

Number Ca ’s in 1st core

[AlignedCoreCAs]

187

95

160

259

262

RMS of 1st core (Å)

[AlignedCoreRMS]

0.40

3.0

0.92

1.4

0.37

Max Ca displacement in 1st Core (Å)

[MaxCore
Deviation]

0.66

7.6

1.7

4.2

0.60

2nd
Core

Num. Ca ’s in 2nd core

[2ndCoreCAs]

190

94

160

260

260

RMS of 2nd core (Å)

[2ndCoreRMS]

2.9

18

12

23

29

Max Ca displacement in 2nd core (Å)

[Max2ndCore

Deviation]

7.1

38

28

49

60

RMS of 2nd core (Å)
after fitting on 1st core

[2ndCoreRMS

postrefitting]

1.6

11

11

10

18

Hinge

Number of putative hinges detected

[NHinges]

0

0

0

1

1

X position of 1st hinge (Å) rel. to centroid

[Hinge000X]

-

-

-

-4.7

-7.2

Y position ""

[Hinge000Y]

-

-

-

11

-0.91

Z position ""

[Hinge000Z]

-

-

-

3.3

-3.0

1st Hinge Residue Selection

[Hinge000res]

-

-

-

380:403

352:375

Sequence
of 1st putative hinge

[Hinge000seq]

-

-

-

EVEMKE
KKARVE
DALHAT
RAAVEE

INLFQV
VHNSYN
RPAYSP
GHKTQP

Screw
Axis

Distance betw. screw-axis (x0) & centroid (Å)

[x0ToCentroid

Distance]

21

8.4

23

30

39

X displacement centroid from screw axis (Å)

[x0X]

-0.16

-0.5

-2.5

17

-20

Y ""

[x0Y]

-5.0

-6.2

-5.2

-16

-24

Z ""

[x0Z]

-20

5.7

-22

19

-24

Distance between screw axis and 1st hinge (Å)

[Hinge000x0dist]

-

-

-

26

45

Torsion
Angles

Max phi change (Max of Abs. degrees, 0º-180º)

[MaxPhi]

180º

180º

180º

180º

180º

Max psi change

[MaxPsi]

180º

180º

180º

180º

170º

Max alpha change

[MaxAlpha]

150º

180º

180º

180º

170º

 

 

Comprehensive Statistics for alcohol dehydrogenase, reoverin, DNA polymerase beta, GroEL and diphtheria toxin as Reported by the Server. These statistics were automatically generated by the server in the course of morphing alcohol dehydrogenase, recoverin, DNA polymerase beta, and the first chain of GroEL, and diphtheria toxin. They are reported here to two significant figures except where exact. A brief explanation for each statistic may be found above. More comprehensive explanations may be found on-line.

 

Table 2

   

Hand-gathered statistics

 

Automatically collected motion statistics

Value

 

Min

Max

Mean

 

Min

Max

Mean

Median

Stdev

Maximum Ca displacement (Å)

 

1.5

60

12

 

0.90

81

23

17

19

Maximum hinge rotation (º)

 

5

148

24

 

0.0

150

35

9.5

46

 

 

Comparison of statistics between automatically gathered (server gathered) and manually gathered statistics for maximum Ca displacement and maximum rotation. Despite the sparseness of the manually culled data, the statistics are roughly comparable. Maximum Ca displacement was calculated by first sieve-fitting the protein conformations. The 81Å motion in the database is due to Oxo-Acid-Lyase (5CTS to 1AJ8 in the PDB.) The 12 references reporting maximum rotation in the literature reported a mean maximum rotation of 24º, whereas the server found a mean maximum rotation of 35º over the 176 entries present at the time the table was generated. The mean is, however, skewed by some of the larger motions; the median displacement is much smaller. The maximum value of 150° is due to Oxidoreductase (1FMC -> 1HDC in the PDB. To collect the manual data, we found eleven entries in the Database of Macromolecular Motions citing manually gathered Ca displacement statistics from the literature, and twelve entries giving manually gathered maximum hinge rotations. (Some researchers reported only Ca displacement while others reported only maximum hinge rotation, so these correspond to different sets of proteins.) Automatic collection used a sample of 184 motions for Ca displacement and 176 motions for maximum hinge rotation

 

 

Table 3

Statistic on 65 observations

Mean

Minimum

Maximum

Number of residues aligned

250

5

780

Trimmed RMS

2.4

0.24

16

Trimmed RMS p-value

0.041

0.0

0.96

Sequence percent identity

55

7.9

100

Sequence identity p-value

0.23

0.0

1.00

Sequence Smith-Waterman Score

1400

-7400

15000

Structural Similarity Score

4400

97

15000

Structural Similarity Score p-value

0.015

0.0

1.00

 

Morphs in the database were processed to eliminate redundancy (several PDB pairs have multiple morph movies of varying characteristics) and then fed into the Yale Structural Alignment Server (URL: http://bioinfo.mbb.yale.edu/align) based on structure alignment [32]. Structure alignment was able to structurally align 65 of the 78 non-redundant protein chain pairs. The results for 65 observations are shown in the table above to two significant figures.

 

On average, successful protein chain comparisons in the database have a sequence percent identity of 55%, although the server was able to successfully morph proteins with as little sequence identity as 7.9% identity and as high as 100% identity. Morphed proteins have a mean trimmed RMS (RMS after worst-fitting half of residues eliminated) of 2.4 Å, with a range between 0.245 Å to 16.46 Å.

 

The server was able to successfully morph protein chains with p-values based on all three statistics (Trimmed RMS, Sequence percent identity, and Structural Similarity Score) near one, suggesting that some protein chain pairs in the database are unlikely to be related either evolutionarily or structurally.

 

 

Table 4

 

Name

Mean of max

Min of max

Max of max

Maximum Alpha Change

140º

16º

180º

Maximum Phi Change

180º

140º

180º

Maximum Psi Change

150º

23º

180º

 

 

Maximum Torsion Angle Changes is another example of the statistics collected by the server. For this table, maximum Alpha, Phi, and Psi Torsion Angle Changes were computed for 134 protein chain pairs in the database and reported here to two significant figures. The mean, minimum, and maximum of each statistic were computed for the table above. As expected, a motion can be found for each statistic with a torsion angle change of 180º, the maximum possible. Every motion involves at least one large phi angle change of at least 140º. However, a few morphs have only small psi and alpha torsion angle changes.