Summary: In this assignment students will manipulate protein structures through rotations of dihedral bonds. They will get a first-hand look at how rotation by dihedrals causes large scale motions in protein structures, and can cause steric clashes in a protein chain.
Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
Figure 1 provides examples of some aminoacids, from long ones such as arginine to bulky ones such as proline, and to the smallest one, glycine. Note that as we have mentioned, while different in their sidechain atoms, all these aminoacids share a common group of atoms that includes N, CA, C, and O. The chain that goes through this shared core is known as the backbone. For the purpose of this module, you will consider as the backbone the chain that goes through N, CA, and C for every aminoacid. Note that there are two backbone dihedrals per aminoacid, with the number of sidechain dihedrals varying depending on the aminoacid. To simplify the rotation by dihedrals, in this assignment you will only manipulate protein structures from which the sidechain atoms have been removed. This means that you will only deal with a backbone chain that goes through the N, CA, and C of every aminoacid.
![]() |
You should be familiar with how aminoacids connect to each other through the peptide bond. Figure 2 illustrates the peptide bond formed between the C of an aminoacid and the N of the following aminoacid. Consecutive applications of the peptide bond create the backbone chain. Recall that due to its double covalent nature, the peptide bond is planar (illustrated with the plane in Figure 2), and unrotatable. Therefore, in your manipulation of the backbone chain, you do not need to consider the peptide bond for rotation. You are left with two dihedral bonds per aminoacid.
![]() |
Figure 3 shows four consecutive aminoacids in a polypeptide chain. Note the peptide bonds that connect the aminoacids. You need to understand that rotation of a backbone dihedral will affect the location of all atoms following it due to the accumulation of rotations along the backbone. Therefore, it is necessary for you to define an orientation for the backbone. The easiest orientation is that used for the sequence of a protein, where the backbone is defined as the chain starting at the N atom of the first aminoacid of the sequence and ending at the C atom of the last aminoacid of the sequence. For your convenience, the native conformation supplied to you later in this assignment will list the coordinates of the atoms in the order N, CA, C for every aminoacid. So, reading in order from this file will give you the orientation of the backbone.
![]() |
Before you proceed to the next section, make sure that you can answer these questions:
Now it is time to actually rotate dihedral bonds. Recall from our discussion of Forward Kinematics that one can define the transformation matrix/matrices to account for a dihedral rotation, depending on the representation method you choose (global or local coordinates, as discussed below). If you need to compute values such as bond lengths and bond angles, You can compute the bond length between two atoms as the Euclidean distance between them. You can also compute the angle between two bonds, the bond angle, as the the angle between two normalized vectors, which amounts to taking the dot product of the normalized vectors. If you need to identify one atom as your anchor atom, with the default backbone orientation, your anchor atom is the N of the very first aminoacid (the very first atom of the chain).
Now we will put our knowledge into practice. You will manipulate the structure in the pdb file backbone_native.pdb, which is the backbone of the 1COA structure of CI2. You may want to save this pdb file for later visualization in VMD.
Even though you are welcome yo use any programming language to perform
rotations, here we provide you with a setup for Matlab. Matlab offers a lot
of matrix
operations that you would otherwise have to implement yourself in languages
like C/C++, for example. We provide some hints/directions for those of
you who choose to implement this assignment in Matlab. The very first step
you need to do is to read from an ASCII file that contains the cartesian
coordinates of the chain you will manipulate and save these coordinates in a
data structure that will represent your chain. The simplest data structure at
this point is an array, where positions 3*(i-1)+1, 3*(i-1)+2, and 3*(i-1)+3
contain the coordinates of atom i. Note that Matlab starts counting from 1.
Take a look at the backbone_native.pdb file. Find out what the number of atoms is.
You can read the cartesian coordinates from the backbone_native.crd file (note that this .crd file does NOT have a dummy line at the beginning to read it easily with Matlab). You can do
so with the command: cartesian_coordinates =
textread(input_file, '%f'); where you have set the input file to
where you have stored backbone_native.crd as in for
example input_file = '/home/user_name/rest_of_the_path/backbone_native.crd' ;. You can check how
many coordinates you have read with the command
length(cartesian_coordinates) .
The textread command stores all coordinates
read from the backbone.crd file in the
cartesian_coordinates array. You could manipulate this array with the
dihedral rotations you will define. However, it might be more convenient for
matrix operations and for clarity to store the cartesian coordinates in a
matrix with N rows and 3 colums for each row. In this way, row i contains all
the three cartesian coordinates for atom i. Define the number of atoms in the
backbone with the command N is
length(cartesian_coordinates)/3; . Now you need to declare a matrix
with the right dimensions. Since you actually need to work with homogeneous
coordinates, it might be convenient to declare a matrix with N rows and 4
columns, where the last column contains 1.
You can do so with the command backbone_chain = zeros (N, 4);
which creates a matrix with N rows and 4 columns initialized to all
zeros. You can set the fourth column to 1 with the command backbone_chain(:, 4) = 1;
Now you need to place the cartesian coordinates from the array to
this matrix. You can do so with the for loop below:
for i = 1:N
for j = 1:3
backbone_chain(i,j) = cartesian_coordinates((i-1) * 3 + j);
end;
end;
The cartesian coordinates you have just read will serve as a basis for performing rotations, according to what method you choose to represent and manipulate the protein, as discussed in class. The following guidelines should help you get started with either method:
trans_mat with backbone_chain(i, :) as in trans_mat * backbone_chain(i, :)'. Note that the colon is very convenient as it gives an entire row or an entire column and that backbone_chain(i, :)' gives you the transpose that is necessary for the multiplication to be carried out. bonds, angles, dihedrals). Once this is done, you have extracted a representation of the protein in its internal coordinates and can discard the cartesian coordinates, or keep the coordinates array to store the reconstructed protein after performing manipulations. The absolute position/orientation of the protein is not important and can be ignored by assuming the anchor atom rests at the origin and that the first bond angle and dihedral angle are both zero. Remember you can perform rotations now by simply adding/subtracting from the dihedral angles, but to recover the cartesian coordinates once the rotations are done you need to build a chain of homogeneous transformations as discussed in class.In any case, your transformation matrices need to have dimension 4x4 to work with homogeneous coordinates. You can initialize an empty 4x4 matrix by writing trans_mat = zeros(4,4);. Then you can set the elements of this matrix to what they should be for the particular method you use. You can evaluate all the cosines and sines that you need in Matlab. Matlab offers you built-in operations such as dot product or vector norm. Make sure that you understand these operations before you apply them. Recall that you can rotate any bond except the peptide bond, which is rigid.
If you wish to visualize the resulting conformations in VMD, you simply have to produce a file with the transformed coordinates. This file should be in ASCII (text) format and must have an empty/dummy first line, and all atom coordinates following (for example all in a single line, separated by spaces). You can save this file with ".crd" extension. Then, you can use the same "backbone_native.pdb" file, and use VMD's option "Load data into molecule" to load your new coordinates from the .crd file.
Protein conformations are not only geometric objects but are characterized by energetic stability. You have seen that there are many empirical ways to compute the potential energy of a conformation. Functions such as CHARMM are very involved for the purpose of this assignment. Therefore, let us not worry about all the energetic terms to be considered from the atomic interactions. Consider only unfavorable interactions due to collisions between atoms, also referred to as steric clashes. Think about a simple energy function that checks for steric clashes. Your function should report high energies for conformations with collisions and low energies for collision-free conformations. You can model each atom as a sphere with a certain radius known as the Van der Waals (VDW) radius. Even though different atoms have different VDW radii, you can assume that all atoms have the same radius of 1.7 A. To assist you in devising a function that checks for collisions over all pairs of atoms, consider the following:
![]() |
Again, as in dihedral rotations, you are encouraged to use Matlab. After you have the cartesian coordinates of the manipulated conformation, you can iterate over the atoms and check whether they are in collision with one another. You can do this with a double loop. Assume that the radius of every atom is the same, 1.7 A. In order to avoid reporting collisions for bonded atoms, you can consider only pairs of atoms that are 4 positions apart. That is, check atom i with atom i+4 for a possible steric clash.
NOTE: remember that in order to open a .crd file in VMD (to attach it to a loaded structure) it has to contain a dummy first line. If your implementation writes out ASCII coordinates, remember to add an empty/dummy line at the beginning before opening with VMD. Please follow this list of deliverables closely: