using DSSP in the "bio3d" R package

By Jochen Voss , on 2010-08-04

The program DSSP is used to determine the secondary structure of a protein, taking the three dimensional coordinates of its atoms as the input. Bio3d is a library for the statistical software package R which makes it easier to analyse protein structure; as part of this, bio3d contains an interface to DSSP (in the function dssp). Since I had a bit of trouble using this interface, here are some hints.

Get the DSSP program from the DSSP webpage. For academic use, the program is free of charge, but you need to fill in a license agreement.
Rename the executable to dssp (mine was called DSSP_MAC.EXE after I had downloaded it).
Move the file to the directory where it will live. For my system (and for the examples below) I chose /usr/local/bin/. The name of this directory is not allowed to contain white space.
Call the dssp function with this directory as the exepath argument. The value of the exepath argument must end in / (on Linux/Unix/MacOS) or \ (on Microsoft Windows). Also note that every \ in an R string constant needs to be doubled, e.g. on Microsoft Windows you will probably need to write something like exepath="C:\\path\\to\\file\\".

Example. In R one can now use the following commands.

library("bio3d")
pdb <- read.pdb("12as")
x <- dssp(pdb, exepath="/usr/local/bin/")

Then the secondary structure information of the protein 12AS can be accessed as follows:

> x$helix
$start
  1   2   3   4   5   6   7   8   9   1   2
  5  76 130 170 182 258 277 297 320 271 310

$end
  1   2   3   4   5   6   7   8   9   1   2
 27  83 155 176 193 268 283 305 325 275 312

$length
 1  2  3  4  5  6  7  8  9  1  2
23  8 26  7 12 11  7  9  6  5  3

$chain
 [1] "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A"

This tells us that the first helix reaches from residue 5 to residue 27 (both inclusive).