The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures.
The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures. Protocol 1 lists the set of instructions needed to download and install the software. This is followed, in Protocol 2, by the analysis of a nucleic acid structure, including the assignment of base pairs and the determination of rigid-body parameters that describe the structure and, in Protocol 3, by a description of the reconstruction of an atomic model of a structure from its rigid-body parameters. The most recent version of 3DNA, version 2.1, has new features for the analysis and manipulation of ensembles of structures, such as those deduced from nuclear magnetic resonance (NMR) measurements and molecular dynamic (MD) simulations; these features are presented in Protocols 4 and 5. In addition to the 3DNA stand-alone software package, the w3DNA web server, located at http://w3dna.rutgers.edu, provides a user-friendly interface to selected features of the software. Protocol 6 demonstrates a novel feature of the site for building models of long DNA molecules decorated with bound proteins at user-specified locations.
Understanding the three-dimensional structures of DNA, RNA, and their complexes with proteins, drugs, and other ligands, is crucial for deciphering their diverse biological functions, and for allowing the rational design of therapeutics. Exploration of such structures entails three separate, yet closely related components: analysis (to extract patterns in shapes and interactions), modeling (to assess energetics and molecular dynamics), and visualization. Structural analysis and model building are essentially two sides of the same coin, and visualization complements both of them.
The 3DNA suite of computer programs is an increasingly popular structural bioinformatics toolkit with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. Earlier publications outlined the capabilities of the software1, provided recipes to perform selected tasks2, introduced the web-based interface to popular features of the software3, presented databases of structural features collected using 3DNA4, 5 and illustrated the utility of the software in the analysis of both DNA and RNA structures6, 7.
The goal of this article is to bring the 3DNA software kit to laboratory scientists and others with interests and/or needs to investigate DNA and RNA spatial organization with state-of-the-art computational tools. The protocols presented here include step-by-step instructions (i) to download and install the software on a Mac OS X system, (ii-iii) to analyze and modify DNA structures at the level of the constituent base-pair steps, (iv-v) to analyze and align sets of related DNA structures, and (vi) to construct models of protein-decorated DNA chains with the user-friendly w3DNA web interface. The software has the capability to analyze individual structures solved using X-ray crystallographic methods as well as large ensembles of structures determined with nuclear magnetic resonance (NMR) methods or generated by computer-simulation techniques.
The structures examined here include (i) the high-resolution crystal structure of DNA bound to the Hbb protein from Borrelia burgdorferi8 (the tick-borne bacterium that causes Lyme disease in humans9, 10), (ii) two large sets of sequentially related DNA molecules produced with molecular simulations11 – 4,500 snapshots of d(GGCAAAATTTTGCC)2 and d(CCGTTTTAAAACGG)2 collected at 100-psec increments during the calculations, and (iii) a small ensemble of NMR-based structures of the O3 DNA operator bound to the headpieces of the Escherichia coli Lac repressor protein12. The instructions below include information on how to access the files of atomic coordinates associated with each of these structures as well as how to use 3DNA (a copy of this file is found on the 3DNA forum at http://forum.x3dna.org/jove) to examine and modify these structures.
1. Installation of the Software Package
2. Analysis of a Crystal Structure
3. Construction of a DNA Structure from Rigid-body Parameters
4. Analysis of Multi-model Structure Files
5. Superposition of Multi-model Structures onto a Common Reference Frame
6. Construction of a Protein-decorated DNA Molecule
The 3DNA software tools are routinely used to analyze nucleic acid structures. For example, the identities of base pairs and the rigid-body parameters that characterize the arrangements of bases in double-helical fragments of DNA and RNA structures are automatically computed and stored for each new entry in the Nucleic Acid Database22, a worldwide repository of nucleic acid structural information. The values of the rigid-body parameters determined with Protocol 2 readily reveal distortions in three-dimensional structure, such as the two sites of extreme DNA bending into the major groove, with large positive roll angles (64.95° and 60.93°), found at AT·AT steps 13 and 22 in the crystal complex with the Borrelia burgdorferi Hbb protein8 (Figure 1).
The capability of the software to rebuild structures from these quantities with Protocol 3 makes it possible to determine how individual base and base-pair steps contribute to the overall molecular fold. As illustrated in Figure 2, the global bending of DNA induced by Hbb reflects more than the two extreme roll distortions noted above. That is, the DNA remains highly curved when reconstructed with these base-pair steps straightened, i.e. with null roll angles at the two sites. The same technique has previously revealed the contributions of specific base-pair steps and deformations to the superhelical pitch of the DNA wrapped on the surface of the nucleosome core particle6 and to the width of the minor groove of the DNA bound to the bacterial nucleoid-associated protein Fis23.
The new capability in 3DNA, described in Protocol 4, to examine large numbers of related structures makes it possible to extract both sequence- and time-dependent patterns in the spatial arrangements of simulated DNA and RNA molecules. For example, the (yellow) color-coding of the roll angles between successive base pairs in two large sets of simulated DNA structures11 reveals the preferential bending of these molecules at pyrimidine-purine base-pair steps (Figure 3). The higher values of roll, depicted in red, that persist for short periods at the ends of the DNA are suggestive of localized melting and reannealing of the double-helical structure. The variational patterns of other rigid-body parameters, such as the angles and distances between complementary bases, can help to decipher the precise structural distortions.
The capability of the 3DNA software, presented in Protocol 5, to reorient related molecules in a common reference frame, reveals features of overall structure hidden in many of the files stored in the PDB. For example, the conventional alignment of related structures on the basis of a root-mean-square fit of corresponding atoms produces a series of similar spatial pathways that roughly superimpose upon one another, here the ten NMR-based models of the O3 DNA operator bound to the headpieces of the Lac repressor protein12 (Figure 4 left). The superposition of the same structures on a common coordinate frame on the 5´-terminal base pair of each duplex reveals sizable distortions of global structure, in which the molecules flex in appreciably different directions (Figure 4 right). The structural variability may influence the ease with which the Escherichia coli Lac repressor protein binds O3 and induces a loop between O3 and sequentially distant operators in the lac operon24.
The steps outlined in Protocol 6, for building models of long DNA fragments decorated at arbitrary sites with proteins and other ligands, adds a new perspective on the organization of large macromolecular assemblies. Such models help to understand how the multi-molecular complexes interact during biological processing. As illustrated in Figure 6, the precise placement of an architectural protein like Hbb can have a dramatic effect on the overall folding of DNA. If two copies of the known high-resolution structure8 are separated by 43 base pairs, an 81-base-pair DNA fragment closes into a tight, nearly closed configuration. If the two proteins are separated by an additional five base pairs, the DNA follows an open, meandering pathway. The very different arrangements of the protein-decorated duplex show how the spacing of architectural proteins can affect the cyclization or looping of DNA25, 26.
Figure 1. Variation of the roll angle between successive base pairs (see insert for visual depiction) along the DNA chain bound to the Hbb protein from Borrelia burgdorferi8. Values obtained using the ‘analyze’ command of 3DNA and the structural data described in Protocol 2. Note the extreme values of roll at the AT steps that bracket the central third of the structure.
Figure 2. Approximate atomic models of DNA constructed using the ‘rebuild’ function of 3DNA and rendered in PyMOL with color-coded atoms (C-cyan; N-blue; O-red; P-gold). Models based on (left) the rigid-body step parameters of the Hbb protein and (right) a modified set of step parameters, where the two largest values of roll have been set to zero. See Protocol 3 for step-by-step instructions. Note the unfolding of DNA induced by the imposed changes in roll.
Figure 3. Mosaic images of the roll angles along the DNA in two sets of simulated structures11. Values of roll, extracted using Protocol 4, are color-coded from blue to red over the range [-5°, 20°]. Note the reverse order of bases and large values of roll, highlighted by yellow/red columns, which occur at pyrimidine-purine steps in the two 14 base pair self-complementary sequences.
Figure 4. Cartoon images of the DNA models found in the NMR-derived structures of the O3 DNA operator with the Lac repressor protein headpieces12 illustrative of the capabilities of the ‘x3dna_ensemble reorient’ command. Images rendered in PyMOL (backbones show as gold tubes and bases as blue sticks) and aligned using (left) the coordinates in the PDB entry (2kek) and (right) the structural superposition presented in Protocol 5. Note the large differences among the structures when placed in a common reference frame on the 5´-terminal base pair.
Figure 5. Screen shot from the w3DNA web server illustrating specifications of the DNA sequence (Label 1), the helical form of unbound DNA (Label 2), the positions and identities of proteins (Label 3), and the preview image check box (Label 4) described in Protocol 6. Click here to view larger figure.
Figure 6. Approximate atomic models of two Hbb proteins8 bound to a long DNA fragment. Structures created with the w3DNA web server as described in Protocol 6 and rendered in PyMOL. The protein chains are show as violet ribbons while the DNA is colored-coded by atom type (C-cyan; N-blue; O-red; P-gold). The central base pair of each protein-binding sites is set at positions (left) 20 and 62 along the 81 base-pair DNA chain and (right) 20 and 67 along the 86 base-pair DNA chain. Note the major change in the folding of the structures associated with the increased (five base-pair) displacement of the two proteins.
The set of protocols presented in this article only touch upon the capabilities of the 3DNA suite of programs. The tools can be applied to RNA structures to identify non-canonical base pairs, to determine the secondary structural contexts in which such pairing occurs, to quantify the spatial disposition of helical fragments, to measure the overlap of bases along the chain backbone, etc. The rebuild command allows the user to construct simple and informative block representations of the bases and base pairs like that shown in the inset to Figure 1. The building tools also include features to ‘thread’ different sequences on a given structural template, to generate models of numerous double-, triple-, and four-stranded DNA structures, to orient models in a specific direction, etc. Finally, 3DNA has played a significant role in a number of other projects, such as: the SwS solvation web service for nucleic acids27; the ARTS web server for aligning RNA tertiary structures28; the MDDNA web-based tool for the analysis of molecular dynamics results and structure prediction29; the HADDOCK information-driven protein-DNA docking method30; the SARA server for function annotation of RNA structures31; the 3D-DART DNA structure modeling server32; the 3D-footprint database for the structural analysis of protein-DNA complexes33; the RNA FRABASE 2.0 database for the identification of three-dimensional fragments within RNA structures34; and the SETTER web server for the pairwise comparison of RNA structures35. To the best of our knowledge, there is currently no other nucleic-acid structure software package with a comparable broad combination of features and the robust performance record of 3DNA.
The authors have nothing to disclose.
We are grateful to Jiří Šponer for sharing the coordinates of DNA double helices generated in molecular dynamics simulations. We also acknowledge Nada Spackova for assistance in downloading these structures. Support of this work through USPHS Research Grants GM34809 and GM096889 is gratefully acknowledged.