2007 PDB News
Contents:
Earlier news is available and is archived in the RCSB PDB newsletters.
25-December-2007
Happy Holidays from the RCSB PDB
The RCSB PDB staff wish to extend our best wishes to the community for a happy holiday season and a wonderful new year!
Position Available for Senior Scientist/Scientific Software Developer
Available Positions with the RCSB PDB include an opening at the University of California, San Diego for a Senior Scientist/Scientific Software Developer.
18-December-2007
ADITBeta Available for Testing
A new version of ADIT, that will ensure that data in the PDB is even more accurate and consistent, is available for testing at http://deposit-beta.rcsb.org/adit/.
The RCSB PDB asks that depositors use ADITBeta to deposit their structures and provide us with any feedback at [email protected].
The following features have been added in this version:
- Format checking
- ADITBeta will indicate any format errors and provide suggestions for solving them
- Geometry and stereochemistry checking
- Deposited structures will be automatically validated
- Sequence information
- ADITBeta will check for consistency between sequence and coordinates
- The organization of sequence information (e.g., expression tags, mutations) has been improved
- Author and Title information
- Entering author, title, and citation information is easier in ADITBeta
This version of the tool will become the default version of ADIT early in 2008.
Position Available for Lead Web Architect
Available Positions with the RCSB PDB include an opening at the University of California, San Diego for a Lead Web Architect.
11-December-2007
RCSB PDB Poster Prize Awarded at AsCA
Thanks to everyone who participated in the RCSB PDB Poster Prize competition at the 8th Conference of the Asian Crystallographic Association (AsCA) from November 4-7, 2007 (Taipei, Taiwan).
The RCSB PDB Poster Prize is awarded to the best student poster related to macromolecular crystallography. At AsCA, the judges interviewed the finalists for the prize, and considered the engagement of the student in the work and their understanding of it; the clarity of the presentation in terms of the hypothesis being tested; the appropriateness of the approach; and the justification of the conclusions drawn in terms of the data presented.
The award went to Serah Kimani for "Why do nitrilases need to form helices to be active?" (Trevor Sewell, Serah Kimani (University of Cape Town), and Muhammed Sayed (University of the Western Cape, South Africa).
Judges: Mitchell Guss (University of Sydney), Sine Larsen (European Synchrotron Radiation Facility), and Mike Lawrence (Walter and Eliza Hall Institute of Medical Research).
Poster Prize Chairman: Jill Trewhella (University of Sydney)
Special thanks to the AsCA organizers and the Program Committee Chairman Se Won Suh for their assistance with organizing the prize.
Congratulations to all of the award winners in 2007.
04-December-2007
Announcement: Experimental Data Will Be Required for Depositions Starting February 1, 2008
Effective February 1, 2008, structure factor amplitudes/intensities (for crystal structures) and restraints (for NMR structures) will be a mandatory requirement for PDB deposition.
These data must be deposited at a member site of the Worldwide Protein Data Bank (www.wwpdb.org): RCSB PDB (www.pdb.org), MSD-EBI (www.ebi.ac.uk/msd), PDBj (www.pdbj.org), or BMRB (www.bmrb.wisc.edu).
Data can be released as soon as they have been processed and approved. There is a one-year limit on the length of time a structure and its experimental data can be put on hold, including structures that are on hold until the associated paper is published (HPUB).
This policy was developed as a result of comments and recommendations from the PDB user community, including the Commission on Biological Macromolecules of the International Union of Crystallography and the NMR Task Force, and has been endorsed by the wwPDB Advisory Committee.
Questions relating to depositions should be sent to [email protected].
27-November-2007
RCSB PDB Focus: Sorting Search Results
Following a search that produces multiple entries, the results set can be sorted by choosing 'Sort Results' from the menu on the left hand side of the page.
For most searches, the sorting options include: PDB ID, Release Date, Residue Count, Resolution and Rank (useful with keyword searches).
An Advanced Search by sequence (Advanced Search>>Sequence Features>>Sequence (Blast/Fasta)) allows the user to sort results by PDB ID, formula weight and E value.
20-November-2007
RCSB PDB Flyers Available in Print and Online
The News & Publications page offers links to various RCSB PDB publications, including newsletters, annual reports, and brochures.
These informational brochures describe different RCSB PDB features, including the Sea of Genes exhibit at the Birch Aquarium (Scripps) that explores proteins related to underwater creatures.
5 Easy Steps for Structure Deposition describes the tools that facilitate NMR and X-ray crystal structure deposition and validation for use by the authors of the structures.
A General Information trifold provides an overview of the RCSB PDB project, and includes information about data deposition, data query and reporting, Molecule of the Month, structural genomics, wwPDB, and outreach and education resources.
All of these materials can be downloaded from the RCSB PDB site. To receive printed copies of these flyers, please send your postal address and brochure request to [email protected]. Requests can be made for multiple copies.
If you are interested in receiving the upcoming 2007 Annual Report, or would like to subscribe to our quarterly newsletter, please send your postal address to [email protected].
13-November-2007
Web Survey: RCSB PDB Educational Resources
The RCSB PDB is looking for feedback about the educational resources available from our website. We would also like to know the types of educational activities and resources that are of interest to our users.
We've created a short online survey that should only take a few minutes to answer.
We greatly appreciate your participation in this survey. As a token of appreciation, we'll send temporary tattoos of tRNA to survey respondents who send their postal address to [email protected].
06-November-2007
New Release of pdb_extract Deposition Tool
pdb_extract is a program that minimizes errors and saves time during the deposition process by extracting key details from the output files produced by many X-ray crystallographic and NMR applications for use in the deposition process. The program merges these data into macromolecular Crystallographic Information File (mmCIF) data files that can be used with ADIT to perform validation and to add any additional information for PDB deposition.
Version V3.004 of pdb_extract has been released, and provides:
- Support for several new programs has been added, for a total of 34 programs/packages with hundreds of different formats
- Improved usability, with added functions and additional error and warning messages
- Data files that follow the PDB Exchange Dictionary (PDBx) v1.045 and the Protein Data Bank Contents Guide Version 3.1.
Complete details are available in the release notes at http://sw-tools.rcsb.org/apps/PDB_EXTRACT/latestrelease-v3.004.html.
pdb_extract can be used via web interface or downloadable workstation from pdb-extract.rcsb.org.
30-October-2007
Southern California Wildfires and PDB Disaster Preparedness
As many of you know, extensive wildfires caused widespread destruction throughout much of Southern California. More than 500,000 people were evacuated in San Diego County, including many RCSB PDB staff members. The RCSB PDB website (www.pdb.org) and the PDB FTP archive (ftp.wwpdb.org) are hosted at the San Diego Supercomputer Center (www.sdsc.edu) on the University of California, San Diego (www.ucsd.edu) campus. Essential PDB operations continued uninterrupted at the height of the firestorms, including the weekly update on Tuesday, October 23, thanks to the RCSB PDB staff, the dedication of many people at UCSD, and the power of the Internet.
Failover sites are maintained at the Skaggs School of Pharmacy and Pharmaceutical Sciences (pharmacy.ucsd.edu), also on the UCSD campus; and at Rutgers, the State University of New Jersey (www.rutgers.edu). Availability of services at the primary site, and, if necessary, transfer of services to the failover sites, are automatically monitored and enacted by an outside DNS service. The PDB archive is also mirrored by wwPDB members MSD-EBI and PDBj.
The RCSB PDB continues to review and enhance its plans for disaster preparedness and recovery. Questions and comments may be sent to [email protected].
Fall 2007 RCSB PDB Newsletter Published
The latest RCSB PDB Newsletter has been published in HTML and PDF formats.
This issue's "Message from the RCSB PDB" looks back at the many different RCSB PDB- and wwPDB-related meetings held this past September. The newsletter also describes the "5 Easy Steps for Data Deposition with ADIT" that were presented at this past summer's ACA meeting.
The new features and enhancements added to the RCSB PDB website and database are reviewed.
This quarter's Education Corner by Melissa Kosinski-Collins (Brandeis University) explores the Java program StarBiochem and how it can be used independently by students to view the structure and function of proteins and nucleic acids.
The Community Focus interview speaks with RCSB PDB Co-Director Phil Bourne about his involvement with the computational, systems biology, and educational communities.
If you would like to receive a printed version of the RCSB PDB quarterly newsletter, please send your postal address to [email protected].
Subscription information for the plain text electronic version is also available.
23-October-2007
Interview with Helen M. Berman: RCSB PDB Paper Cited More Than 5,000 Times
According to Essential Science IndicatorsSM, the primary reference for the RCSB PDB is ranked #4 in the top cited Biology and Biochemistry papers of the past ten years. "The Protein Data Bank", published in the 2000 Database Issue of Nucleic Acids Research, has been cited more than 5,000 times.
Director Helen M. Berman discusses this paper in an interview with in-cites magazine at www.in-cites.com/papers/HelenBerman.html.
16-October-2007
Automated Downloads of PDB Data from ftp://ftp.wwpdb.org
The PDB archive at ftp://ftp.wwpdb.org provides coordinate data (in PDB, mmCIF, and PDBML/XML formats) and experimental data. New features and improved query functionality on the RCSB PDB website reflect the data enhancements that resulted from the wwPDB Remediation Project.
A web interface offers a way to download multiple data files from the archive.
Scripts are also available to assist in the automated download of data from the ftp site:
-
ftp://snapshots.rcsb.org/rsyncSnapshots.sh
Makes a local copy of an annual snapshot or sections of the snapshot. This script is annotated to assist in downloading only sections of the archive. -
ftp://ftp.wwpdb.org/pub/pdb/software/rsyncPDB.sh
Copies the current contents of the entire archive.
Additional information is available from ftp://ftp.wwpdb.org/pub/pdb/README about:
- Downloading a single file via ftp
- Downloading the entire archive via rsync
- Downloading all files in a given format (PDB, CIF, XML) via rsync
- Downloading the entire archive via ftp
- Downloading all files in a given format (PDB, CIF, XML) via ftp using tar balls
Questions and comments about downloading data should be sent to [email protected].
09-October-2007
PDB archive at ftp://ftp.wwpdb.org
As previously announced, the PDB archive has been moved to ftp://ftp.wwpdb.org. Updated weekly, this location serves the files from the wwPDB Remediation Project and all newly released files. In September, approximately 14.7 million files were downloaded from ftp.wwpdb.org.
During the same period, 2.7 million files were downloaded from the snapshot of unremediated data at ftp.rcsb.org. This site is no longer being updated.
Users are strongly encouraged to update any automatic scripts or bookmarks to ftp://ftp.wwpdb.org.
The transition to ftp://ftp.wwpdb.org was designed so that users' private copies of the ftp archive would not be overwritten. Reminders about this transition have been sent to depositors and help desk correspondents. In some cases, users performing bulk downloads from the old site were contacted individually.
The RCSB Protein Data Bank would like to thank its users for their attention and cooperation during this transition. Questions and comments should be sent to [email protected].
02-October-2007
Structure Deposition Checklist
It is recommended to have the following items on hand when depositing a structure:
- Contact authors name (including PI), e-mail address, postal address, phone and fax numbers
- Title of the deposited structure and any relevant keywords
- Citation information: author names, title, and journal details if these are available
- Macromolecule names
- Biological assembly information
- Ligand names and chemical diagrams
- Sequence and chain ID for each macromolecule, including his tags or cloning artifacts that were not cleaved and any residues not visible due to disorder
- Source information: scientific names for source organisms, expression systems, or details about synthetically produced molecules
More detailed checklists specific to X-ray, NMR, and electron microscopy (EM) depositions are available at:
Questions about this news item should be sent to [email protected].
25-September-2007
Art of Science exhibits in Dallas, Texas
Images from the RCSB PDB's
Questions about this news item should be sent to [email protected].
RCSB PDB Poster Prize Awarded at ECM
Thanks to everyone who participated in the recent RCSB PDB Poster Prize competition for best student poster related to macromolecular crystallography at the 24th European Crystallographic Meeting held in Marrakech August 22-27, 2007.
The award went to Humberto Couto Fernandes for "Yellow lupine pathogenesis-related protein as a reservoir for cytokinins" (Humberto Fernandes, Anna Bujacz, Oliwia Pasternak, Grzegorz Bujacz, Michal Sikorski, Mariusz Jaskólski, Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland).
Judges: Alexander Wlodawer (National Cancer Institute at Frederick), Wolfram Saenger (Freie Universitaet Berlin), Vilmos Fulop (University of Warwick), Tomitake Tsukihara (Osaka University)
Poster Prize Chairman: Anders Liljas (Lund University)
Special thanks to John Helliwell and Petra Bombicz for their help with organizing this prize.
18-September-2007
PDB Data Summaries
Various summaries of current data in the PDB archive are available through the /pub/pdb/derived_data directory on the FTP site at ftp://ftp.wwpdb.org as well as the summaries page on the RCSB PDB web site. Summaries include:
- pdb_seqres.txt : A listing of all PDB sequences in FASTA format (also available as a compressed file).
- compound.idx: A listing of all PDB ID codes and compound names
- clusters95.txt: Protein chains in the PDB are clustered weekly using cd-hit at 50%, 70%, 90%, and 95% sequence identity.
Questions about these files should be sent to [email protected].
11-September-2007
RCSB PDB Poster Prize Awarded, Art of Science shown at ISMB Meeting
Thanks to everyone who participated in the recent RCSB PDB Poster Prize competition for best student poster related to macromolecular crystallography at the 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) & 6th European Conference on Computational Biology (ECCB) held in Vienna, Austria July 21-25, 2007.
The award went to Keren Lasker for "Determining the configuration of macromolecular assembly components based on cryoEM density fitting and pairwise geometric complementarity" (Keren Lasker, Tel Aviv University and University of California, San Francisco; Maya Topf, Birkbeck College, University of London; Andrej Sali, University of California, San Francisco; Haim Wolfson, Tel Aviv University).
Judges: Yanay Ofran (Columbia University), Predrag Radivojac (Indiana University), Alejandro Giorgetti (University of Rome), Riccardo Percudani (Universit� di Parma), Michael Tress (Spanish National Cancer Research Centre), and Sean Mooney (Indiana University School of Medicine).
Poster Prize Chairman: Marco Punta (University of Georgia)
04-September-2007
New Query and Reporting Capabilities and Features: Access to Remediation and Pre-remediation Data
All data in the PDB archive ( ftp://ftp.wwpdb.org) reflects the new features incorporated as part of the wwPDB Remediation Project, including standardized IUPAC nomenclature for chemical components. These data have been incorporated into the RCSB PDB website and database to provide improved searching and reporting capabilities. Access to the unremediated data is possible for individual structures and for the entire archive.
The left menu of each Structure Summary page provides download options for either remediated or unremediated data in a variety of formats. The Remediation Tab will appear on this page to describe any changes to chain and residue naming conventions made to make the archive more consistent. An example description would be "This structure's single unnamed chain was assigned chain id A".
A snapshot of the entire unremediated PDB archive (as of July 31, 2007) is available at ftp.rcsb.org.
28-August-2007
New Query and Reporting Capabilities and Features: Advanced Search
The data in the PDB archive offers a wealth of valuable metadata. Advanced Search is a powerful and easy-to-use interface to the underlying search architecture and remediated data. Complex queries are constructed by combining simple "subqueries" chosen from a drop-down list. Users get a feel for the likely success of their search strategy while constructing the search by checking the number of results for each subquery.
A broad range of subqueries is available including sequence searches; GO assignments; SCOP and CATH domain assignments; and author name searches.
These subqueries may be combined into a complex query by searching "all" or "any" of the user-specified subqueries.
Advanced Search is accessible from the Search Tab in the left menu or from the search bar at the top of this page.
Click here for a flash tutorial on how to use the advanced search tool.
21-August-2007
New Query and Reporting Capabilities and Features: Improved Sequence Tabs
The Sequence Details tab offers a customizable report that displays polymer chain sequences annotated with properties such as domain and secondary structure. This feature utilizes data from the Remediation Project to provide an exact mapping of the structure sequence to the UniProt 1 sequence. Annotations from CATH, 2 DSSP, 3 PDP, 4 and the author-approved secondary structure can be applied to either the sequence in UniProt or in the PDB entry's SEQRES information.
The size of the report can be customized for use in presentations.
1
(2007) The Universal Protein Resource (UniProt). Nucleic Acids Res.
35(Database issue): D193-7.
2
C.A. Orengo, A.D. Michie, S. Jones, D.T. Jones, M.B. Swindells, and
J.M. Thornton (1997) CATH- a hierarchic classification of protein
domain structures. Structure. 5: 1093-1108.
3
W. Kabsch and C. Sander (1983) Dictionary of protein secondary
structure: pattern recognition of hydrogen-bonded and geometrical
features. Biopolymers. 22: 2577-2637.
4
N. Alexandrov and I. Shindyalov (2003) PDP: protein domain parser.
Bioinformatics. 19(3): 429-30.
14-August-2007
New Query and Reporting Capabilities and Features: Search Result Tabs
Since the RCSB PDB website utilizes the data from the wwPDB Remediation Project, queries will now return a more accurate set of results.
Keyword or Advanced Searches will also return different ways of exploring the search results list. Options available from the tabs shown above the default results list include:
-
Citations: The primary citations for all structures have been verified as part of the Remediation Project. This improved mapping between structure and associated reference is reflected in the database. The Citations Tab provides a PubMed-like list of the primary citations for the structures that match a query.
-
Ligand Hits: This tab lists the ligands known to interact with the structures matching the query. For example, a keyword search for "protein kinase" will return all ligands known to bind protein kinases. Linked images, names, IDs, and formulas appear for each ligand.
-
Web Page Hits: Any of the more than 900 curated web pages found at the RCSB PDB website, including Molecule of the Month features, that contain a requested keyword are found on this tab.
-
GO, SCOP, CATH Hits: These tabs provide the hits that map to the Gene Ontology (GO)*, SCOP and CATH. Entries are returned in a tree browser, which indicates where these structures reside in the respective hierarchies. The SCOP tab, for example, indicates which hits belong to which class of proteins.
* The Gene Ontology Consortium (2000) Nature Genetics 25:25-29; Conte, L., Bart, A., Hubbard, T., Brenner, S., Murzin, A. & Chothia, C. (2000) Nucleic Acids Res. 28: 257-259; Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B. & Thornton, J. M. (1997) Structure 5:1093-1108.
RCSB PDB Newsletter Summer 2007 Published
The latest RCSB PDB Newsletter has been published in HTML and PDF formats
This newsletter describes the recent release of the remediated PDB archive, tools for depositing structures, and new features of the RCSB PDB website.
This quarter's Education Corner by Alisa Zapp Machalek describes educational resources available from the NIGMS.
In the Community Focus interview, Alex Wlodawer of the National Cancer Institute discusses his recent thoughts on the deposition of experimental data files, and his thoughts on reviewing macromolecular structure papers for publication in journals.
If you would like to receive a printed version of the RCSB PDB quarterly newsletter, please send your postal address to [email protected].
Subscription information for the plain text electronic version is also available.
07-August-2007
The PDB Archive at ftp.wwpdb.org
Data annotated and released by members of the wwPDB are available for download from ftp.wwpdb.org. This site is updated on a weekly basis.
All data in the PDB archive reflects the new features incorporated as part of this wwPDB project, including standardized IUPAC nomenclature for chemical components. Users may have to download new software to view the files with the new nomenclature (e.g., RasMol, Chimera) or update their scripts for automatic downloads. Please see remediation.wwpdb.org for details.
A snapshot of the unremediated PDB archive (as of July 31, 2007) will be available at ftp://ftp.rcsb.org. This site has been frozen, and will not be updated.
RCSB PDB Poster Prize Awarded at ACA Meeting
Thanks to everyone who participated in the recent RCSB PDB Poster Prize competition for best student poster related to macromolecular crystallography at the ACA.
The award at the American Crystallographic Association's Annual Meeting (July 21-26; Salt Lake City, UT) went to Hasan Demirci for "Structure Based Protein Enginering of Ribosomal Protein Trimethyltransferase" (Hasan Demirci, Steven T. Gregory, Albert E. Dahlberg, Gerwald Jogl. Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University).
Judges: Mitchell J. Guss (University of Sydney), Peter Horanyi (University of Virginia), Thomas Koetzle (Argonne National Laboratory), James Phillips (Duke University Medical Center), Bernard Santarsiero (University of Illinois at Chicago), and Timothy Umland (Hauptman-Woodward Medical Research Institute)
Poster Prize Chairman: John Rose (University of Georgia)
31-July-2007
wwPDB Remediated PDB Archive Released
The PDB archive that has been remediated by the wwPDB is available from ftp.wwpdb.org. Searches and reports performed on the RCSB PDB website will now utilize these data.
The FAQ below answers general questions about the remediation project and technical questions downloading the data files.
wwPDB Remediation Project: Frequently Asked Questions
General Questions
-
Why was the archive remediated? In the past, query across the complete PDB archive has been limited by missing, erroneous and inconsistently reported data, nomenclature, and other annotations. The evolution of experimental methods and the techniques used to process these data has introduced various inconsistencies into the PDB archive. The wwPDB has remediated the archive to create a more uniform and consistent data resource.
-
What has changed? The result of the two-year wwPDB Remediation Project can be found in the remediated coordinate files, which now have updated sequence database references and taxonomies, validated citations, improved representation of virus structures, improved labeling of nucleic acids, more consistent identification of synchrotron beamlines, more detailed chemical description of non-polymer and monomer chemical components, and removal of redundant ligands and standardization of atom nomenclature. An overview of this project and a document detailing the types of changes made and not made is available at www.wwpdb.org/docs.html.
-
What has not changed? While some data files have been changed to improve standardization and consistency, changes have NOT been made to the originally submitted experimental data values. For instance, atom nomenclature may have changed in the interest of standardization, but the underlying atom positions are not altered. An overview of this project and a document detailing the types of changes made and not made is available at www.wwpdb.org/docs.html.
-
Will I need to submit deposition files in a new format? No, any changes in nomenclature required to support the remediated chemical components dictionary can be made automatically by the wwPDB during the annotation process.
-
How will the remediated information improve my ability to search the wwPDB archive? Standardization of nomenclature and improved chemical representation enables more detailed mining of molecular interactions and improved integration with chemical and pharmaceutical data resources. Similarly, improved consistency and representation of molecular sequence promotes better integration with a wide range of biological resources.
Technical Questions
-
Will my structure visualization software work with the remediated data? We have approached 60 software providers with a preview of both the new ligand chemical component dictionary and a set of representative clean PDB entries for review. The latest versions of key packages such as Chimera and RasMol support the remediation changes. Links to software resources can be found at remediation.wwpdb.org/software.html.
-
Has the remediated ftp site been reorganized? No, the organization and file naming conventions used in the remediated site follow conventions in the current ftp site.
-
How will I find the new remediated ftp after August 1, 2007? Once the transition is complete, the remediated archive will become the production resource at ftp.wwpdb.org. The unremediated or "original" archive will remain available as a frozen archive at the existing ftp address ( ftp.rcsb.org).
-
How long will it take me to down load the new archive? The wwPDB ftp site contains well over 600,000 data files and requires over 60 Gigabytes of storage.
All users who wish to maintain a full copy of the wwPDB FTP archive need to perform an initial download of the new archive, which can then be updated on a weekly basis. The full archive consists of over 600,000 files (~60 GB). The following methods for downloading all or parts of the archive are available: downloading the entire archive or all files in a given format via rsync (the preferred method); downloading single files or the entire archive via ftp; or downloading all files in a given format via ftp using tar balls.
A full ftp download will require a substantial amount of time. Downloading the archive using the rsync protocol is approximately 50% faster than using the ftp protocol. See ftp://ftp.wwpdb.org/pub/pdb/README for more information.
-
What if I only want the files in PDB (or XML or mmCIF) or some other logical grouping, do I need to down load the full file? The download instructions describe how to copy subsets of the archive. We have also prepared a starter kit that provides the user with several data down load options focused on specific data formats or groupings. See ftp://ftp.wwpdb.org/pub/pdb/README for more information.
-
What compression format does the remediated ftp use? Based on user input we have standardized the new ftp archive compression using gzip (GNU zip).
This document is also available as a PDF from www.wwpdb.org/docs.html.
24-July-2007
Remediated PDB Archive To Be Released on August 1, 2007
The PDB archive has been remediated and will be available starting August 1, 2007 from ftp.wwpdb.org. All data in the PDB archive will reflect the new features incorporated as part of this wwPDB project, including standardized IUPAC nomenclature for chemical components.
Users may have to download new software to view the files with the new nomenclature (e.g., RasMol, Chimera). Please see http://remediation.wwpdb.org/software.html for details.
A snapshot of the unremediated PDB archive (as of July 31, 2007) will be available at ftp://ftp.rcsb.org.
An FAQ about this project and transition is available at www.wwpdb.org/docs.html.
Questions about any of these events should be sent to [email protected]
Announcement: Data Processing Procedures
Starting August 1, 2007, files processed and released into the archive by the wwPDB sites will reflect the new features incorporated as part of the remediation project. These files will follow the PDB Exchange Dictionary (PDBx) v1.045 and the Protein Data Bank Contents Guide Version 3.1.
There is no change to how depositors submit their files. Any required changes in nomenclature can be made automatically by the wwPDB during the annotation process.
For more information, please see www.wwpdb.org.
Questions about any of these events should be sent to [email protected]
17-July-2007
Use the RCSB PDB Beta Website to Search and Report on Remediated Data
The RCSB PDB website that utilizes the data from the wwPDB Remediation Project is available at betastaging.rcsb.org. It will become the production site at www.pdb.org after a period of testing.
This site offers:
-
Improved searching and reporting capabilities
-
Updated sequence references
-
Updated primary citation information and links
-
Better representations for complex assemblies (such as viruses)
-
Access to remediation and pre-remediation data
-
Advanced access to ligand information
-
Enhanced sequence details page for each structure
Questions about any of these events should be sent to [email protected]
ACA's Annual Meeting: Remediation Poster, Young Scientists Lecture, Exhibit Booth Demonstrations, Informatics in Structural Biology session, and Poster Prize
The RCSB PDB will be actively involved with the Annual National Meeting of the American Crystallographic Association (ACA) July 21-26, 2007 in Salt Lake City, UT.
-
The session "Informatics in Structural Biology", organized by John Westbrook (RCSB PDB) and Kim Henrick (MSD-EBI), will focus on the applications of structural informatics (Tuesday July 24).
-
Annotator Jasmine Young will describe the "5 Easy Steps for Structure Deposition" as part of the Fun Lectures for Young Scientists symposium on Tuesday, July 24.
-
The poster "Remediation of the PDB Archive" will be presented on Tuesday, July 24.
-
Exhibit Booth Demonstrations: The RCSB PDB will be on-hand in the exhibit hall to provide online demonstrations of the RCSB PDB website and the remediated data. Please stop by and say hello!
-
Poster Prize: A poster will be selected for the RCSB PDB Poster Prize. The award will be a subscription to Science and a related book.
Questions about any of these events should be sent to [email protected]
10-July-2007
Upcoming Meetings: Demonstrations, Art of Science Exhibit, and Poster Prize at ISMB
The RCSB PDB will be involved in several activities at the 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) & 6th European Conference on Computational Biology (ECCB) that will be held in Vienna, Austria from July 21-25, 2007.
-
Poster Prize: The meeting's Posters Committee will select a student poster for the RCSB PDB Poster Prize. The award will be a subscription to Science and a related book.
-
Art of Science Exhibit, which features images pictures available from the RCSB PDB website and the Molecule of the Month will be on display in the exhibit hall.
-
Demonstrations: The RCSB PDB will be on-hand in the exhibit hall to provide online demonstrations of the RCSB PDB website and the remediated data. Please stop by and say hello.
Questions about any of these events should be sent to [email protected]
03-July-2007
Structural Genomics News: PSI Highlighted Structures, Technical Advances, and Assessment
The RCSB PDB Structural Genomics Information Portal offers online tools, summary reports, and target information related to structural genomics. This site also links to new features from the Protein Structure Initiative (PSI). The PSI Structures of the Month highlights recent structures solved by the current centers, while the PSI Technical Highlights describes methods developed by researchers to streamline the structure determination process. These features include links to related information, such as PDB structure summaries, published papers, and center Web sites.
To date, the overall PSI effort has resulted in nearly 2,500 structures of which 70 percent share less than 30 percent of their sequence with known proteins. Methods and tools developed during the first phase of the PSI have been incorporated into the centers' structural genomics pipelines and adopted by structural biology labs worldwide.
The National Institute of General Medical Sciences, which supports the PSI, currently is engaged in an assessment of the project and invites scientists, scientific organizations, and other interested parties to participate in this assessment by July 20, 2007. A formal Request for Information has been posted in the NIH Guide for Grants and Contracts.
26-June-2007
Help Desks Answer Questions about Remediated Data, the RCSB PDB Website, Deposition, and more
Electronic help desks are available to support users exploring PDB data.
[email protected] is available to address questions regarding the remediated PDB archive. The wwPDB appreciates the feedback from users who have examined the Chemical Component Dictionary and files in the remediated archive. Further information about this project is available at www.wwpdb.org.
[email protected] answers questions about the deposition and annotation process at the RCSB PDB.
Support pages at deposit.pdb.org include a file deposition and release FAQ, an overview of software tools, and tutorials for using ADIT, pdb_extract, the Validation Server, and Ligand Depot.
[email protected] responds to requests relating to the navigation of the RCSB PDB website. Questions about searching, reporting, and using all of the resources available from the RCSB PDB should be sent to this address.
The RCSB PDB help system launches a separate browser window to allow users to access the help information and the website at the same time. It offers detailed topics (including Getting Started, Download Files, Search/Browse the Database, and Results), an index, glossary, and search engine.
19-June-2007
Exploring Ligands in the RCSB PDB Database
A ligand name can be entered in the keyword text search at the top of any page from the RCSB PDB website. The Advanced Search query engine can also be used to search for structures based upon ligands based upon the ligand's name, ID code, or SMILES string.
In addition to reviewing the structures that match the given query constraints, users can select the 'Ligand Hits' tab, which lists the ligands known to interact with the structures matching the query. The Ligand Hits tab also offers a gallery view of ligand images.
Selecting one of the ligands from the this tab returns a summary page with chemical and structural details. The page offers interactive and static views of the ligand. Users may also download 'model' coordinates (the experimental coordinates from the first deposition of the ligand) and 'ideal coordinates' (generated from the model coordinates and their connectivity) in a variety of formats including CIF, XML, SDF and PDB.
The PDB chemical components dictionary (formally the HET dictionary) containing these ligands has been remediated to better describe the components that interact with macromolecular structures. Please go to wwpdb.org to learn more about this remediation project, and the release of the archive of remediated data files.
More information about ligand searching and viewing is available.
12-June-2007
Depositing NMR Structures with ADIT-NMR
Users can now deposit NMR structure and experimental data using one tool: ADIT-NMR. Available from batfish.bmrb.wisc.edu/bmrb-adit, ADIT-NMR can be used to precheck, validate, and deposit NMR structures. Coordinates and constraint data will be processed and released by the RCSB PDB, while other NMR spectral data (such as chemical shifts, coupling constants, and relaxation parameters, etc.) will be processed and archived by BMRB.
All new NMR depositions at RCSB PDB will be submitted using ADIT-NMR. The assignment of PDB/BMRB IDs and the movement of data files between sites is fully automated. More than 100 joint depositions have already been processed through this new system. Any unfinished NMR deposition sessions that were started using ADIT before May 16, 2007 will continue to be available at that site.
Other tools for NMR depositions include:
-
pdb_extract
1
minimizes errors and saves time during the deposition process since
fewer data items have to be manually entered.
The program extracts key details from the output files produced by many X-ray crystallographic and NMR applications for use in the deposition process. The program merges these data into macromolecular Crystallographic Information File (mmCIF) data files that can be used with ADIT-NMR to perform validation and to add any additional information for PDB deposition.
pdb_extract can be used via web interface or downloadable workstation from pdb-extract.rcsb.org.
-
The Validation Server lets users check the format consistency of
coordinates (PRECHECK) and to create validation reports about a
structure before deposition (VALIDATION). These checks can be done
independently by the user. The Validation Server can be used at the
RCSB PDB (
http://deposit.pdb.org/validate/) and PDBj (
http://pdbdep.protein.osaka-u.ac.jp/validate/)
sites.
NMR structures may also be deposited using ADIT at PDBj and AutoDep at MSD-EBI.
05-June-2007
Remediated File Formats: mmCIF, PDBML-XML, and PDB
The wwPDB has collaborated on a project to remediate the PDB archive and create a new set of corrected files. The remediated data files are currently available for testing before they become the main PDB archive in three formats:
- mmCIF. All remediation work was done using the PDB Exchange Dictionary (PDBx) that follows the mmCIF syntax.
- PDBML-XML. Remediated data files are also available in PDBML-XML format, in a direct translation from the files in mmCIF format.
- PDB File Format. The remediated files will be released in PDB File Format version 3.0. This version of the file format incorporates standardized atom nomenclature, and distinguishes deoxyribonucleic acid from ribonucleic acid.
The entire PDB archive has been reviewed and remediated by the wwPDB with the objectives of improving the detailed chemical description of non-polymer and monomer chemical components; standardizing atom nomenclature; updating sequence database references and taxonomies; resolving any remaining differences between chemical and macromolecular sequences; improving the representation of viruses; and verifying primary citation assignments. In addition, the atom nomenclature for amino acids and nucleotides now conforms with IUPAC standards.
Questions and comments about the files should be sent to [email protected]. Major announcements will be made at the wwPDB website as well as on the individual member websites.
29-May-2007
Using Simple Viewer to visualize functional biological units
When crystallographic structures are deposited in the PDB, the primary coordinate file generally contains one asymmetric unit - a concept that has applicability only to crystallography. For many of these structures, the asymmetric unit represents the functional biological molecule. In other cases, the biological unit can be generated from the asymmetric unit.
In these cases, Protein Workshop can be used to display the asymmetric unit and Simple Viewer can be used to explore the functional biological unit of a structure. Simple Viewer can rotate a structure, zoom in and out, and then save the view of the biological unit as an image file.
Simple Viewer tool can be launched from the "Display Options" found on an entry's Structure Summary page. Simple Viewer requires Java version 1.4 or greater.
Citation information is available for images created using Simple Viewer.
An Introduction to Biological Units and the PDB Archive is available to describe asymmetric and biological units in more detail.
22-May-2007
Searching the Remediated Chemical Component Dictionary
The Chemical Component Dictionary has been remediated to address the inconsistencies in older dictionary entries that resulted in valence problems, missing model coordinates, and redundant ligands.
The features of the new dictionary include:
-
Standardized nomenclature
- IUPAC nomenclature for standard amino acids and nucleotides (Pure & Applied Chem., 70, 117-142, 1998)
- All atom labels begin with element type symbol
- Retention of all prior names as an alternate identifier
- Model coordinates have been corrected, redundant chemical components obsoleted, and additional definitions for protonated forms are provided.
- Stereochemical assignments, aromatic bond assignments, idealized coordinates, chemical descriptors (SMILES & InChI), and systematic chemical names have been added.
The full Chemical Component Dictionary and the companion Amino Acid Variants Dictionary can be downloaded from http://remediation.wwpdb.org/downloads.html.
Users can also search for individual chemical components, either by entering the component ID in the form provided, or by browsing by ID. The variant dictionary can also be browsed.
This dictionary was remediated as part of the wwPDB's Remediaton Project. In addition to the improvements made as a result of the Chemical Component Dictionary, this project reviewed the PDB archive and updated sequence database references and taxonomies; resolved any remaining differences between chemical and macromolecular sequences; improved the representation of viruses; and verified primary citation assignments.
The remediated data files are currently available for testing before they become the main PDB archive. Questions and comments about the files should be sent to [email protected]. Major announcements will be made at the wwPDB website as well as on the individual member websites.
15-May-2007
Testing Remediated PDB Files
The
wwPDB
has collaborated on a project to remediate the PDB archive and create a
new set of corrected files.
These files are available for community testing in a variety of ways,
including:
- Sets of Example Files. Sets of structures are available for download in PDB, mmCIF and PDBML formats. These sets each include ten structures that illustrate the nomenclature changes typical of the revised dictionary. Three sets are currently available: proteins, nucleic acids, and viruses.
- Chemical Component Dictionary. To review the remediated Chemical Component Dictionary, users can search or browse the dictionary by ID. Each chemical component has a summary page that provides diagrams, information on the physical and chemical features of the ligand, status information, and links to the component definition in CIF and PDBML/XML formats along with model coordinates, idealized coordinates, and chemical diagrams. The entire dictionary can also be downloaded.
- Search by PDB ID. To download a specific entry, users can either navigate the FTP site, or can enter the PDB ID at http://www.pdbj.org/remediation/. Instructions for accessing the entire archive are at http://www.wwpdb.org/remediation-downloads.html. New options for accessing these data will be announced on this site and the wwPDB site in the near future.
The entire PDB archive has been reviewed and remediated with the objectives of improving the detailed chemical description of non-polymer and monomer chemical components; standardizing atom nomenclature; updating sequence database references and taxonomies; resolving any remaining differences between chemical and macromolecular sequences; improving the representation of viruses; and verifying primary citation assignments. In addition, the atom nomenclature for amino acids and nucleotides now conforms with IUPAC standards.
Questions or comments about the remediation project should be sent to [email protected].
08-May-2007
PDB Focus: First Time Depositors...
There are a few steps a depositor can take to make the process of depositing a structure to the PDB quick, easy, and accurate! This is an iterative process. If you encounter problems at a particular step, please make the correction(s) and go through the steps again.
- Use the pdb_extract Program Suite to extract information needed for deposition from output files produced by many structure determination applications.
- Check your structure with the Validation Suite and Server to ensure that the data being deposited are accurate and reflect what you intend to submit.
- Run BLAST (at NCBI) to compare your sequence to sequence database references. Any necessary corrections can then be made to your sequence and coordinates.
- Use Ligand Depot to find the proper codes for existing ligands, to link to other entries with a particular ligand, and to search for substructures. If a ligand related to a deposition is not in Ligand Depot, please email the chemical diagram, name, and formula to [email protected].
- Deposit your structure using ADIT, using its editor to add any missing information to the deposition.
For a detailed packet of information about first-time deposition, including reprints about validation and Ligand Depot, please send your postal address to [email protected]with the subject line "first time depositor packet".
01-May-2007
RCSB PDB Newsletter Spring 2007 Published
The latest RCSB PDB Newsletter has been published in
HTML and
PDF formats.
This newsletter describes upcoming meeting participation, deposition
statistics (including experimental data) for the first quarter of 2007,
and new additions to the BioSync resource.
The issue also presents new features of the RCSB PDB website, including
improved access to ligand data, a tool for viewing protein-ligand
interactions, new advanced search options, and more.
This quarter's Education Corner looks at the New Jersey Science
Olympiad competition and the winners of the protein modeling trial
event that was sponsored and judged by the RCSB PDB.
In the Community Focus interview, the RCSB PDB speaks with Angela
Gronenborn, an NMR spectroscopist who is the Rosalind Franklin
Professor and Chair of the recently established Department of
Structural Biology at the University of Pittsburgh.
If you would like to receive a printed version of the RCSB PDB
quarterly newsletter, please send your postal address to
[email protected].
Subscription information for the
plain text electronic version is also available.
24-April-2007
Announcement: Release of Remediated PDB Data
The wwPDB has collaborated on a project to remediate the PDB archive and create a new set of corrected files.
A new FTP server containing the remediated data has been set up for testing. The access details for this site are provided at http://www.wwpdb.org/remediation-downloads.html. The new ftp site will be updated weekly in concert with current production site at ftp://ftp.rcsb.org. Both sites share the same organizational structure.
The entire archive has been reviewed and remediated with the objectives of improving the detailed chemical description of non-polymer and monomer chemical components; standardizing atom nomenclature; updating sequence database references and taxonomies; resolving any remaining differences between chemical and macromolecular sequences; improving the representation of viruses; and verifying primary citation assignments. In addition, the atom nomenclature for amino acids and nucleotides now conforms with IUPAC standards.
Your input is very important to us. PDB users are encouraged to test the remediated data files between April and July 2007. The details of the final transition will be announced on this website.
Detailed information about this project can be found at http://remediation.wwpdb.org.
Comments about the files should be sent to [email protected]. Major announcements will be made at the wwPDB website ( http://www.wwpdb.org) as well as on the individual member websites.
17-April-2007
Education Focus: DNA Day
"National DNA Day" will be celebrated on April 25. Commemorating the completion of the Human Genome Project in April 2003 and the discovery of DNA's double helix.
DNA Day encourages teachers and students to celebrate these historic achievements. Online resources relating to DNA Day include the following:
- The National Human Genome Research Institute will be sponsoring an online webcast and chatroom on DNA Day, in addition to providing a variety of teaching materials.
- The Nature Publishing Group has compiled the original articles, historical perspectives, and examinations of DNA in medicine, society, and as a biological molecule in "Double Helix: 50 years of DNA".
- BBC News has compiled resources and images describing the story behind the discovery
- The Nucleic Acid Database is a repository of three-dimensional structural information about nucleic acids. The NDB has a searchable database and a browsable Atlas that provides summary information and images for each structure in the database.
- The RCSB PDB has many educational resources related to nucleic acid structure, including Molecule of the Month features on DNA, Transfer RNA, and Self-splicing RNA.
Dr. Judith McGonigal's 8th grade class created a 3D model of DNA out of
swimming pool "noodles".
Haddonfield, NJ middle schooler Ethan Quanci
(left) liked the scale of the model, because "as we all know whenever
you need to comprehend the process of something small and complex ...
make it BIG."
Alex Romash (middle) felt that "During the process of creating our
molecule I felt just like I was part of Crick and Watson's team and
that we were creating history just like they did." Sam Silver is on the
right. To build your own DNA model, try Science in School's lesson on
Modelling the DNA double helix
using recycled materials.
10-April-2007
Using PubMed Abstracts to Search the PDB
PubMed abstracts are accessible from a published entry's Structure Summary page. The "Abstract" link returns a page with the article title, abstract, keywords, authors, organizational affiliation, journal, and PubMed identifier. The PubMed abstract at NCBI can also be viewed by clicking on the icon next to "Abstract".
The text box at the bottom of the Abstract page can be used to search for related structures in the PDB using any word in the abstract or keyword fields. Terms can be entered into the text box either by typing the word manually or by clicking the mouse over any word in the abstract or the keyword fields.
03-April-2007
RCSB PDB Focus: Viewing Secondary Structure in Plain Text
For any released structure, the 'Sequence Details' tab offers summary information related to sequence. On this page, the sequence and secondary structure table uses graphics to illustrate secondary structure and domain information.
A textual display of secondary structure is linked from the little 'page' icon at the top right (in the dark gray bar).
The sequence and its corresponding secondary structure information is shown in paired lines. The first line in a pair provides the amino acid sequence, colored red for helices, blue for beta strands and bridges, and green for turns and bends. The line underneath indicates the secondary structure using the abbreviations from Kabsch and Sander † shown below.
Letter Secondary Structure
- G 3 10 helix
- H Alpha helix
- I Phi helix
- T H-bonded turn (short segment of helix)
- E Extended strand
- B Beta bridge (short segment of strand)
- S Bend
† Reference:
Kabsch, W. and Sander, C. (1983) Biopolymers 22:2577-2637.
27-March-2007
New Information and Statistics Available at BioSync
The BioSync website now contains updated beamline descriptions for operational US synchrotron beamlines as well as some basic information for almost all operational international beamlines.
PDB deposition statistics, grouped by site and beamline, can be found at biosync.rcsb.org. Galleries of structures, also grouped by site and beamline, are cross-linked to structure summary pages in the RCSB PDB. Tables of primary citations and some general information (phasing software, resolution, R-factors, etc.) are also provided. Most recently, similar tables and galleries have been added for structural genomics structures solved from synchrotron data.
BioSync (Structural Biology Synchrotron Users Organization) was formed in 1990 as a grassroots organization intended to promote access to synchrotron radiation. The BioSync resource, originally designed and hosted by UCSD/SDSC, has been updated and is now being maintained by the RCSB PDB.
Updates to beamline descriptions from local personnel as well as general comments and suggestions are most welcome at [email protected] .
20-March-2007
Princeton High School Wins New Jersey Science Olympiad Protein Modeling State Finals
At the NJ Science Olympiad (NJSO) State Finals, teams from all over the garden state presented their hand-built 3D models of a major histocompatibility complex (MHC) protein, along with an abstract, to be judged by annotators from the RCSB PDB. After taking a written exam about MHC and protein structure and function, the highest ranking teams were Princeton High School (First Place), Montgomery High School (Second), and The Lawrenceville School (Third).
Congratulations to all of the teams who participated in this trial event -- there were many great models, abstracts, and responses to the exam. Questions about the NJ Science Olympiad Protein Modeling trial event should be sent to [email protected].
Special thanks to our judges from the RCSB PDB (Shuchismita Dutta, Irina Persikova, Monica Sekharan, and Christine Zardecki (Event Supervisor)), the NJ Science Olympiad organizers, and to the MSOE Center for BioMolecular Modeling for the design of this event.
13-March-2007
RCSB PDB Focus: Saving Protein Workshop "States" for Future Visualization Sessions
Protein Workshop is a molecular viewer accessible from every PDB entry's Structure Summary page. Its simple interface lets users quickly and easily select structural elements and change the coloring, labeling, and representation style (ribbons, cylinders, and more). Users can also color specific structural features such as conformation type and hydrophobicity.
Protein Workshop is an excellent tool for generating high-resolution images in JPG, BMP, TIFF, WBMP, and PNG formats. A tutorial for creating these images is available.
This tool offers a way to save the "state" of a Protein Workshop session. Users can rotate and zoom a structure to a particular orientation and then capture this view for later use. To save a state, enter a title next to the "Capture current viewer state" from the Options menu, and then select the adjacent button. The name of this state will be listed in the box below. The view of the molecule can then be changed around, but users can always go back to saved states by clicking on the state's name.
These states can be saved in a XML file for later use by selecting the state and clicking the "Export selected state" button. States can be restored from a file by clicking the "Import state" button.
This tool uses the Molecular Biology Toolkit (mbt) and JOGL technology, and requires no installation other than the most recent version of Java. A tutorial is provided to guide users in using Protein Workshop.
Figures created using Protein Workshop should cite the RCSB PDB and the mbt.
06-March-2007
RCSB PDB Focus: Restarting ADIT depositions
A structure can be deposited in more than one Internet session by using ADIT's "Session Restart ID" feature. This identifier appears in red in the center of the browser window when ADIT's "deposit" step is first started. It is also seen in the title of the browser throughout the deposition session.
The case-sensitive restart ID should be entered in the space provided on the ADIT home page to return to the undeposited entry. Any data entered in a category are stored every time the user selects the SAVE button. All entered data associated with a particular entry can be accessed using the restart ID until the "DEPOSIT NOW" button is selected, for up to six months after the session has been last updated.
ADIT is available at the RCSB PDB and PDBj. ADIT-NMR can be used to deposit data to the PDB and BMRB in one session.
A tutorial guide to using ADIT is available in English and Japanese. Example "in progress" deposition sessions are available to practice learning how to use ADIT at http://rcsb-deposit-demo-1.rutgers.edu.
27-February-2007
RCSB PDB Exhibit at the Biophysical Society Meeting
The RCSB PDB will participate in the exhibition at the 51st Annual Meeting of the Biophysical Society (March 3-7 in Baltimore, Maryland). Staff will be available at booth #639 to answer questions and to demonstrate the deposition and query features available from www.pdb.org. We hope to see you there!
20-February-2007
Depositing and Releasing Experimental Data
The RCSB PDB strongly encourages depositors to follow the guidelines
set by the IUCr, the NIH, and the journals regarding the submission and
release of coordinate and experimental data.
Deposition of experimental data (structure factor and/or NMR constraint
files) is required by many journals, including
Acta Crystallographica,
Biochemistry,
Cell,
Nature, and
Science.
These files can be uploaded during the ADIT deposition process.
Depending upon the hold status selected by the depositior, data release
can occur when a depositor gives approval (REL), the hold date has
expired (HOLD), or the journal article has been published (HPUB). There
is a one-year limit on the length of a hold period, including HPUBs. If
the citation for a structure is not published within the one-year
period, depositors will be given the option to either release or
withdraw the deposition.
Detailed deposition and release information is available from
http://deposit.pdb.org/.
13-February-2007
Citing Structures in the PDB: IDs, citations, and DOIs
The contents of the PDB are in the public domain. Structures can be cited using their PDB ID and the published citation related to the structure.
-
Structures may also be referenced using their Document Object
Identifier (DOI). The DOIs for PDB structures all have the same format
- 10.2210/pdbXXXX/pdb - where XXXX should be replaced with the desired
PDB ID. For example, the DOI for PDB entry 4HHB is
"10.2210/pdb4hhb/pdb".
This DOI can then be used as part of a URL to obtain the entry's compressed data file in PDB format ( http://dx.doi.org/10.2210/pdb4hhb/pdb), or can be entered in a DOI resolver (such as http://www.crossref.org/) to automatically link to pdb4hhb.ent.Z in the main PDB ftp archive ( ftp://ftp.rcsb.org).
-
The journal reference for the Research Collaboratory for Structural
Bioinformatics PDB is:
H.M.Berman, J.Westbrook, Z.Feng, G.Gilliland, T.N.Bhat, H.Weissig, I.N.Shindyalov, P.E.Bourne
The Protein Data Bank
Nucleic Acids Research, 28 pp. 235-242 (2000)
-
The RCSB PDB is a member of the worldwide PDB (
wwPDB).
The journal reference for the wwPDB is:
H.M. Berman, K. Henrick, H. Nakamura (2003): Announcing the worldwide Protein Data Bank. Nature Structural Biology 10 (12), p. 980
Detailed information for citing the use of data, structures (with examples), and images is available.
06-February-2007
East Brunswick High School and Bergen County Academy Win New Jersey Science Olympiad Protein Modeling Regionals
Several high school teams competed in the protein modeling trial events at the New Jersey State Science Olympiads held January 9 (Central Regional) and 11 (Northern Regional). Each team created a three-dimensional model of an insulin structure, accompanied by a written description, using resources available from the RCSB PDB. At the event, teams also answered multiple choice and short answer-questions focusing on the structure and function of insulin. The three-dimensional protein models are built using Mini-Toober kits provided by the RCSB PDB.
At the Central Regional, East Brunswick High School (First Place and the 2006 State Champions in this event), West Windsor-Plainsboro South HS (Second), and West Windsor-Plainsboro North HS (Third) created very strong models.
At the Northern Regional, Bergen County Academy (First Place), Westfield HS (Second), and New Providence HS (Third) exhibited very strong skills.
Special thanks to our judges from the RCSB PDB (Shuchismita Dutta,
Irina Persikova, Massy Rajabzadeh, Monica Sekharan, Jasmine Young,
Muhammed Yousufuddin, and Christine Zardecki (Event Supervisor)), the
NJ Science Olympiad organizers, and to the
MSOE
Center for BioMolecular Modeling for the design of this event. We look
forward to seeing everyone at the state competition on March!
Questions about the
NJ Science Olympiad Protein Modeling trial event
should be sent to
[email protected].
New Web Site Features: Advanced Search and Help Pages
In addition to new features such as improved access to ligand, SNP, and Pfam information, new options have also been added to the Advanced Search and help features.
-
New Advanced Search Options
Simple searches of the RCSB PDB website can be performed using the keyword box at the top of each page. The "Advanced Search" feature makes more specific and complex searching possible.
New possible queries include:
-
Keyword Searching: These options can be used to search with
keywords, phrases or a series of keywords.
Advanced Keyword Search: This option can be used to search for keywords in the full text or in the author name. If you enter a phrase, you must place it in quotes otherwise it will be interpreted as a series of keywords. Advanced keyword search supports the lucene syntax for sophisticated string searching.
PubMed: Searches PubMed titles and abstracts for an entry's primary citation (if it exists).
Medical Subject Headings (MeSH): Searches for structures associated with particular MeSH terms from the National Library of Medicine (NLM). This option launches the MeSH Browser, which lets users either browse through the MeSH hierarchical tree or search the tree with keywords.
-
mmCIF items: At the bottom of the Advanced Search pulldown menu is
the option to build queries using mmCIF Category and Item names. For
example, users can search information about the details of the
biological assembly in the category _struct_biol.details.
- Author Assigned: Looks for structures based upon keywords used by the depositor.
Advanced Search Tutorial: http://www.pdb.org/pdb/tutorials/advancedSearch.html
Advanced Search Help: http://www.pdb.org/robohelp/advancedsearch/intro.htm
-
Keyword Searching: These options can be used to search with
keywords, phrases or a series of keywords.
-
New Help Features
A new set of Flash Tutorials are available, modeled on the popular guides on how to use the RCSB PDB site overall and how to use the Advanced Search.
Tutorials are now available for the MeSH Browser, Protein Workshop, KiNG, Jmol, and general Navigation. They are accessible from the left-hand menu under "Site Tutorials".
Quick Tips are another resource for learning new ways of exploring the RCSB PDB website. These tips offer hints and quick links to get started. To view them, click on the "Show Quick Tips" in the left-hand menu. Clicking on the arrow button will scroll through these hints, and clicking on the X will close the box.
Please write to [email protected] with any questions or comments about these new features.
30-January-2007
New Web Site Features For Viewing Ligand, SNP, and Pfam Data
The RCSB PDB website now offers improved access to ligand, Single Nucleotide Polymorphism (SNP), and Pfam information.
-
Improved Access to Ligand Data
The PDB chemical components dictionary (formally the HET dictionary) has been remediated to better describe the components that interact with macromolecular structures. This new dictionary has been incorporated with the RCSB PDB database.
Options available after a search now includes a tab called 'Ligand Hits'. This page lists the ligands known to interact with the structures that match the query.
For example, a search for 'protein kinase' returns 2051 structures and 678 ligands. From the 'Ligand Hits' page, users can find all of the structures that contain that ligand or access information from the 'Ligand Summary' page. This page offers summary information, downloads (definitions and coordinates), and interactive and static views.
-
Ligand Explorer Tool for Viewing Protein-Ligand Interactions
Ligand Explorer is a Java-based program accessible from each Structure Summary page. Features include the ability to highlight ligand interactions based on conventional and user-defined thresholds and a 'contact map' that shows the details of each interaction.
-
Access to Single Nucleotide Polymorphism (SNP), Pfam, and more
SNP information is now accessible from the structure summary pages. Over 4000 PDB structures are linked to SNP information from the SNP database. This information is accessible from each entry's 'Biology and Chemistry Report' tab.
The Pfam database contains multiple alignments of protein domains. With each release of the Pfam data, files mapping Pfam domains to PDB structures are made available on the Pfam FTP site. This mapping is loaded into our database and the Pfam domain information for a protein structure is displayed on an entry's Structure Summary page and Biology and Chemistry Report, when available.
The 'External Links' option provides further information about the structure under study, such as biochemical pathway information, stereochemistry and ligand binding data. When looking at an entry's structure summary, the external links page is accessible from the left-hand menu.
SNP: http://www.ncbi.nlm.nih.gov/projects/SNP/
Pfam: http://www.sanger.ac.uk/Software/Pfam/
Please write to [email protected] with any questions or comments about these new features.
RCSB PDB Newsletter Winter 2007
The latest RCSB PDB Newsletter has been published in
HTML and
PDF formats.
This newsletter describes new developments in data deposition and
processing, including an article on how DOIs are available for released
entries in the PDB archive.
The issue also looks at how search results and tabular reports can be
sorted, different methods for exploring protein structure domains, and
how to search for sequence variants.
In this quarter's Education Corner, Gary L. Gilliland reports on the
X-Ray Methods in Structural Biology Course held at Cold Spring Harbor
Laboratory.
In the Community Focus interview, the RCSB PDB speaks with Julian
Voss-Andreae, a sculptor of protein structures.
If you would like to receive a printed version of the RCSB PDB
quarterly newsletter, please send your postal address to
[email protected].
Subscription information for the
plain text electronic version is also available.
23-January-2007
Time-stamped Copies of PDB Archive Available via FTP
A time-stamped snapshot of the PDB archive as of January 2, 2007 has been added alongside time-stamped copies of the archive from January 2006 and 2005 at ftp://snapshots.rcsb.org/. It is hoped that these snapshots will provide readily identifiable data sets for research on the PDB archive.
The directory 20070102 includes the 40,933 experimentally-determined coordinate files that were current (i.e., not obsolete) as of January 2, 2007. Coordinate data are available in PDB, mmCIF, and XML formats. The date and time stamp of each file indicates the last time the file was modified.
Scripts are available to automatically download data:
-
ftp://snapshots.rcsb.org/rsyncSnapshots.sh
Makes a local copy of an annual snapshot or sections of the snapshot. Downloading the entire archive can be lengthy (18+ hours), but the time required to download data in a single format should be much less. Depending upon network speed, our tests show that all of the coordinate files in PDB format from a snapshot can be downloaded in about 2 1/2 hours.
-
ftp://ftp.rcsb.org/pub/pdb/software/rsyncPDB.sh
Copies the current contents of the entire archive.
-
ftp://ftp.rcsb.org/pub/pdb/software/getPdbStructures.pl
Copies portions of the current archive.
-
ftp://ftp.rcsb.org/pub/pdb/software/getPdbUpdate.pl
Copies the data from the weekly updates.
Entries in the PDB archive have been processed by the three members of the wwPDB (RCSB PDB, MSD-EBI, and PDBj).
16-January-2007
PDB File Formats, Annotation Procedures, and Remediation
wwPDB members work to annotate all data deposited to the PDB archive. Information about data file formats, annotation procedures, and remediation efforts are described below.
Documentation for the different file formats for PDB data is available at http://www.wwpdb.org/docs.html
Entries in PDB format comply with the PDB Contents Guide v2.3 (July 1998).
Entries in mmCIF format comply with the PDB Exchange Dictionary v1.037 (January 2007).
Entries in XML format comply with the PDBML Schema v1.037 (January 2007).
Annotation procedures and policies are described at http://www.wwpdb.org/docs.html
There are some data items for which the processing procedures are ambiguous. Over the course of the last 12 months, the annotation teams have worked to formalize many aspects of PDB annotation policies and procedures. As a result, a consistent set of annotation procedures are being defined.
Remediation project information is available at http://remediation.wwpdb.org/.
All existing entries have been reviewed and errors have been corrected where possible. One major change is that the atomic names will conform to IUPAC standards. In addition, the chemical component dictionary has been updated and extended to include more information about the chemical structures of each component. The wwPDB Advisory Committee has reviewed and approved this effort.
Please consult this site to review test data files and the new dictionary. The full new data set in PDB, mmCIF and XML formats will become available for review in April 2007.
Questions about these projects should be sent to [email protected].
09-January-2007
Browsing the PDB Using Medical Subject Headings (MeSH)
The RCSB PDB's "
Browse
Database" resources allow users to explore the PDB archive using
different hierarchical trees. The Medical Subject Headings (MeSH)
Browser searches the PDB using an index of biomedical-related
publications from the National Library of Medicine (
NLM).
The primary citations for structures in the PDB are used to retrieve
the MeSH terms associated with their respective PubMed IDs. This
mapping of the PDB IDs and the MeSH terms is then loaded into the
database. The MeSH tree is constructed using the hierarchy obtained
from the
MeSH
site.
Nodes of the tree are populated with structures whose primary citations
are associated with the MeSH term for each node. Clicking on the folder
icon beside each node further opens up the sub folders for that node.
Selecting the term itself will retrieve all associated structures. The
text box available at the top of the page can be used to search the
tree using MeSH index number.
For example, several more folders are revealed when the folder for the
MeSH leaf node 'CO2: Virus Diseases' is opened. The folder for Tumor
Virus Infections can then be opened to find all structures indexed as
relating to Epstein-Barr Virus Infections (MeSH Number C02.929.313).
Top folder 'C11: Eye Diseases' can be opened to find structures indexed
as relating to eye diseases, specifically cataracts (MeSH Number
C11.510.245).
Other browsers can be used to navigate structures based upon
hierarchical trees relating to biological process, cellular component,
molecular function, enzyme classification, source organism, genome
location, SCOP, and CATH. These browsers are available from the "
Search"
tab in the left-hand menu on the RCSB PDB home page.
02-January-2007
PDB Focus: Weekly Deadlines for Release/Modify Entry Requests
PDB entries are processed by three members of the wwPDB ( RCSB PDB, MSD-EBI, and PDBj) and are released immediately (REL), when the corresponding paper is published (HPUB), or on a particular date (HOLD).
Each week, all files scheduled for release or modification are checked and validated one final time. Authors may be contacted to resolve any issues that may arise while preparing the entries for release.
When the release of HPUB structures is requested, the wwPDB routinely confirms the primary citation. If this is not accomplished within that release cycle, the entry may be scheduled to be released in a later update.
To be included in the next weekly update, any required author correspondence should be sent to the appropriate wwPDB member by the following times:
- RCSB PDB ( [email protected]): 15:00 EST Friday
- MSD-EBI ( [email protected]): 15:00 GMT Thursday (10:00 EST Thursday)
- PDBj ( [email protected]): 13:00 JST Thursday (23:00 EST Wednesday)
All entries due for release are transferred to the RCSB PDB for final packaging into the master PDB ftp archive. These files are then released by 4:00 EST each Wednesday.
Requests received after these cutoff times will be processed during the next update cycle.