The crystallographic information file, (CIF) was developed by the IUCr in the early 1990's as a standardized format to document single-crystal structure determinations and to exchange the results between laboratories. The data items used in CIF are described in a series of data dictionaries. The syntax used in these data dictionaries is described in a data definition language (DDL), of which there are two versions DDL 1.4 and DDL 2.1. The data dictionary for the original single-crystal information (now called the core) as well as the powder diffraction (pdCIF) dictionary use a somewhat updated version of the original syntax, DDL 1.4, while the macromolecular CIF (mmCIF) dictionary uses the more complex DDL 2.1.
The CIFEDIT program, described here, was created to view and edit CIFs, with particular emphasis on files that contain multiple blocks, as these are quite important in pdCIF. For DDL 1.4 version dictionaries, CIFEDIT can use information from the dictionaries to validate CIF data items, as well as display the definitions for these items.
Contents:
The CIFEDIT program is started on Unix computers typically by typingcifedit [file.cif]where optionally the name of a file, file.cif, to be read is listed on the command line. In windows, several mechanisms can be used to start the program, but the most convenient will likely be either clicking on the CIFEDIT icon in the start menu or desktop, depending on the software installation options. The CIFEDIT program will open a file browser to select a file to open. One can also configure the software so that either double clicking or left-clicking on a CIF file will open the .CIF file in CIFEDIT.
While the CIF is being read, a message such as the one to the right will be seen. After the read is complete, the main CIFEDIT screen (shown below) displays the CIF data block(s) and data items in a hierarchical format.
The minus sign (-) to the left of block name data_NISI_publ can be toggled using the mouse to hide all of the data items included in the block, as is shown to the left.
As is shown to the left, the "folder" icon indicates a data block, a single page indicates a CIF data item and an icon with a pair of pages indicates a data loop. The contents of loops is not shown, by default, but clicking on the plus sign (+) to the left of the loop causes the entry to be expanded to show the contents. As an example, contrast the difference between loop_0, and loop_2 to the left.
As is shown above, clicking on a CIF data item causes the value associated with that data item to be displayed in the right-hand side window. In the example shown above, the data name _cell_length_a has been selected. This is defined in the CIF dictionary as a number between 0 and infinity with units of Angstroms. These units (A) are displayed adjacent to the entry box.
If the mode is changed from "browse" to "edit" (see control in lower right), the value can be edited, as is seen below. When appropriate, input is validated to require that valid numbers in the allowed range are input and that standard uncertainties (esd's) are entered only where allowed. Likewise, if the CIF dictionary defines a enumerated list of values for a data name, a menu button is offered in place of an entry box. In this way, only a valid entry from the list can be selected.
CIF loops
CIF loops allow multiple values to be associated with one or more data items, in effect defining a table of data. Clicking on the entry for a loop causes all the data names in the loop to be displayed, as is shown below.
When a loop is displayed, extra controls appear, as are defined below. Note that only the first entry listed will be available in "Browse" mode.It is also possible to click on the data name for a item inside a loop, in this case, all entries for that data item in the loop are displayed (a column). This mode is not available for very large loops, as it would require too much memory to display all the entries. The maximum number of entries is controlled by variable CIF(maxRows), which can be customized.
- Loop element #
- The Loop element #" spinbox" is used to select which "row" from the loop is displayed. The up arrow advances to the next row, while the down arrow reverses by one entry. Numbers can also be typed into the entry box; the number is accepted when Enter is pressed. The keyboard up and down arrows can also be used to advance between entries. Other keys such as Page Up, Home, etc. advance in large increments.
- Add to loop
- In edit mode, a new row can be added to the end of a loop using the "Add to loop" button. The value for each new entry is initialized as "?" (meaning value unknown or unspecified.)
- Delete loop entry
- In edit mode, this deletes the current row from the loop. First select the row to delete with the "Loop element #" spinbox. Note that the values are displayed for confirmation before the delete operation is performed. It is not possible to delete all entries from a loop, so this button is disabled when a loop has only a single row defined.
CIF errors
Parse Errors
If errors are noted as a CIF is parsed, a special entry is listed immediately after the block name, labeled "Parse-errors", as is shown below. Note that the "go to line" button in the Show (Hide) CIF contents window can be very convenient for locating and repairing these errors.
Validation Errors
Many other types of errors can be determined by comparing data values against the definitions found in the appropriate CIF dictionary. For example, the core dictionary specifies that the only valid names for _diffrn_radiation_probe are x-ray, neutron, electron, and gamma. If a CIF has this value specified as proton, it will be flagged as an error. Likewise, _atom_type_number_in_cell is specified as number, zero or greater. An error condition will exist, if the value for this is specified as a negative number, or a string than is not a valid number (the special CIF values . and ? are valid, however). Pressing the "Validate CIF" button at the bottom of the main window causes the CIF to be scanned for errors in data values. If errors are located, a window will be displayed. Also, a entry with "Validation-errors" is added to the browser.
The buttons and controls on the bottom of the main window have the following functions:
- Close
- The Close button causes the program to exit. If there are unsaved changes, the user is offered the chance to save the edits to disk.
- Show (Hide) CIF contents
- The "Show CIF contents" button causes a window to be displayed that shows the text of the CIF, as shown below. As CIF data items are selected by clicking on data names or through use of the other buttons, the window is scrolled forward or backward to show the appropriate section. Note that it is possible to make editing changes directly to the CIF using this window, but this will clear the information in the Undo buffer. Also, if the "Open for Editing" button is pressed, it is assumed that the CIF has been changed.
After the "Show CIF contents" button is pressed, the label changes to "Hide CIF contents"; pressing the button again causes the window to be hidden.
![]()
- Show (Hide) CIF definitions
As CIF data names are selected, their definitions are shown in the CIF Definitions window, as shown to the right. After the "Show CIF definitions" button is pressed, the label changes to "Hide CIF definitions"; pressing the button again causes the window to be hidden.
- Validate CIF
- Pressing the "Validate CIF" button causes the CIF to be scanned for errors in data values. If errors are located, a window will be displayed. Also, a entry with "Validation-errors" is added to the browser for each block with errors.
- Undo
- As changes are made to the CIF template, they are recorded and can be reversed using the "Undo" button. There is no limit to the number of changes that are recorded. However, changes cannot be undone after the CIF has been saved to disk.
- Redo
- If changes have been reversed with the "Undo" button, the changes can be reapplied using the "Redo" button. The list of changes available for "Redo" is cleared when a new edit is made or when the CIF is saved.
- Edit Mode
- The CIFEDIT program works in two modes:
- browse
- In browse mode, the CIF can be examined, but no changes can be made.
- edit
- In edit mode, the contents of the CIF can be changed.
- Save
- Changes made to the CIF are not saved to the disk file automatically. When changes have been made, but have not been saved to disk, this button is made active.
Maximum CIF Size
The CIFEDIT program reads the entire CIF into memory. This means that the amount of memory and number of programs running can limit the maximum CIF size that can be handled before the operating system starts to gasp for breath (what really happens is called churning as the computer starts page-faulting and gets nothing done.) To prevent this from happening, variable CIF(maxvalues) limits the number of CIF data items (each value in a loop is counted as an item). By default, this value is 100,000 -- which allows CIFs as large as 0.5Mb to be read. This limit is fine for average computers. If you have a lot of memory and have big CIFs, you can edit cifedit.tcl to raise this value or even set it to 0, which disables the size limit completely.
The pdCIFplot and CIFEDIT programs have been combined into a single distribution, CIFTOOLS. The executable code is the same on all platforms, however, for Windows other platform-specific files are included in a self-installing program file. Please refer to the CIFTOOLS installation instructions for download and installation details.
The script indexCIFdict.tcl is used to create an index to the CIF dictionary or dictionaries that will be used. The index file created by indexCIFdict.tcl, CIF_index, provides a line for each CIF data name in the selected CIF dictionaries. The entry for each data name includes a reference to the name of the dictionary file and location where the data name is found, as well as well as the data type, units and validation ranges or lists allowed values. Byte offsets are used to define definition locations so if the dictionary files are changed in any way, this indexing script must be rerun or definitions will appear incorrectly.On Unix computers, indexCIFdict.tcl is typically run using a command such as:
wish indexCIFdict.tcl [file list]where optionally one or more dictionary files, "file list" to be indexed are included on the command line. Files can also be included on the list to be indexed once the program has been started using the usual sort of file open window. On windows computers, the script can be run with an appropriately configured windows shortcut, or by typing a command such asc:\cifedit\tcl823\bin\wish82.exe c:\cifedit\indexCIFdict.tclin a DOS window or into the Start/Run command box.
This program has benefitted from comments of Brian McMahon of the IUCr. Richard L. Harlow first got me interested in the problem of a universal file format for powder diffraction data, leading eventually to my involvement with CIF and then this programming effort. I may someday forgive him.
The author of CIFEDIT is a U.S. Government employee, which means that CIFEDIT is not subject to copyright. Have fun with it. Modify it. Please add new features and make them available to the rest of the world.
Neither the U.S. Government nor the author makes any warranty, expressed or implied, or assumes any liability or responsibility for the use of this information or the software described here. Brand names cited herein are used for identification purposes and do not constitute an endorsement by NIST.
Comments, corrections or questions: crystal@NIST.gov
lastmod(); ?>
$Revision: 1.4 $ $Date: 2003/08/28 15:38:48 $