Readers: After the Atlanta ACA Meeting (1994) I posted this article to sci.techniques.xtallography. I finally got around to creating a hyper-text version. I hope you find it useful.

and Crystallographic Computing

At the recent American Crystallographic Association meeting in Atlanta, I met a number of crystallographers who were interested in using linux as an operating system for crystallographic computing. I thought I would write up a post of my experiences of compiling and running crystallographic software under linux.

Although, I am posting this to sci.techniques.xtallography to encourage other crystallographers to try linux, I have decided to cross-post it to comp.os.linux.misc to facilitate communication between crystallographers and other linux users.

I will not pretend that I am the first, or the most knowledgeable person to use linux for crystallographic computing. I have just found linux to be a good operating system choice. For those of you who don't know, linux is a free UNIX-clone operating system for 386/486/Pentium CPU's. I have found that using linux allows me to make much better use of my available CPU resources. Linux is impressive.

To start, I have to say that before I got involved with linux, I knew no UNIX, and I have never done anything more complicated than reformat a hard drive on a DOS PC. I could do some FORTRAN programming and no C programming. Having convinced you that I am anything but a computer genius, I want to say that if I could do it, you can too!

Reasons for switching to LINUX

  1. Previously I had done my computing using DOS as my operating system. I found this unsatisfactory for several reasons:

  2. From speaking to some other crystallographers at the Atlanta meeting, it seemed that many of them had VAX's or micro-VAX's which were slower than a 486DX. Going to linux would enable them to get better preformance at a lower price and not have to sacrifice operating system functionality by having to have to use DOS.

  3. Linux is a good OS choice for service crystallographers in particular because of the virtual console (VC) capability included in the linux kernel. This enables a person sitting at one physical console (keyboard and screen) to "hotkey" to different VC's with a couple of keystrokes. I can be logged on as user A and start a least squares and then switch VC's, login as user B and start doing graphics, or reading mail, editing source code etc. On linux there are usually 6 different VC's available.

  4. Linux supports X11, the UNIX graphical user interface. The graphics capable software that I use, NRCVAX, SIR92, PLUTON, PLATON all use calls to the X11 libraries, and compile without much difficulty under linux.

  5. A lot of crystallographic programs with other UNIX implementations compile without a lot of difficulty on linux machines.

  6. The linux community is a very helpful bunch of people. It is easy to post questions to the comp.os.linux.* newsgroups and get prompt, easy (for the most part) to understand answers.

  7. Linux is a fully featured UNIX-type OS. There is a lot of good software available in either source or binary form which runs under linux. This includes networking code, a full complement of UNIX utilities and cool things like Mosaic and other net-surfing tools.
I am sure I have left out a lot of other desireable reasons to go to linux, but these are the ones which come to my mind now.

Compiling Crystallographic Software on Linux

Traditionally, crystallographers use FORTRAN as their programming language of choice. The first problem one encounters is how to compile FORTRAN code with no native FORTRAN 77 compiler. Linux uses the f2c (FORTRAN to C) translator available from netlib.att.com and the Free Software Foundation's GNU gcc (C compiler). The linux distributions I have used (SLS and Slackware) come with a shell script called f77 which simulates the way that a "real" FORTRAN compiler would work. The f2c translator comes with it's own libraries which are used when compiling and linking programs.

From what I have read on the net, a free FORTRAN 77 compiler (GNU g77) is going to be coming out "any day now". I would anticipate that someone is going to port this to linux. This should make bringing crystallographic code to linux even easier, but I have to admit that the f2c+gcc combination isn't bad.

Using f77 (f2c + gcc combination), I have compiled the following with almost no trouble: SHELXL-93, SHELXS-86, PBDINS, CIFTAB, PATSEE, DIFABS, DIRDIF92, SIR92, THMA11, PLATON, PLUTON and the NRCVAX program package. I will give a brief summary of what changes in the source code were needed to get the program running under linux.

SHELXL-93:
For the unix version, I added two little C programs to get TIME and DATE and elapsed CPU time. Otherwise, the code compiled very cleanly.

PBDINS & CIFTAB:
No real changes to the programs. For CIFTAB I had to add the unix path for my local implementation.

SHELXS, PATSEE:
These codes were generic sources without I/O ( I think the codes were originally SHELXS.FTN and PATSEE.FTN). I just added simple I/O. I wrote a little C program to get the job name from the user; this could have been done in FORTRAN, but I wanted to practice my C.

DIFABS:
I had previously modified this code to run under DOS. This DOS version compiled and ran without a problem under linux.

DIRDIF92:
This compiled easily under linux. The TIME and DATE routines looked a little more complicated than some of the other programs I have worked with, so I just dummied them out. Someday, I'll go back and fix them.

SIR92:
To get this to work, I added my standard TIME/DATE C program and had to add an underscore to the name of the C function which does the graphics.

THMA11:
I had previously modified this code to run under DOS. This DOS version compiled and ran without a problem under linux. Again, I added my standard C TIME/DATE routine.

PLATON, PLUTON:
Ton Spek distributes a linux version of these programs. Actually it is a DEC version with a program called add.c compiled in to take care of the time/date, and FORTRAN functions like IOR, IAND which are not translated by f2c.

For the most part, these are the sorts of changes that are made anyway when implementing a program in one's local computing environment.

I have also implemented the NRCVAX program package to run under linux. This was a little more problematic due to some read/write errors involving direct access files. I wrote some sample code which reproduced the error and sent it to the f2c maintainer. Although he informed me that the FORTRAN was "buggy", he was kind enough to modify the f2c library source code to accmodate the behaviour which the original source code needed (See the 10 March 1994 entry in the f2c Change Log file). Not being a computer genius, I can't say whether the code had a bug or not, I am just grateful for the help I received making the program work properly.

With the NRCVAX program package I had to write some short C functions to take care of the IAND, IOR and IEOR FORTRAN functions which are not translated by f2c. This was accomplished by running the nior.f niand.f and nieor.f routines through f2c ( with no gcc) and editing the resulting C source files to include the C Boolean operators: & (for IAND), | (for IOR) and ^ (for IEOR). I used f2c because I didn't know any C at the time I did this and it seemed the simplest way of doing things. Were I to do it again, I would rewrite them in a more "normal" C by myself, but it works now, so why fix something that's not broken ;-).

The last thing that needed adjustment to get NRCVAX working properly was a slight modification in the graphics routine, cxdraw.c . I contacted the author and maintainer of the fvwm window manager (which comes with the Slackware distribution of linux). He suggested a very simple change:
wm_hints.initial_state = ZoomState;
change that line to:
wm_hints.initial_state = NormalState;

Everything worked well after that, and I had a functioning NRCVAX package running on my linux machine!

The execution times are comparable to 32 bit memory manager compiled in DOS implementations of the same programs, so there is no performance hit taken on going from DOS to linux. I recently realized that the f77 script does not pass the compiler optimization switches to gcc, so I have been comparing unoptimized gcc code with presumeably optimized 32 bit DOS code. With optimization, the linux based code should improve even more. The only thing I have had a problem with is that the backspace key doesn't appear to do it's job with f2c/gcc compiled code. If I mistype something in an input dialog, I can't seem to erase it properly. I don't know if this is a problem with f2c or my ignorance of UNIX in setting up my keyboard. I would guess the latter, but my typing isn't that bad to have needed to investigate it further.

Hardware Setup

At various linux ftp sites, there are files available for Hardware compatibilities. I suggest you look there to find out detailed information. I can say what my hardware is, and that is about it. I have a 486DX-33 with 16 MB RAM; ISA bus; 2 IDE hard drives ( Maxtor 130 MB and Seagate 107MB); one 3.5" floppy; Orchid Fahrenheit 1280+ graphics adapter and a ViewSonic 15 monitor; and an SMC 16 Elite Ethernet adapter. Given the limited amount of money available to me, it took me about 2 years to piece this setup together. It is not what a synthetic chemist would call a "rational synthesis". It is not the sort of hardware I would buy now for a linux system. I would get a bigger SCSI disk, tape back up, 17" monitor and a PCI or VESA bus motherboard. The ISA bus gets sluggish when using graphics programs (especially like rotating ORTEP drawings). Because crystallography is a graphics intensive field you are much better off using a bus which has a higher throughput then an ISA bus. There are Hardware compatibility lists available at linux ftp sites. Read them before investing, especially if $$ is tight.

Getting Linux

Linux is available on a variety of media, from diskettes to CD-ROM's and by anonymous ftp from a number of sites as well as BBS's. I have always used anonymous ftp. Being a citizen of North America I have used the following sites.

Site			IP Address		Directory
sunsite.unc.edu		152.2.22.81		pub/Linux
tsx-11.mit.edu		18.172.1.2		pub/linux

Lists of other ftp sites in Europe and Australia are available.

If you are going to do anonymous ftp, the hardest thing about the whole process is having the patience ( and the wherewithal) to download the 30 or so diskettes of compressed files, assuming 1.44M 3.5" diskettes. The Slackware distribution I used was 28 diskettes (series a, ap, d, n, x, xap, xd plus a bootdisk and a rootdisk for installation).

Installing Linux

The best place to begin is by reading the Installation HOWTO The steps involved in getting linux onto a PC is very clearly explained in this guide. Briefly, it entails getting linux, booting your PC from a linux floppy, formating your hard drive, making your filesystem(s), and installing linux. I have found the Slackware distribution easy to install, and I recommend it to anyone who is new to linux. I would also recommend using a DOS program called 'fips', which non-destructively alters the size of DOS partitions. I have used it several times as I wanted more of my hard drives to be dedicated to linux rather than DOS. Using fips allows one to avoid needing to reformat and reloading DOS partitions.

There are also many "HOWTO" guides for the various aspects of setting up a linux system as well a FAQ. I have found them very useful. By reading them, it has made what could be an opaque, esoteric process, more understandable for me. The HOWTO guides are available from sunsite.unc.edu .

I found that the f77 script included with the Slackware 1.1.2 distribution didn't work when compiling NRCVAX programs, but an older f77 script from the SLS 1.03 distribution did work. I saved (fortunately!) the older script and have recently modified it to pass optimization flags to the C compiler. I also rebuilt the f2c libraries to get the behaviour that I needed (see NRCVAX section above). The libraries compile with out a problem under linux, but you need to include the following in the makefile for libI77:

CFLAGS = -O -DNON_UNIX_STDIO -DPosix_SOURCE -DPad_UDread

It probably wouldn't hurt getting one or two UNIX books as reference material. I use A Practical Guide to the UNIX System 2nd edition, by Mark G. Sobell. I have also found a book on C useful. ( I use Using C by Lee Atkinson and Mark Atkinson. I find it not such an easy book, but I have usually found what I am looking for).

Acknowledgements

There are many people who helped me in my effort to turn my DOS PC into a linux workstation. I want to thank all of the members of the linux community who made this all possible and who helped me with my questions. I would also like to thank the maintainer of the f2c translator, David Gay for accomodating my direct read/write problems, and Robert Nation (author of the fvwm window manager) for helping me with getting some of the graphics routines working. Peter White, who originally suggested using linux, and who has been very helpful with getting NRCVAX under linux going. Gianluca Cascarano for advice and suggestions concerning SIR92. Frank Warmerdam (from somewhere on the net) who gave me a C program to pass time and date info to a FORTRAN program. Thanks goes also to programmers of the various crystallographic software I use for writing such easily portable code, which made my job so much more easier than it could have been.


boyle@laue.chem.ncsu.edu
Last Updated 5 May 1995