Ray Tracing News

"Light Makes Right"

January 2, 1990

Volume 3, Number 1

Compiled by Eric Haines erich@acm.org . Opinions expressed are mine.

All contents are copyright (c) 1989,1990, all rights reserved by the individual authors

Archive locations: anonymous FTP at ftp://ftp-graphics.stanford.edu/pub/Graphics/RTNews/,
wuarchive.wustl.edu:/graphics/graphics/RTNews, and many others.

You may also want to check out the Ray Tracing News issue guide and the Mother of all Ray Tracing Pages.


========Net News Cullings========


This issue's focus is on timings from various people using a wide selection of hardware and software. Much of the delay in getting out this issue was my desire to finish up my own timing tests of MTV's, Kolb's, and my own ray tracers on the same machine. I hope some of you will find this information of use.

Another feature of this issue is a pair of book reviews. One of the purposes of the RT News is to provide people with sources of information relevant to ray tracing. So, I would like to see more reviews, or even just brief descriptions of articles and books that you come across. Keeping up to date in this field is going to take more time as the years go by, so please do pass on any good finds you may have. Also, if you're an author, please feel free to send a copy of the abstract here for publication. This service is already provided to a certain extent by SIGGRAPH for thesis work. However, even a duplication of their efforts is worthwhile, since an electronic version is much easier to search and manipulate.

back to contents

New People

# Kory Hamzeh
# 6640 Nevada Ave.
# Canoga Park, Ca 91303
# Email: UUCP:     avatar!kory or ..!uunet!psivax!quad1!avatar!kory
# INTERNET: avatar!kory@quad.com
alias	kory_hamzeh	quad.com!avatar!kory

I'm not professionally involved in ray tracing. Just personally fascinated by it. I have written a couple of ray tracers (who hasn't yet?) and I'm in the midst of designing a 24 bit frame buffer. Since I don't do this on a professional level, I lack some of the resources required to develop real products. I maintain a archive site with a lot of graphics related items (including Ray Tracing News). If you need to access the archive (anonymous uucp only) please send me mail.


# Steve Lamont, sciViGuy - parallelism
# Box 12732,
# Research Triangle Park, NC 27709
alias	steve_lamont	cornell!spl%mcnc.org


# Bob Kozdemba - novice tracer, futures, also radiosity
# Hewlett-Packard Co.
# 7641 Henry Clay Blvd.
# Liverpool, NY 14450
# (315-451-1820 x265)
alias   bob_kozdemba    hpfcla!hpfcse!hpurvmc!koz

I work for HP in Syracuse, NY as a systems engineer. I will be attending SU starting in Jan. `89 working toward my BS with a focus in computer graphics. My job responsibilities are to provide technical support to customers and sales in the areas of Starbase graphics and X Windows. Lately I have been experimenting with HP's SBRR product [radiosity and ray tracing part of the HP graphics package] and trying to keep abreast of futures in graphics. I have written an extremely primitive ray tracer and I am looking for ideas on how to implement reflections and transparency.


# Robert Goldberg
# Queens College of CUNY
# Comp. Sci. Dep't
# 65-34 Kissena Blvd.
# Flushing, N.Y. 11367-0904
# Work : 3d Modeling algorithms, with appl. to graphics and image processing
# Phone: Work -  (718) 520-5100
alias	robert_goldberg	rrg@acf8.nyu.edu


# John Olsen - refraction, radiosity, antialiasing, stereo images.
# Hewlett-Packard, Mail Stop 73
# 3404 E. Harmony Road
# Ft Collins, CO 80525
# (303) 229-6746
# email:  olsen@hq.HP.COM, hplabs!hpfcdq!olsen
alias	john_olsen	hpfcdq.hp.com!olsen

Currently, I've been spending some time tinkering with the DBWrender ray tracer making it produce 24 bit/pixel QRT-format images. I like the QRT output format, but I like some of the features of DBWrender, such as antialiasing and fading to a background color.

I've thought about writing my own ray tracer with all the features I want, but so far I've resisted this evil temptation, and only looked for fancier ones already done by others who could not resist the temptation. :^)

I've just installed an alias for a local ray tracing news distribution. You can send it to raylist@hpfcjo.HP.COM (or if you can't reach, try something like hplabs!hpfcdq!hpfcjo!raylist).


# Andrew Hunt, andrew@erm.oz
# Earth Resource Mapping, 130 Hay St, Subiaco, Western Australia 6008
# Phone: +61 9 388 2900   Fax: +61 9 388 2901
# ACSnet: andrew@erm.oz
alias	andrew_hunt	uunet!munnari!erm.erm.oz.au!andrew

In 1987 I implemented a "Three Dimensional Digital Differential Analyser" (3D-DDA) algorithm, along the lines of Fujimoto and Iwata`s, and used it to speed up a raytracing system under development at the Computer Science Department at Curtin University of Technology.

Recently I have got a bit busy developing commercial image processing software to devote much time to Ray Tracing.

Sometime during 1990 I plan to try to port our Ray Tracing system to a Transputer based platform.


# Nick Beadman - Distributed Ray Tracing, Efficiency
# School of Information Systems
# University of East Anglia
# Norwich
# Norfolk
# United Kingdom
alias	nick_beadman	cmp7112@sys.uea.ac.uk

At the moment I'm trying to implement a distributed ray tracer on 8 t800s on a meiko computing surface using C, with little success I should add. It all part of a big third year computing project worth a sixth of my degree.


# Peter Miller - algorithms, realism
# 18 Daintree Cr
# Kaleen ACT 2617
# CONTACT LIST ONLY: subscription through melbourne
#Phone:  +61-62-514611 (W)
#	 +61-62-415117 (H)
#        UUCP    {uunet,mcvax,ukc}!munnari!neccan.oz!pmiller
#        ARPA    pmiller%neccan.oz@uunet.uu.net
#        CSNET   pmiller%neccan.oz@australia
#        ACSnet  pmiller@neccanm.oz
alias	peter_miller	cornell!uunet!munnari!neccan.necisa.oz.au!peter

I have been interested in ray tracing since 1984, when I wrote a ray tracer before I knew it was called ray tracing! Since then I have been reading journals and tinkering with my ray tracer.

The last 3 years were spent marooned off the net, with very poor graphics output devices; so while some work was done on the ray tracer, I now have a lot of catching up to do.


# Mike J. McNeill
# VLSI and Graphics Research Group,
# University of Sussex,
# Falmer, BRIGHTON, East Sussex, BN1 9Qt
# TEL: +44 273 606755 x 2617
# EMAIL: (JANET) mikejm@syma.sussex.ac.uk | (UUCP) ...mcsun!ukc!syma!mikejm
alias	mike_mcneill	mike@syma.sussex.ac.uk

I am currently researching into parallelising the RT algorithm on a multiprocessor architecture. I'm using object space subdivision using a fast octree traversal algorithm. At the moment I'm simulating the architecture using C via TCP/IP protocols on a distributed Apollo network. Interested in all aspects of speed-up, parallel architectures, coherency and radiosity - dare I mention animation and Linda? Also does anyone know of a modelling package for use on the Apollos?

back to contents

Archive Site for Ray Tracing News, by Kory Hamzeh

Avatar is an archive site which archives Ray Tracing News, comp.source.unix, comp.sources.misc, and other goodies. Archived files can be access by any one using anonymous uucp file transfers. A list of all files in the archive can be found in avatar!/usr/archive/index. Note that this file is quite large (about 40K). If you are only interested in the graphics related archives, snarf the file avatar!/usr/archive/comp/graphics/gfx_index. All Ray Tracing News Archives (except for the first few) are stored in /usr/archive/comp/graphics directory. They are named in the following fashion:


Where XX is the volume number, YY is the issue number. Note that all files (except index and gfx_index) are compressed.

Before trying to access avatar, your L.sys file (or Systems file, depending on which flavor of UUCP you have) must be edited to contain the following info:

# L.sys entry for avatar.
# NOTE: 'BREAK' tells uucico to send a break on the line. This keyword
# may vary from one flavor of UUCP to another. Contact your system
# administrator if your not sure.
# 1200 baud
avatar Any ACU 1200 18188847503 "" BREAK "" BREAK ogin: anonuucp
# 2400 baud
avatar Any ACU 2400 18188847503 "" BREAK ogin: anonuucp
# 19200 baud (PEP mode) for Telebit Trailblazers only
avatar Any ACU 19200 18188847503 ogin:-\n-ogin: anonuucp

After the previous lines are entered in you L.sys file, you may use the following command to get the index file:

    uucp avatar!/usr/archive/index /usr/spool/uucppublic/index

This command will grab the file from avatar and put it in your /usr/spool/uucppublic directory using the name index. For example, to get Ray Tracing News Volume 2, Issue 5, execute the following command:

   uucp avatar!/usr/archive/comp/graphics/RTNewsv02i05.Z /tmp/RTnewsv02i05.Z

NOTE that some shells (like csh) use the "!" as an escape character, so use a "\" before the "!".

If you experience problems getting to any of the files, I can be reached via e-mail at:

    UUCP: avatar!kory or ..!uunet!psivax!quad1!avatar!kory
    INTERNET: avatar!kory@quad.com soon to be kory@avatar.avatar.com

Kory Hamzeh

back to contents

Ks + T > 1, by Craig Kolb and Eric Haines

From: Craig Kolb (craig@weedeater.math.yale.edu)

While adding adaptive tree pruning to rayshade (and discovering a bug), I came across a question I've had regarding the SPD for a while.

Some objects have Ks + T > 1. How can this be? For example, the spheres in "mountain" have Ks = .9 and T = .9. Unless I'm completely out to lunch (which is possible), ths means that subsequent specularly reflected rays are weighted by .9, and that transmitted rays are also weighted by .9. This leads to "glowing" objects pretty quickly.

What's wrong in the above description? Be warned that in "nff2shade.awk", I set the "specular reflectance" to be min(Ks, 1. - T).


My reply:

Actually, true enough, Ks + T > 1 does occur in the SPD stuff. I use T in a funny way, since I was trying to make databases that would display both using hidden surface and ray tracing algorithms. Hmmm, how to explain? Well, imagine you have a glass ball: under the hidden surface system (without transparency), you'd like the ball to appear opaque, and so a high Kd is in order. Now if the thing's transparent, you don't want Kd to be high. So what I do is lower Kd and Ks by ( 1 - T ). An admittedly weird shading model, which I've now changed a bit (i.e. reflectance and transmittance are now entirely separate). So, your solution of turning down the reflectance is fine. I should add that I didn't really explain all this as it's irrelevant for timings (all we care about is what rays get generated, not the final color), but I agree, it would be nicer to get a good resulting picture as a check. I'll change that in the next update of SPD, actually... thanks for pointing it out!


Craig's reply:

Ah, I get it.

I brought it up because it does make a difference timing-wise if you're using adaptive tree pruning. Although you say in the SPD stuff that pruning shouldn't be used, Mark's raytracer currently has no option to turn it off. I was comparing pruned vs. pruned, and noticed that I had many fewer reflected rays, since my reflectance for the transparent gears was .05 rather than .95.

back to contents

Quartic Roots, and "Intro to RT" Errata, by Larry Gritz (vax5.cit.cornell.edu!cfwy)

I was trying to find the solution to a quartic equation to solve a ray-torus intersection test. I've gotten lots of replies, but generally fall into one of two categories: either solve the quartic equation (I forget the reference now, but I'll send you either a reference or the formula if you want [It's in the _CRC Standard Math Tables_ book - EAH]), or by some iterative method. Everybody says (and I have confirmed this experimentally) that solving the equation is very numerically unstable. I have chosen to use the Laguerre method from Press, et. al. _Numerical Recipes in C_, which is slow, but seems to work, and finds all roots without needing to specify brackets for the roots. (An advantage since although I can bracket all possible real roots with the bounding box test that I already do, I'm not really sure how many roots lie within the brackets.)

What actually turns out to be a bigger problem is that I got the quartic coefficients from the SIGGRAPH '88 Ray Tracing Course Notes (on page 3-14 of Pat Hanrahan's section).

[Larry and I thrashed this out over a number of passes (boy, I wish I had access to Mathematica or somesuch), and came out with the following corrected equation set for those on page 93 of _An Introduction to Ray Tracing_:

	a4 & a3 - Pat's are OK.
	a2 = 2(x1^2+y1^2+z1^2)((x0^2+y0^2+z0^2)-(a^2+b^2))
	      + 4 * (x0x1+y0y1+z0z1)^2 + 4a^2z1^2
	a1 = 4 * (x0x1+y0y1+z0z1)((x0^2+y0^2+z0^2)-(a^2+b^2))
		+ 8a^2 * z0 * z1
	a0 = ((x0^2+y0^2+z0^2)-(a^2+b^2))^2 - 4a^2(b^2-z^2)
[actually, WRONG, see RTNv3n2 for the right answer]

- EAH]

back to contents

More on Quartics, by Larry Spence From: larry@csccat.UUCP

Newsgroups: comp.graphics
Subject: Re: Solution to quartic eqn?
>I didn't ask the question, but thanks for your input.  However, Ferrari's
>theorem yields a fast and very accurate answer.
Are you sure about this?  If it's the same closed-form solution that you find
in the CRC books, etc., doesn't it use trig functions and cube roots?   Seems
to me there was a paper by Kajiya in the early '80s on numerical ray tracing,
and there have been several in the last few years.  My advice would be to go
look at SIGGRAPH proceedings from 1981 on.  Certainly, a closed form solution
like the one suggested wouldn't take advantage of any coherence in the problem,
unless you wrote all the trig stuff yourself.

[Comments, anyone? I never saw any replies. - EAH]

back to contents

Question: Kay and Kajiya Slabs for Arbitrary Quadrics? by Thomas C. Palmer (palmer@mcnc.org)

Here's a question regarding slabs for arbitrary quadrics. Kay & Kajiya '86 discusses computing slabs for polygons and implicit surfaces. The method for implicit surfaces using Lagrange multipliers and gives an example using spheres. This is easy and works quite nicely for canonical (i.e. centered and axis aligned) quadrics. K&K handle object rotations and translations during the slab computation. What about quadrics that have been transformed by some arbitrary matrix prior to input and look like this:

ax^2 + 2bxy + 2cxz + 2dxw + ey^2 + 2fyz + 2gyw + hz^2 + 2izw + jw^2 = 0 ?

The xy, xz, and yz terms prevent a simple solution via Lagrange multipliers. Has anyone done this? How do you handle quadric bounding planes? Note that K&K cheated for the tree branchs in the tree models. Each cylinder has endpoint capping spheres so the cylinder's extent is just the combined extent of the two spheres.

Thanks for your help -


Thomas C. Palmer		North Carolina Supercomputing Center
Cray Research, Inc.		Phone: (919) 248-1117
PO Box 12732			Arpanet: palmer@ncsc.org
3021 Cornwallis Road
Research Triangle Park, NC

back to contents

Ambient Term, by Pierre Poulin (poulin@dgp.toronto.edu)

I just read your Tracing tricks in the latest Ray Tracing News. Thanks for passing that on to us, this was very interesting.

One trick you mentioned was to put a light source at the eye position in order to eliminate the ambient term. This is a simple trick I did not know. However, you noted that highlights appears as artifacts.

Since you know that this light does not need any shadow rays, you could use only the diffuse intensity created by this light to approximate the ambient term, hence creating no undesirable highlights.

I know, this is very easy and everyone probably knows that. But just in case you would not have thought about it, I wanted to indicate it to you. I just hope I am not the 10,000th to tell you :^)


My reply:

I'm glad to hear that you enjoyed the "Tracing Tricks" article - sometimes I worry that I'm just publishing ideas that everyone already knows. I've tried having the ambient light have no highlight, and it's sort of interesting, but the lack of highlight can look a little strange for those objects where there really would be a highlight (it sort of makes them look less shiny, though it also depends upon the other lights in the scene). Nonetheless, turning it off is definitely worth exploring. You're the first person to comment on this, actually. Thanks for taking the time to write, and do keep in touch.

back to contents

Book Reviews on Hierarchical Data Structures of Hanan Samet, by A. T. Campbell, III (atc@cs.utexas.edu)

"The Design and Analysis of Spatial Data Structures", and "Applications of Spatial Data Structures", by Hanan Samet, Addison-Wesley, 1990.

Reviewed by A. T. Campbell, III

This two-volume series of books is one man's effort to provide a guide to the study of hierarchical data structures. The topic has extensive influence on many fields, particularly computer graphics, image processing, and computational geometry. Hanan Samet is a well-established expert in the field, with literally dozens of publications. As a computer graphics researcher, I eagerly anticipated the books' publication. A close examination of both volumes leads to one conclusion: the books are extremely worthwhile.

The integration of diverse material is remarkable. Comprehensive research results throughout the spectrum of science are drawn together seamlessly. Samet has really done a thorough job of pulling together literature from a vast collection of conferences and journals, both major and minor.

The level of explanation is good. Samet has obviously read all of his references thoroughly. The descriptions of algorithms reflect an understanding of what is really going on. Even algorithms mentioned briefly are given a good essential description.

Numerous topics are covered. Algorithms for such problems as proximity-searching, efficient storage of image and volume data, constructive solid geometry, data structure conversion, hidden surface removal, and ray-tracing fill the books. Pseudo-ALGOL code examples present detailed explanations of how to build and traverse many of the data structures. Ray-tracing enthusiasts in particular will enjoy a detailed description of how to trace a path through an octree.

There are, however, a few problems with the presentation. Despite the ambitious titles of the volumes, there is nowhere near as much theory or practical advice as one might expect. The emphasis is instead on explaining literally everything at an understandable level. While this makes the books a wonderful introduction to all sorts of stuff, the reader still needs guidance in choosing what techniques to actually use.

The title of the first volume, "The Design and Analysis of Spatial Data Structures", obviously invokes comparison with the classic text "The Design and Analysis of Computer Algorithms", by Aho, Hopcroft, and Ullman. However, Samet's approach differs greatly from that of Aho et al. While the data structures are described and discussed in detail, the analysis is not very formal. Theorems and proofs, as well as detailed algorithm analysis, are not much in evidence. A more appropriate title might simply be "An Introduction to Spatial Data Structures".

The second book, "Applications of Spatial Data Structures", covers basically every research result in hierarchical algorithms, major or minor. It is exceptionally good at explaining techniques succinctly. The depth is not sufficient enough to implement the techniques without referring to the original papers, however. Additionally, the reader is given no good feel for which results should actually be used. If a technique is commonly used in industry or never used because of impracticality, Samet never says so. The reader who expects a "cookbook" solution to his problem will be disappointed.

The books are primarily of use for two purposes. First, they provide a good introduction to those aspects of computational geometry and image processing which are most likely to be of interest to a person working in graphics. Second, they provide a very complete guidebook to the literature. I would suggest that researchers and practitioners have these volumes on their reference shelves. Due to the sheer volume of material presented, I would not recommend them for use as course textbooks.

back to contents

Comparison of Kolb, Haines, and MTV Ray Tracers, Part I, by Eric Haines

I decided to compare the MTV ray tracer, the Kolb "rayshade" software, and my own modified "SBRR" ray tracing package to see the efficiency of each. My goal was to see what sort of performance was obtained by each ray tracer and note the strengths and weaknesses of each. This first section of the report marks the state of current results, consisting of timings from the "gprof" command for each package, using the Standard Procedural Database (SPD) package. All three packages were run on an HP 350 workstation with a floating point accelerator board. The compiler options were "-g -G +ffpa" (debug, profile, with special floating-point only compile), using HP-UX 6.5.

The three ray tracers were selected from all the existing packages by having the following properties: (1) Each could handle all the primitives in the SPD package, and (2) each had some automatic efficiency scheme. Other packages do not support all the primitives (e.g. DBW does not have cylinders, cones, or n-sided polygons), or do not have automatic efficiency generation (e.g. QRT lets you explicitly create bounding boxes, but has no way for this to happen automatically).

The MTV ray tracer was created by Mark VandeWettering, and uses the Kay/Kajiya hierarchy approach (i.e. sorting objects along X/Y/Z and splitting each group, recursively). To make it conform to the requirements of the SPD tests, "minweight" was set to 0.0 in order to disable tree pruning a la Hall's method.

Craig Kolb's "rayshade" ray tracer uses a 22x22x22 grid on all scenes. Because of the use of grids (i.e. 3DDDA), it was found to be sensitive to the background polygons used in the SPD package. In four of the SPD scenes (balls, gears, rings, and tree) there is a "ground" polygon. The "rayshade" software allows some user intervention in how the database is structured. It was found that faster timings (sometimes strikingly quicker) could be obtained by leaving this background polygon out of the grid structure. Changing the database in this manner is forbidden by the SPD tests, but both sets of results are presented because of the difference in timings.

The SBRR package is not public domain, but rather is part of the graphics software in all HP workstations using the Turbo-SRX graphics accelerator. It uses the Goldsmith and Salmon automatic bounding volume hierarchy method. It should also be noted that this package is full featured, which has a corresponding slowdown effect when intersection computations are performed. For example, polygons can be single or double sided, with different materials, colors per vertex, normals per vertex, and other combinations available. Since the package has a "hardware assist" by using the graphics engine as an item buffer (see Weghorst and Hooper), timings are given both with and without this assist. The times without are the fairer of the two for comparison.

Without further ado, here are the timings:

MTV	   Basic
-----      -----
balls	   12604
gears	   38123
mount	    9307
rings	   24286
tetra	    1081
tree	    8056

Kolb	   Basic	Modified
-----      -----        --------
balls	   14871	    3224
gears	   12601	   11449
mount	    2989	    2989 (i.e. same - no modification needed)
rings	    8348	    8103
tetra	     836	     836 (i.e. same - no modification needed)
tree	   18957	    2505

SBRR	   Basic	Item Bfr
-----      -----        --------
balls	    5027	    4126
gears	   13561	   12776
mount	    5440	    3749
rings	   11044	   10446
tetra	    1187	     457
tree	    3229	    2719

So, considering the MTV ray tracer as 1.00, here are the relative performance times of each tracer - (MTV time / RT time) - i.e. higher is faster, and can be thought of as how many times faster it is:

SPD	MTV-Base	Kolb-Bas	Kolb-Mod	SBRR-Bas	SBRR-Bfr
-----   --------	--------	--------	--------        --------
balls	  1.00		  0.85		  3.91		  1.40		  3.05
gears	  1.00		  3.02		  3.32		  2.81		  3.33
mount	  1.00		  3.11		<--same		  1.71		  2.48
rings	  1.00		  2.91		  3.00		  2.20		  2.32
tetra	  1.00		  1.29		<--same		  0.91		  2.37
tree	  1.00		  0.42		  3.22		  2.50		  2.96

Some interesting phenomena are revealed by the statistics. The "tetra" database is pretty much the same absolute speed for everyone. However, given the performance for other scenes, it is noteworthy that MTV performs relatively faster on this than others. I've noticed this, too, when trying Kay/Kajiya myself - this scheme just soars on tetra, though I am not sure why. Perhaps it is the smallness and regularity of the objects, which would go well with Kay/Kajiya's assumption that using the centroid of these is reasonable. For other databases one can imagine better hierarchies that those constructed by Kay/Kajiya. For example, with mount, the four spheres above the fractal mountain should be in their own cluster just off the top of the hierarchy tree.

The "tetra" scene is a strange test in that most (around 81%) of the scene is background, so what we tend to test here is affected more by how fast one can traverse a scene, set up rays, shade, and store values in a file. It will take further analysis to see where the time is spent.

The Kolb ray tracer is interesting in how much its efficiency scheme is affected by the geometry of the scene. The "teapot in a football stadium" effect I've written about previously hits grid efficiency schemes with a vengeance. For example, moving the ground polygon from the grid subdivision to outside of it makes rayshade perform 4.6 times faster for balls, and 7.7 times faster for tree! The point is that grids perform best when the scene is relatively "compact". The large ground polygons in these scenes cause the entire grid to get larger in two directions, and so make many more objects fall inside but a few grid squares, thereby ruining much of the efficiency of the grid.

Comparing Kolb to MTV, we see that overall Kolb is faster. Kolb does worse on balls and tree using the unmodified database, but otherwise outperforms MTV, being about twice as fast. When the ground polygon is taken out of the grid subdivision, Kolb is more than 3 times faster for all cases except tetra.

Comparing SBRR and MTV shows SBRR to be faster for most cases, with MTV being slightly faster for tetra. Overall SBRR is almost twice as fast with its basic performance, and about 2.75 times faster when the item buffer is used.

Comparing SBRR and Kolb is a bit tricky, since there are two tests of each. Taking the basic tests in each, Kolb and SBRR are comparable: Kolb outperforms SBRR in four cases, and SBRR outperforms Kolb in two (and for one of those, tree, it is almost 6 times faster). SBRR has some things to learn from Kolb (which is why I'm doing all this, anyway), as Kolb's modified database results point towards the fact that faster performance is possible.

So, on the basis of pure raw timings, Kolb and SBRR without modifications or accelerators are of comparable speed. With user intervention into the database structure, Kolb can become noticeably faster. It should also be noted that Kolb uses a default of 22x22x22 grid cells, which is under user control and so could be tuned to further improve performance.

Actually, this is an interesting open question: what heuristics can be used to determine a reasonable grid size for a scene? Also, is there a reasonable way to determine if such performance destroyers such as ground polygons are present automatically, and so remove them from the grid subdivision itself? David Jevans' and Brian Wyvill's work on nested grid subdivisions ("Adaptive Voxel Subdivision for Ray Tracing", Proceedings of Graphics Interface '89, p. 164-172) might lead to a less variable performance and to greater overall speed. Craig and I have discussed this, but unfortunately he has no time to try this out - perhaps someone out there can experiment and compare results.

As mentioned, this is research in progress. My next step will be to analyze the statistics generated by each program and see where time is spent.

back to contents

Raytracer Performance of MTV, by Steve Lamont

> Could you pass on your timings on ray tracer performance on various
> machines and any thoughts or experiences you want to share about the subject?

The parallelization was done in a brute force manner, forking processes and dividing the work by the number of processes. The parent process remains behind and reads the scanlines in a round robin fashion from pipes. There is no communication from the parent to the child processes once the forking has been done; the ray tracing routines simply march down the scan lines. This approach works well on a homogeneous architecture where all processes run at approximately the same speed and no process "dries up" or runs out of work to do.

This works well for single frames. However, my approach for a large animation is to simply parcel out work on a frame per processor basis.

I've built the MTV raytracer on the Cray, the IRIS, and the Ardent Titan... and here are some preliminary results on a 128 x 128 test image (the balls image with reflections but no refractions, 3 light sources, 11 primitives (16 with bounding volumes)

		      (CPU seconds)
	Machine		1	4	Notes

	Y-MP/8-432	4.0	1.0	-hopt,intrinsics,vector
	IRIS (4D/240)*  8.2	2.1	-O2 (MIPS R3000/3010)
	Titan (P2)     30.0     7.7	-O3 (MIPS R2000 + prop. vector/fp unit)
	Titan (P3)	7.2	---	Run by vendor. (MIPS R3000/3010 +
					proprietary vector unit)

Wall clock times improved by a factor of 2.5 to 3, which squares pretty well with Amdahl's law as extended for small parallel architectures.

These are *preliminary* results with respect to the Titan -- we've only had it for a couple of weeks. On none of the machines did MTV vectorize in any way to speak of. In fact, turning off vectorization improves performance for several "short vector" loops; e.g., loops of vector length 3.

Timings were done on a fairly heavily loaded Cray and an empty IRIS and Titan.

The Cray is a Y-MP with four processors (upgradable to 8, hence the 8-4) and 32 mWords of central memory. There is also a 128 mWord SSD (Solid-state storage device). We also have 40 gBytes of rotating storage (a combination of DS-40s and DD-49s).

[*] Actually, this machine is a CDC Cyber 910-624 but the only difference between it and a "genuine" Silicon Graphics IRIS 4D/240 is the color of its box and the binding on the manuals.

[Disclaimer: These comments are solely the responsibility of the author and no endorsement by the North Carolina Supercomputing Center should be inferred]

Steve Lamont, sciViGuy			EMail:	spl@ncsc.org
NCSC, Box 12732, Research Triangle Park, NC 27709

back to contents

BRL-CAD Ray Tracer Timings, by Gavin Bell (gavin@krypton.sgi.com)

These results are third-hand; I can vouch for the accuracy of our machine's results, but the BRL people may have more recent results to share. The only experience I have with ray-tracing timing on our machines is with a simple, interactive ( :-> ) ray-tracer demo called 'flyray'. I modified it to be fully parallelized; it uses one CPU to display a real-time display of the scene being ray-traced (complete with rays shooting into and bouncing around the scene), and the other N-1 CPUs to compute ray-object intersections (these results are shown in a separate window). As for timing... runs REAL fast on an 8 CPU system.

What follows is a form letter I've been sending to people who asked for ray-tracing timings.


This is a response to all of those people who asked me for the BRL-CAD ray-tracing benchmark results. I'm surprised at how many of you there are!

First, a little bit about myself. I work in Technical Marketing, the 'Demos and Benchmarks' group here at Silicon Graphics. I'm not usually involved in benchmarks; I work mainly on our demos.

The rest of this text comes directly from a 'SGI Benchmark Summary' prepared by one of our Marketing people. The numbers are communicated to him by the software engineer who did the benchmark. These benchmark summaries are communicated to salespeople in a weekly newsletter as soon as the results come in. Other summaries done include:

Dhrystones, Digital Review Labs, Linpack, Livermore Fortran Kernals, MUSBUS, Whetstones.

LES50 (computational fluid dynamics), Moldflow (finite element analysis), Molecular Dynamics, Nord, UFLA, GROMOS (all computational chemistry).

If you want more information on any of these benchmarks, please see a SGI sales rep-- I can't keep typing in all of these numbers!! Also, please remember that these benchmarks were done for a specific customer, who was interested in a specific machine, so most of them were not benchmarked on our whole product line (the 'Industry Standard' benchmarks, however, are usually run on all of our products).

-----------  --------------  --------
Rendering    BRL-CAD 3.7     US Army Ballistic Research Lab

--------     ------------
C            9/5/89

The BRL-CAD benchmark is a part of the US Army Ballistic Research Laboratory's BRL-CAD package. The core of the BRL-CAD benchmark is a ray-tracing program which consists of about 150,000 lines of C code. Computations are performed in double precision floating point.

Five separate data bases are input to the ray tracing program, resulting in six performance ratings (one for each, plus a total which is the mean of the other five) . When rendered, each data base will produce a 512x512x24 bit color shaded image. The images are of increasing complexity, and are identified as 'Moss', 'World', 'Star', 'Bldg' and 'M35'.

The result of this benchmark is reported as rays traced per second, and referred to as Ray Tracing Figure of Merit (RTFM). Higher numbers indicate better performance.

The code was parallelized by inserting user directives to create multiple processes to trace rays.

    Note:  The actual report has nice graphs here.
Machine        RTFM
1x16 (4D/80)    714
2x16 (4D/120)  1358
4x25 (4D/240)  5034
8x25 (4D/280)  7434
    Note:  NxMM numbers refer to the number of processors in the
	machine (N) running at MM MHZ.

Machine   # Processors  RTFM  Relative Performance
Vax 780        1          77     1.0
Sun3           1          88     1.1
Convex C120    1         163     3.6
Sun4           1         435     5.6
SGI 4D/120S    2        1358    17.5
Alliant FX/80  8        2783    33.6
SGI 4D/240S    4        4456    70.4
Cray X-MP/48   4        7217   116.1
SGI 4D/280     8        7434   119.7

The BRL-CAD benchmark exhibits excellent speedup as processors are added. This is due to the coarse granularity inherent in the ray tracing problem being solved. Each ray is processed independently, with no data dependencies among the rays. This means that multiple processors can each work on separate rays simultaneously, with minimal need for synchronization among processors.

While the code is highly parallelizable, it is not efficiently vectorizable because of short vector lengths. The combination of these two characteristics explain the phenomenal performance of Silicon Graphics machines relative to vector machines like Cray and Alliant.

The characteristics of this benchmark that lead to high performance by the Silicon Graphics machines are common to all ray tracing applications.


Here is another note from Gavin:

My only experience with the BRL ray-tracer came when I was at Princeton University - I installed it on Silicon Graphics machines there for the Graphics Lab. That was two years ago; as far as I could tell, it didn't use octrees or any other space-partitioning algorithm. I used a ray-tracer written at Princeton (the precursor of what is now Craig Kolb's rayshade program; Craig and I had the same thesis advisor) which did do octrees; it was infinitely faster than the BRL beast.

back to contents

BRL-CAD Benchmarking and Parallelism, by Mike Muuss (mike@BRL.MIL)

I'm sort of on vacation right now, so I'm going to cop out and just send you the TROFF input for several things that I have handy about how we benchmark ray-tracing in the BRL-CAD Package.

The first one is our benchmark summary paper.

The second one is a portion of a paper that I wrote called ``Workstations, Networking, Distributed Graphics, and Parallel Processing''.

You may publish and/or redistribute both documents as you wish. Note that the United States Government holds the "copyright", ie, it is not permissible to copyright this material.

[These papers are rather lengthy, so I won't include them in this issue. If you would like copies, look at the archive sites for Muuss.benchmrk and Muuss.parallel, or write me. - EAH]

back to contents ======== USENET cullings follow ====================

Rayshade Patches Available, by Craig Kolb

	From: craig@weedeater.math.yale.edu
	Newsgroups: comp.graphics
	Organization: Math Department, Yale University

Patches 1-3 for rayshade v3.0 are available via anonymous ftp from weedeater.math.yale.edu (new address: in directory pub/rayshade.3.0/patches. The patches fix several minor bugs, clean up the documentation, and provide new features.

Several people have expressed an interest in 'trading' rayshade input files. If you have an interesting input file that you'd like to share, feel free to deposit it in the "incoming" directory on weedeater or send it to me via email. I will make these files available in the pub/Rayinput directory on weedeater.

Rayshade is supposedly "on the verge" of appearing in comp.sources.unix, patches and all.

back to contents

Research and Timings from IRISA, by Didier Badouel From: badouel@irisa.irisa.fr">badouel@irisa.irisa.fr">badouel@irisa.irisa.fr Newsgroups: comp.graphics Organization: IRISA, Rennes (Fr)

We have a parallel raytracer (called PRay) at IRISA which as MTV uses NFF description databases. This raytracer has been implemented on an iPSC/2, on a SEQUENT BALANCE and also on serial computers (SUN3, GOULD NP1) to make better comparisons.

I give you the various synthesis times for the well known 'Teapot' database. The image has been rendering with a 512X512 resolution with 3 light sources. Results are as follows :

			#PEs    Time (in sec.)
	SUN3:                   8877 (~ 2h27mn)
	GOULD NP1:              1642 (~ 27mn)
	SEQUENT BALANCE 1       37121 (~ 10h18mn)
			2       18567
			3       12381
			4       9285
			5       7431
			6       6197
			7       5311
			8       4656
			9       4138 (~ 1h9mn)
	iPSC/2          1       6294 (~ 1h45mn)
			2       3332
			4       1700
			8       860
			16      440
			32      224
			64      119 (~ 2mn)

The code running on the iPSC/2 emulates a virtual shared memory over the local PEs. The database is not duplicated but all the local memories are used. The remaining memory after loading code and data is used as a cache to speed up low global accesses.


In order to benchmark our parallel raytracer running on an iPSC/2, which use Eric Haines' NFF file format as input, we would like to know if other people have a parallel raytracer using these databases in order to make some comparisons.

One of our goals is to render the largest possible database. For the moment, we have rendered the 'tetra10' database: - the database contains more than 1 million polygons (1048576 polygons) - the size of the database with its 'object access structure' (a grid) is 140 MO. - the synthesis time is 526 seconds on the iPSC/2 with 64 nodes and 4 MO node memory. - however, on account of using NFF text file format, reading the input is very slow (more than 3 hours for 'tetra10') when using YACC and LEX. Furthermore, our iPSC/2 configuration does not have I/O node system.

________________________________________________________________ Didier BADOUEL badouel@irisa.fr INRIA / IRISA Phone : +33 99 36 20 00 Campus Universitaire de Beaulieu Fax : 99 38 38 32 35042 RENNES CEDEX - FRANCE Telex : UNIRISA 950 473F ________________________________________________________________

back to contents

Concerning Smart Pixels, by John S. Watson From: watson@ames.arc.nasa.gov (John S. Watson) Newsgroups: comp.graphics Organization: NASA - Ames Research Center

In article (207400033@s.cs.uiuc.edu) mccaugh@s.cs.uiuc.edu writes:
> Does anyone know of a system with "smart" pixels?

Once upon a time I wrote a ray tracer in which the pixels used heuristics to determine their sampling rate. Since the reason for doing it was to speed things up, the heuristic had to be simpler than casting a ray( s, if sub-sampling). I used difference in previous pixels values, with a little randomness tossed in. So pixels changing quickly were sampled every frames, while pixels that were hardly ever changing were sampled only once every 10 frames. The results: much faster, but with a graininess on the edges of moving objects. I needed to make the pixels more aware of what was happening with its neighbors. Never got around to doing that.

Another problem is big pictures have lots of pixels ... 512x512 = 0.25 million. To be smart you must have a memory.

To save memory, I combined the above with an Area-of-Interest/Variable Acuity (AOIVA) Ray Tracer.

Hope this helps, John S. Watson, Civil Servant from Hell ARPA: watson@ames.arc.nasa.gov NASA Ames Research Center UUCP: ...!ames!watson Any opinions expressed herein are, like, solely the responsibility of the, like, author and do not, like, represent the opinions of NASA or the U.S. Government.

back to contents

Input Files for DBW Render, by Tad Guy From: tadguy@cs.odu.edu (Tad Guy)

Newsgroups: comp.graphics
Organization: Old Dominion University, Norfolk, VA
In article (6475@pt.cs.cmu.edu) te07@edrc.cmu.edu (Thomas Epperly) writes: I was wondering if anyone had any neat input files for DBW_Render available for anonymous ftp.

xanth.cs.odu.edu as /amiga/dbw.zoo. If you have a network of, say, sun workstations, you might as well get /amiga/distpro.zoo, which will allow to distribute the computations over many machines.

back to contents

Intersection with Rotated Cubic Curve Reference, by Richard Bartels

From: rhbartels@watcgl.waterloo.edu
Newsgroups: comp.graphics
Organization: U. of Waterloo, Ontario
In article (1445@tukki.jyu.fi) toivanen@tukki.jyu.fi (Jari Toivanen) writes: : :I would like to know is there any simple and effective solution to :following problem: : : [intersecting a ray with a rotated cubic curve] : :Jari Toivanen Segments are for worms ;-) :University of Jyvaskyla, Finland Internet: toivanen@tukki.jyu.fi

Look at the article:

	Ray Tracing Objects Defined By Sweeping Planar Cubic Splines
	Jarke J. van Wijk
	ACM Transactions on Graphics
	Vol. 3, No. 3, July, 1984, pp. 223-237

I believe that the author subsequently wrote a whole book on the subject.

[Incidentally, this article also has a method for quickly intersecting a tight fitting bounding volume around such curves. I've found this test useful for use as a torus bounding volume. Also, does anyone know of the existence and the name of the book Richard mentions? - EAH]

back to contents

Needed: Quartz surface characteristics, by Mike Macgirvin

From: mike@relgyro.stanford.edu
Newsgroups: comp.graphics
Organization: Stanford Relativity Gyro Experiment (GP-B)

I am in need of the surface characteristics for fused quartz. Ambient, diffuse and specular color characteristics, Phong coefficent, reflectance, and transparency. I have the index of refraction (Well, I have to average it, c'est la vie).

I have attempted to derive these experimentally, but find the resulting traced image lacking in many ways, and a simulation visualization I am working on requires accuracy.

I am using Kolb's 'rayshade' if it affects your responses.

Please respond via e-mail if possible.

back to contents

Solution to Smallest Sphere Enclosing a Set of Points, by Tom Shermer

From: shermer@cs.sfu.ca
Newsgroups: comp.graphics
Organization: School of Computing Science, SFU, Burnaby, B.C. Canada
>I need the solution for the following problem:
>find the smallest sphere that encloses a set of given points, in both
>2-D and 3-D (or even n-D, if you like).

This problem can be solved in linear time (in any fixed dimension) by the technique of prune-and-search (sometimes called ``Megiddo's Technique''), either directly or by first converting the problem to a linear program. The most relevant reference (for 2d & 3d) is:

Linear-time Algorithms for Linear Programming in R^3 and Related Problems, Nimrod Megiddo, SIAM J. Comput, v. 12, No. 4, Nov 1983, pp. 759-776.

Other related references:

Linear time algorithms for two- and three- variable linear programs, M.E. Dyer, SIAM J. Comput, v. 13, 1984, 31-45.

On a multidimensional search technique and its application to the Euclidean one-center problem, M. E. Dyer, Dept. Math and Stats TR, Teesside Polytechnic, Middlesbrough, Cleveland TS1 3BA, UK, 1984.

Linear programming in linear time when the dimension is fixed, N. Megiddo, JACM 31, 1984, 114-127

The weighted Euclidean 1-center problem, N. Megiddo, Mathematics of Operations Research 8, 1983, 498-504

On the Ball Spanned by Balls
	N. Megiddo, manuscript (this may have appeared in the literature
	by now)

back to contents

True Integration of Linear/Area Lights, by Kevin Picott

From: kpicott@alias.UUCP
Newsgroups: comp.graphics
Organization: Alias Research Inc.

Has anyone seen any work done on evaluating the diffuse and specular illumination produced by linear and/or area lights? I have checked all the standard sources and all information I find gets to the point where the integration is set up and then a little hand waving is performed accompanied by the magical words "numerically integrated". This works but is too slow for my purposes. Does anyone know of any work done in different directions (ie faster evaluation) ?


Thanks to all who replied to my query about linear and area lights.

In the area of linear lights, two papers on analytical solutions were found. The first, by John Amanatides and Pierre Poulin has been submitted to Eurographics '90 and I'll hopefully get a look at that soon.

The second, "Shading Models for Point and Linear Sources", ACM TOGS, 4(2), April 1985, pp. 124-146. by T. Nishita, I. Okamura, E. Nakamae, proposes an analytic solution to the diffuse component, but only under certain circumstances.

The latter unfortunately reduces to numerical integration in the majority of cases where spline surfaces are involved, although a method of optimization is given that reduces computation time for the numerical integration. This method would seem to be suited to lighting parallel and perpendicular to the illuminated surfaces.

There was also a paper entitled "A Comprehensive Light-Source Description for Computer Graphics", IEEE CG&A, July 1984, by Channing P. Verbeck and Donald P. Greenberg that approximates both linear and area light sources as a series of point sources. This is a compromise to numerical integration, but is still computationally expensive.

In summary, the analytical solution to linear sources exists and is calculable, at least for the diffuse component. The specular component exists, but direct calculation is almost expensive as numerical integration.

As far as area light sources are concerned... no analytical solutions were found. In fact, from the work examined I was left with the impression that even if the solution existed it would not be very useful from a light illumination point of view (ie non-radiosity). (Comments?)

-- Kevin Picott aka Socrates aka kpicott%alias@csri.toronto.edu Alias Research Inc. R+D Toronto, Ontario... like, downtown

back to contents

Eric Haines / erich@acm.org