
30 May 1998 •The Hewlett-Packard JournalArticle 4 •1998 Hewlett Packard Company
Occlusion Culling
The HPfast-breakprogram (page 8) enabled us to understand
customer requirements by analyzing what is important in
OpenGL graphics today. As a result, we developed a technol-
ogy called
occlusionculling
as an extension to OpenGL and
implementeditintheVISUALIZEfxgraphicshardware.
We found that the data sets many graphics workstation cus-
tomers are trying to visualize are very complex. These data
sets have large numbers of small, complex components that
are not always visible in the final images. For instance, when
rendering an airplane, all of the MCAD parts are present in the
data set represented by potentially millions of polygons that
must be processed. However, when this airplane is viewed
from the outside only the outer surfaces are visible, not the fan
blades of the engine or the seats or bulkheads in the interior.
In a traditional 3D z-buffered graphics system, all polygons in
ascenemust be processed by the graphics pipeline because it
is not known a priori which polygons will be visible and which
oneswillbeoccluded (not visible). The notion of occlusion
culling,orremovalofoccludedobjects,hasbeentalkedabout
intheresearchcommunityforseveralyears.However,imple-
mentations tend to be in software where the performance is
notat a satisfactory level.
Inthe VISUALIZE fx series of graphics devices, HP developed
a very efficient algorithm that tests objects for visibility.
An application program can very quickly use the occlusion
culling visibility test to determine if a simple bounding box
representation of a more complex part is visible. Since a
bounding box, or more generally a bounding volume, com-
pletely encloses the more complex part, it is possible to know
a priori that if the bounding volume is not visible then the
complex part it encloses is not visible. Thus, the part that is
not visible does not need to be processed through the graphics
pipeline.Thereal benefit of occlusion culling comes when a
verycomplexpartconsistingofmanyverticescanberejected,
avoiding the expenditure of valuable time to process it.
For very complex data sets, such as the airplane mentioned
aboveoranautomobile,atremendousperformanceincrease
canberealized by using the HP occlusion culling technology.
Todate,severalISVshavebegunusingocclusioncullingin
their applications and are seeing a 25 to 100 percent increase
ingraphicsperformance.Thismagnitudeofperformancebene-
fit typically costs a customer several thousand dollars for the
extracomputationalhorsepower.HPincludes this technology
asstandardinall VISUALIZE fx series graphics accelerators,
giving even better price and performance results to our
customers.
The future of 3D graphics will continue toward visualizing ever
more complex objects and environments. Occlusion culling
together with HP’sDirectModel technology (page 19) are
well positioned to be industry leaders in providing the technol-
ogyfor3D modeling applications.
The primary responsibility of the interface chip is to sepa-
rate the streams of data that arrive from the host SPU into
three paths and arbitrate access among those paths.
3D Path. Typically data from the host CPU looks very
much like the OpenGL API functions themselves. Data
following this first path is routed to the geometry chips.
The geometry chips process the data and return the re-
sults to the interface chip. These results are then sent on
to the texture chips or directly to the raster chips if the
texture mapping subsystem is not installed. In either case
the data is transmitted to and through all the texture and
raster chips in the system.
Unbuffered Path. This path passes data directly through
the interface chip to the texture and raster chips. This
provides a bypass method that allows traffic to get around
other pending operations. An example would be a texture
cache download that is required to complete a primitive
that is currently being rasterized, a situation that would
lead to deadlock without the unbuffered path.
2D Path. This path runs directly through the interface chip
to the texture and raster chips. The 2D path differs from
the unbuffered path in the way its priority is handled. The
interface chip manages priority among the three paths as
they all converge on the same set of wires between the
interface chip and the first texture chip. The unbuffered
path goes directly through the interface chip to those
wires and has priority over the other two paths. Data
targeting the 2D path is held off until all preceding 3D
work in the geometry chip has been flushed through to
the first texture chip.