Behind the Graphics2D: The OpenGL-based Pipeline Behind the Graphics2D: The OpenGL-based Pipeline

by Chris Campbell
11/12/2004


Contents
Introduction
General Comments
Shape Rendering
   Non-antialiased Rendering
   Antialiased Rendering
Text Rendering
Image Rendering
   System Memory Surface --> OpenGL Surface
   Managed Image (OpenGL Texture) --> OpenGL Surface
   VolatileImage (OpenGL Pbuffer) --> OpenGL Surface
Miscellaneous
Conclusion

Introduction

Ever since the new OpenGL-based Java 2D pipeline became available in J2SE 5.0, developers have been asking the same question: "Which rendering operations are accelerated by OpenGL?"... While I've tried my best to answer these questions clearly, I know that my answers never tell the whole story. There is just no simple way to answer that question with just a few sentences or a "matrix of supported operations" or anything like that. Even my colleagues will tell you that I usually resort to wild handwaving and whiteboard diagramming (that verges on interpretive dance at times) when I try to explain this stuff in the office.

Therefore, I compiled this document to help answer the hot question and explain all the caveats that developers might encounter when they run their application with the OpenGL-based pipeline enabled. Even this one (long!) document is probably not sufficient. There are at least two more topics that I would like to cover in the near future: a performance comparison of the OpenGL-based pipeline, and a roadmap describing some of the features and performance improvements we would like to implement for it in the future.

[This document describes the current state of the OpenGL-based pipeline as of J2SE 5.0. Keep in mind that this story may change a bit in future releases as we find ways to accelerate more operations using OpenGL.]

General Comments

Shape Rendering

Operations in this category include drawLine(), fillRect(), draw(Shape), etc. The way that each operation is handled largely depends on whether the ANTIALIASING RenderingHint is turned on, in addition to the other relevant Graphics2D state.

Non-antialiased Rendering (ANTIALIAS_DEFAULT/OFF)

Some basic operations can be rendered directly by OpenGL simply by passing down the coordinates of the operation. Specifically, these basic operations include drawLine(), drawRect(), drawPolygon(), drawPolyline(), and fillRect(). More complex operations, such as drawArc() and fill(Shape) are converted to easily digestible spans, which are then rendered by OpenGL. The Graphics2D state determines how the operation is handled by OpenGL:

Paint

Composite
All 12 Porter-Duff rules defined by the AlphaComposite class can be accelerated by OpenGL. Likewise, if XOR mode is set, then we will use OpenGL's XOR logic operation to accelerate XOR rendering. For custom Composite implementations, we will fall back on our software pipelines to complete the operation.

Stroke
For simple draw operations (such as drawLine()), the geometry can be sent directly to OpenGL only when there is a thin stroke (i.e. a default BasicStroke with width=1.0) installed on the Graphics2D object. If the stroke state is any more complex, then the shape will be sent to the software rasterizer and converted into spans, which will then be rendered by OpenGL as a list of simple quads. (The composite and paint operations will still be accelerated by OpenGL as described above when rendering the spans.)

Transform
If the current AffineTransform represents a simple translation (no scale, shear, or rotation), then the translation factors will be applied to the parameters of the operation and the operation will be performed by OpenGL. If the current AffineTransform is more complex, then the shape will be sent to the software rasterizer and converted into spans, which will then be rendered by OpenGL as a list of simple quads. (The composite and paint operations will still be accelerated by OpenGL as described above when rendering the spans.)

Antialiased Rendering (ANTIALIAS_ON)

The built-in antialiasing facilities in OpenGL are not of sufficient quality and consistency to be used in the OpenGL-based Java 2D pipeline. Therefore, when antialiasing is enabled, shape rendering operations go through the software geometry rasterizer, which knows how to optimally apply the current transform, stroke, and clip state in order to produce something easily digestible by OpenGL. Specifically, the geometry is converted into a series of alpha mask tiles. (There is actually a ton of things going on here, but for the sake of simplicity I'll just talk about this process from the perspective of the OpenGL-based pipeline, which only knows how to take these alpha tiles and turn them into something visible on the screen.)

Even though the software rasterizer is heavily involved when antialiasing is enabled, I would still argue that the operation can be considered "accelerated", since OpenGL can be used to apply the mask to the current Paint and composite the result to the destination OpenGL surface.

Due to the way the operation is defined, OpenGL will only accelerate the alpha mask operation if:

If the above restictions are not met (e.g. a GradientPaint is installed), we will use a slower path, but rest assured that we will use OpenGL whenever possible to render the antialiased shape to the destination surface.

Text Rendering

Operations in this category include drawString(), drawGlyphVector(), etc.

Rendering of text, both antialiased and non-antialiased, is accelerated by the OpenGL-based pipeline. We maintain an OpenGL texture that acts as a hardware glyph cache, so commonly used glyphs can simply be texture mapped to the destination surface, taking advantage of the hardware accelerated compositing offered by OpenGL. The heuristics used by the OpenGL glyph cache are subject to change, but in J2SE 5.0, we attempt to cache a glyph if its width and height are each less than or equal to 16 pixels. If the glyph cannot fit in the OpenGL glyph cache (which can hold approximately 1024 16x16 glyphs), we render each glyph individually using a process very similar to that descibed above in "Antialiased Rendering" including the same restrictions on the current Paint and Composite).

Image Rendering

Operations in this category include all the drawImage() variants. If you are unfamiliar with the concepts of VolatileImages and "managed images", I highly suggest you read through Chet's blogs on those subjects.

Imaging operations are usually accelerated in hardware by OpenGL, even if one of the 12 AlphaComposite rules is installed on the Graphics2D. Generally speaking, the OpenGL-based pipeline will accelerate all drawImage() variants, including:

Exactly how the image data is rendered to an OpenGL surface depends on the types of images involved. Each type of imaging operation is described below.

System Memory Surface --> OpenGL Surface

System memory surfaces (e.g. a BufferedImage that has not yet been cached in an OpenGL texture) of the following types can be rendered directly by OpenGL:

If an image is not of one of the above types, we can still use OpenGL to render the image, but we will first convert the image into an intermediate type that OpenGL can handle, such as IntArgbPre.

The glDrawPixels() operation can handle simple copies and simple scales (in conjunction with glPixelZoom()), so these operations should be relatively performant. However, glDrawPixels() is known to be somewhat slow, especially on graphics hardware in the x86 world, so this is not the most optimal path.

There is no direct way in OpenGL for transforming system memory surfaces (barring the "pixel transform" extension, which is either not available or not performant on most graphics hardware). Therefore, the OpenGL-based pipeline will use a special tiled approach that uses an intermediate OpenGL texture object to transform the system memory surface:

sysmem --> texture --> OpenGL surface

This approach is reasonably fast since the intermediate texture operations are handled in hardware, but note that it is currently defined only for NEAREST_NEIGHBOR interpolation. (We have an RFE open that would make this work for BILINEAR as well, but for now BILINEAR and BICUBIC hints are handled by our software transform loops in this case.)

Managed Image (OpenGL Texture) --> OpenGL Surface

Managed images of all types can be cached in an OpenGL texture. There are direct loops defined that can upload the system memory types listed above into an OpenGL texture. If an image is not one of those types, we will first convert its system memory surface into an intermediate format (such as IntArgb) that we can then upload into an OpenGL texture. Once an image has been cached in an OpenGL texture object, that image can be rendered to an OpenGL surface by mapping the texture to an OpenGL quad. The texture-mapped quad will respect the current AffineTransform state, and will therefore be transformed.

For example, if there is a rotation transform set on the Graphics2D object, the texture will be rotated by the graphics hardware. Likewise, the variants of drawImage() that take scaling parameters will scale the texture mapped quad before rendering to the destination OpenGL surface. Transforming a managed image with either NEAREST_NEIGHBOR or BILINEAR interpolation RenderingHints will be accelerated by OpenGL in hardware. Unfortunately, OpenGL does not support BICUBIC interpolation for textures, so we fall back on our software transform loops for the BICUBIC case.

VolatileImage (OpenGL Pbuffer) --> OpenGL Surface

Simple copies and scaled copies from a pbuffer-backed VolatileImage to an OpenGL surface will be accelerated by the VRAM->VRAM glCopyPixels() operation, and should be relatively performant. There is no easy, direct way in OpenGL for transforming pbuffers, however. On Windows, we use the render-to-texture approach, which allows us to treat the pbuffer as a texture object that can be transformed to the destination OpenGL surface. On Solaris and Linux, we can transform a pbuffer-backed VolatileImage using a tiled approach similar to that described above in System Memory Surface --> OpenGL Surface:

pbuffer --> texture --> OpenGL surface

This approach is reasonably fast since the intermediate texture operations are handled in hardware, but note that it is currently defined only for NEAREST_NEIGHBOR interpolation, as mentioned above in "System Memory Surface --> OpenGL Surface".

Miscellaneous

Conclusion

I hope this article answers most of the questions developers have been asking for the past few months. If you see any glaring omissions, something you would like clarified, or topics for a future "Behind the Graphics2D" article, please post a comment. I'll try to incorporate your suggestions into this document so that it can be the "definitive source" for this topic (if that's possible).

Chris Campbell is an engineer on the Java 2D Team at Sun Microsystems, working on OpenGL hardware acceleration and imaging related issues.


 Feed java.net RSS Feeds