The previous five posts, starting with Drawing Circles With OpenGL, and finishing with Adding a Moving Triangle, created an increasingly complex program that displayed two circles and a triangle moving about the drawing canvas. A number of design and coding decisions were made when the program was written. This post will discuss these decisions, and in so doing help describe some of the efficiencies and inefficiences that can result when using OpenGL.
OpenGL C++ Libraries?
CirclesAndRotators program uses the OpenGL C API directly, well through GLEW. There are C++ wrapper libraries available, but they were not used, because I am starting to learn OpenGL, and introducing wrapper libraries at this stage would confuse the issue, and probably reduce the meager readership that I currently have.
There are at least three C++ wrapper libraries:
All of these are open source with source code on GitHub.
OGLplus is actively maintained, while
OOGL has not had any updates for two years and has bugs that have been outstanding for four years.
glbinding uses C++11 features such as enum classes, lambdas and variadic templates rather than MACROs. It is both actively maintained, and comes with commercial support if needed. I leave it to you to decide if you want to use any of these, but before doing so, read this discussion from StackOverflow.
How A Circle is Defined
It is possible to define a circle in terms of many triangles with one vertex at the centre of the circle and the other two vertices on the edge of the circle. The more triangles you use, the more closely the drawn object looks like a circle rather than a polygon.
I chose instead to simply define the smallest square that completely encloses the circle, and, in the fragment shader, to discard pixels outside the circle. This method works for circles of any size; it also illustrates some of what it is possible to do in shaders.
OnPaint method in
glUseProgram is called to set the circle shader program, then the two circles are painted.
glUseProgram is called again, this time to set the triangle shader program before the triangle is painted.
An alternative to this would be to place the
glUseProgram calls in the
glEquilateralTriangle Paint methods. However, the
glUseProgram is expensive in terms of the work that the GPU must perform to switch shader programs. Therefore, the fewer program switches, the better your program will perform. In general, if you can, code your program to limit the number of shader program switches. Now in the CirclesAndRotators program, you will not perceive any difference between having the
glUseProgram calls in
OnPaint and in the
Paint methods for each object. However, once a program has many thousands of objects, there will be a noticeable difference in performance. In general, try to call
glUseProgram once, paint all of the objects that use that shader program, then call
glUseProgram for the next set of objects, and so forth.
Perform Transform in CPU or GPU?
Most graphics programs are designed such that objects are created to be centred on the origin of the display canvas, and then transformed to their final location; this is especially true for objects that are moved about the canvas whenever a frame is painted.
There are two places these transformations can be performed:
- By the CPU before moving the vertex data to the GPU; and,
- By the GPU.
The second is preferred for a number of reasons:
- CPU’s typically have between 2 and 16 logical cores. You could create multiple threads to calculate the final position of each vertex and pass the vertices using vertex buffers to the GPU each time the frame is painted, but how quickly you can perform the transformations on the vertices is limited by the number of logical cores in the CPU. Alternatively, you could pass the initial vertices in a vertex buffer once, and pass the transform via a uniform each time the frame is painted. Modern GPUs are optimized for parallel processing, and may contain hundreds or thousands of cores to perform the processing. For complex objects that contain many vertices, performing the transformation in the GPU is much more efficient.
- Transforming the vertices in the CPU performs work that can be performed in a vertex shader, but what happens if you need access to the transform elsewhere in the graphics pipeline. For example, in the circle fragment shader in the
CirclesAndRotatorsprogram, the transform is required so that each fragment can be transformed to determine if it is inside or outside the circle. The transform information is not available to the fragment shader if the vertices are transformed by the CPU.
- For programs that display a large number of objects, each containing a large number of vertices, the vertex buffers that are transferred would be very large in comparison to the 16 values that are passed in a transformation matrix. Moving data to the GPU is a relatively time-consuming operation whose time is determined to some extent by the size of the data being moved. Transporting a large amount of data every time a frame is painted by the GPU takes more time than transporting a large amount of data once and smaller amounts of data each frame.
For the reasons given above, it is generally better to perform transformations in shaders (by the GPU) rather than in the program (by the CPU).
Where To Apply Multiple Transforms
CirclesAndRotators program, there are two transforms created for each circle that are then multiplied together to form a composite transform. For the triangle, there are four transforms created. These are then multiplied together to form a composite transform. The composite transforms are passed to the shaders where they are applied to the vertices in the vertex shader, and also in the fragment shader for circles.
An alternative would be to pass all of the individual transforms to the shaders and multiply them together there. However, the result of multiplying the individual transforms together is always the same each time a frame is drawn, so multiplying the transforms together for each vertex, and each fragment in the case of the circles, is performing a large amount of unnecessary work. Therefore, only the composite transforms should be passed as uniforms to the shaders.
Note also that in the circle fragment shader, the centre of the circle is calculated for every fragment, but this result is always the same when a circle is painted. Therefore, this calculation should have been done in the CPU, with the result being passed as a uniform.
This post has discussed some of the design and coding decisions that I made when writing the
CirclesAndRotators program. A number of options were discussed, and the better or best choice as related to drawing efficiency were noted.