Go to content Go to navigation

Screen Space transformation in OpenGL · 2009-12-13 16:10 by Black in

Transforming from World Space to Screen Space in OpenGl is useful for selecting and picking with a mouse or similar. OpenGL itself provides the function gluProject and gluUnProject to do this. This class replicates that functionality. Download the full header and source for a commented version. Below, a simplified header shows the API.

screenproject.h

  1. class ScreenProject
  2. {
  3. public: // ScreenSpace
  4.   void calculateMVP(GLint * vp, double * mv, double * p);
  5.   void calculateMVP();
  6.   Point transformToScreenSpace(Point _p, float _f = 1) const;
  7.   Point transformToClipSpace(Point _p, float _f = 1) const;
  8.   Point transformToWorldSpace(Point _p) const;
  9. private:
  10.   double mvp_[16];  /**< Product of Modelview and Projection Matrix */
  11.   double mvpInv_[16]; /**< Inverse of mvp_ */
  12.   long vp_[4];    /**< Viewport */
  13. }

Even if the class is initialized, it can not be used until one of calculateMVP is used to pass custom matrixes or read the matrixes from the current OpenGL context. After this is done, the matrix and inverse is stored internally, and the calculations work without accessing any outside data.

The various transform functions transform the passed point with the internal state. Going from screen space to world space is very useful in picking, selecting or generally interacting with the scene with a pointing device. Transforming to clip or screen space is used when calculating anchor points for labels on screen.

The Point type is a three-element vector. OpenGL works with homogenous coordinates, so a fourth value is needed. Because this is almost always 1, that was chosen as default value. There is one important exception: When transforming normals, the last coordinate is zero. Normals are not influenced by translations.

The source file contains code for the transformation with the matrixes, viewport transformation and matrix inversion. The code for the inversion was taken from Mesa, everything else is from me. OpenGL’s matrixes are column-major so the four numbers in the first column are mapped to the first four slots in the 16 slot array. The transformation matrix is built from the modelview matrix M and the projection matrix P with P*M, a point p is then projected as in P*M*p.

screenproject.cpp [6.51 kB]

  1. Point ScreenProject::transformToClipSpace(Point _p, float _f) const
  2. {
  3.   Point pT = transformPointWithMatrix(_p, mvp_, _f);
  4.   return Point((pT[0] + 1)/2, (pT[1] + 1)/2, (pT[2] + 1)/2);
  5. }
  6.  
  7. Point ScreenProject::transformToWorldSpace(Point _p) const
  8. {
  9.   // Transform to normalized coordinates
  10.   _p[0] = (_p[0] - vp_[0]) * 2 / vp_[2] - 1.0f;
  11.   _p[1] = (_p[1] - vp_[1]) * 2 / vp_[3] - 1.0f;
  12.   _p[2] = 2 * _p[2] - 1.0;
  13.  
  14.   // Transform
  15.   return transformPointWithMatrix(_p, mvpInv_);
  16. }
  17.  
  18. Point ScreenProject::transformPointWithMatrix(Point _p, const double * _m, float _f) const
  19. {
  20.     float xp = _m[0] * _p[0] + _m[4] * _p[1] + _m[8] * _p[2] + _f * _m[12];
  21.     float yp = _m[1] * _p[0] + _m[5] * _p[1] + _m[9] * _p[2] + _f * _m[13];
  22.     float zp = _m[2] * _p[0] + _m[6] * _p[1] + _m[10] * _p[2] + _f * _m[14];
  23.     float tp = _m[3] * _p[0] + _m[7] * _p[1] + _m[11] * _p[2] + _f * _m[15];
  24.     if (tp == 0)
  25.     return Point(xp, yp, zp);
  26.     else
  27.     return Point(xp / tp, yp / tp, zp / tp);
  28. }

Comment

Simple vector class in C++ · 2009-12-12 14:44 by Black in

Mathematical vectors are often used in C++, but no classes for it exist in the std, the standard library. C++‘s Vector classes are heavy weight containers that offer a rich function set but are unsuitable for mathematics.

For ExaminationRoom I created a set of small classes to be able to easily pass vectors around in my code and perform simple operations on them. The implementation makes use of templates and operator overloading to offer an easy interface for users and still keep it flexible. It is not intended as a competitor to Boost’s uBLAS classes or similar rich mathematics libraries.

vec.h [7.47 kB]

  1. /**
  2. A small helper object, that is a 2 element vector. It can be treated as point
  3. (with x, y accessors) or an array (with operator[] accessor).
  4. */
  5. template <typename T>
  6. union Vec2
  7. {
  8.   enum { dim = 2 };
  9.  
  10.   struct
  11.   {
  12.     T x;
  13.     T y;
  14.   };
  15.  
  16.   T vec[dim];
  17.  
  18.   Vec2() {x = y = 0; };
  19.   Vec2(const Vec2<T>& v)
  20.   {
  21.     x = v.x;
  22.     y = v.y;
  23.   };
  24.   Vec2(T a, T b)
  25.   {
  26.     x = a;
  27.     y = b;
  28.   };

The code above is the start of the declaration of the Vec2 type. There are three base types, Vec2, Vec3 and Vec4, which are intentionally incompatible. Conversion with a creator is only possible when no data is lost. An interesting detail is that the class is not declared as normal class but as union, all members are at the same location in memory. That way, vector values can be accessed by their names or their location.

vec.h [7.47 kB]

  1. template <typename T>
  2. inline Vec2<T> & operator/=(Vec2<T> &v1, const T s1)
  3. {
  4.   v1.x /= s1;
  5.   v1.y /= s1;
  6.   return v1;
  7. }
  8.  
  9. template <typename T>
  10. inline const Vec2<T> operator+(const Vec2<T> &v1, const Vec2<T> &v2)
  11. {
  12.   Vec2<T> v = v1;
  13.   return v += v2;
  14. }

The operators are defined globally as inline functions since they are rather simple. When ever possible, the definition of an operator is built up on a previously defined one to minimize code duplication.
Some special methods were defined for normalization of the vectors, conversion to homogenous vectors as well as cross products of 3-element vectors. Often used types are defined with usable names. In ExaminationRoom, those were the types I used:

vec.h [7.47 kB]

  1. typedef Vec3<float> Vec3f;
  2. typedef Vec4<float> Vec4f;
  3. typedef Vec3f Point;
  4. typedef Vec3f Vector;
  5. typedef Vec3f Color3;
  6. typedef Vec4f Color4;

Lua Interface

An implementation of marshaling to Lua in the form of a simple table with luabridge was also written:

lua interfacing

  1. template <typename V>
  2. inline void pushVector(lua_State *L, V v)
  3. {
  4.   const int n = V::dim;
  5.   lua_createtable(L, n, 0);
  6.   for (int i = 0; i < n; i++)
  7.   {
  8.     lua_pushnumber(L, i+1);
  9.     lua_pushnumber(L, v[i]);
  10.     lua_settable(L, -3);
  11.   }
  12. }
  13. template <typename V>
  14. inline V toVector(lua_State *L, int idx)
  15. {
  16.   const int n = V::dim;
  17.   V v;
  18.   luaL_checktype(L, idx, LUA_TTABLE);
  19.   for (int i = 0; i < n; i++)
  20.   {
  21.     lua_pushnumber(L, i+1);
  22.     lua_gettable(L, idx);
  23.     v[i] = lua_tonumber(L, -1);
  24.     lua_pop(L, 1);
  25.   }
  26.   return v;
  27. }
  28. template <>
  29. struct tdstack <Tool::Vec2f>
  30. {
  31.   static void push (lua_State *L, const Tool::Vec2f & data)
  32.   {
  33.     pushVector<Tool::Vec2f>(L, data);
  34.   }
  35.   static Tool::Vec2f get (lua_State *L, int index)
  36.   {
  37.     return toVector<Tool::Vec2f>(L, index);
  38.   }
  39. };

From C++ to lua, a table is created and all elements are put into the table in order. This requires the type of the vector to be compatible with Lua’s number representation, which is usually a floating point number. Back converts tables that contain numbers back to a vector type of the suitable size. Missing or wrong table contents lead to lua errors that get caught with lua_pcall, all other table contents get ignored. (Here the reason for the existance of the enum “dim” is seen: A type variable that can be evaluated at compile time, which unlike static const variables do not take up space therefore can be defined and declared in a header.)

Comment

Mayan with GLSL · 2009-12-09 00:00 by Black in

The first test implementation of Mayan was a Photoshop file containing the picture in various states of desaturation and blending. The second implementation was a direct show filter for the group’s stereo movie player. The third and latest implementation is an OpenGL Shading Language shader for ExaminationRoom.

ExaminationRoom was extended to support shader assisted merging of the two viewpoints. This was done by rendering both the left and the right camera’s view to FramebufferObjects, which then get drawn while the given Fragment Shader is active. The shader can calculate how to modify each sides’ fragments. The blend func is GL_ONE during this time, so no further modification is performed.

mayan.fs [526.00 B]

  1. uniform sampler2D tex;
  2. uniform float side;
  3.  
  4. // Factor that determines how much of the other
  5. // colors is mixed into the primary channel of that
  6. // side. This is the same lambda as in the mayan paper.
  7. uniform float lambda;
  8.  
  9. void main()
  10. {
  11.   float facR = 1.0-side;
  12.   float facG = side;
  13.   float mixFactor = (1.0-lambda)*0.5;
  14.  
  15.   vec4 c = texture2D(tex, gl_TexCoord[0].xy);
  16.   gl_FragColor = vec4(
  17.     facR*(c.r*lambda + (c.g+c.b)*mixFactor), // Red
  18.     facG*(c.g*lambda + (c.r+c.b)*mixFactor), // Green
  19.     c.b*0.5, // Blue
  20.     0.5); // Alpha
  21. }

Fragment shaders get a uniform variable that defines which side the currently drawn texture is on. Lambda is a factor that influences the desaturation of the colors for better 3D impression.

Using shaders for mixing allows for maximal adaptability with hardware accelerated speed. Unlike the original Anaglyph renderer it can mix different colors and is able to handle shared channels like Mayan’s blue.

Comment

ExaminationRoom · 2009-12-08 13:05 by Black in

As previously mentioned, ExaminationRoom is the result of my masters thesis. From the project page:

Viewing stereoscopic movies or images is unnatural. The focus and vergence of the eyes have to be decoupled. Artefacts and inconsistencies of a stereoscopic image with the real world cause confusion and decrease viewing pleasure.

ExaminationRoom is a Tool that enables exploration of those problems and quantifying them by providing a flexible and extensible framework for user testing. Challenges include understanding the needs in this relatively new field of research, as well as the commonly used methods in user testing.

The project began with a simple program that generated random dot stereograms from a 1 bit depth image. While this code was rewritten completely later on, it still proved that the general idea of the test worked.

The real ExaminationRoom design started out as a bunch of boxes on a notepad. It was fairly simple: A scene graph containing the objects that are displayed, a scripting core that executes user provided code to move the scene, and a rendering engine that renders the the scene graph.

After some searching I decided to build my own scene code. The preexisting libraries had too many limitations when it came to simulate depth cues. The script core was a Lua state that acted directly on the scene graph. During my WoW Addon writing career I got to like this language, it’s simplicity make it easy to learn and integrate into other applications.

The whole application had to run on both Mac OS X and Windows. Qt was the most comfortable way to achieve this goal, it abstracted many platform dependent features such as window and input handling. Easy handling of pictures for textures was an added benefit.

The rendering of the scene is OpenGL based. Each object in the scene graph can draw itself into the scene, containers can modify the state before and after drawing their contents to achieve interesting effects. The rendering of the stereoscopic representation is controlled by a group of classes titled Renderer, which are responsible for mixing the views of left and right cameras appropriately.

ExaminationRoom Screenshot

The screenshot shows a simple scene with custom depth ordering drawn with the line interlacing renderer.
Read more on this topic in my thesis, but be warned: It’s long! :)

Comment

Mayan Anaglyph · 2009-12-06 13:26 by Black in

My semester thesis in Software Engineering was titled Mayan. This by itself is as non-descriptive as it can get, but it basically is an improved method for anaglyph stereo. The improvement was to allow for better color perception and preservation while achieving superior fusion.

Stereoscopic Images?

Stereoscopic images are pictures that can be seen in 3D, they contain data from both eyes’ viewpoints. Various technologies exist to produce, store, display and perceive such images. In this article am writing about anaglyph, color encoded stereo.

How does it work?

Almost all color display technology these days works by mixing three color channels, red, green and blue. Due to the limited sensory equipment humans possess, this is enough to imitate a wide range of the visible color spectrum. The existence of three channels is used in traditional anaglyph, it displays the image for one eye in red only, for the other eye in green and blue, which appears cyan. (There are many methods to mix the images that preserve different aspects of the image)

Mayan displays the left image in the Magenta plane, red and blue. The right image in the Cyan plane, green and blue. The blue channel is therefore shared. Having two channels for each eye makes color perception much better. Of course the visibility from both sides’ blue causes crosstalk, but due to shortcomings in human physiology, blue can not be perceived as sharp as other colors, so the impact is lessened.

Mayan example image

Fusion vs. Rivalry

Fusion is achieved when both the left and right image are perceived by the correct eye, and the brain considers them actual data seen from different viewpoints and fuses them to a single 3D impression. It is easy to fuse on images that contain fine structure that exist in both viewpoints.

Rivalry is when the images can not be fused and the brain alternates between perceiving only the left and only the right image. It can occur when the two viewpoints differ too much, or there is not enough structure to fuse on.

The Mayan algorithm has a tuning parameter to allow easier fusion or better color perception. It influences how much the pictures are desaturated and mixed into the respective side’s channel.

And after all that talk, here’s my thesis paper. It contains a more detailed description of the mixing and some analyses of the crosstalk and possible mitigation strategies.

Comment

ArtPad Vector alpha · 2008-01-13 05:12 by Black in

I’ve written about it before, and also implemented it some time ago: Drawing in ArtPad is now vector based. The downside in the new version is that the eraser does not work yet. But it does use much less memory, and drawing is also faster. Not to mention that it looks great.

How does it work?

A line is drawn by drawing a rectangular texture. The texture consists of a single line in a rectangular file (32×32 pixel). To draw lines with a given angle, the texture is rotated. This is done by transforming the texture coordinates with the inverse rotation matrix. Due to the property of the texture mapping to infinitely repeat the border pixel for all points outside the texture area, the texture can also be scaled down for long lines.

The newest version can be gotten here

Comment