Go to content Go to navigation

SVG embedding in XHTML with TextPattern · 2010-02-10 19:20 by Black in

To store and transport scale independent graphics SVG is a standardised format. It is supported by most browsers somewhat, with the exception of Microsoft’s product. It would be nice if it could be used like the usual raster image formats in an <img /> tag. While Safari and other WebKit browsers support this usage, Firefox does not. Instead, SVG can be embedded directly into the source code of an XHTML file.

45° A B C

The above image is part of the source code of this page. It is included by a TextPattern plugin, and only slightly processed. Processing is needed to remove the <?xml /> header and to insert a viewBox attribute to allow scaling of the image with CSS. With that done, Firefox displays and scales the image nicely. Safari on the other hand causes trouble, it does not correctly infer the viewport height from the width and the aspect ratio.

To be allowed to embed SVG data, the mime type of the document has to be application/xhtml+xml or similar. This has to be changed for TextPattern by editing the header() call in publish.php. The plugin code itself is rather simple. Download a version ready to be pasted into TextPattern (Licensed under the MIT). Below the sourcecode.

svg_inline.php [1.74 kB]

  1. function svg_inline($atts)
  2. {
  3.   extract(lAtts(array(
  4.     'src'  => '',
  5.   ),$atts));
  7.   if ($src)
  8.   {
  9.     if ($src[0] == '/')
  10.     {
  11.       // Relative to Document Root
  12.       $src = $_SERVER['DOCUMENT_ROOT'].$src;
  13.     }
  14.     $svg = file_get_contents($src, FILE_TEXT);
  15.     if ($svg)
  16.     {
  17.       // Add this to publish.php
  18.       //header("Content-type: application/xhtml+xml; charset=utf-8");
  19.       $svg = preg_replace('/<\?xml [^>]*>/', '', $svg, 1);
  20.       $svg = preg_replace('/(<svg[^>]*)width="([^"]*)"([^>]*)height="([^"]*)"([^>]*)>/',
  21.         '$1$3$5 viewBox="0 0 $2 $4">', $svg, 1);
  22.       return '<div class="svg">'.$svg.'</div>';
  23.     }
  24.     else
  25.     {
  26.       return 'Read error src='.$src;
  27.     }
  28.   }
  29.   else
  30.   {
  31.     return 'Missing src';
  32.   }
  33. }

TextPattern plugins are php functions that take two arguments: an array containing the tag attributes, and the contents of the tag element. All this plugin function does is to read the specified svg file and return the filtered source. A simple <txp:svg_inline src="imagepath" /> results in a nicely embedded SVG.


Stereoscopic Camera for OpenGL · 2010-02-10 02:15 by Black in

Even to create stereoscopic content digitally, cameras are used. But more than real cameras, a lot of freedom and control lies in the hands of the user. The position and projection can be freely decided without respecting things like the size of the camera, inexactness in the manufacturing of their optics or their weight.

For further reading, see Paul Bourke’s Page on stereo pair creation.

Camera Theory

Cameras in OpenGL are defined by filling the modelview matrix and the projection matrix with values. The modelview matrix defines the position of the camera relative to the origin or the object space, the projection matrix defines how coordinates in space are mapped to screen.

The projection matrix can be chosen freely, but normally two basic types of cameras are used: Orthographic and Perspective. Perspective cameras create projections very similar to how the human eye sees the world, objects appear smaller the further they are from the camera. Orthographic cameras project objects preserving parallel lines and their proportions. It is mostly used in technical drawings.

Stereo Pairs

A simple method to use perspective cameras to create stereoscopic footage is to converged their viewing axis. With hardware cameras, this is often used for macro recordings or recordings in closed rooms. The advantage is that the parallax plane is determined when recording, so post-processing needs are low. In addition, the cameras do not have to be as close together as in the next method. The biggest drawback is that the left and right sides of the image do not overlap and have to be cut away or ignored, and that the divergence behind the parallax plane is very strong and can easily lead to unfuseable content. This method should be avoided when ever possible.

ZeroParallax Cutoff Strongdivergence

A better method is to use perspective cameras with parallel axis. It requires the cameras to be relatively close together and well aligned, both of which is no problem to do in software. Unlike converged cameras, the maximal divergence at infinity is fixed, so even recordings containing far objects can work. The zero parallax plane lies at infinity. It can be moved by creating asymmetric view frustums, effectively horizontally moving both images.

ZeroParallax AsymetricFrustum

For special visualizations, parallel cameras with converged axis can be used. And similar as with perspective converged cameras, extreme caution has to be taken to not create strongly diverging images. This method should only be used to show objects that are very close to the parallax plane.


Implementation with OpenGL

As part of ExaminationRoom, I implemented a flexible camera class. The source and header can be downloaded and used relatively freely. As all of my code on this page, they are licensed under the GPL and MIT licenses. This class is not meant to be used directly in an other project since a lot of code is specific to ER, but I am sure the core can be of use as example.

In my implementation, camera positions are defined by their position, their viewing direction, their up-vector and their separation (distance between the cameras). The projection is influenced by the field-of-view, the distance to the zero-parallax plane (the plane where separation of corresponding points is zero) and of course the type of the projection.

camera.h [6.01 kB]

  1. private:
  2.   Tool::Point   pos_;
  3.   Tool::Vector  dir_;
  4.   Tool::Vector  up_;
  5.   float     sep_;
  6.   float     fov_;
  7.   float     ppd_;
  8.   Tool::ScreenProject * spL_;
  9.   Tool::ScreenProject * spR_;
  10.   Camera::Type  type_;

The core of the class is the creation of the matrixes. The call to glFrustum sets the projection matrix, the modelview matrix is created with the utility method gluLookAt. The separation between the cameras has to be considered for both. The camera uses vertical field-of-view, so that the height of the image does not change between standard and widescreen viewport aspect ratios.

camera.cpp [9.43 kB]

  1. void Camera::loadMatrix(float offsetCamera)
  2. {
  3.   GlErrorTool::getErrors("Camera::loadMatrix:1");
  4.   GLint viewport[4];
  5.   glGetIntegerv(GL_VIEWPORT, viewport);
  6.   float aspect = (float)viewport[2]/viewport[3];
  7.   float fovTan = tanf((fov_/2)/180*M_PI);
  8.   if (type() == Camera::Perspective)
  9.   {
  10.     // http://local.wasp.uwa.edu.au/~pbourke/projection/stereorender/
  12.     float fTop, fBottom, fLeft, fRight, fNear, fFar;
  13.     // Calculate fNear and fFar based on paralax plane distance hardcoded factors
  14.     fNear = ppd_*nearFactor;
  15.     fFar = ppd_*farFactor;
  16.     // Calculate fTop and fBottom based on vertical field-of-view and distance
  17.     fTop = fovTan*fNear;
  18.     fBottom = -fTop;
  19.     // Calculate fLeft and fRight basaed on aspect ratio
  20.     fLeft = fBottom*aspect;
  21.     fRight = fTop*aspect;
  23.     glMatrixMode(GL_PROJECTION);
  24.     // Projection matrix is a frustum, of which fLeft and fRight are not symetric
  25.     // to set the zero paralax plane. The cameras are parallel.
  26.     glPushMatrix();
  27.     glLoadIdentity();
  28.     glFrustum(fLeft+offsetCamera, fRight+offsetCamera, fBottom, fTop, fNear, fFar);
  29.     glMatrixMode(GL_MODELVIEW);
  30.     glPushMatrix();
  31.     glLoadIdentity();
  32.     // Rotation of camera and adjusting eye position
  33.     Vector sepVec = cross(dir_, up_); // sepVec is normalized because dir and up are normalized
  34.     sepVec *= offsetCamera/nearFactor;
  35.     // Set camera position, direction and orientation
  36.     gluLookAt(pos_.x - sepVec.x, pos_.y - sepVec.y, pos_.z - sepVec.z,
  37.           pos_.x - sepVec.x + dir_.x, pos_.y - sepVec.y + dir_.y, pos_.z - sepVec.z + dir_.z,
  38.           up_.x, up_.y, up_.z);
  39.     GlErrorTool::getErrors("Camera::loadMatrix:2");
  40.   }

The perspective projection is used in most places. For ExaminationRoom, one of the feature requests was the ability to disable selected depth cues. A very strong cue is size relative to the environment. To disable this cue, parallel projection with converged cameras as described above is used instead. The values for the projection matrix were chosen so that the objects at the zero-parallax plane would not change their size when switching between the projection types. The projection matrix is derived from the normal orthographic projection created by OpenGL’s glOrtho by shearing it.

camera.cpp [9.43 kB]

  1.   else if (type() == Camera::Parallel)
  2.   {
  3.     float fTop, fBottom, fLeft, fRight, fNear, fFar;
  4.     // Calculate fNear and fFar based on paralax plane distance and a hardcoded factor
  5.     // Note: the zero paralax plane is exactly in between near and far
  6.     fFar = ppd_*farFactor;
  7.     fNear = 2*ppd_ - fFar; // = ppd_ - (fFar-ppd_);
  8.     // Set fTop and fBottom based on field-of-view and paralax plane distance
  9.     // This is done to make the scaling of the image at the paralax plane the same
  10.     // as in perspective mode
  11.     fTop = fovTan*ppd_;
  12.     fBottom = -fTop;
  13.     // Set left and right baased on aspect ratio
  14.     fLeft = fBottom*aspect;
  15.     fRight = fTop*aspect;
  17.     glMatrixMode(GL_PROJECTION);
  18.     glPushMatrix();
  19.     glLoadIdentity();
  20.     // http://wireframe.doublemv.com/2006/08/11/projections-and-opengl/
  21.     // Note: The code there is wrong, see below for correct code
  22.     // Create oblique projection matrix by shearing an orthographic
  23.     // Projection matrix. Those cameras are converged.
  24.     const float shearMatrix[] = {
  25.       1, 0, 0, 0,
  26.       0, 1, 0, 0,
  27.       -offsetCamera/nearFactor, 0, 1, 0,
  28.       0, 0, 0, 1
  29.     };
  30.     glMultMatrixf(shearMatrix);
  31.     glOrtho(fLeft, fRight, fBottom, fTop, fNear, fFar);
  32.     glMatrixMode(GL_MODELVIEW);
  33.     glPushMatrix();
  34.     glLoadIdentity();
  35.     // Rotation of camera
  36.     // Note: The position of both left and right camera is at the same place
  37.     //  because the offset is already calculated by the shearing, which also sets
  38.     //  the zero paralax plane.
  39.     gluLookAt(pos_.x, pos_.y, pos_.z,
  40.           pos_.x + dir_.x, pos_.y + dir_.y, pos_.z + dir_.z,
  41.           up_.x, up_.y, up_.z);
  42.     GlErrorTool::getErrors("Camera::loadMatrix:3");
  43.   }

Hopefully this is useful to someone :)


Screen Space transformation in OpenGL · 2009-12-13 16:10 by Black in

Transforming from World Space to Screen Space in OpenGl is useful for selecting and picking with a mouse or similar. OpenGL itself provides the function gluProject and gluUnProject to do this. This class replicates that functionality. Download the full header and source for a commented version. Below, a simplified header shows the API.


  1. class ScreenProject
  2. {
  3. public: // ScreenSpace
  4.   void calculateMVP(GLint * vp, double * mv, double * p);
  5.   void calculateMVP();
  6.   Point transformToScreenSpace(Point _p, float _f = 1) const;
  7.   Point transformToClipSpace(Point _p, float _f = 1) const;
  8.   Point transformToWorldSpace(Point _p) const;
  9. private:
  10.   double mvp_[16];  /**< Product of Modelview and Projection Matrix */
  11.   double mvpInv_[16]; /**< Inverse of mvp_ */
  12.   long vp_[4];    /**< Viewport */
  13. }

Even if the class is initialized, it can not be used until one of calculateMVP is used to pass custom matrixes or read the matrixes from the current OpenGL context. After this is done, the matrix and inverse is stored internally, and the calculations work without accessing any outside data.

The various transform functions transform the passed point with the internal state. Going from screen space to world space is very useful in picking, selecting or generally interacting with the scene with a pointing device. Transforming to clip or screen space is used when calculating anchor points for labels on screen.

The Point type is a three-element vector. OpenGL works with homogenous coordinates, so a fourth value is needed. Because this is almost always 1, that was chosen as default value. There is one important exception: When transforming normals, the last coordinate is zero. Normals are not influenced by translations.

The source file contains code for the transformation with the matrixes, viewport transformation and matrix inversion. The code for the inversion was taken from Mesa, everything else is from me. OpenGL’s matrixes are column-major so the four numbers in the first column are mapped to the first four slots in the 16 slot array. The transformation matrix is built from the modelview matrix M and the projection matrix P with P*M, a point p is then projected as in P*M*p.

screenproject.cpp [6.51 kB]

  1. Point ScreenProject::transformToClipSpace(Point _p, float _f) const
  2. {
  3.   Point pT = transformPointWithMatrix(_p, mvp_, _f);
  4.   return Point((pT[0] + 1)/2, (pT[1] + 1)/2, (pT[2] + 1)/2);
  5. }
  7. Point ScreenProject::transformToWorldSpace(Point _p) const
  8. {
  9.   // Transform to normalized coordinates
  10.   _p[0] = (_p[0] - vp_[0]) * 2 / vp_[2] - 1.0f;
  11.   _p[1] = (_p[1] - vp_[1]) * 2 / vp_[3] - 1.0f;
  12.   _p[2] = 2 * _p[2] - 1.0;
  14.   // Transform
  15.   return transformPointWithMatrix(_p, mvpInv_);
  16. }
  18. Point ScreenProject::transformPointWithMatrix(Point _p, const double * _m, float _f) const
  19. {
  20.     float xp = _m[0] * _p[0] + _m[4] * _p[1] + _m[8] * _p[2] + _f * _m[12];
  21.     float yp = _m[1] * _p[0] + _m[5] * _p[1] + _m[9] * _p[2] + _f * _m[13];
  22.     float zp = _m[2] * _p[0] + _m[6] * _p[1] + _m[10] * _p[2] + _f * _m[14];
  23.     float tp = _m[3] * _p[0] + _m[7] * _p[1] + _m[11] * _p[2] + _f * _m[15];
  24.     if (tp == 0)
  25.     return Point(xp, yp, zp);
  26.     else
  27.     return Point(xp / tp, yp / tp, zp / tp);
  28. }


Mayan with GLSL · 2009-12-09 00:00 by Black in

The first test implementation of Mayan was a Photoshop file containing the picture in various states of desaturation and blending. The second implementation was a direct show filter for the group’s stereo movie player. The third and latest implementation is an OpenGL Shading Language shader for ExaminationRoom.

ExaminationRoom was extended to support shader assisted merging of the two viewpoints. This was done by rendering both the left and the right camera’s view to FramebufferObjects, which then get drawn while the given Fragment Shader is active. The shader can calculate how to modify each sides’ fragments. The blend func is GL_ONE during this time, so no further modification is performed.

mayan.fs [526.00 B]

  1. uniform sampler2D tex;
  2. uniform float side;
  4. // Factor that determines how much of the other
  5. // colors is mixed into the primary channel of that
  6. // side. This is the same lambda as in the mayan paper.
  7. uniform float lambda;
  9. void main()
  10. {
  11.   float facR = 1.0-side;
  12.   float facG = side;
  13.   float mixFactor = (1.0-lambda)*0.5;
  15.   vec4 c = texture2D(tex, gl_TexCoord[0].xy);
  16.   gl_FragColor = vec4(
  17.     facR*(c.r*lambda + (c.g+c.b)*mixFactor), // Red
  18.     facG*(c.g*lambda + (c.r+c.b)*mixFactor), // Green
  19.     c.b*0.5, // Blue
  20.     0.5); // Alpha
  21. }

Fragment shaders get a uniform variable that defines which side the currently drawn texture is on. Lambda is a factor that influences the desaturation of the colors for better 3D impression.

Using shaders for mixing allows for maximal adaptability with hardware accelerated speed. Unlike the original Anaglyph renderer it can mix different colors and is able to handle shared channels like Mayan’s blue.


Mayan Anaglyph · 2009-12-06 13:26 by Black in

My semester thesis in Software Engineering was titled Mayan. This by itself is as non-descriptive as it can get, but it basically is an improved method for anaglyph stereo. The improvement was to allow for better color perception and preservation while achieving superior fusion.

Stereoscopic Images?

Stereoscopic images are pictures that can be seen in 3D, they contain data from both eyes’ viewpoints. Various technologies exist to produce, store, display and perceive such images. In this article am writing about anaglyph, color encoded stereo.

How does it work?

Almost all color display technology these days works by mixing three color channels, red, green and blue. Due to the limited sensory equipment humans possess, this is enough to imitate a wide range of the visible color spectrum. The existence of three channels is used in traditional anaglyph, it displays the image for one eye in red only, for the other eye in green and blue, which appears cyan. (There are many methods to mix the images that preserve different aspects of the image)

Mayan displays the left image in the Magenta plane, red and blue. The right image in the Cyan plane, green and blue. The blue channel is therefore shared. Having two channels for each eye makes color perception much better. Of course the visibility from both sides’ blue causes crosstalk, but due to shortcomings in human physiology, blue can not be perceived as sharp as other colors, so the impact is lessened.

Mayan example image

Fusion vs. Rivalry

Fusion is achieved when both the left and right image are perceived by the correct eye, and the brain considers them actual data seen from different viewpoints and fuses them to a single 3D impression. It is easy to fuse on images that contain fine structure that exist in both viewpoints.

Rivalry is when the images can not be fused and the brain alternates between perceiving only the left and only the right image. It can occur when the two viewpoints differ too much, or there is not enough structure to fuse on.

The Mayan algorithm has a tuning parameter to allow easier fusion or better color perception. It influences how much the pictures are desaturated and mixed into the respective side’s channel.

And after all that talk, here’s my thesis paper. It contains a more detailed description of the mixing and some analyses of the crosstalk and possible mitigation strategies.