Go to content Go to navigation

Time flies · 2019-03-24 11:59 by Black in

After finishing my masters thesis and completing civil service, I’ve been working for Disney Research since early 2010. And I’ve obviously not had much to say on this blog. Most of my early work was focused on software engineering, mostly tech transfer using C++. I’ve written the basis of a big code base that is used (and considering the resistance to update anything will likely be used for decades) in many projector / camera systems in disney parks. Qt was popular for GUI work back then, and I’ve used it in several projects targeted at normal humans (but since those were internal research projects, not many actually used it). I’ve worked on projects targeting mobile platforms, in both Objective-C and Unity (I’m even named on a patent for one of them). I’ve written plugins for Nuke, ToonBoom, AfterEffects and many others, code running on arduinos or servers with dozens of cores and multiple GPUs.

But lately, I’ve shifted to projects that focus on machine learning. I’m not a researcher, so I don’t focus on developing models and graphs, but over the past few years I’ve debugged, ported and improved a lot of the deep learning research code created here. My favorite framework is TensorFlow, and I spent most time with it, but I’ve also used PyTorch. TensorFlow’s graph based structure forces more discipline, and makes reasoning much easier than the python spaghetti code I’ve seen from torch users. The biggest NN project I’ve participated in so far was Denoising with Kernel Prediction. Implemented in TensorFlow, and using custom CUDA code for better performance, this denoiser surpasses anything before.

I still like C++, it is one of the most flexible and pragmatic languages that are widely used. But for deep learning, Python is the standard, and inertia ensures this will remain the case for many years. Which is unfortunate, the lack of type safety and static checking is a big annoyance, especially in non-trivial code bases as the ones I work in.
Using C++ for TensorFlow is possible but only done for very specific subset of tasks, such as embedding into other programs, or writing custom Ops with CPU or CUDA.

Comment

SVG embedding in XHTML with TextPattern · 2010-02-10 19:20 by Black in

To store and transport scale independent graphics SVG is a standardised format. It is supported by most browsers somewhat, with the exception of Microsoft’s product. It would be nice if it could be used like the usual raster image formats in an <img /> tag. While Safari and other WebKit browsers support this usage, Firefox does not. Instead, SVG can be embedded directly into the source code of an XHTML file.

45° A B C

The above image is part of the source code of this page. It is included by a TextPattern plugin, and only slightly processed. Processing is needed to remove the <?xml /> header and to insert a viewBox attribute to allow scaling of the image with CSS. With that done, Firefox displays and scales the image nicely. Safari on the other hand causes trouble, it does not correctly infer the viewport height from the width and the aspect ratio.

To be allowed to embed SVG data, the mime type of the document has to be application/xhtml+xml or similar. This has to be changed for TextPattern by editing the header() call in publish.php. The plugin code itself is rather simple. Download a version ready to be pasted into TextPattern (Licensed under the MIT). Below the sourcecode.

svg_inline.php [1.74 kB]

  1. function svg_inline($atts)
  2. {
  3.   extract(lAtts(array(
  4.     'src'  => '',
  5.   ),$atts));
  6.  
  7.   if ($src)
  8.   {
  9.     if ($src[0] == '/')
  10.     {
  11.       // Relative to Document Root
  12.       $src = $_SERVER['DOCUMENT_ROOT'].$src;
  13.     }
  14.     $svg = file_get_contents($src, FILE_TEXT);
  15.     if ($svg)
  16.     {
  17.       // Add this to publish.php
  18.       //header("Content-type: application/xhtml+xml; charset=utf-8");
  19.       $svg = preg_replace('/<\?xml [^>]*>/', '', $svg, 1);
  20.       $svg = preg_replace('/(<svg[^>]*)width="([^"]*)"([^>]*)height="([^"]*)"([^>]*)>/',
  21.         '$1$3$5 viewBox="0 0 $2 $4">', $svg, 1);
  22.       return '<div class="svg">'.$svg.'</div>';
  23.     }
  24.     else
  25.     {
  26.       return 'Read error src='.$src;
  27.     }
  28.   }
  29.   else
  30.   {
  31.     return 'Missing src';
  32.   }
  33. }

TextPattern plugins are php functions that take two arguments: an array containing the tag attributes, and the contents of the tag element. All this plugin function does is to read the specified svg file and return the filtered source. A simple <txp:svg_inline src="imagepath" /> results in a nicely embedded SVG.

Comment

Stereoscopic Camera for OpenGL · 2010-02-10 02:15 by Black in

Even to create stereoscopic content digitally, cameras are used. But more than real cameras, a lot of freedom and control lies in the hands of the user. The position and projection can be freely decided without respecting things like the size of the camera, inexactness in the manufacturing of their optics or their weight.

For further reading, see Paul Bourke’s Page on stereo pair creation.

Camera Theory

Cameras in OpenGL are defined by filling the modelview matrix and the projection matrix with values. The modelview matrix defines the position of the camera relative to the origin or the object space, the projection matrix defines how coordinates in space are mapped to screen.

The projection matrix can be chosen freely, but normally two basic types of cameras are used: Orthographic and Perspective. Perspective cameras create projections very similar to how the human eye sees the world, objects appear smaller the further they are from the camera. Orthographic cameras project objects preserving parallel lines and their proportions. It is mostly used in technical drawings.

Stereo Pairs

A simple method to use perspective cameras to create stereoscopic footage is to converged their viewing axis. With hardware cameras, this is often used for macro recordings or recordings in closed rooms. The advantage is that the parallax plane is determined when recording, so post-processing needs are low. In addition, the cameras do not have to be as close together as in the next method. The biggest drawback is that the left and right sides of the image do not overlap and have to be cut away or ignored, and that the divergence behind the parallax plane is very strong and can easily lead to unfuseable content. This method should be avoided when ever possible.

ZeroParallax Cutoff Strongdivergence

A better method is to use perspective cameras with parallel axis. It requires the cameras to be relatively close together and well aligned, both of which is no problem to do in software. Unlike converged cameras, the maximal divergence at infinity is fixed, so even recordings containing far objects can work. The zero parallax plane lies at infinity. It can be moved by creating asymmetric view frustums, effectively horizontally moving both images.

ZeroParallax AsymetricFrustum

For special visualizations, parallel cameras with converged axis can be used. And similar as with perspective converged cameras, extreme caution has to be taken to not create strongly diverging images. This method should only be used to show objects that are very close to the parallax plane.

ZeroParallax

Implementation with OpenGL

As part of ExaminationRoom, I implemented a flexible camera class. The source and header can be downloaded and used relatively freely. As all of my code on this page, they are licensed under the GPL and MIT licenses. This class is not meant to be used directly in an other project since a lot of code is specific to ER, but I am sure the core can be of use as example.

In my implementation, camera positions are defined by their position, their viewing direction, their up-vector and their separation (distance between the cameras). The projection is influenced by the field-of-view, the distance to the zero-parallax plane (the plane where separation of corresponding points is zero) and of course the type of the projection.

camera.h [6.01 kB]

  1. private:
  2.   Tool::Point   pos_;
  3.   Tool::Vector  dir_;
  4.   Tool::Vector  up_;
  5.   float     sep_;
  6.   float     fov_;
  7.   float     ppd_;
  8.   Tool::ScreenProject * spL_;
  9.   Tool::ScreenProject * spR_;
  10.   Camera::Type  type_;

The core of the class is the creation of the matrixes. The call to glFrustum sets the projection matrix, the modelview matrix is created with the utility method gluLookAt. The separation between the cameras has to be considered for both. The camera uses vertical field-of-view, so that the height of the image does not change between standard and widescreen viewport aspect ratios.

camera.cpp [9.43 kB]

  1. void Camera::loadMatrix(float offsetCamera)
  2. {
  3.   GlErrorTool::getErrors("Camera::loadMatrix:1");
  4.   GLint viewport[4];
  5.   glGetIntegerv(GL_VIEWPORT, viewport);
  6.   float aspect = (float)viewport[2]/viewport[3];
  7.   float fovTan = tanf((fov_/2)/180*M_PI);
  8.   if (type() == Camera::Perspective)
  9.   {
  10.     // http://local.wasp.uwa.edu.au/~pbourke/projection/stereorender/
  11.  
  12.     float fTop, fBottom, fLeft, fRight, fNear, fFar;
  13.     // Calculate fNear and fFar based on paralax plane distance hardcoded factors
  14.     fNear = ppd_*nearFactor;
  15.     fFar = ppd_*farFactor;
  16.     // Calculate fTop and fBottom based on vertical field-of-view and distance
  17.     fTop = fovTan*fNear;
  18.     fBottom = -fTop;
  19.     // Calculate fLeft and fRight basaed on aspect ratio
  20.     fLeft = fBottom*aspect;
  21.     fRight = fTop*aspect;
  22.  
  23.     glMatrixMode(GL_PROJECTION);
  24.     // Projection matrix is a frustum, of which fLeft and fRight are not symetric
  25.     // to set the zero paralax plane. The cameras are parallel.
  26.     glPushMatrix();
  27.     glLoadIdentity();
  28.     glFrustum(fLeft+offsetCamera, fRight+offsetCamera, fBottom, fTop, fNear, fFar);
  29.     glMatrixMode(GL_MODELVIEW);
  30.     glPushMatrix();
  31.     glLoadIdentity();
  32.     // Rotation of camera and adjusting eye position
  33.     Vector sepVec = cross(dir_, up_); // sepVec is normalized because dir and up are normalized
  34.     sepVec *= offsetCamera/nearFactor;
  35.     // Set camera position, direction and orientation
  36.     gluLookAt(pos_.x - sepVec.x, pos_.y - sepVec.y, pos_.z - sepVec.z,
  37.           pos_.x - sepVec.x + dir_.x, pos_.y - sepVec.y + dir_.y, pos_.z - sepVec.z + dir_.z,
  38.           up_.x, up_.y, up_.z);
  39.     GlErrorTool::getErrors("Camera::loadMatrix:2");
  40.   }

The perspective projection is used in most places. For ExaminationRoom, one of the feature requests was the ability to disable selected depth cues. A very strong cue is size relative to the environment. To disable this cue, parallel projection with converged cameras as described above is used instead. The values for the projection matrix were chosen so that the objects at the zero-parallax plane would not change their size when switching between the projection types. The projection matrix is derived from the normal orthographic projection created by OpenGL’s glOrtho by shearing it.

camera.cpp [9.43 kB]

  1.   else if (type() == Camera::Parallel)
  2.   {
  3.     float fTop, fBottom, fLeft, fRight, fNear, fFar;
  4.     // Calculate fNear and fFar based on paralax plane distance and a hardcoded factor
  5.     // Note: the zero paralax plane is exactly in between near and far
  6.     fFar = ppd_*farFactor;
  7.     fNear = 2*ppd_ - fFar; // = ppd_ - (fFar-ppd_);
  8.     // Set fTop and fBottom based on field-of-view and paralax plane distance
  9.     // This is done to make the scaling of the image at the paralax plane the same
  10.     // as in perspective mode
  11.     fTop = fovTan*ppd_;
  12.     fBottom = -fTop;
  13.     // Set left and right baased on aspect ratio
  14.     fLeft = fBottom*aspect;
  15.     fRight = fTop*aspect;
  16.  
  17.     glMatrixMode(GL_PROJECTION);
  18.     glPushMatrix();
  19.     glLoadIdentity();
  20.     // http://wireframe.doublemv.com/2006/08/11/projections-and-opengl/
  21.     // Note: The code there is wrong, see below for correct code
  22.     // Create oblique projection matrix by shearing an orthographic
  23.     // Projection matrix. Those cameras are converged.
  24.     const float shearMatrix[] = {
  25.       1, 0, 0, 0,
  26.       0, 1, 0, 0,
  27.       -offsetCamera/nearFactor, 0, 1, 0,
  28.       0, 0, 0, 1
  29.     };
  30.     glMultMatrixf(shearMatrix);
  31.     glOrtho(fLeft, fRight, fBottom, fTop, fNear, fFar);
  32.     glMatrixMode(GL_MODELVIEW);
  33.     glPushMatrix();
  34.     glLoadIdentity();
  35.     // Rotation of camera
  36.     // Note: The position of both left and right camera is at the same place
  37.     //  because the offset is already calculated by the shearing, which also sets
  38.     //  the zero paralax plane.
  39.     gluLookAt(pos_.x, pos_.y, pos_.z,
  40.           pos_.x + dir_.x, pos_.y + dir_.y, pos_.z + dir_.z,
  41.           up_.x, up_.y, up_.z);
  42.     GlErrorTool::getErrors("Camera::loadMatrix:3");
  43.   }

Hopefully this is useful to someone :)

Comment

Lua String Writer · 2010-02-04 16:35 by Black in

Lua strings are opaque byte streams. They are constant, and can only be manipulated by using the string api to create new strings. This can be expensive, especially when creating a string by appending new values at the end. While Lua contains optimizations for direct concatenation, successive appending has a high overhead.

This StringWriter class reduces the overhead by aggregating string concatenations in a table and executing them when requested. It was originally designed to serve as an efficient drop-in replacement for files as created by io.open, but it can also be used standalone.

The class itself is built with a protected shared metatable and state inside a table. The state itself is not protected (it would be possible by using individual metatables or an internal database in a weak table, but this is more elegant). The metatable contains entries to redirect reads to the method table, redirect new writes to nothing and prevent changing or reading the metatable. The concatenation operator is also overloaded, but since it has value semantic, and is not allowed to change the object itself, the implementation is less efficient than StringWriter:write(). Converting a StringWriter with tostring() gives the contained string, equivalent to StringWriter:get().

stringwriter.lua [4.77 kB]

  1. -- MetaTable for string writers
  2. local StringWriter_Meta = {
  3.   ["__index"] = StringWriter_Methods;
  4.   ["__newindex"] = function ()
  5.       -- Don't allow setting values
  6.     end;
  7.   ["__metatable"] = StringWriter_ID;
  8.   ["__tostring"] = StringWriter_Methods.get;
  9.   ["__concat"] = function (this, str)
  10.       str = tostring(str);
  11.       local sw = StringWriter();
  12.       sw.string_ = {};
  13.       for _, v in ipairs(this.string_) do
  14.         table.insert(sw.string_, v);
  15.       end
  16.       table.insert(sw.string_, str);
  17.       sw.len_ = this.len_ + #str;
  18.       sw.pos_ = sw.len_;
  19.       return sw;
  20.     end;
  21. }

The method table itself contains all methods the StringWriter supports. It was modeled after the file class, so many methods are placeholders that do nothing. The methods that are supported are seeking and writing. Seeking simply sets an internal position value. Writing in the context of files means overwriting and extending. When the position is at the end, the contents that are to be written can simply be appended to the contents table. Otherwise, the string has to be baked, split, and recomposed.

stringwriter.lua [4.77 kB]

  1. -- Methods for string writers
  2. local StringWriter_Methods = {
  3.   ["close"] = voidFunc;
  4.   ["flush"] = voidFunc;
  5.   ["lines"] = voidFunc;
  6.   ["read"] = voidFunc;
  7.   ["seek"] = function (this, base, offset)
  8.       -- Only act on StringWriters
  9.       if not StringWriter_Check(this) then
  10.         return nil, "Invalid StringWriter";
  11.       end;
  12.       -- Default offset
  13.       if type(base) == "number" then
  14.         offset = base; -- Not done in file, but reasonable
  15.       else
  16.         offset = offset or 0;
  17.       end
  18.       -- Set position and return it
  19.       if base == "set" then
  20.         this.pos_ = math.clamp(offset,0, this.len_);
  21.       elseif base == "end" then
  22.         this.pos_ = math.clamp(#this.string_+offset,0, this.len_);
  23.       else -- "cur"
  24.         this.pos_ = math.clamp(this.pos_+offset,0, this.len_);
  25.       end
  26.       return this.pos_;
  27.     end;
  28.   ["setvbuf"] = voidFunc;
  29.   ["write"] = function (this, ...)
  30.       -- Only act on StringWriters
  31.       if not StringWriter_Check(this) then return end;
  32.       -- Concat all arguments (assuming they are valid)
  33.       local s = table.concat({...});
  34.       -- Concat argument string with current string
  35.       if this.pos_ == -1 or this.pos_ == this.len_ then
  36.         -- Just append
  37.         table.insert(this.string_, s);
  38.       else
  39.         -- Insert, merge into a string
  40.         local sFull = table.concat(this.string_);
  41.         -- Split it up
  42.         local sLeft = string.sub(sFull, 1, this.pos_);
  43.         local sRight = string.sub(sFull, this.pos_+1+#s, -1)
  44.         -- And put it back in
  45.         this.string_ = {sLeft, s, sRight};
  46.       end
  47.       -- Update position
  48.       this.pos_ = this.pos_ + #s;
  49.       if this.pos_ > this.len_ then
  50.         this.len_ = this.pos_;
  51.       end;
  52.     end;
  53.   ["get"] = function (this)
  54.       if not StringWriter_Check(this) then
  55.         return nil, "Invalid StringWriter";
  56.       else
  57.         this.string_ = {table.concat(this.string_)};
  58.         return this.string_[1];
  59.       end;
  60.     end;
  61. }

StringWriter instances are created by a factory method. It initializes the state and sets the metatable.

stringwriter.lua [4.77 kB]

  1. -- StringWriter factory
  2. StringWriter = function ()
  3.   local sw = {
  4.     string_ = {""};
  5.     len_  = 0;
  6.     pos_  = 0;
  7.   }
  8.   setmetatable(sw, StringWriter_Meta);
  9.   return sw;
  10. end

I hope this code is useful for someone, use it as you wish, it is licensed under the MIT license.

Comment

Lua Table Persistence · 2010-01-27 14:56 by Black in

Lua is a very flexible scripting language for embedding into programs. It’s standard API is very slim, it lacks all but basic functions. Adding them is easy though.

The persistence code here requires nothing but lua’s standard io.open for reading and writing files. It can handle loops, multiple references to the same table in both keys and values, and most standard value types.
Not supported are userdata, threads and many types of functions. Exporting simple lua functions works, but the exported byte code is not portable. The result from the export is itself lua code, it can be executed and returns data structures equivalent to those that were exported.

The core for the export is a simple recursion with a dispatcher method and writers for all types. When unsupported types are encountered, nil is written. This can cause problems on import when those unsupported values are used as table keys, but in most cases it is more desirable than to fail the export.

persistence.lua [5.50 kB]

  1. -- Format items for the purpose of restoring
  2. writers = {
  3.   ["nil"] = function (file, item)
  4.       file:write("nil");
  5.     end;
  6.   ["number"] = function (file, item)
  7.       file:write(tostring(item));
  8.     end;
  9.   ["string"] = function (file, item)
  10.       file:write(string.format("%q", item));
  11.     end;
  12.   ["boolean"] = function (file, item)
  13.       if item then
  14.         file:write("true");
  15.       else
  16.         file:write("false");
  17.       end
  18.     end;
  19.   ["table"] = function (file, item, level, objRefNames)
  20.       local refIdx = objRefNames[item];
  21.       if refIdx then
  22.         -- Table with multiple references
  23.         file:write("multiRefObjects["..refIdx.."]");
  24.       else
  25.         -- Single use table
  26.         file:write("{\n");
  27.         for k, v in pairs(item) do
  28.           writeIndent(file, level+1);
  29.           file:write("[");
  30.           write(file, k, level+1, objRefNames);
  31.           file:write("] = ");
  32.           write(file, v, level+1, objRefNames);
  33.           file:write(";\n");
  34.         end
  35.         writeIndent(file, level);
  36.         file:write("}");
  37.       end;
  38.     end;
  39.   ["function"] = function (file, item)
  40.       -- Does only work for "normal" functions, not those
  41.       -- with upvalues or c functions
  42.       local dInfo = debug.getinfo(item, "uS");
  43.       if dInfo.nups > 0 then
  44.         file:write("nil --[[functions with upvalue not supported]]");
  45.       elseif dInfo.what ~= "Lua" then
  46.         file:write("nil --[[non-lua function not supported]]");
  47.       else
  48.         local r, s = pcall(string.dump,item);
  49.         if r then
  50.           file:write(string.format("loadstring(%q)", s));
  51.         else
  52.           file:write("nil --[[function could not be dumped]]");
  53.         end
  54.       end
  55.     end;
  56.   ["thread"] = function (file, item)
  57.       file:write("nil --[[thread]]\n");
  58.     end;
  59.   ["userdata"] = function (file, item)
  60.       file:write("nil --[[userdata]]\n");
  61.     end;
  62. }

To be able to export tables that are referenced several times (be it a cycle in the data structure, or just one that is inserted several times), the structures that are to be written are examined first and the numbers or references to each table are counted.

All tables that have multiple references to them are created at the start in the export file before they are filled with content. This is required, since they could contain themselves or other multi-ref tables.

After all those temporary tables are created, they are filled with content. The writer for tables uses a lookup table for multi-ref tables, instead of creating the table constructor for them, they are assigned from the table created at the start. Last but not least, the passed arguments themselves are created in the same way.

persistence.lua [5.50 kB]

  1.   store = function (path, ...)
  2.     local file, e;
  3.     if type(path) == "string" then
  4.       -- Path, open a file
  5.       file, e = io.open(path, "w");
  6.       if not file then
  7.         return error(e);
  8.       end
  9.     else
  10.       -- Just treat it as file
  11.       file = path;
  12.     end
  13.     local n = select("#", ...);
  14.     -- Count references
  15.     local objRefCount = {}; -- Stores reference that will be exported
  16.     for i = 1, n do
  17.       refCount(objRefCount, (select(i,...)));
  18.     end;
  19.     -- Export Objects with more than one ref and assign name
  20.     -- First, create empty tables for each
  21.     local objRefNames = {};
  22.     local objRefIdx = 0;
  23.     file:write("-- Persistent Data\n");
  24.     file:write("local multiRefObjects = {\n");
  25.     for obj, count in pairs(objRefCount) do
  26.       if count > 1 then
  27.         objRefIdx = objRefIdx + 1;
  28.         objRefNames[obj] = objRefIdx;
  29.         file:write("{};"); -- table objRefIdx
  30.       end;
  31.     end;
  32.     file:write("\n} -- multiRefObjects\n");
  33.     -- Then fill them (this requires all empty multiRefObjects to exist)
  34.     for obj, idx in pairs(objRefNames) do
  35.       for k, v in pairs(obj) do
  36.         file:write("multiRefObjects["..idx.."][");
  37.         write(file, k, 0, objRefNames);
  38.         file:write("] = ");
  39.         write(file, v, 0, objRefNames);
  40.         file:write(";\n");
  41.       end;
  42.     end;
  43.     -- Create the remaining objects
  44.     for i = 1, n do
  45.       file:write("local ".."obj"..i.." = ");
  46.       write(file, (select(i,...)), 0, objRefNames);
  47.       file:write("\n");
  48.     end
  49.     -- Return them
  50.     if n > 0 then
  51.       file:write("return obj1");
  52.       for i = 2, n do
  53.         file:write(" ,obj"..i);
  54.       end;
  55.       file:write("\n");
  56.     else
  57.       file:write("return\n");
  58.     end;
  59.     file:close();
  60.   end;

Loading the exported data is simple, but the provided method performs some error checking.

persistence.lua [5.50 kB]

  1.   load = function (path)
  2.     local f, e = loadfile(path);
  3.     if f then
  4.       return f();
  5.     else
  6.       return nil, e;
  7.     end;
  8.   end;

I hope this code is useful for someone, use it as you wish, it is licensed under the MIT license.

Comment [1]

Source Code Management with Git · 2009-12-29 15:22 by Black in

Git is a distributed SCM designed by Linus Torvalds to manage the development of the Linux Kernel. Since it’s licensed under the GPL, it can be used freely by anyone.

Just like backups are a necessity for anyone who uses a Computer (or should be…), source code management is a necessity for serious developers. Not only does it track the past state of the project (which allows tracking the introduction of bugs), but it also allows the management of separate branches. That way, development can continue to add new experimental features while production uses only stable and tested code.

Git is a distributed SCM tool, unlike CVS and Subversion it does not require a central server and by design there is no central authoritative repository. Every repository contains the full history. Every file is hashed and added to a database. Every commit contains a tree of file hashes, a commit message and a pointer to the ancestor commits. All that is hashed and added to the database, so a commit’s hash can be used to cryptographically verify the integrity of the complete previous history. For a more technical perspective on git’s inner workings, read Git for Computer Scientists (It really is quite cool in it’s simplicity). Here’s a one sided comparison of git with some alternatives.

I have started to use git beginning of 2008 for my work on ExaminationRoom, and while the start was a bit hairy, having a history of my code development as well as my comments have helped me a lot, even as only developer. I worked on three computers, so keeping the code synchronized was critical. That too was easy thanks to the SCM, even without a reachable central server (One of the computers had no internet access, it was only used to drive two Projectors for the experiments.)

I still use git these days, and can’t recommend it more. Although most other projects are World of Warcraft addons… All my public code can be cloned from my repositories

Comment