Understanding and interpreting the homogeneous transformation matrix in 3D space

In 3D computer graphics it is common practice to describe positions and directions using homogeneous coordinate vectors and affine transformations (scaling, rotating, translating, shearing, or a combination of them) using 4×4 transformation matrices. While explaining certain aspects of these transformation matrices to a colleague, I noted that even though there are many articles available on the web which explain the setup of these matrices, most of them do not point out some very important high-level relations between the numbers in the matrix and the actual transformation they represent in 3D space. After reading this article, you should understand these relations and be able to visualize an affine transformation simply by looking at the raw numbers of the matrix. Without going very general mathematically, this article is a high level discussion and will not explain what matrices are or how to perform computations with them. Before reading on you should:
  1. know what vectors and matrices are
  2. understand matrix*matrix and matrix*vector multiplications by performing a few of them on paper
  3. understand what a homogeneous coordinate vector is
  4. check your fridge (just to be sure)

The setup of a 3D transformation matrix describing an affine transformation

Most articles about 3D transformation matrices will show you something like this: Affine Homogeneous Transformation Matrix While this is accurate, the formula above shows how to setup the rotational part of the transformation matrix using Euler angles. Not only are Euler angles evil (avoid them whenever possible), the relatively complex looking formula also obfuscates some much easier to comprehend and much more useful informations that can be read from it – so don’t even try to memorize it. Generally, an affine transformation in 3D space describes how to map the coordinates of any point from reference space A to reference space B and we represent this transformation using the 4×4 matrix T. So, if T for example is a “localToWorld” matrix, reference space A would be the local coordinate system (in which for example the vertex coordinates of a 3D mesh are defined) and reference space B would be the world coordinate system. Multiplying any local vertex coordinates with the localToWorld matrix thereby yields the coordinates of said vertex in world space and as such, T describes exactly how the 3D mesh is positioned, oriented and scaled in world space. The transformation is said to span A as a subspace of B, and it does so using the 3 basis vectors ex, ey, ez: Transformation Matrix Basis Vectors And here’s the catch: you can read these three basis vectors as well as the translation component directly from the 4 columns of the transformation matrix. They are expressed in terms of the target space the matrix maps to (so, in world space for the case of a localToWorld matrix). Here is a sample application to illustrate this (you need to have the Unity3D Webplayer installed). Use the mouse to rotate, translate and scale the 3d model and observe how the values in the transformation matrix change. Also you can hover with the mouse over the transformation matrix to get some tool-tips:

In the start configuration the transformation of the 3D mesh is aligned with the world coordinate system, which is why the three basis vectors of the transform equal the basis vectors of the world coordinate system. Rotate the model by about 45° around the Z axis by dragging at the blue circle and you may see something like this: 3D Transformation Matrix Explained So, visualizing this transformation by looking at the numbers does not require too much imagination anymore. The first column is your thumb, oriented in world space direction (0.7, 0.7, 0.0). The second column is your index finger, oriented in world space direction (-0.7, 0.7, 0.0). The third column is your middle finger, oriented in world space direction (0.0, 0.0, 1.0). The fourth column is the offset from the world origin to the base of your hand and since this example shows a left handed coordinate system, use your left hand to perform this exercise. You can also scale the model by dragging one of the small circles. Observing the numbers in the matrix, you can see that scaling simply means changing the magnitude (length) of the basis vectors. The three numbers below the matrix show these magnitudes: Transformation Matrix Local Scale Explained Finally, this article covered a Left Handed 3D space that follows a column vector convention (which is the case for example for Unity3D). If your 3D package assumes row vectors (which is the case for example in XNA), the matrix needs to be transposed. So instead of columns, you would read the rows of the matrix to find the basis vectors. More on 3D space orientation, column vs. row vectors and column major vs. row major matrices will be posted soon. If you found this article helpful and want to ask or add something, please do so in the comments! Matthias Bühlmann


1 Comment

  1. B 08/04/2015, %I:%M %p

    Thanks for the great explanation 🙂
    when i studied this, they told us how to calculate this matrix but not how to visualize it.

Leave a Comment