Table of Contents

^{Part 1 : Intro - Part 2 : Semantics - Part 3 : Vector Types - Part 4 : Intrinsics & Textures - Part 5 : Blinn-Phong Shader}

Single-dimension and double-dimension arrays in HLSL are used as value types and do not need to be declared and instanciated like in most object-oriented languages. There are also many HLSL-specific features to help working with arrays, since they’re the most common data structure in graphics programming.

Matrix types are basically 2D arrays. They are mostly used for semantic mapping, to use the matrices that are bundled with the application or the framework like the World-View-Projection matrix. The available datatypes are in the format “float*x*x*y*" where *x* and *y* are between 1 and 4. That means that the following statements will compile :

float1x4 someMatrix; float2x3 someOtherMatrix;

You can also define them using the “matrix” keyword, although it’s a lot less used :

matrix <float, 2, 2> someMatrix; matrix <float, 1, 4> someMatrix;

And bool, int, double and half (16-bit float) matrices can be defined as well, but then again this is not recommended nor used very much.

As implied by the typename, the matrices are an array of floats of *x* rows by *y* columns. When using floating-point numbers, you should use the “f” suffix; but you are not allowed to use the “f” suffix on an integer number :

float a = 1f; // Will NOT compile float b = 1.0; // Will compile float c = 1.0f; // Will compile float d = 1; // Will compile as well!

The last one works because integers are implicitely casted to float if needed.

The matrix “cells”, or the array members, can be accessed by three ways :

- Using the ._m
*xy*zero-based notation - Using the ._
*xy*one-based notation - Using the [
*x*][*y*] array-style zero-based notation

That means that :

float3x3 someMatrix; someMatrix._m01 = 1.05f; // First row, second column someMatrix._23 = 2.4f; // Second row, third column someMatrix[2][0] = 6.39f; // Third row, first column

I like to think that _m*xy* is faster since it uses a constant, keyword-like notation instead of taking numbers in parameters like the other notations. But I might be wrong in thinking that too...

You can construct matrix variables using two ways :

- Enumerating the array members in curly brackets
- Defining an array-of-arrays

So, both of those are valid :

float2x2 mat = {1, 2, 3, 4}; // Enumeration float2x2 mat = {{1, 2}, {3, 4}}; // Array-of-arrays

Vector types are named in the same fashion as matrix types :

float2 a; // A 2-dimension vector float3 b; // A 3-dimension vector float4 c; // A 4-dimension vector

And the cryptic <> notation exists still, now with the “vector” keyword :

vector <float, 4> d; vector <float, 2> e;

Their constructors are a little different though :

float2 a = float2(1.5f, 2); // Is valid float4 b = float4(3, 4, 5.1f, 6.9f); // Is valid as well float2 c = {1, 15}; // Works too float3 d; // Works, but... d.x = 1.5f; // ...takes... d.y = 2.1f; // ...time... d.z = 3.4f; // ...to write

So there is a function for each type which takes the three components of the vector, and which returns the vector itself.

They can be accessed using the array notation “var[*x*]” or by using two sets :

- Either the .x, .y, .z and .w properties,
- Or the .r, .g, .b and .a ones.

This is because the vector types are contextually either vectors, points or colors in HLSL programming. This makes the code very natural and readable, since we know what we are fetching, it’s not a simple array index. Sadly there is no .u/.v/.w set (for texture coordinates), probably because that would be confusing for the .w property. Here are some examples :

float2 vec2d = float2(1, 2); vec2d.r = vec2d.g; // OK vec2d.y = vec2d.x; // OK vec2d.r = vec2d.y; // OK too!

One important aspect of HLSL is swizzling and masking of vector and matrix types. It allows very simplified vector access and construction while making the code logic and readable.

Look at the following code :

float3 someVector = float3(1, 2, 3); float3 someOtherVector = someVector.zyx; float3 yetAnother = float3(someOtherVector.xz, 1); float3 lotsOfVectors = yetAnother.xxx; float4 someColor = float4(1, 2, 3, 4); float3 plainRgb = someColor.rgb; float2 redAndBlue = plainRgb.rb;

Those are all possible because of a feature called swizzling.

That means that you can (in SM 2.0) select any combination (even with repetitions) of vector members and the selection will return a *x*-length vector where *x* is the number of selected members. And as shown on the third code line, all methods that accept an enumerated array of elements may be feeded “parts” of the array in sequence, and mixed with the enumeration.

The only thing you aren’t allowed to do, for clarity’s sake, is to mix “rgba” notation with “xyzw” notation during swizzling, like so :

float2 vec = float2(1, 2); float2 vec2 = vec.ry; // Not valid!

You can even do that with matrices, to a certain extent :

matrix2x2 someMatrix = {{1, 2}, {3, 4}}; float2 someVector = someMatrix[0]; float4 someOtherVector = someMatrix._11_21_22_12; float2 yetAnother = someMatrix._m00_m11;

Swizzling is not available for the array notation, but you can select whole row-vectors out of it.

And finally, this can be done on floats too!

float3 vec = 0.5f.xxx; float4 color = float4(1.0f.rr, 2.0f.rr);

This can be a nice time-saver, especially when debugging pixel shaders, i.e. output a single component and fill the rest with 0’s or 1’s.

Similarly to swizzling, HLSL has another feature called write-masking. This consists of storing information on swizzled parts of a vector/matrix, like so :

float3 someVector = float3(1, 2, 3); float3 someOther; someOther.xzy = someVector; float4 yetAnother; yetAnother.wyx = someOther;

And it may be mixed with swizzling in the selection (right-hand) part if needed. Obviously, repetition is not allowed in write-masking, since you can’t assign two values to the same member.

You can down-cast and up-cast explicitely a matrix or vector type like so :

float3x3 someMatrix; float2x2 reducedMatrix = (float2x2)someMatrix; float4x4 augmentedMatrix = (float4x4)someMatrix;

This may be useful, and swizzling with a constructor would be a little long with a full 3×3 matrix. Keep it simple! :) The same goes for vector types, you can up-cast and down-cast any vector type, as long as the down-casted or up-casted container exists.

For 1D and 2D arrays of sizes bigger than 4, you can always use fixed-sized standard arrays :

float fiveLengthArray[5] = {1, 2, 3, 4, 5}; float 2dArray[5][2] = {{1, 2}, {3, 4}, {5, 6}, {7, 8}, {9, 10}};

They are constructed with curly brackets, but you can’t use vector selectors nor swizzling on them.

If you have the choice of working with a vector/matrix or an array/2d array, *always* prefer the vector-based type. Even if the use of vectors makes the implementation unintuitive, you need to remember that a GPU uses parallel computing to process whole vectors at once; thus using vectors is faster, and consumes less arithmetic instructions.

For example, take the following code :

float data[4]; float sum = data[0] + data[1] + data[2] + data[3];

This code compiles to 3 assembly instructions. Now this version :

float4 data; float sum = dot(data, 1.0f.xxxx);

This does the exact same thing, and compiles to **1** instruction, so intuitively three times faster. The dot() intrinsic function will be explained in the next section, but it’s a dot product operation. Since I do a dot product on the [1, 1, 1, 1] vector, that equals the same as doing (1 * data.x + 1 * data.y + 1 * data.z + 1 * data.w).

And this applies to matrices as well... Matrix multiplication becomes a powerful tool to perform a complex list of operations. For example :

float3 lightColor[4]; // RGB color for 4 lights float lightIntensity[4]; float3 weightedAverage = float3(0, 0, 0); for(int i=0; i<4; i++) weightedAverage += lightColor[i] * lightIntensity[i];

In SM2.0, this compiles to 5 assembly instructions. Now the matrix-based version :

float4x3 lightColor; float4 lightIntensity; float3 weightedAverage = mul(lightIntensity, lightColor);

Compiles to **4** instructions, so one instruction gained. But also cleaner code!

That may look like an insignificant gain, but trust me when you’re locked to a 64-instructions limit in ps_2_0, you start counting them. :)