Deferred Rendering

Written By Geoff Wilson( AriusEso ).

What is deferred rendering?

From Wikipedia, the free encyclopedia.

" In computer graphics, deferred shading is a three dimensional shading technique in which the result of a shading algorithm is calculated by dividing it into smaller parts that are written to intermediate buffer storage to be combined later, instead of immediately writing the shader result to the color framebuffer. Implementations on modern hardware tend to use multiple render targets (MRT) to avoid redundant vertex transformations. Usually once all the needed buffers are built they are then read (usually as input textures) from a shading algorithm (for example a lighting equation) and combined to produce the final result. In this way the computation and memory bandwidth required to shade a scene is reduced to those visible portions, thereby reducing the shaded depth complexity. "

More succinctly, what this means is that we render all of our geometric and pixel shading sources into a buffer. Then we run our shading calculations on the collected buffer data. The buffers take the form of dynamic viewport sized render targets( CTVRenderSurface() ). This buffer is referred to by a few names: gBuffer, Geometric Buffer and Fat Buffer. I prefer “Geometric Buffer” as it is more descriptive.

Standard rendering is often referred to as “Forward Rendering”.

Why do this over forward rendering?

As with anything like this, there are pro’s and con’s. There are also certain situations whereby deferred is preferable over forward and vice versa.


  • Reduced shader complexity.
  • Reduced CPU load.
  • Ties in neatly with screen-spaced effects. HDRR, SSAO and so on.
  • Breaks the 8 light limit barrier. Deferred can render many many more.


  • Increased VRAM load.
  • Does not support semi-transparency.

Forward shading engines with more traditional shading architectures pay a higher price by repeating tasks multiple times per-frame.

Sources: WikiPedia NVIDIA Corporation

What TV3D classes should I familiarise myself with for this example?

Geometric Buffer

As I said previously, the geometric buffer is made up of dynamic textures. For this we are going to use render targets. We are going to store six sets of data.

  • Colour texture map.
  • Vertex position.
  • Normals texture map.
  • Specular texture map.
  • View vector.
  • Depth.

Now, as textures they each have channels. Red, Green, Blue and Alpha. Not all of these channels will be filled with data. The following shows how I will be using each texture.

  • RGB Colour, A nothing.
  • RGBA Positional.
  • RGB Normals, A Specular Map.
  • RGB View, A Depth.

Obviously leaving channels blank is a waste of video RAM. Usually you would fill these with other kinds of data. This is the flexibility of the deferred system. Store in whatever data you need for your lighting/shading model.

A lot of people do not have a positional buffer, they instead reconstruct vertex positions from the depth data. For the purposes of this article I am storing the positional data in the buffer to keep things simple.

In order to fill the buffers with data we need to render the scene to each one, outputting the required data. With four buffers this would mean rendering the scene four times, as you can imagine this is expensive. Fortunately the solution is in MRT, multiple render targets, that allow us to render all of our data out in one single pass. This is highly important to deferred rendering as it significantly reduces the rendering load. You can create these buffers in any texture format but they must all be of the same bit depth.

Sources: MSDN


We essentially render all of the geometry into a pre-pass buffer. The only thing we render in the main pass is our lights. This at first might seem odd, usually lights are more of an abstract. A structure with a set few values. In deferred rendering lights are actual geometry. This article is going to focus on point lighting. Point lights, once visualised in the mind, are simply spheres. The outer circumference being the lights range.

Up to this point I haven’t mentioned shaders. For deferred rendering we need at a minimum two. One renders the geometry to our MRT. The other handles rendering the light spheres based on the MRT data we give it. This allows us to seperate the process in to two key parts.

  1. Render geometry to buffer.
  2. Render lights using buffer.

The reason this is so important is because in the second phase we are running shading calculations on only visible pixels. The buffer data is projected on to the light spheres and the lighting calculation is run. This means you can have as many lights on screen as you like and technically any light shape you like.

Making lights efficient

So, for every light we have on screen we must render. This could become quite expensive with multiple lights requiring multiple draw calls. But, there are things we can do to reduce this load. As point lights are simply spheres, they are not complex. This means we can employ instancing techniques in order to render them. Instancing allows us to send a batch of geometry to the GPU and render it in one single draw call. Truevision3D has an implementation of constants based instancing in the form of its CTVMiniMesh class. We are going to use this to render the lights. This means that we can render up to 52 lights in one single call. This is highly useful, although it has some flaws.

  • We are limited in terms of the data a light can hold.
  • We must use only semantics. We cannot send per-light properties.

Despite these limitations, we can force per-light properties through by thinking smartly. The most important one is light range. We certainly don’t want all of the lights to be the same range. To get around this we can use scale. If we create a base sphere that is 1-unit in size, we can then scale it according to whatever range we want. Then in the lighting shader we can pull this value from the world matrix. The only issue with this hack is rotation. If you rotate any of the lights, it will break. But these are spheres, so we don’t require rotation, right?

Some information on instancing: MSDN

Practical Example

The following code is a basic example of deferred rendering in Truevision3D. I have attempted to keep it as simple as possible. Although it is C++, the function names and calls are the same in any supported language. I will start with the application( or host ) code, then move on to the two shaders. I am only writing the function calls that are important. For the full source code please download the RAR archive at the bottom of this page.


	rtMRT       = m_Scene->CreateMultiRenderSurface(4, 800, 600, true, true, 1, cTV_TEXTUREFORMAT_HDR_FLOAT16, cTV_TEXTUREFORMAT_HDR_FLOAT16, cTV_TEXTUREFORMAT_HDR_FLOAT16, cTV_TEXTUREFORMAT_HDR_FLOAT16, "mrt");
	shadeBuffer = m_Scene->CreateShader("buffer");

This is how we create our MRT buffer. We make it the same size as the viewport, in this case 800×600. Some of the data we want to store requires floating point precision. To keep things simple I have made each render target a 16bit floating point. Then we create/load the buffer shader. The buffer shader is loaded onto all the geometry, in this example just a floor mesh.

	shadeLight  = m_Scene->CreateShader("light");
	shadeLight->SetEffectParamFloat("iw", 1 / 800);
	shadeLight->SetEffectParamFloat("ih", 1 / 600);
	shadeLight->SetEffectParamTexture("colour", rtMRT->GetTextureEx(0));
	shadeLight->SetEffectParamTexture("position", rtMRT->GetTextureEx(1));
	shadeLight->SetEffectParamTexture("normals", rtMRT->GetTextureEx(2));

We load our lighting shader and set a few values. The first two are inverted width and inverted height. The last 3 are our buffer data.

	CTVMesh *gLightTemp;
	gLightTemp  = m_Scene->CreateMeshBuilder("light");
	gLightTemp->CreateSphere(1, 12, 12);
	gLights     = m_Scene->CreateMiniMesh(52, "lights");
	gLights->CreateFromMesh(gLightTemp, true);

We create a base for our instanced lights. Then we create the light “cache” and set the lighting shader.

	for(int i = 0; i < 52; i++)
		gLights->SetPosition(float(RandomRange(-100, 100)), 10, float(RandomRange(-100, 100)), i);
		//Light range.
		gLights->SetScale(35, 35, 35, i);
		float r = float(RandomRange(0, 255)) / 255;
		float g = float(RandomRange(0, 255)) / 255;
		float b = float(RandomRange(0, 255)) / 255;
		gLights->SetColor(RGBA(r, g, b, 1), i);
		gLights->EnableMiniMesh(i, true);

We run through the light cache and set their positions randomly. We set the range to 35 via the scale hack. We then set random colours and enable the lights.

void Render3D()

Before rendering the main frame, we must fill our buffer with data. We do this by rendering all geometry on to the buffer, the buffer shader takes care of putting the data in the right place.


We update the camera and then the buffer. Then finally, we render our lights on to the screen. The lighting shader reads in the buffer data and calculates each pixel that is to be rendered.


Just like with the C++ code, I am only writing the important parts here. For the full HLSL source code download the RAR archive at the bottom of this page.

struct fmesh //MRT.
	float4 C0: COLOR0; //Colour buffer.
	float4 C1: COLOR1; //Position buffer.
	float4 C2: COLOR2; //Normal(RGB) and Spec map(A) buffer.
	float4 C3: COLOR3; //View vector(RGB) and depth(A) buffer.

This is our pixel shader output structure. As you can see it is our MRT.

fmesh fpmesh(in v2fmesh v2fin)
	fmesh    p;
	//If you want to clip pixels that are alpha 0.2f or lower in the colour maps alpha channel.
	//clip(tex2D(csample, v2fin.uv).a - 0.3f);
	float3x3 tbn  = mul(float3x3(v2fin.t, v2fin.b, v2fin.n), (float4x3)w); //TBN Matrix in world space.
	float4   ns   = tex2D(bsample, v2fin.uv);                              //Normal pixel.
	float3   n    = 2 * ns.rgb - 1;                                        //We do this on RGB because we want it in a different range from A( the spec map ).
	float3   vp   = mul(tbn, (v -;                              //View position in tbn space.
	         //Buffer outputs.
	         p.C0 = float4(tex2D(csample, v2fin.uv).rgb, 1);
	         p.C1 =;
	         p.C2 = float4(mul(n, tbn), ns.a);
	         p.C3 = float4(vp, v2fin.z);
		 return p;

The commenting there should explain.

float4 fp(in v2f v2fin): COLOR0
	float4 vp = tex2Dproj(vs, v2fin.suv);                                                     //View vector sample.
	float3 vv = normalize(vp.rgb);                                                            //Normalize view vector.
	float4 cp = tex2Dproj(cs, v2fin.suv);                                                     //Colour sample.
	float4 pp = float4(tex2Dproj(ps, v2fin.suv).xyz, 1);                                      //Position sample.
	float4 np = tex2Dproj(ns, v2fin.suv);                                                     //Normal sample.
	float3 lv = (v2fin.lp - pp);                                                              //Light vector.
	float3 ld = normalize(lv);                                                                //Normalize light vector.
	float  at = distance(pp, v2fin.lp);                                                       //Calc initial attenuation.
	float  d  = saturate(dot(np, ld) * 0.5f + 0.5f) * clamp(1 - length(at) /, 0, 1); //Calc light * Final attenuation calc.
	float  s  = calculatelyon(vv, ld, np) * np.a;                                             //Calc specular * Spec map.
	       return d * ((cp * + s);                                                  //Output pixel.

Again, the commenting is pretty explanatory there. Both shaders are heavily commented.

Source Code

tutorialsarticlesandexamples/deferred_rendering.txt · Last modified: 2013/11/22 13:32