[13 Dec 2012]  Improvements for shadow mapping in OpenGL and GLSL 
This tutorial shows how to improve results of shadow mapping method, lists disadvantages of basic shadow mapping and possible solutions to those problems. Following improvements are considered: Variance Shadow Mapping (VSM), Exponential Shadow Maps (ESM), Percentage Closer Filtering (PCF), Stratified Poisson Sampling, Rotated Poisson Sampling and other. You can find more info about shadow mapping and shadow mapping for point light sources in previous tutorials.
Problems of shadow mapping
It is easy to implement shadow mapping, but it requires a lot of time and tweaking in order to improve quality. Also, parameters of shadow mapping should depend on a scene. There are following problems of shadow mapping:




Following sections shows how to solve these problems and how to improve quality of shadow mapping.
Surfaces that aren't oriented to light are in shadow
One of the most simple optimizations for shadow mapping is automatic shadowing for fragments where directions of normals are opposite to direction of light rays. You can perform this check with dot product of normal (N) and vector from the light source (L). If value is less than zero, then fragment is in shadow. This optimization creates better shadows on surfaces parallel to light sources, and improves performance. Normals should be valid and well defined for correct work of this optimization. This optimization can decrease quality of shadows during percentage closer filtering and partially remove Peter Panning artifacts. Following code snippet shows how to implement orientation check:


Additional offset to depth values in shadow map
Additional offset to depths in shadow map is used to minimize zfighting. Let's consider the following image. Surface of an object is depicted as green line, the shadow map's projection plane  thick black line, red lines  borders between texels in the shadow map, black dots  depths stored in the shadow map. Whole surface on the left should be lit, but as shown (with the black line from the camera to the surface), part of the fragments that is projected to single texel in the shadow map is in the shadow, because distances from these fragments to the light source are bigger than depth stored in shadow map. Second part of the fragments that corresponds to the same texel in shadow map is lit. Situation is similar for next texels: half of the corresponding fragments is in shadow, another half is lit. This is the reason of zfighting shown on one of the previous images. Image on the right side shows that additional offset is applied to depths in shadow map. Black dots show actual depths stored in the shadow map. Now all fragments of the green surface are lit.
The problem is to determine optimal offset for each depth in shadow map. If you apply not enough offset, zfighting will be still present. If you apply very large offset then Peter Panning will become noticeable. Offset should depend on precision of the shadow map, and on slope of the surface relative to direction to light source.
Following code snippet shows how to add constant offset to distance from fragment to light source:
float distToLight = distance(fragmentWorldPosition, lightPosition) + epsilon;
This offset doesn't take slope of the surface into account. Let's consider the following image. The figure on the left shows distances in shadow map without any additional offset. This shadow map will produce zfighting. The figure on the right shows initial depths and depths after additional offset. As you can see, offset for each depth is different. If the surface is perpendicular to light rays, then offset should be minimal (as for the left most texel). If the surface is parallel to lights rays, then use maximum offset (as for the right most texel). You can determine amount of additional offset with normal at the fragment and with direction to the light source. Check the following code:
// dot product returns cosine between N and L in [1, 1] range
// then map the value to [0, 1], invert and use as offset
float offsetMod = 1.0  clamp(dot(N, L), 0, 1)
float offset = minOffset + maxSlopeOffset * offsetMod;
// another method to calculate offset
// gives very large offset for surfaces parallel to light rays
float offsetMod2 = tan(acos(dot(N, L)))
float offset2 = minOffset + clamp(offsetMod2, 0, maxSlopeOffset);
OpenGL can automatically compute and add offset to values which are stored in ZBuffer. You can setup offset with glPolygonOffset function. Two parameters are available: multiplier for offset that depends on slope of the surface, and value that determines amount of additional smallest possible offsets (depends on format of shadow map):
glEnable(GL_POLYGON_OFFSET_FILL);
glPolygonOffset(1.0f, 1.0f);
Linear filtering for shadow map
OpenGL has option that allows the fragment shader to sample shadow map with linear filtering. Linear filtering returns interpolated value that is based on four closest texels to sampling location instead of a value of closest texel. But shadow map has special builtin logic. Linear filter doesn't interpolate between four depths, but automatically compares current distance from light to fragment with four values in shadow map. Each passed check adds 0.25 to result, and failed check doesn't modify the result. So there are five possible values returned with sampling: 0.0, 0.25, 0.5, 0.75, 1.0.


In case if none of the checks have passed then 0 is returned. If all checks have passed then 1 is returned. So this value can be interpreted as shadowing factor. This is not linear filtration of depths but linear filtration of depth comparisons. To enable automatic comparisons you have to use samplerShadow sampler type in the fragment shader. Following code snippet shows how to initialize texture for automatic depth comparisons with linear filtering:
// bind shadow map
glBindTexture(GL_TEXTURE_2D, shadowMap);
// set linear filter
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
// set automatic comparisson mode
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_MODE, GL_COMPARE_REF_TO_TEXTURE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL);
Shadowing factor with 5 possible values makes transition from light to shadow more smooth and decreases visible aliasing. Also, it may be much quicker to take single sample with linear filter than to take four separate samples and manually compare them with current distance from the fragment to the light source.
Percentage closer filtering
Percentage closer filtering (PCF) is the simplest method to create soft shadows with shadow mapping. Instead of taking one sample from the shadow map, it's possible to perform multiple samples from shadow map around the location of the fragment. Shadowing factor is calculated as ratio of passed depth comparisons to total number of depth comparisons. Even more, PCF uses linear filtering to get results of 4 comparisons at once. So if PCF takes 4 samples, then in combination with linear filtering it gives 16 depth checks, and same number of intensity steps in transition between light and shadow. Such samples should be taken in such a way that all checks are done with unique texels from the shadow map. That is, linear filtering shouldn't sample from same texels twice or more.


Quality of soft shadows in Percentage Closer Filtering directly depends on number of samples from shadow map. Small number of samples and large radius of PCF kernel lead to visible bands along transition between light and shadow. The more unique checks are performed, the better the quality of soft shadows. But more samples lead to bad performance. Big radius of PCF filter requires bigger offset to depth values in shadow map, or the lit surfaces may become partially shadowed. Following code snippet shows how to implement Percentage Closer Filtering:


// number of samples
const int numSamplingPositions = 4;
// offsets for rectangular PCF sampling
vec2 kernel[4] = vec2[]
(
vec2(1.0, 1.0), vec2(1.0, 1.0), vec2(1.0, 1.0), vec2(1.0, 1.0)
);
// performs sampling and accumulates shadowing factor
void sample(in vec3 coords, in vec2 offset,
inout float factor, inout float numSamplesUsed)
{
factor += texture(
u_textureShadowMap,
vec3(coords.xy + offset, coords.z)
);
numSamplesUsed += 1;
}
void main()
{
// ...
// sample each PCF texel around current fragment
float shadowFactor = 0.0;
float numSamplesUsed = 0.0;
float PCFRadius = 1;
for(int i=0; i<numSamplingPositions; i++)
{
sample(projectedCoords,
kernel[i] * shadowMapStep * PCFRadius,
shadowFactor,
numSamplesUsed
);
}
// normalize shadowing factor
shadowFactor = shadowFactor/numSamplesUsed;
}
Poisson sampling kernel
Poisson disk of offsets allows PCF to sample with more irregular offsets than in standard PCF method. Offsets in Poisson disk are uniformly distributed in unit circle and distance between any two samples is not less than defined distance. This fact allows to reduce banding a bit and use less samples. To generate Poisson disk of offsets you can use Poisson Disk Generator app. You won't get valid Poisson disk with simple random generation.
Comparing with regular PCF grid, sampling with Poisson kernel gives quite greater quality with smaller number of required samples. It reduces banding and preserves soft transitions between light and shadow. Following images show results of Poisson sampling and example of Poisson sampling kernel.


Rotated Poisson kernel
In any case, if you use constant grid of sampling positions for Percentage closer filtering, you will have to deal with insufficient softness of transition, banding and performance. Good quality of soft shadows and absence of banding require a lot of samples. And huge amount of samples is always the main performance bottleneck. The banding can be visible even with large number of samples because offsets of samples are same for all fragments, and they create patterns of intensity steps along boundaries of shadows.
You can try to randomly generate sampling kernel for each fragment. But as it was noted previously, random non uniform samples may sample from same texels of shadow map multiple times. You have to generate Poisson disk for each fragment, but all the math will decrease performance of the fragment shader even more than large number of samples. Instead, you can define one Poisson kernel in fragment shader, and then randomly rotate kernel in 2D for each fragment.
This gives us another problem  to generate random angle in the shader. But usually shaders doesn't support builtin generation of random numbers. So you have to implement this functionality manually. In order to generate each time same random number for each fragment in the scene, generator of pseudo random numbers should depend on position of fragment in the scene. In other case generator would generate other numbers each time and the final image will flicker. You can write pseudo random number generator as some high frequency function in the shader, or provide texture with previously generated random numbers and sample different location for each fragment (using position of the fragment in the scene as texture coordinates). Following code snippet shows example of pseudo random number generator:


// generates pseudorandom number in [0, 1]
// seed  world space position of a fragemnt
// freq  modifier for seed. The bigger, the faster
// the pseudorandom numbers will change with change of world space position
float random(in vec3 seed, in float freq)
{
// project seed on random constant vector
float dt = dot(floor(seed * freq), vec3(53.1215, 21.1352, 9.1322));
// return only fractional part
return fract(sin(dt) * 2105.2354);
}
As we have to rotate Poisson disk with random angles, you can store cos(theta) and sin(theta) in a texture, and these values can be directly used in 2D rotation.
Following code snippet shows implementation of Rotated Poisson kernel sampling with pseudo random number generator. Image to the right shows results of Poisson sampling with increased radius of Poisson kernel but with same number of samples. As you can see random rotations of sampling kernel almost totally removes banding, increases possible kernel radius, but with low number of samples this method creates visible noise. Noise becomes less visible when shadows are combined with diffuse maps, lighting and other effects.


// default Poisson disk (offsets)
const int numSamplingPositions = 9;
uniform vec2 kernel[9] = vec2[]
(
vec2(0.95581, 0.18159), vec2(0.50147, 0.35807), vec2(0.69607, 0.35559),
vec2(0.0036825, 0.59150), vec2(0.15930, 0.089750), vec2(0.65031, 0.058189),
vec2(0.11915, 0.78449), vec2(0.34296, 0.51575), vec2(0.60380, 0.41527)
);
// returns random angle
float randomAngle(in vec3 seed, in float freq)
{
return random(seed, freq) * 6.283285;
}
void main()
{
// ...
// generate random rotation angle for each fragment
float angle = randomAngle(o_worldPos, 15);
float s = sin(angle);
float c = cos(angle);
float PCFRadius = 20;
for(int i=4; i < numSamplingPositions; i++)
{
// rotate offset
vec2 rotatedOffset = vec2(kernel[i].x * c + kernel[i].y * s,
kernel[i].x * s + kernel[i].y * c);
sample(projectedCoords, rotatedOffset * shadowMapStep * PCFRadius,
shadowFactor, numSamplesUsed);
}
shadowFactor = shadowFactor/numSamplesUsed;
}
Stratified Poisson kernel
Instead of random rotation of Poisson disk in shader, you can define more offsets in Poisson kernel, and for each fragment use randomly selected part of these offsets. This will lead to very similar results to Rotated Poisson kernel modification. It will decrease banding artifacts and increase possible radius of transition. But this modification requires more samples than Rotated Poisson Sampling in order to decrease noise as it may sample same locations multiple times.


// Poisson disc for Stratified Poisson sampling
const int numSamplingPositions = 9;
uniform vec2 kernel[9] = vec2[]
(
vec2(0.95581, 0.18159), vec2(0.50147, 0.35807), vec2(0.69607, 0.35559),
vec2(0.003682, 0.5915), vec2(0.1593, 0.08975), vec2(0.6503, 0.05818),
vec2(0.11915, 0.78449), vec2(0.34296, 0.51575), vec2(0.6038, 0.41527)
);
// returns pseudorandom number in [0, maxInt) range
int randomInt(in vec3 seed, in float freq, in int maxInt)
{
return int(float(maxInt) * random(seed, freq)) % maxInt;
}
void main()
{
// ...
// four random samples
float radius = 5;
for(int i=1; i <= 4; i++)
{
// random index that depends on position of the fragment
int randomIndex = randomInt(o_worldPos * i, 100, numSamplingPositions);
// get offset through random index
sample(projectedCoords, kernel[randomIndex] * shadowMapStep * radius,
shadowFactor, numSamplesUsed);
}
shadowFactor = shadowFactor/numSamplesUsed;
}
Early bailing
There is one optimization that can increase speed and reduces number of required samples from the shadow map. It's applicable to scenes where large parts of the scene are in light, or in shadow, and scenes doesn't contain high frequency changes between light and shadow (for example, like shadows from trees with a lot of branches). First step of this modification takes small number of samples at distant locations from the shadow map. If all checks passed then assume that fragment is in light and end sampling. Similarly, if all checks failed assume that fragment is in shadow and end sampling.


Otherwise you have to take arbitrary number of samples from the shadow map to calculate smooth transition between light and shadow. If distance between samples in early checks is too large, then it's possible that small lit areas may become shadowed and small areas of shadow may become lit. You can see these strange artifacts on the image.
// Poisson disk with offsets for early bailing
const int numSamplingPositions = 13;
uniform vec2 kernel[13] = vec2[]
(
vec2(0.9328896, 0.03145855), // left check offset
vec2(0.8162807, 0.05964844), // right check offset
vec2(0.184551, 0.9722522), // top check offset
vec2(0.04031969, 0.8589798), // bottom check offset
vec2(0.54316, 0.21186), vec2(0.039245, 0.34345), vec2(0.076953, 0.40667),
vec2(0.66378, 0.54068), vec2(0.54130, 0.66730), vec2(0.69301, 0.46990),
vec2(0.37228, 0.038106), vec2(0.28597, 0.80228), vec2(0.44801, 0.43844)
);
void main()
{
// ...
float PCFRadius = 5;
// take samples required for early bailing
for(int i=0; i < 4; i++)
{
sample(projectedCoords, kernel[i] * shadowMapStep * PCFRadius,
shadowFactor, numSamplesUsed);
}
// early bailing check
if(shadowFactor>0.1 && shadowFactor<3.9)
{
// additional sampling to get smooth transition
float angle = randomAngle(o_worldPos, 15);
float s = sin(angle);
float c = cos(angle);
// don't take early bailing offsets again
for(int i=4; i < numSamplingPositions; i++)
{
vec2 rotatedOffset = vec2(kernel[i].x * c + kernel[i].y * s,
kernel[i].x * s + kernel[i].y * c);
sample(projectedCoords, rotatedOffset * shadowMapStep * PCFRadius,
shadowFactor, numSamplesUsed);
}
}
}
Variance shadow mapping (VSM)
The main problem of PCF method and its modifications is that they require multiple samples from shadow map and comparisons with current fragment's depth in order to smooth transitions between light and shadow. It would be good to blur shadow map, use automatic linear filtration, take only one sample and get smooth transition from light to shadow. But standard shadow mapping can't handle this, as sampling from shadow map and following comparison can't be treated as linear operation. The shadows will be wrong and there won't be smooth transition, as only one sample is taken and comparison has only two possible results (total light or shadow).
Variance Shadow Mapping (VSM) method calculates soft shadows and requires only one sample from the shadow map. It modifies logic of shadow mapping, and instead of simple shadow comparison, VSM uses average depth and average square of depths around the texel. Then, these values can be used in Chebyshev's inequality. Input parameters for inequality are difference between depth of current fragment and average depth and standard deviation of depths around texel. Chebyshev's inequality creates soft shadows with probabilistic approach. Following steps are required to implement Variance Shadow Mapping:
First rendering pass is similar to first pass of standard shadow mapping. Calculate normalized distance from fragment to the light source in [0, 1] range. Additionally you have to save square of that depth. So you have to setup render target with two channels. Following code snippet shows shader, that saves two required depth values:
float worldSpaceDistance = distance(u_lightPosition, o_worldSpacePosition.xyz);
float dist = (worldSpaceDistance  u_nearFarPlane.x) /
(u_nearFarPlane.y  u_nearFarPlane.x) + u_depthOffset;
resultingColor.r = clamp(dist, 0, 1);
resultingColor.g = resultingColor.r * resultingColor.r;
Now you have to blur shadow map, that contains linear depths and squares of linear depths. You can achieve blur with two pass Gauss filter with 5x5 sampling kernel (one horizontal pass and one vertical). After bluring you will get average depth and average square of depth around each texel. Also, you can enable linear filtering and anisotropic filtering to increase blur and quality of soft shadows.
Render target with local average depths is passed to last rendering pass of VSM. Vertex shader of last pass is similar to vertex shader in last pass of the simple shadow mapping. Fragment shader of last pass of VSM similarly calculates texture coordinates to access shadow map.
Then you have to calculate standard deviation and apply Chebyshev's inequality. Take sample from shadow map. Let's average depth is M_{1} and average square of depths is M_{2}. Then square of standard deviation σ is equal to:
From Chebyshev's inequality shadowing factor can be calculated as:
Intensity of lighting is maximum when difference between current depth and average depth is minimum. Also, standard deviation weights significance of difference between depth and average depth. If standard deviation is large, then it increases intensity of lighting in places where difference is big.
But Variance Shadow Mapping has it's own disadvantages. Among them is light bleeding. Light bleeding means that some surfaces become lit when they have to be in shadow. This artifact may appear in places of high frequency changes of light and shadow. Light bleeding can be decreased by setting limit to final intensity calculated through Chebyshev's inequality. You can also automatically add shadow to fragments where normals are opposite to direction of light rays.


Following fragment shader shows how to implement last pass of Variance Shadow Mapping:
#version 330
in vec4 o_shadowCoord;
in vec3 o_normal;
in vec3 o_worldPos;
layout(location = 0) uniform sampler2D u_textureShadowMap;
uniform vec3 u_lightPosition;
uniform float u_minVariance;
uniform float u_lightBleedingLimit;
uniform vec2 u_nearFarPlane;
layout(location = 0) out vec4 resultingColor;
////////////////////////////////////////////////////////
float reduceLightBleeding(float p_max, float amount)
{
return clamp((p_maxamount)/ (1.0amount), 0.0, 1.0);
}
void main(void)
{
/////////////////////////////////////////////////////////
// current distance to light
vec3 fromLightToFragment = u_lightPosition  o_worldPos.xyz;
float worldSpaceDistance = length(fromLightToFragment);
float currentDistanceToLight = clamp((worldSpaceDistance  u_nearFarPlane.x)
/ (u_nearFarPlane.y  u_nearFarPlane.x), 0, 1);
/////////////////////////////////////////////////////////
// get blured and blured squared distance to light
vec3 projectedCoords = o_shadowCoord.xyz / o_shadowCoord.w;
vec2 depths = texture(u_textureShadowMap, projectedCoords.xy).xy;
float M1 = depths.x;
float M2 = depths.y;
float M12 = M1 * M1;
float p = 0.0;
float lightIntensity = 1.0;
if(currentDistanceToLight >= M1)
{
// standard deviation
float sigma2 = M2  M12;
// when standard deviation is smaller than epsilon
if(sigma2 < u_minVariance)
{
sigma2 = u_minVariance;
}
// chebyshev inequality  upper bound on the
// probability that fragment is occluded
float intensity = sigma2 / (sigma2 + pow(currentDistanceToLight  M1, 2));
// reduce light bleeding
lightIntensity = reduceLightBleeding(intensity, u_lightBleedingLimit);
}
/////////////////////////////////////////////////////////
resultingColor.rgb = vec3(lightIntensity);
resultingColor.a = 1.0;
}
Exponent shadow maps (ESM)
Variance Shadow Mapping uses two channels of shadow map, and as result it takes two times more memory. And light bleeding might be too severe. Instead of Variance Shadow Mapping you can use Exponential Shadow Maps (EMS). Exponential Shadow Maps method uses only one channel of shadow map, and decreases light bleeding effect.
Exponential Shadow Maps method is based on the fact that difference between depth from shadow map and distance from current fragment to the light source has to be greater or equal to zero. When fragment is lit, then difference (in ideal case!) has to be 0, and when fragment is in shadow  greater than zero. It may be positive due to floating point errors or due to some manual manipulations on input values (like blur).
Shadowing factor can be approximated with a function that depends on value of exponent. Value of such function will decrease very quickly to zero if difference is negative. When depths are equal then exp(zd) = exp(0) = 1
In order to adjust softness of the transition between light and shadow you can use c constant. The greater the value, the more sharp are the shadows. For 32 bit shadow maps, you can use c=80 and lower values, and for 16 bit  c = 30 and lower values. With too low values the shadows will become to bright near the beginning.
As you can see from following image, the function returns 1.0 for fragments where difference between depths is zero. When difference is negative, then the function returns value greater than 1.0. In such case use 1.0 as shadow factor. If difference dz is positive than the function returns shadow factor between 1.0 and 0.0, and creates smooth transition.
Now let's look how to implement Exponential Shadow Maps. In first pass store exp(c * linearDepth) instead of linear depth. Note, depth values should be in [0, 1] range:
resDepth = exp(c * depth);
As for VSM, you can blur ESM with Gauss filter. The greater the blur effect, the softer the shadows. But large blur will lead to artifacts.
In the last rendering pass of ESM, calculate shadowing factor as exp(cz) * exp(cd). Following code snippet shows implementation of Exponent Shadow Maps shader:
#version 330
in vec4 o_shadowCoord;
in vec3 o_normal;
in vec3 o_worldPos;
layout(location = 0) uniform sampler2D u_textureShadowMap;
uniform vec3 u_lightPosition;
uniform float u_depthMultiplier;
uniform vec2 u_nearFarPlane;
uniform float u_epsilon;
layout(location = 0) out vec4 resultingColor;
/////////////////////////////////////////////////
void main(void)
{
/////////////////////////////////////////////////
// current distance to light
vec3 fromLightToFragment = u_lightPosition  o_worldPos.xyz;
float worldSpaceDistance = length(fromLightToFragment);
float currentDistanceToLight = clamp((worldSpaceDistance  u_nearFarPlane.x)
/ (u_nearFarPlane.y  u_nearFarPlane.x), 0, 1);
/////////////////////////////////////////////////
// get blured exp of depth
vec3 projectedCoords = o_shadowCoord.xyz / o_shadowCoord.w;
float depthCExpBlured = texture(u_textureShadowMap, projectedCoords.xy).r;
// current exp of depth
float depthCExpActual = exp( (u_depthMultiplier * currentDistanceToLight));
float expFactor = depthCExpBlured * depthCExpActual;
// Threshold classification for high frequency artifacts
if(expFactor > 1.0 + u_epsilon)
{
expFactor = 1.0;
}
/////////////////////////////////////////////////
resultingColor.rgb = vec3(expFactor);
resultingColor.a = 1.0;
}
ESM may improperly interpret scenes with high frequency details. In such case instead of one sample, you can take multiple samples and average them. Something similar to PCF. To determine which fragments require additional sampling you can check whether calculated shadow factor is greater than 1.0 + epsilon.
Exponential Shadow Maps has its own disadvantages. First is that EMS is a hack, that gives nice shadows and gives good performance, but it doesn't have any physical background. Because of this, light bleeding is present, but it is much lower than in VSM. And shadows near objects can be too bright and darker at father distances, but it have to be conversely. Nevertheless, EMV with correct parameters is fine and quick method to get realtime shadows.


Sun and Black Cat Igor Dykhta () © 20072014