ShaderFlex - Overview



ShaderFlex is a powerful stand-alone GPU shader authoring environment for DirectX 11 similar to legacy tools like NVidia FX Composer and AMD RenderMonkey.


Whether you're a beginner looking to learn about shaders, a graphics hobbyist wanting to sandbox and share your latest creations or a seasoned graphics engineer prototyping some killer special effects for an upcoming AAA game, ShaderFlex is the tool for you.


Scroll down for an overview of ShaderFlex or check out the Features page for more in-depth information.


NOTE: ShaderFlex is still in development and a beta program is planned for later this year.

Above is an image of ShaderFlex rendering over 500,000 dynamic blades of grass, with depth of field and HDR blooming


Main Features
  • DirectX 11 support ( extensive control of the API through ShaderFlex's material definitions )
  • Shader models ( 4_0_level_0_1, 4_0_level_0_3, 4_0, 4_1 and 5_0 )
  • Shader types ( compute, vertex, geometry, hull, domain, pixel )
  • Complex non-linear multi-pass support ( for rendering effects like smooth particle hydrodynamics or fluid simulations which require many scattered passes )
  • Uses proprietary 100% GPU accelerated user interface ( runs smooth on 4K+ displays, dockable, supports HiDPI scaling, abstracted from Windows )
  • NodeFlex support ( use NodeFlex and our customizable DX11 graph system to auto-generate your shaders and create your own reusable code generating logic )
  • Oculus Rift DK1 and DK2 support ( using Direct HMD Access and latest SDK )
  • 3DConnexion SpaceMouse support ( and related products )
  • Debugging capabilities ( live preview, resources, ASM, timing, passes, shader feedback )
  • Scene scripting ( setup render targets, feed values to shaders, animated objects, and control the scene and rendering per frame )
  • Material definitions ( easy to use C++ style material definition for setting up resources, views, shaders, states, passes, etc. )
  • RenderFlex ( export your scenes in binary format to share with other using RenderFlex )
  • Simulation and Animation ( ability to run simulation passes at different frequencies than other rendering or compute passes )
  • Texture support ( .jpg, .png, .dds, .tga, .ppm, .bmp.dib, .pfm, .hdr, .tif, .gif )
  • ShaderFlex Material API ( lightweight API to access all data from a material )
  • ShaderFlex Rendering API ( integration API to load and render ShaderFlex materials in your rendering pipeline )
  • AutoDesk FBX support ( .fbx, .obj, .3ds, animation and CPU skinning )
  • Stream-output support ( render geometry to buffers for later use )
  • indirect draw/compute ( keep the GPU from stalling by initiating draw/dispatch calls using values from a DX Buffer )
  • Data Caching ( generate data from a compute shader in an INIT pass and save the data to file for future quick loading )
  • Pass iterations ( simply tell a pass how many times it should execute and access the iteration count and iteration index from your shaders )
  • Compute generated vertex and index buffers ( generate and render geometry entirely on the GPU )
  • Access mesh data from compute shaders ( get the position, normal and other data for processing in a compute shader, for example generating static ambient occlusion )
  • Flexible viewport ( go full screen, full window, choose from a selection of standard resolutions or choose your own )
  • Full access to mouse/touch input data from shaders
  • Auto-draw ( tell a rendering pass how many fake vertices to render, and use the SV_VertexID to dynamically generate geometry like grass blades for example )
  • Geometry instancing
  • Tweaks and presets ( shader variable overrides and per-material presets )
  • Supports WARP software rendering
  • Code Editing ( Intellisense style code editing, auto-completion tips and other helpers, code preview )
  • Multiple projects ( open multiple projects at the same time )
  • Scene editor ( build hierarchical scenes using frame/camera/light/mesh entities and post fx filters, each entity can have unlimited materials attached )
  • Auto generate mip map levels on the GPU
  • Volume Texture Targets ( render into volume slices or use a compute shader to do the same )
  • Cube Texture Targets ( render into cube faces or use a compute shader to do the same )
  • 2D/3D/Volume texture arrays ( use them as shader resources, unordered access buffers or render targets )
  • Resource Views ( over 31 different types of shader resource, unordered access and render target views )

One of the most important components of real-time 3D graphics today is the "shader". Yes, those little mathematical programs which get executed hundreds of millions of times a second on the GPU and are responsible for every little pixel that gets blasted to our screens. Not only do they deal with pixels, more and more rendering logic and general computation is now being offloaded from game engines into compute shaders for things like physics, collision, particle systems, lens flare systems, sorting, skinning, water, and the list goes on. 


In most systems, many of the crucial core shaders are hardwired into the engine and require minimal changes, while other ( project specific ) specialty shaders are beginning to be considered more of an art asset which require plenty of iteration, custom functionality and interation between graphics engineers and artists. With the increased power and flexibility of today's GPUs and continued evolution of shader languages like HLSL and GLSL, these shaders and rendering effects are becoming increasingly more complex and are beginning to require development environments to help prototype, iterate and debug so that developers and artists can continue to be creative and productive.



Rendering effects these days require a substantial amounts of special code to set up things like render/blend/sampler/rasterizer states, resources, render targets, buffers, views, render/compute passes, shaders, render commands, etc. When iterating on any effect, having to add and remove hundreds of lines of setup code over and over becomes extremely time consuming, redundant and error prone, which can lead to zero iteration and creativity, even when using a development environment like ShaderFlex.


This is where NodeFlex comes into play by helping you hide all of the complexity and redundancy so that you can concentrate on being productive and creative.


NodeFlex is a powerful node-based code generator which can be used to help visualize, automate and manage your shader development process. However, don't be fooled by the visual nature of the tool, NodeFlex was designed specifically to empower developers, not lock them into using a closed "black-box" graph system which lead to unmanageable "spaghetti graphs". Quickly edit existing nodes, add your own, edit the way the final shader is generated and even debug by stepping line by line through the template code as it executes and generates your shader. The template system lets you create, organize and reuse clever code generating logic into individual nodes using a familiar C/C++ style interpreter language. Change some template/shader code or edit an option and the software executes a function which traverses the graph, accumulates shader code and other data and spits out your shader to your exact specification.


ShaderFlex will come with a fully customizable DX11-based graph system specifically designed to help ShaderFlex users generate flawless shader code.


ShaderFlex was developed to be as flexible as possible so that users could easily setup and render any type of non-standard multi-stage rendering effect thrown its way. To make sure this would be possible, I took some time during its development to implement some of the most complex rendering techniques I could think of and put ShaderFlex to the test. Each technique required some special functionality and helped shape some of the features now present in ShaderFlex.


Smooth Particle Hydrodynamics ( with multi-substance interations )



This is a smooth particle hydrodynamic based water simulation effect that runs entirely on the GPU using compute, vertex, geometry and pixel shaders. It can simulate and render up to 1 million particles, and handles a cool 256k particles at full real world speed, requiring at least 180 simulation steps per second to maintain stability/accuracy and rendering at a capped rate of at 75 f/s on a GTX 780 TI.


It is one of the most complex effects I've thrown together so far and consists of 1100+ lines of shader code and 600+ lines of material definition code to setup resources, views, samplers, shaders, passes, render commands etc.


click for more info...


The effect was based on a 2D fluid sample from the DirectX SDK which I was using as a quick means to prototype the simulation and animation throttling system that is now part of ShaderFlex. I had a bit too much fun with this shader and once I got it working in ShaderFlex I started adding all kinds of controls, and was even able to implement support for the interaction of multiple types of substances that behave like blood, milk, sand, cotton, goo, snow, foam, etc.


Implementing this type of complex multi-pass effect was one of the biggest tests for ShaderFlex's rendering and simulation system since it required tons of special buffer resources, shaders, and 20+ rendering passes for initialize data, ping ponging resources, calculating dynamic bounding volume for the bucket grid, GPU sorting of particles, particle emitting/killing, simulation, pressure concervation, forces etc.

Physically Based Rendering / Image Based Lighting



Another good test for ShaderFlex was to see if I could add support for Physically Based Rendering and Image Based Lighting without adding any special functionality for it in the application. I was able to add some initialization only compute passes that generate some BRDF and seperated GGX lookups, then processes the current environmental map using an importance sampling method into each mip level which each correspond to a roughness amount. These mip levels are then sampled in real-time during the lighting calculations based on the roughness and other material and lighting attributes.


click for more info...


Volumetric Fluid Dynamics



More details to come.


click for more info...


Oculus and 3DConnexion Support



One of the most exciting features of ShaderFlex is the support for the Oculus DK1 and DK2 using its Direct HMD Access. See the 3D rendering as you would in the viewport and also directly in your head mounted display. Each camera can modify its own Oculus control settings.


Complex simulations also run great in 3D because any simulation-based rendering passes are only executed once per frame in the simulation stage before the scene is rendered twice for both eyes.


ShaderFlex also supports all 3DConnexion 6-degree of freedom motion controls which work great together with the Oculus HMDs. Imagine using the SpaceMouse to fly around your 3D world like you were in a zero gravity spaceship while using the DK2 to control your head and body from within the cockpit. Makes for some amazing fly throughs!

Fur Simulation



This is a compute-based fur simulation effect using 12000 strands of hair with 8 joints each and subdivided further using a catmull-rom curve. It reacts naturally to rotation, movement, gravity, wind and other forces. It was one of the first shaders I developed to test out ShaderFlex's compute shader support.


It has a slew of settings and can even clump together hair which can give some interesting results. The effect was developed and prototyped from NodeFlex.


click for more info...


More details to come.

Water Simulation ( FFT based )



This is a Fast Fourier Transform ( FFT ) based water simulation I developed which was based on a few samples I found online and supports both a dynamically displaced and horizon-aligned screen-space projected grid as well as a local world aligned grid array.


I also added a realistic multi-depth foam parallax effect using the waves direction and a feedback technique, as well as a fake refracted sea bottom and a pre-scattered cube map from a compute shader for some of the lighting. You'll also notice I used this water with my atmospheric scattering sample below.


The water here is being rendered in a maximized ShaderFlex viewport and controled using the NodeFlex parameter editor where the effect was developed and iterated on.


click for more info...


The main features stressed by this technique was the use of simulation passes, plenty of compute shaders, some resource ping ponging, generating dynamic vertex/index buffers for the grid via compute shader to render with and the ability to have compute shaders only run when the shader loads, compiles or when a parameter is changed ( for example the fresnel shader runs only in the INIT render stage, but will also run when you change the water's refraction index ).


Overall this effect is quite complex and was a bit confusing trying to make sense of some of the run-time code as I coudln't get either sample to compile and run. In the end I was able to get my own variation working. It required 4 samplers, 5 unordered access buffers for the compute shaders to fill ( H0, Omega, Spectrum, DXYZ, RadixTemp ), three render targets ( Displacement, Gradient, Foam ), two dynamically precomputed buffers ( Fresnel, Perlin ), a vertex and index buffer for the water grid, 23 shaders ( which I won't have room to list ), 19 individual render passes, and finally 19 executed render passes.

GPU Particle system with GPU depth sorting



After I got the GPU radix sort working for the SPH water simulation, I wanted to try it for sorting particles based on depth and was also interested in testing the viability of a 100% GPU-based particle system, where particles are emitted using a compute shader, managed by a quick and clever dynamic single list of active/inactive particle indices.


Another reason for this shader was to test out some per-pixel order independant transparency functionality I had been working on.


On the left you see NodeFlex was used to generate the effect. Most of the code, logic and options are built into the Particle System node but the plan is to abstract things like forces and other attributes and behaviors into their own nodes which can be linked up together to quickly come up with cool new particle systems.




click for more info...


The effect uses 9 compute passes ( 4 for sorting ), 1 render pass, 12 shaders ( 4 for sorting ), 6 resource buffers ( 2 for sorting ), 26 render commands ( 20 for sorting ).


Each particle generates its own random float4 seed when spawned which is the basis to any amount of subsequent random values it may need. The particle system has min/max/falloff settings for things like life, drag, weight, rotation, size, color and other particle characteristics. Each particle characteristic is dynamically generated using the particle's random seed to give it a truely organic feel.


Some of the particle's behavior are interpolated and others like velocity and gravity are simulated. The main compute shaders are ran in the simulation stage at a user defined rate and the final effect is rendered at a user defined target frame rate.


More details to come.

Atmospheric Scattering



More details to come.


click for more info...





The grass was one of my first more involved shader effects I was using to test out my geometry, hull and domain shader support as well as ShaderFlex's connection to NodeFlex. Here you see hundreds of thousands of dynamically generated grass blades from a compute/geometry shader, high quality depth of field, bloom streaks, motion blur, etc.


The grass has multiple layers of detail, the main grass which is always present, some additional cheap wireframe grass to add more detail, and a patch of grass that moves with the player and smoothly blends in when the camera is close to the ground giving you the alusion that there's always 2-3x more grass blades being rendered at all times.


click for more info...


Lens Flares



More details to come.


click for more info...


HDR Post FXs



More details to come.


click for more info...





Developed a clone of PhotoShop's Liquify tool to test out ShaderFlex's multi-pass rendering, render target formats, dynamically sized textures, feedback technique and mouse input support.


Just like in PhotoShop, you can switch modes from Smudge, Bulge, Pinch, Heal, Freeze and Thaw, and change the pointer's pressure, strength, falloff and radius, as well as the grid color, radius and mask.


The effect is written in less than 700 lines for all shader and setup code, uses 2 target resources, 3 texture views, 6 shaders, 4 rendering passes and 4 render pass commands.


click here for the 4k youtube video


click for more info...