From Evolution to Expansion and Multi-Threading: The Mile High Overview

The November DirectX SDK update was the first to include some DirectX 11 features for developers to try out. Of course, there is no DX11 hardware yet, but what is included will run on the current DX10 setup with DX10 hardware under Vista and the beta Windows 7. This combined with the fact that Khronos finished the OpenCL specification last month mark two major developments on the path to more general purpose computing on the GPU. Of course, DX11 is more geared toward realtime 3D and OpenCL is targeted at real general purpose data parallel programming (across multiple CPUs and GPUs) distinct from graphics, but these two programming APIs are major milestones in the future history of computing.

There is more than just the compute shader included in DX11, and since our first real briefing about it at this year's NVISION, we've had the chance to do a little more research, reading slides and listening to presentations from SIGGRAPH and GameFest 2008 (from which we've included slides to help illustrate this article). The most interesting things to us are more subtle than just the inclusion of a tessellator or the addition of the Compute Shader, and the introduction of DX11 will also bring benefits to owners of current DX10 and DX10.1 hardware, provided AMD and NVIDIA keep up with appropriate driver support anyway.

Many of the new aspects of DirectX 11 seem to indicate to us that the landscape is ripe for a fairly quick adoption, especially if Microsoft brings Windows 7 out sooner rather than later. There have been adjustments to the HLSL (high-level shader language) that should make it much more attractive to developers, the fact that DX10 is a subset of DX11 has some good transitional implications, and changes that make parallel programming much easier should all go a long way to helping developers pick up the API quickly. DirectX 11 will be available for Vista, so there won't be as many complications from a lack of users upgrading, and Windows 7 may also inspire Windows XP gamers to upgrade, meaning a larger install base for developers to target as well.

The bottom line is that while DirectX 10 promised features that could bring a revolution in visual fidelity and rendering techniques, DirectX 11 may actually deliver the goods while helping developers make the API transition faster than we've seen in the past. We might not see techniques that take advantage of the exclusive DirectX 11 features right off the bat, but adoption of the new version of the API itself will go a long way to inspiring amazing advances in realtime 3D graphics.


From DirectX 6 through DirectX 9, Microsoft steadily evolved their graphics programming API from a fixed function vehicle for setting state and moving data structures around to a rich, programmable environment enabling deep control of graphics hardware. The step from DX9 to DX10 was the final break in the old ways, opening up and expanding on the programmability in DX9 to add more depth and flexibility enabled by newer hardware. Microsoft also forced a shift in the driver model with the DX10 transition to leave the rest of the legacy behind and try and help increase stability and flexibility when using DX10 hardware. But DirectX 11 is different.

Rather than throwing out old constructs in order to move towards more programmability, Microsoft has built DirectX 11 as a strict superset of DirectX 10/10.1, which enables some curious possibilities. Essentially, DX10 code will be DX11 code that chooses not to implement some of the advanced features. On the flipside, DX11 will be able to run on down level hardware. Of course, all of the features of DX11 will not be available, but it does mean that developers can stick with DX11 and target both DX10 and DX11 hardware without the need for two completely separate implementations: they're both the same but one targets a subset of functionality. Different code paths will be necessary if something DX11 only (like the tessellator or compute shader) is used, but this will still definitely be a benefit in transitioning to DX11 from DX10.

Running on lower spec'd hardware will be important, and this could make the transition from DX10 to DX11 one of the fastest we have ever seen. In fact, with lethargic movement away from DX9 (both by developers and consumers), the rush to bring out Windows 7, and slow adoption of Vista, we could end up looking back at DX10 as merely a transitional API rather than the revolutionary paradigm shift it could have been. Of course, Microsoft continues to push that the fastest route to DX11 is to start developing DX10.1 code today. With DX11 as a superset of DX10, this is certainly true, but developer time will very likely be better spent putting the bulk of their effort into a high quality DX9 path with minimal DX10 bells and whistles while saving the truly fundamental shifts in technique made possible by DX10 for games targeted at the DX11 hardware and timeframe.

We are especially hopeful about a faster shift to DX11 because of the added advantages it will bring even to DX10 hardware. The major benefit I'm talking about here is multi-threading. Yes, eventually everything will need to be drawn, rasterized, and displayed (linearly and synchronously), but DX11 adds multi-threading support that allows applications to simultaneously create resources or manage state and issue draw commands, all from an arbitrary number of threads. This may not significantly speed up the graphics subsystem (especially if we are already very GPU limited), but this does increase the ability to more easily explicitly massively thread a game and take advantage of the increasing number of CPU cores on the desktop.

With 8 and 16 logical processor systems coming soon to a system near you, we need developers to push beyond the very coarse grained and heavy threads they are currently using that run well on two core systems. The cost/benefit of developing a game that is significantly assisted by the availability of more than two cores is very poor at this point. It is too difficult to extract enough parallelism to matter on quad core and beyond in most video games. But enabling simple parallel creation of resources and display lists by multiple threads could really open up opportunities for parallelizing game code that would otherwise have remained single threaded. Rather than one thread to handle all the DX state change and draw calls (or very well behaved and heavily synchronized threads sharing the responsibility), developers can more naturally create threads to manage types or groups of objects or parts of a world, opening up the path to the future where every object or entity can be managed by it's own thread (which would be necessary to extract performance when we eventually expand into hundreds of logical cores).

The fact that Microsoft has planned multi-threading support for DX11 games running on DX10 hardware is a major bonus. The only caveat here is that AMD and NVIDIA will need to do a little driver work for their existing DX10 hardware to make this work to its fullest extent (it will "work" but not as well even without a driver change). Of course, we expect that NVIDIA and especially AMD (as they are also a multi-core CPU company) will be very interested in making this happen. And, again, this provides major incentives for game developers to target DX11 even before DX11 hardware is widely available or deployed.

All this is stacking up to make DX11 look like the go-to technology. The additions to and expansions of DX10, the timing, and the ability to run on down level hardware could create a perfect storm for a relatively quick uptake. By relatively quick, we are still looking at years for pervasive use of DX11, but we expect that the attractiveness of the new features and benefit to the existing install base will provide a bigger motivation for game developers to transition than we've seen before.

If only Microsoft would (and could) back-port DX11 to Windows XP, there would be no reason for game developers to maintain legacy code paths. I know, I know, that'll never (and can't by design) happen. While we wholeheartedly applaud the idea of imposing strict minimum requirements on hardware for a new operating system, unnecessarily cutting off an older OS at the knees is not the way to garner support. If Windows 7 ends up being a more expensive Vista in a shiny package, we may still have some pull towards DX9, especially for very mainstream or casual games that tend to lag a bit anyway (and as some readers have pointed out because consoles will still be DX9 for the next few years). It's in these incredibly simple but popular games and console games that the true value of amazing realtime 3D graphics could be brought to the general computing populous, but craptacular low end hardware and limiting API accessibility on popular operating systems further contribute to the retardation of graphics in the mainstream.

But that's the overview. Let's take some time to drill down a bit further into some of the technology.

Introducing DirectX 11: The Pipeline and Features Drilling Down: DX11 And The Multi-Threaded Game Engine
  • Hrel - Sunday, February 1, 2009 - link

    This is one of the most poorly written articles I've ever read on anandtech. It's like the author couldn't organize his thoughts properly. Also, the speculation was riddled with subjective assumptions. I'm not sure if the author just doesn't know this topic very well or if he hadn't slept in 3 days, but this could have been done much better. Great topic though, and interesting subject matter.
  • GourdFreeMan - Sunday, February 1, 2009 - link

    Derek, the DX10 geometry shader was never really intended to do tessellation, and really should not be thought of as a generalized tessellator. It was designed to offer a generalized hardware implementation of vertex effects such as skinning, vertex blending and tweening (see the dolphin demo in the DX SDX for what I am refering to here).

    If it becomes desirable at some point in time in the future to offer fully programable tessellation, then vertex shader, hull shader, tessellator, domain shader and geometry shader could all be merged into another compute shader earlier in the pipeline to do generalized vertex manipulation.

    Of course, it is also possible that the existing tessellator will prove more efficient as fixed function hardware, and only minor functionality improvements will be added.
  • eXistenZ - Sunday, February 1, 2009 - link


    I just wanted to add, that ATi graphic cards have tesslator included since Radeon 8500, but i can be wrong...
    I remember "Truform" technology, which is working in Serious Sam, or Return to Castle Wolfenstein, and Counter Strike 1.6 (it is not working now in Counter Strike).

    I want to know, if author of this article forgot about it, or im wrong about this technology.

    Sorry for my english, im from Slovakia :)
  • haukionkannel - Sunday, February 1, 2009 - link

    There has been a tessalation unit in ATI cards for some time. It's not the same as is reguired in DX11, but guite near. I think that it was mentioned in the article...

    From what I know is that DX10 has been slow because in most games it's just dx9 with some clued features from dx10 above it. With pure dx10 codepath it would have been faster, but that would have left all those XP-customers out, and would not have been sound economically...
    The author hopes that Win7 win encourage the transfer from XP, so there will be larger amount of DX10 and DX11 platforms. So it would become enonomically possible to make DX11 based games (just leaving out some pure DX11 features so that older dx10 cards could handle the games.) So actually when dx11 games comes out, they would be actually first to make use of all dx10 features...
    Well there are so many dx9 machines in the world that even that will take time. So we will see poor dx10, dx11 performance until the XP customers are not needed by game companies, and even then there are those pure console transfers without any optimization like GTA...
    I hope that "Chattered horisont" from Futuremark shows what DX10 can really do. It is goint to be pure DX10 game, so it can use advantages that dx10 can offer. On the other hand it can be next Crysis that looks really good, but makes your hardware moan for more power. We will see...
  • yyrkoon - Sunday, February 1, 2009 - link

    "On the flipside, DX11 will be able to run on down level hardware."

    Um . . . Eh ? English ?

    "This may not significantly speed up the graphics subsystem (especially if we are already very GPU limited), but this does increase the ability to more easily explicitly massively thread a game and take advantage of the increasing number of CPU cores on the desktop. "

    ... and significantly slow things down even further.

    " These code resources are huge and can be hard to manage without OOP (Object Oriented Programming) constructs. But there are some differences to how things work in other OOP languages. "

    I think you would find many experienced programmers who would say that OOP is a way of programming, not necessarily a language type, and I would have to agree with them. Now if you mean languages that *support* OOP, then sure, I can live with that.

    Also, one other minor thing that kind of bothers me. You speak of Directx 6, but was Directx 6 an actual redistributable ? I definitely do not remember it, but I *do* remember Directx 5, Direct 7, 8, . . . and even that thing MS claims never existed . . . WinG.
  • DerekWilson - Friday, February 6, 2009 - link

    down level hardware == hardware that meets a lower DX spec (like DX10 hardware).

    allowing games to be more mulithreaded using a fine grained synchronization scheme ala DX11 should not slow things down if developers take advantage of it correctly (which will be much easier than doing your own management here).

    yes i did mean languages that support the OOP model.

    DX6 was a Win98 thing ... it existed and actually was (iirc) the first version of DX to be hardware accelerated ... at least that's how I remember it.

    DX4, on the other hand, never existed -- MS skipped from DX3 to DX5.
  • frozentundra123456 - Sunday, February 1, 2009 - link

    I was initially unhappy with both Vista and DX10. However, I have come to accept Vista, but dont know if it is that much improved over WinXP. I only have Vista because I bought a new computer with that OS intstalled. I dont really know of anything I do with Vista that could not be done with XP. The only advantage to Vista is that it is supposedly more secure than XP, but I never had any major security problems with XP, nor have I had any with Vista.
    DX10 is still more of a disappointment to me. It requires too many resources and does not seem to offer corresponding improvements in visual quality. Nearly every game I have that is DX10 compliant, I run in DX9 mode because the performance improvement in DX9 more than makes up for the slight visual improvement with DX10. (Yes, I know I need a better graphics card.) I have an HD2600 pro, which was supposedly a "mid range" DX10 card when it came out, and it is virtually worthless for trying to play in DX10 mode, as I stated above.
    I wonder if DX9 will still be supported when DX11 comes out. If not, they had better make DX11 run better on low to midrange hardware than DX10, or there will be a lot of unhappy users.
  • epyon96 - Saturday, January 31, 2009 - link

    Since Derek claims that Direct X 11 is simply a superset of Dx10, why does Microsoft release it simply as 10.2 instead? I am curious what makes a Direct X version and what determines an incremental move forward.
  • ltcommanderdata - Saturday, January 31, 2009 - link

    I'd like to know that too. Since to me DX9.0c (SM3.0) seems to have been a pretty major step forward from DX9.0 (SM2.0), even a whole new shader model, yet it was only given a letter subscript. It should have at least been DX9.1.

    My cynical view? It's all marketing and Microsoft appeasing hardware vendors for their own benefit. For example DX8.1 was supposed to be a decent step forward, going from SM1.1 to SM1.4 with longer shaders and other features. Yet nVidia refused to support SM1.4 and managed to convince Microsoft to call SM1.3 DX8.1 compliant even though it's closer to SM1.1 than SM1.4. My suspicion is that Microsoft agreed with nVidia, because at that time nVidia was making the GPU for the XBox and Microsoft needed them.

    A similar situation occurred with DX9.0c and SM3.0. This time ATI wasn't going to offer immediate support for SM3.0 in their GPUs. So in order for ATI's X8xx generation to not look so far behind, SM3.0 was only marketed as DX9.0c instead of DX9.1 or something more major. Why would Microsoft appease ATI? Conveniently, ATI was making the GPU for Microsoft's next-gen XBox 360, so Microsoft needed them.

    This might not actually be true, but it's interesting that the swings in XBox GPU choice corresponds with Microsoft's degree of emphasis on DirectX capability.

    In the case of DX11, I think there is sufficient new capabilities with Tessellation and Compute Shaders to justify a major number increase. I believe what Derek means is that DX11 is a superset of DX10 in the same way DX9 is a superset of DX8. They both offer backwards compatibility. In contrast, DX10 is not compatible with DX9 and Vista actually has separate DX10 and DX9 APIs (and third Vista specific DX9.0L) while DX8, DX7, etc can run on the DX9 API.
  • GourdFreeMan - Sunday, February 1, 2009 - link

    Microsoft originally had some soft guidelines in this respect. Letter releases were to represent minor changes in the API such as the range and precision allowed for constants, max number of loop iterations in pixel and vertex shaders, etc. Point releases would permit added functionality to stages of the rendering pipeline. Version releases could include changes to the rendering pipeline itself. In practice, point and letter releases have been to support vendor-specific functionality, and version releases have set a baseline for all vendors.

    Microsoft's guildelines fit for all DirectX changes except 9.0c, which was really a vendor-specific change to fit the nVIDIA 6000 series hardware. (ATi did not have SM3.0 cards until its next hardware generation).

