Increased Thread Limit for Compile Tools
The maximum amount of threads used by both VVIS and VRAD is limited to 16 as they are both implementing the same class; threads.cpp. This limitation has been part of the Source SDK since 2006 and derives from the processors common for that era. If you run an older version of Source (2013 or older), you can compile these tools with a new limit yourself as the source code is available. If you want to adapt the tools for a newer version, you will need to patch them. Warning: Spawning more threads will not necessarily decrease your compile time, these dlls are meant to be used for machines that have more than 8 physical cores!
Patched DLLs for CS:GO SDK
If you require more threads for compiling CS:GO maps, then you can download the patched dlls below. Please take notice of the disclaimer and back up your original vvis_dll.dll and vrad_dll.dll before you replace them:
Replace your vvis_dll.dll and vrad_dll.dll in SteamApps\common\Counter-Strike Global Offensive\bin. The inner workings are explained in the section below, if one is interested.
Patching the DLLs
The maximum amount of threads used by both VVIS and VRAD is limited to 16 using preprocessor directives.
#define MAX_TOOL_THREADS 16
#include "threads.h" #define MAX_THREADS 16 CRunThreadsData g_RunThreadsData[MAX_THREADS]; HANDLE g_ThreadHandles[MAX_THREADS]; if ( numthreads > MAX_TOOL_THREADS ) numthreads = MAX_TOOL_THREADS;
This means we will need to change 3 things:
- Find and replace the if statement to check for > 32 instead of > 16 (easy)
- Increase the size of the .data memory segment to make room for a bigger g_RunThreadsData array.
- Replace all calls to point towards the new memory addresses.
We can use the previous memory address of g_RunThreadsData (16 * (12 + 4) bytes) for g_ThreadHandles. Start off by increasing the .data segment size with CFF Explorer (luckily there is some space between .data and _RDATA).
Now launch IDA Free/Pro and find all cross-references to 'CreateThread'. Search for the subroutine that looks like Figure 1 (g_RunThreadsData & g_ThreadHandles will not be named that way).
Leave the graph mode (Space) and enable op codes. The line marked in Figure 2 is our register used for the if statement from the C++ code, using your favorite Hex editor find and replace the BE 10 00 00 00 (Warning: NOT unique) with BE 20 00 00 00 (0x20 = 32 decimal). Congrats, we have now completed our first task.
Now jump to the memory address provided by g_RunThreadsData (doubleclick ukn_XXXXXXX) copy and save the memory address (for example .data:12461E50). Go to the end of the old .data memory segment (you should have increased the size of .data with CFF Explorer some steps before) and copy that memory address. Find all cross-references to 'g_RunThreadsData' (ukn_XXXXXXX) and replace all calls to point to the new memory address using the Hex editor and the op codes.
Replace each call to 'g_ThreadHandles' you can find via cross-references with the old memory address of g_RunThreadsData. After this you are done.
32 vCores (1.8GHz), 224GB Ram, 500GB SSD, CoreOS (32 threads, DigitalOcean)
4 physical Cores (3.40 GHz), 32GB Ram, 3TB HDD, Debian (8 threads, Hetzner)
The tests concluded that increasing the thread count above 16 will decrease the average CPU usage to 70% (Figure 3) and the overall compile time by only about 4.5%. This can be traced back to the fact that in the end only a couple of vCores remain working on difficult world units (WU) and the other ones being idle (Figure 3). Additional possible performance influences: wine (to run VVIS on Linux) and VPS (multiple vCores splitting the same physical core).
- 32 vCores (32 threads, ~70% Usage): 43 minutes, 10 seconds elapsed
- 32 vCores (32 threads, 16 vCores @ ~100% Usage (unpatched VVIS)): 45 minutes, 23 seconds elapsed
- 4 physical Cores (8 threads, ~100% Usage): 52 minutes, 53 seconds elapsed
- 4 physical cores (3.4GHz) (8 threads, ~100% Usage) + 4x4 vCores (VMPI): 47 minutes, 1 second elapsed
- 16 vCores (High Memory) (16 threads, ~90% Usage): 55 minutes, 11 seconds elapsed
- 16 vCores (Standard) (16 threads, ~80% Usage): 1 hour, 9 minutes, 42 seconds elapsed
- 10 x 4 vCores (Standard) (VMPI, ~100% Usage): 1 hour, 12 minutes, 3 seconds elapsed
- 8 vCores (Standard) (8 threads, ~90% Usage): 1 hour, 45 minutes, 19 seconds elapsed
- 4 vCores (Standard) (4 threads, ~100% Usage): 1 hour, 52 minutes, 10 seconds elapsed
Throwing more cores at the problem does not help. Optimize your maps before compiling!
Theses dlls are based upon their respective dlls from the CS:GO SDK (vvis_dll.dll & vrad_dll.dll), which we all know and trust not to destroy our PCs. That said, use these at your own risk. No responsibility is claimed for damage to your computer or games that may occur while using these patched dlls.