I often hear about NUMA and it’s performance implications. I even speak about it when people ask me about best practices. I never really had any numbers or proof that it was in fact a major performance issue until recently. Some of you may know that my main PC is a VM running on an ESXi host with PCI passthrough of a GPU and an USB card. I have used this for many years. In 2013/2014 I built my first VM pc using a hardware modded NVIDIA GTX640. I followed in the footsteps of some far smart person than I am with this guide. At my first VMworld in 2014 I was in the HOL testing group and after my testing I logged into my View instance loaded my desktop via hardware accelerated 3D graphics over PCoIP. I streamed Tribes: Ascend from my desktop in Atlanta over my Comcast cable uplink of about 20mbps to San Fran. It had a horrible frame rate but was playable. At home I was using a Terra 2 client for access running at about 50fps on most games. After a while I ended up switching to a SVGA cable and then HDMI for video and a passthrough USB card for keyboard/mouse/other USB stuff. I’ve gone through a few iterations of this across different hosts and video cards and USB cards. Today I’m using an NVIDIA 1080 for video and have my HTC Vive hooked up for VR.
Long intro but the point of this is when I wanted to try iRacing it suggested a minimum requirement of 32 GB of RAM. This was fine until I changed to a new host. I moved from Westmere-EX to Haswell. The new host had half of the RAM of my old host. 64GB was all it came with. This was fine for a short time. I ended up getting another 64GB of RAM last week. During the time that I only had 64GB total I noticed something odd. The performance of my VM seemed to have dropped. It was something I really noticed playing Factorio. I have an FPS/UPS display and it had dropped to the 40s/50s. I thought this was odd as my old host ran at about the same. This game is generally CPU bound. My new CPUs were faster and I am running less on them. This was annoying but I just moved along.
When my new memory finally showed up I installed it and got everything back up and running. All of the sudden my Factorio FPS/UPS was back at the max of 60. Nothing changed except the memory configuration. The difference that I can think of is NUMA locality. With twice the memory in each node my VM fit entirely into one node and performance increased. This was an unexpected benefit that has solidified my belief in the things I have been telling people around large VM performance. Long winded but I thought it was interesting.