Browse Source

Add new GamingVM post

master
sn0w 2 weeks ago
parent
commit
132a283013
Signed by: sn0w <sn0w@posteo.de> GPG Key ID: DDEDFB9D3FA15727
2 changed files with 254 additions and 0 deletions
  1. 1
    0
      build.sh
  2. 253
    0
      content/blog/2019-06-03_GamingVM-Revisited.md

+ 1
- 0
build.sh View File

@@ -88,6 +88,7 @@ fi
88 88
 
89 89
 if [[ "$1" == "--push" || "$1" == "-p" ]]; then
90 90
     echo "#> Pushing..."
91
+    source .env
91 92
     neocities push ./_public
92 93
     exit 0
93 94
 fi

+ 253
- 0
content/blog/2019-06-03_GamingVM-Revisited.md View File

@@ -0,0 +1,253 @@
1
+
2
+## Intro
3
+
4
+Hey there, It's time to talk about virtualization again.
5
+Almost 3 years after my initial attempts at squashing my vidya into a VM,
6
+I figured that a follow-up post is more than overdue.<br>
7
+
8
+Enjoy this little writeup of last-week's hardware woes and software headaches c:
9
+
10
+## Resources
11
+
12
+Before we start: This is not a tutorial.<br>
13
+If you want to build a VM yourself, here are some helpful links.
14
+
15
+- https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF
16
+- https://wiki.installgentoo.com/index.php/PCI_passthrough
17
+- https://vfio.blogspot.com/
18
+- https://www.reddit.com/r/VFIO
19
+- https://www.youtube.com/watch?v=aLeWg11ZBn0&t=35
20
+
21
+## Hardware
22
+
23
+My setup changed quite a bit since my last post.<br>
24
+Here's what I currently use, screenfetch-style.
25
+
26
+```txt
27
+OS:  Artix Linux, Kernel 5.1.4
28
+CPU: AMD Ryzen 7 2700X (8c/16t) @ 4GHz
29
+GPU: AMD Radeon RX Vega 64 (Guest)
30
+GPU: NVIDIA GeForce GTX 1060 (Host)
31
+RAM: 32GB (16GB Guest, 16GB Host)
32
+```
33
+
34
+My current mainboard (Asus PRIME B350 Plus) is a relict of old ideas.<br>
35
+It was recycled from a cheap NAS that I built in 2018.<br>
36
+Boy was that a bad idea.
37
+
38
+As it turns out, this board is the worst choice you could possibly make for GPU passthrough.
39
+It has two "x16" PCIe slots, but the second one only has enough pins to reach x8.
40
+I'm not kidding you.
41
+They literally didn't solder the other pins on that slot, but still label and sell it as x16.
42
+<br>
43
+
44
+<a href="https://i.imgur.com/kESB8iC.png"><img style="max-width:512px;" src="https://i.imgur.com/kESB8iC.png"/></a>
45
+
46
+## Bootstrapping
47
+
48
+Well whatever, bad mainboards won't stop us, right?
49
+
50
+Turns out if you disable "CSM" and boot once with cables only connected to the *second* GPU,
51
+it will remember that for the following boots. Nice, one step closer.
52
+
53
+From here on I did the usual steps.
54
+
55
+Blacklisted my Vega using vfio-pci:
56
+```txt
57
+# /etc/modprobe.d/vfio.conf
58
+options vfio-pci ids=1002:687f,1002:aaf8
59
+```
60
+
61
+Enabled AVIC and nested page tables:
62
+```txt
63
+# /etc/modprobe.d/kvm.conf
64
+options kvm_amd nested=1 avic=1
65
+```
66
+
67
+And finally added some new kernel params:
68
+```txt
69
+amd_iommu=on iommu=pt nvidia-drm.modeset=1
70
+```
71
+
72
+## The VM
73
+
74
+This time I decided to use a more "manual" approach.<br>
75
+I ditched libvirt and wrote a simple script that launches QEMU.
76
+
77
+You can see the whole thing at [glitch.sh](https://glitch.sh/sn0w/GamingVM-Reloaded),
78
+but I'll also walk you through some of the more important content here.
79
+
80
+In order of the file:
81
+
82
+### `echo performance | sudo tee /sys/devices/system/cpu/cpu*/...`
83
+
84
+The VM can't control the CPU clock, so we need to ensure<br>
85
+that Linux doesn't underestimate our workload.
86
+
87
+### `echo 1 | sudo tee /sys/bus/pci/...`
88
+
89
+This is a workaround for Vega. More on that later.
90
+
91
+### `-daemonize`
92
+
93
+Allow closing the terminal without killing the VM
94
+
95
+### `-machine pc-q35-4.0,accel=kvm,...`
96
+
97
+The guest uses an AMD GPU so we don't need to lie to Windows about the VM.<br>
98
+Yay team red!
99
+
100
+### `-name ...,debug-threads=on`
101
+
102
+`debug-threads` is important because we'll need it later for manual CPU pinning.
103
+
104
+### `-cpu ...`
105
+
106
+`+topoext` is required to use SMT/HT on AMD.
107
+
108
+`host-cache-info=on` will pass the CPU's cache topology instead of emulating something.
109
+
110
+`hv_*` are the usual ["HyperV Enlightenments"](https://blog.wikichoon.com/2014/07/enabling-hyper-v-enlightenments-with-kvm.html).
111
+
112
+### `-smp 8,sockets=1,cores=4,threads=2`
113
+
114
+This sets the CPU core topology.<br>
115
+The Ryzen has 8 cores and 16 threads.<br>
116
+I decided to pass half of it to windows.
117
+
118
+Usually people use more `cores` with `threads=1`, but here's the thing with Ryzen:<br>
119
+The R7 consists of two "CCX" which both house 4 cores and have their own cache.<br>
120
+They are glued together with "Infinity Fabric" and can exchange data at roughly 40GB/s.<br>
121
+So it makes sense to pass one complete CCX and expose it's SMT topology to increase cache locality.
122
+
123
+Or in a simple picture (Blue Host, Green VM):
124
+
125
+<a href="https://i.imgur.com/Ki7br3M.png"><img style="max-width:425px;" src="https://i.imgur.com/Ki7br3M.png"/></a>
126
+
127
+### `-m 16G ... -mem-path /hugepages/...`
128
+
129
+Memory allocations are an important topic to think about.
130
+This ensures that QEMU utilizes memory that was allocated in contiguous 1GB blocks at boot time,
131
+instead of falling back to the default (possibly fragmented) 4KB pages.
132
+
133
+The kernel params for this are:
134
+```txt
135
+default_hugepagesz=1G hugepagesz=1G hugepages=16
136
+```
137
+
138
+Then mount them with:
139
+```
140
+hugetlbfs /hugepages hugetlbfs defaults 0 0
141
+```
142
+
143
+This permanently locks 16G away, but the remaining 16G are more than enough for the host.
144
+
145
+### `-audiodev id=pa,driver=pa,server=...`
146
+
147
+This is simply the modern replacement for `QEMU_AUDIO_DRV` and `QEMU_PA_SERVER`.
148
+
149
+### `-device ioh3420,id=root,...`
150
+
151
+This adds a PCI "root port" that the GPU attaches to.
152
+Otherwise it will seem to windows like the GPU was connected directly to the root bus,
153
+which will cause QEMU to change the emulated configuration to "Integrated Endpoint".
154
+
155
+This means it looks to Windows like the GPU was physically inside the PCIe controller,
156
+and (more importantly) in this mode [QEMU will omit any link speed configuration](https://github.com/qemu/qemu/blob/a2596aee6c8274daff2357f4e1c406de763cf832/hw/vfio/pci.c#L1859-L1887).
157
+
158
+So TLDR, without this your GPU will likely run much slower.<br>
159
+Not just "slightly slow". We're talking PCIe x1 vs x16.
160
+
161
+### `-object input-linux,...,evdev=`
162
+
163
+Passes keyboard and mouse via PS/2 using evdev.<br>
164
+This allows switching between Guest/Host on the fly by pressing LCtrl-RCtrl.
165
+
166
+### `-device virtio-{mouse,keyboard}-pci,...`
167
+
168
+Passes the keyboard and mouse using VirtIO.<br>
169
+Automatically takes priority over PS/2 in the guest.
170
+
171
+This still uses evdev events but omits a lot of emulation overhead,<br>
172
+and - subjectively - works a lot better and perfectly stutter-free in games.
173
+
174
+YMMV
175
+
176
+## CPU Pinning
177
+
178
+Ok so with the VM up and running, let's talk about CPU pinning.<br>
179
+In libvirt that's rather easy, but it's also doable manually.
180
+
181
+Remeber `debug-threads`?<br>
182
+That flag adds some pretty useful information to QEMU's `comm`,<br>
183
+which makes spotting the virtualized CPUs very easy:
184
+
185
+```sh
186
+for p in $(pstree -pa $(pidof qemu-system-x86_64) | awk -F',' '{print $2}' | awk '{print $1}'); do
187
+    vcpu="$(cat /proc/$p/comm)"
188
+
189
+    if [[ "$vcpu" != CPU*/KVM ]]; then
190
+        continue;
191
+    fi
192
+
193
+    # pin $p here
194
+done;
195
+```
196
+
197
+In that loop we can use `taskset` and/or cgroups to assign the "cpu process" to a fixed CPU.
198
+
199
+Make sure to pin the CPUs in the correct order.<br>
200
+For a Ryzen 7 this means:
201
+```
202
+0=>4, 1=>12, 2=>5, 3=>13, 4=>6, 5=>14, 6=>7, 7=>15 (guest=>host)
203
+```
204
+Not doing this will mix the two CCX's and cause a lot of lags and generally degraded performance.
205
+
206
+Additionally, I isolated the VM CPUs from the rest of the system using kernel params.<br>
207
+This ensures that linux doesn't consider putting any tasks on these cores, to reduce context switches.
208
+
209
+```txt
210
+isolcpus=4-7,12-15 nohz_full=4-7,12-15 rcu_nocbs=4-7,12-15
211
+```
212
+
213
+You could theoretically also move any and all host-pids into a "host" cgroup which only has access to the other 8 cores.
214
+This would allow you to utilize all 16 threads when the vm is off, but it's (imo) a lot more complicated,
215
+and I don't really need more than 4c/8t on linux anyway.
216
+
217
+Tip: To diagnose context switch problems use
218
+```
219
+perf record -e 'sched:sched_switch' -C 4-7,12-15
220
+```
221
+
222
+(Obviously, adapt the `-C` param to your system).
223
+
224
+## VEGA Sadness
225
+
226
+VEGA has something called the "reset issue" where it cannot be used anymore
227
+after the VM shuts down or reboots, until the host power-cycles or goes into standby.
228
+
229
+One workaround that works for me is to only passthrough the GPU "function",
230
+and leave the sound device unmapped. That will print some QEMU warnings during startup
231
+but generally lasts for at least 6-10 VM resets without any noticable side-effects.
232
+
233
+This however required the `/remove` and `/rescan` "patch" on my system.<br>
234
+Otherwise the host would eventually lock up.<br>
235
+Don't ask me why, I don't have an answer yet.
236
+
237
+I'll do a follow-up post if I ever find out how to get this working cleanly.
238
+
239
+## Result
240
+
241
+It works well. Pretty well.
242
+
243
+Time Spy reports a graphics score of 7151. (https://www.3dmark.com/3dm/36575667)<br>
244
+Guru3D scored 7.5k with the exact same GPU model (not overclocked),<br>
245
+which means the VM is running at roughly 96% bare-metal GPU performance.
246
+
247
+The CPU score reaches 4336.<br>
248
+Other Ryzen 7 benchmarks usually score ~90% higher, which is expected<br>
249
+when you consider that only half the cores are passed.
250
+
251
+To finish up, here's a final pic of Linux running Windows running CoD Zombies :)
252
+
253
+[![](https://i.imgur.com/gRMa4G4.jpg)](https://i.imgur.com/gRMa4G4.jpg)

Loading…
Cancel
Save