Fix GPU watchdog crashes on Windows when loading a large AI model by running the included PowerShell registry script.
Recover hidden VRAM lost to WSL2 overhead by switching to a native Windows model server and connecting Linux tools over the local network.
Follow the checklist to reliably serve a large language model on a Windows 11 PC with an NVIDIA GPU that also drives the display.
Requires Windows 11 with an NVIDIA GPU driving the display, a reversible registry change must be applied via PowerShell before loading large models.
This repository is a practical guide for running large AI language models locally on a Windows PC, aimed at people who have been told to use Mac or Linux instead. The authors argue that the old advice against Windows was based on two specific problems that now have straightforward fixes. The first problem is desktop crashes when loading a large model. Windows has a built-in watchdog that monitors the graphics card and resets it if any single operation takes longer than two seconds. Loading a 20 gigabyte model can easily exceed that, causing the screen to go black or the machine to reboot. The fix is a small registry change that raises the timeout to 20 seconds, giving the GPU time to finish loading without Windows mistakenly treating it as a hang. The change is reversible, and the guide includes a PowerShell script that backs up the relevant settings before making any changes. The second problem affects people running Linux tools inside Windows using a compatibility layer called WSL2. On newer NVIDIA graphics cards, that compatibility layer consumes roughly 16 gigabytes of video memory for internal overhead that does not show up in standard memory monitoring tools. This causes software to run out of memory while trying to load a model, even when the card appears to have plenty of space free. The fix is to run the AI model directly in Windows rather than through the Linux layer, and simply connect to it over the local network from any Linux tools that need it. The guide covers how to serve a model after applying the fixes, how much video memory to reserve for the desktop, a troubleshooting table for common errors, and a checklist at the end summarizing all steps. It is written specifically for Windows 11 with NVIDIA graphics cards that also drive the display, and the authors note that the same graphics card runs large models daily in their own setup with no instability after applying these changes.
← onlyterp on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.