onlyterp/windows-is-fine-for-llms

★ 12PowerShellAudience · developerComplexity · 2/5Setup · moderate

Mindmap

mindmap
  root((windows-is-fine-for-llms))
    Problems Solved
      GPU watchdog crashes
      WSL2 VRAM overhead
    Fixes
      Registry timeout tweak
      Run model natively
    Guide Contents
      Serving a model
      VRAM reservation
      Troubleshooting table
    Requirements
      Windows 11
      NVIDIA GPU
      PowerShell

mindmap root((windows-is-fine-for-llms)) Problems Solved GPU watchdog crashes WSL2 VRAM overhead Fixes Registry timeout tweak Run model natively Guide Contents Serving a model VRAM reservation Troubleshooting table Requirements Windows 11 NVIDIA GPU PowerShell

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Fix GPU watchdog crashes on Windows when loading a large AI model by running the included PowerShell registry script.

USE CASE 2

Recover hidden VRAM lost to WSL2 overhead by switching to a native Windows model server and connecting Linux tools over the local network.

USE CASE 3

Follow the checklist to reliably serve a large language model on a Windows 11 PC with an NVIDIA GPU that also drives the display.

Tech stack

PowerShell

Getting it running

Difficulty · moderate Time to first run · 30min

Requires Windows 11 with an NVIDIA GPU driving the display, a reversible registry change must be applied via PowerShell before loading large models.

No license information was mentioned in the explanation.

In plain English

This repository is a practical guide for running large AI language models locally on a Windows PC, aimed at people who have been told to use Mac or Linux instead. The authors argue that the old advice against Windows was based on two specific problems that now have straightforward fixes. The first problem is desktop crashes when loading a large model. Windows has a built-in watchdog that monitors the graphics card and resets it if any single operation takes longer than two seconds. Loading a 20 gigabyte model can easily exceed that, causing the screen to go black or the machine to reboot. The fix is a small registry change that raises the timeout to 20 seconds, giving the GPU time to finish loading without Windows mistakenly treating it as a hang. The change is reversible, and the guide includes a PowerShell script that backs up the relevant settings before making any changes. The second problem affects people running Linux tools inside Windows using a compatibility layer called WSL2. On newer NVIDIA graphics cards, that compatibility layer consumes roughly 16 gigabytes of video memory for internal overhead that does not show up in standard memory monitoring tools. This causes software to run out of memory while trying to load a model, even when the card appears to have plenty of space free. The fix is to run the AI model directly in Windows rather than through the Linux layer, and simply connect to it over the local network from any Linux tools that need it. The guide covers how to serve a model after applying the fixes, how much video memory to reserve for the desktop, a troubleshooting table for common errors, and a checklist at the end summarizing all steps. It is written specifically for Windows 11 with NVIDIA graphics cards that also drive the display, and the authors note that the same graphics card runs large models daily in their own setup with no instability after applying these changes.

Copy-paste prompts

Prompt 1

Run the windows-is-fine-for-llms PowerShell script to back up my GPU registry settings and raise the watchdog timeout to 20 seconds, then verify the change was applied.

Prompt 2

I have an NVIDIA RTX 4090 on Windows 11 and my screen goes black when loading a 20GB model. Walk me through the TDR timeout fix from windows-is-fine-for-llms.

Prompt 3

Using the windows-is-fine-for-llms guide, set up a native Windows model server and connect it to my WSL2 environment over the local network so I can use Linux-based tools with it.

Prompt 4

How much VRAM should I reserve for the desktop display according to the windows-is-fine-for-llms guide, and how do I set that limit in my model serving software?

Open on GitHub → Explain another repo

← onlyterp on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.