Web Codegen Scorer is a tool built by the Angular team at Google for measuring how good the web code is that AI models produce. The problem it addresses is that many developers choose between AI coding tools, models, and frameworks based on informal impressions rather than consistent measurement. This tool provides a repeatable way to run those comparisons with actual metrics. You give it a set of prompts describing web apps to build, and it sends those prompts to an AI model, collects the generated code, and then runs a series of checks on the results. The built-in checks cover whether the code builds successfully without errors, whether it produces runtime errors when run in a browser, accessibility problems, security issues, coding best practices, and a rating produced by a second AI model looking at the code. If problems are detected during generation, the tool can automatically try to fix them and regenerate. The tool works with any web framework, not just Angular, and supports models from Google, OpenAI, Anthropic, and xAI. You configure which model to use, which framework to target, and what system instructions to include. Results are saved and can be viewed in a report viewer that lets you compare runs side by side. Practical uses include testing whether a change to your system prompt actually improves output quality, comparing which model produces better results for your specific use case, and tracking whether quality changes as models are updated over time. It installs as a global npm package. You supply API keys for whichever model providers you want to use as environment variables. The Angular example bundled with the tool is a convenient starting point. The project is written in TypeScript, uses pnpm for development, and is open source with contributions welcome.
← angular on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.