Automate repetitive Windows desktop tasks by describing the task in plain English and letting the agent click through apps on your behalf
Run multi-device workflows where agents on Windows, Linux, and Android collaborate on a single task without human involvement
Build automated UI testing flows that interact with desktop applications through their real visual interface, not just APIs
Requires a Windows machine and an LLM API key, multi-device Galaxy mode additionally needs Linux or Android machines configured as agents.
UFO is a Microsoft research project that lets a computer use its own software the way a person would. Instead of a human clicking through menus and typing into apps, UFO reads the screen, understands what it sees, and takes actions to complete a task you describe in plain English. It works on Windows and is built on top of large language model technology. The project has gone through three major versions. The original UFO was a single-device agent for Windows released in early 2024. UFO2, called Desktop AgentOS, deepened the integration with Windows so the agent could interact with apps both through the visual interface and through underlying APIs when available. UFO3, the current version, introduces a framework called Galaxy that lets multiple agents on different devices (Windows, Linux, Android) work together on the same task at the same time. The Galaxy system breaks a user request into a graph of smaller tasks, where some tasks can run at the same time and others must wait for earlier ones to finish. A planning component called the ConstellationAgent figures out which device is best suited for each piece of the task, assigns work to the right machines, and adjusts the plan if something goes wrong mid-run. The devices communicate over a secure connection so they can share results and coordinate without human involvement. For someone who wants to automate a single Windows computer, UFO2 is described as stable and straightforward to set up. For workflows that span multiple machines or operating systems, Galaxy handles the coordination. The two modes are compatible: a UFO2 installation can act as one of the device agents inside a Galaxy setup, so existing users can move to the newer system gradually. The repository includes documentation, quick-start guides, and video demos. The full README is longer than what was shown.
← microsoft on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.