According to 动察 Beating monitoring, the open-source computer-use infrastructure project Cua has released cua-driver, a macOS native driver that allows any agent to control Mac applications in the background. When the agent clicks, types, or takes screenshots, the user's cursor remains still, the focus does not change, and macOS does not switch across Spaces.
The core technology is based on reverse-engineering Apple's private framework SkyLight. Normal synthetic events sent through CGEventPost in the HID event stream would move the cursor; `CGEvent.postToPid` can send events directly but would be filtered out by Chromium's render process. cua-driver uses SkyLight's SLEventPostToPid to send events through the WindowServer trust channel, bypassing HID, allowing Chromium to receive the events. Window activation is achieved by following the window manager yabai's approach: using SLPSPostEventRecordTo to only toggle the target application's AppKit activation state without raising the window level, avoiding triggering Spaces switching. For Electron apps (such as Slack, VS Code, Discord), an undisclosed method _AXObserverAddNotificationAndCheckRemote is used to ensure the accessibility tree remains updated even when the window is covered.
cua-driver offers three capture modes: ax mode returns only the accessibility tree without requiring screen recording permissions; vision mode returns only screenshots; som mode (default) returns both, allowing agents to click using either element indices or pixel coordinates. The driver supports the MCP protocol, enabling integration with clients like Claude Code, Cursor, and can also be invoked via the command line. Two known limitations: right-clicking on Chromium web content does not work, and Canvas-based applications (Blender, Unity, game engines) still require brief foreground activation.
Following OpenAI's acquisition of the Apple Shortcuts team Sky, Codex was the first to introduce background computer-use capabilities but did not open-source them. Cua's Francesco Bonacci stated that a background computer-use driver should be a generic infrastructure rather than an exclusive feature of a single product.
