According to Watch Beating monitoring, NVIDIA chip architect Max Lv has open-sourced the mcp-cli prototype. It targets code intelligence agents like Codex and Claude Code that often need to repeatedly run small commands such as `cat`, `rg`, `git status` while working locally. The approach of `mcp-cli` is to have a persistent process take over tasks like file reading, code searching, and checking Git status, reducing some of the repetitive command execution.
The author ran a benchmark using Codex CLI 0.120.0, with the target repository being the rust-v0.121.0 tag of openai/codex. The results showed the total execve calls reduced from 103 to 22, a decrease of 79%; the input token count decreased from 2.1596 million to 1.7556 million, a 19% reduction. The author also added the `--prefer-mcp` flag to prioritize the MCP tool for Codex, no longer defaulting to Bash.
This set of data first illustrates one thing: part of the overhead of a code intelligence agent lies not in the model itself but in repeatedly running local commands and handling long texts. However, it has not yet proven to make Codex overall faster. In the same benchmark, the total runtime increased by 45% because tasks that a single Bash pipeline could complete were often broken down into multiple calls when using MCP instead. In other words, `mcp-cli` has first demonstrated that the path of "reducing process startups" is worth exploring, but there is still one more step to take before it becomes truly practical. The next step is to introduce tools that focus more on the workflow of the intelligence agent, such as batch file reading and search result aggregation.
