NewsFlash Articles Data Fundraising Skill&API

From Token to Machine Labor: AI is Transitioning from a Tool to a "Worker"

Read this article in 20 Minutes

The core of market pricing will shift from model invocation to verifiable, provable, and settleable work outcomes

Original Title: A Market for Machine Labor

Original Author: @__sishir

Translation: Peggy

Editor's Note: As AI begins to write code, handle customer service tickets, review legal documents, a more fundamental question is emerging: what enterprises are truly purchasing, token, GPU hours, or completed work?

This article presents an insightful framework: the commercialization of AI should not only be understood as a "compute market" or "model invocation market," but is moving towards a new "machine labor market." In this market, tokens are merely a unit of measurement, GPUs are inputs, models are tools of production, and the object truly priced and traded is the economically productive labor directly performed by the software.

The key argument of the article is that AI pricing mechanisms will evolve from raw tokens, standardized model capabilities, to industrialized labor, and further to a programmable outcome market. In other words, in the future, enterprises may no longer care about which model or GPU performs a task, but rather whether it has delivered a standard-compliant result within specified latency, accuracy, reliability, and cost parameters.

This also implies that the impact of AI on the human labor market may not be a simple substitution. As machines take on more standardized, verifiable work, the role of humans may shift towards review, accountability, context management, and final judgment. In certain scenarios, the final 1% of human judgment may actually become more valuable as it can unleash large-scale automation of the 99%.

From this perspective, the next stage of competition in the AI market may no longer be just about model capabilities or a mere computation price war, but about who can first standardize, verify, price, and ultimately make machine labor a new type of production factor that can be procured, settled, and traded.

The following is the original text:

The productivity wave has always come from producing tools and software for humans to optimize how work gets done. Spreadsheets aid accountants and analysts, conveyor belts increase throughput, hammers amplify human leverage. But real labor always comes from humans.

Now, AI is end-to-end outputting work products, directly performing the labor itself. It can write code, handle customer service tickets, review legal documents. The entire end of the tech stack is compressing: the old tech stack supported labor, the new tech stack is starting to produce labor.

If you've heard recent discussions about AI financialization, you've probably heard Jensen and others say that LLM tokens and/or GPU hours are becoming the new commodities. This intuition is understandable because tokens are measurable, billable, and easily chartable; there have also been billions of dollars flowing into GPU hours. However, tokens are still just a measure, and GPU hours are just an input, no one buys them for the sake of owning them. What people really want is to get the work done. AI is turning the technology stack itself into a source of labor.

Machine labor: work performed by software that is economically useful and sold into the production process.

The market has been moving in this direction. Benchmark's Sarah Tavel tends to view this opportunity through the lens of the outsourced labor market rather than the software category. If a repetitive task was originally done by a dedicated offshore team or professional services firm, it is often a good fit for work delivered by AI. a16z's Alex Rampell refers to it as "software eating labor": the next scene of software is doing the actual work. Sequoia's Julien Bek describes the same shift from a different angle: services are turning into software, where copilots sell tools, and autopilots sell work.

Related Read: https://sequoiacap.com/article/services-the-new-software/

The Missing Market Behind Outcome Pricing

Seat pricing charges based on access, token pricing charges based on usage. Outcome pricing, on the other hand, charges when the work is completed. Outcome pricing takes us a step forward, but it still doesn't answer one question: who decides the price?

If machine labor can be purchased directly, the price should come from competition between suppliers. These suppliers must be able to meet the same class of tasks or work completion standards, which requires standardization across different industries and tasks.

The current practice is to use LLM tokens, but the base token is just the bottom layer. Each barrel of oil is just a unit of measurement; what is truly traded is a specific grade of oil with clear quality, delivery terms, and market price. A barrel of Brent Crude and a barrel of heavy sour crude are not the same commodity. The same goes for LLM tokens. Tokens are just units of measurement; what is crucial is the intelligence behind them: model quality, benchmark test floor, latency, context window, reliability, and delivery assurance. A million tokens from a cutting-edge model and a million tokens from a low-grade general model are not the same commodity. The market needs standardized reasoning grades just as the energy market needs standardized oil grades.

Anjali Shriva directly pointed out that a token is not a fixed cost unit. Its economy will vary depending on context length, task structure, input/output ratio, retries, tool invocations, and Agent workflows. A token in a short prompt is not the same economic entity as a token buried in a long Agent loop.

We've been doing this in the human labor market for a long time. No one would hire a radiologist as a generalized 'human-hour' unit. People look at training background, licensing, specialization, years of practice, availability, reputation, and responsibility. Different human contract specifications correspond to different minimum standards and grade expectations.

The human labor market has always operated on these specifications, but these specifications are often mixed, qualitative, and laden with various agency metrics. Machine labor will make these specifications more explicit and quantifiable.

For LLM or Agent, metrics like skills, experience, speed, and reliability can be directly written into contracts: benchmark test scores, latency, throughput, context window, maximum output length, tool usage accuracy, uptime, error rate. We can procure labor based on quantifiable expectations and outcomes.

TheGrid.ai's contract specifications are essentially a qualification filter, coupled with price competition for LLM outputs. As long as suppliers meet the specifications, they can enter the competition:

Intelligent Benchmark ≥ Minimum

Latency ≤ Maximum

Throughput ≥ Minimum

Uptime ≥ Minimum

Error Rate ≤ Maximum

Once suppliers all meet the same minimum threshold, they start competing on price. The buyer's question is: Which supplier can deliver the required labor at the best price?

Recruiting a radiologist, in the LLM context, becomes a measurable problem: Which LLMs can read X-rays with high proficiency and complete tasks within explicit latency, context window, and other results-based contract specifications.

Result is how the buyer measures success; labor is the economic activity being supplied; and the token is the fuel consumed by the machine in the process of completing work.

TheGrid is the machine labor market.

From Token to Machine Labor Market

A market can price the input of a tech stack, but to price the output, a machine labor market is needed. Buyers are not concerned with GPU hours. Model endpoints themselves are also unstable: they get renamed, deprecated, packaged, or outright retired.

Users and liquidity both dislike frequent changes. GPUs and models will continue to evolve, but the stable unit is the work itself.

I believe that the market will evolve along the following path. With each step up, the purchased item becomes more abstract, more valuable, but also harder to verify. The Grid should gradually climb this ladder:

Raw Token → Commodified LLM Capability Market → Commodified Labor Market → Programmable Output Market

Phase One: Raw Token

Claude 4.7, GPT 5.5, Kimi 2.6, DeepSeek V4, GLM 5, etc.

Today, buyers purchase raw model outputs from inference providers. They send their prompts, receive the inference results, and pay based on usage. This is easy to verify, but it is still just raw material. What buyers really want is not the token but useful intelligence at the best price.

Phase Two: Commodified LLM Capability Market

For example, text/usd, code/usd, agent/usd, etc.

Buyers no longer select a specific model but rather a category of intelligence they need. Buyers still control the workflow, prompts, data, and application logic. The Grid simply routes each request to the compliant and lowest-priced model.

Note: This is the first true abstraction layer above the raw token and is where TheGrid.ai currently stands.

Phase Three: Commodified Labor Market

For example, accounting/usd, support_agent/usd, legal/usd, healthcare/usd, radiology/usd, etc.

As models become more specialized, the capability market can further evolve into industry-specific markets. This is akin to the division of labor in different human labor markets.

At this layer, we are selling inference capabilities tailored to specific labor vertical workflows. As industry-specific models become more prevalent, this market will expand rapidly. Examples include Cursor's Composer, Harvey for legal work, and EvidenceOpen for healthcare.

Phase Four: Agent-Centric Programmable RFQ and Outcomes Market

For example, support_ticket_resolved/usd, pr_merged/usd, claim_processed/usd, etc.

The final layer is where The Grid transitions from an Inference Market to a Machine Labor Market.

This layer requires mechanisms such as RFQs, escrow accounts, delayed settlement, buyer attestation, supplier reputation, clawback mechanisms, dispute resolution, and more. It is likely to start with RFQs rather than using a direct order book. Buyers define the scope of work, constraints, acceptance criteria, and settlement terms, and Agents bid to complete the task. The Grid then assists in routing, pricing, validation, and settlement of these works.

This is the most valuable layer but also the most challenging to validate because outcomes may be delayed, subjective, and easily manipulated. A customer service ticket may be reopened; a PR may have passed tests but still resulted in poor architecture.

Total Price = Cost of Work + Cost of Risk

A workflow does not automatically become a market just because there is intelligence in how it is priced or intelligence gets cheaper. Some work is heavily dependent on proprietary context, such as customer history or internal policy. The more context-dependent the work, the less likely it will settle cleanly in an open market. [@hypersoren https://hypersoren.xyz/posts/cybernetic-arbitrage/]

The market needs to reveal which labor categories will expand and which will contract.

Machine Labor vs Human Labor, or Machine Labor & Human Labor

In her mechanism design draft, Anjali Shriva points out that AI narratives are too often described as substitutes. In reality, it is more of a coordination problem: when both human and machine are engaged in production, how will work, attribution, incentives, and value be reorganized.

Today, much of AI usage within enterprises remains trapped as employees use AI in silos, workflows are still locked to individuals, and companies cannot price these productivity boosts or scale the benefits.

Most automatable work is likely to be offloaded to machines. Some tasks will become human oversight, responsibility, training, and context management. In some scenarios, the final 1% of human judgment becomes even more valuable as it can unlock at scale the 99% of automated work.

In Rachel Su Park's "Brave New World of AI Markets," it is pointed out that AI's TAM should not be simply modeled as a replacement for current human labor expenditure as it simultaneously changes both price and quantity. With falling labor costs, unit prices may decrease, but consumption quantity may expand as existing work is consumed more frequently, and entirely new work that was previously uneconomical becomes viable. The article summarizes this as:

P × Q: Market Size = Unit Work Price × Amount of Work Consumed

If AI makes customer service interactions cheaper, companies can offer 24/7 round-the-clock service capability. This market will not just be a cheap version of the old customer service labor market but may become a larger customer interaction market.

AI is an expansive market because when labor costs drop, demand does not stay the same.

Labor Layer

The machine labor market should start with jobs that can be clearly defined. GPU hours contain too much input information to tell you what supported the work; pricing for the full outcome is too complex, relying too much on context. As validation, reputation, and risk/insurance pricing gradually move towards machine control, the market will continue toward a pure outcome layer.

Machine labor can become tradable because buyers will care less about which model or GPU ran the job and more about whether the work itself achieved the minimum standards and grades in the contract specification at the right price. Agents will be even less concerned about these underlying sources.

Machines can now directly execute economically useful work, which can be defined, measured, priced, procured, and eventually traded. Power, compute, models, and tokens are, of course, still crucial, but they all remain upstream.

The downstream is where work truly gets done, and the market is moving towards a simpler entity: machine labor.

[Original Post Link]

Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia

#AI

Correction/Report