OpenInfer × ClawMetry

Run AI anywhere.
Watch every turn.

Edge, on-prem, or cloud. Any hardware your agents can reach. See every CPU and GPU each turn ran on, what it cost, and which sub-agent caused the spend.

210k+ installs 343 stars 123+ countries E2E encrypted · local-first
The Promise

What if AI was:

The three pillars OpenInfer ships against. ClawMetry adds the fourth so you can prove it.

Low
Cost
Maximize ROI
Sovereign
Your control
Reliable
Always on
Observable
Every turn, every silicon
And what if the Edge could be everywhere?
HARDWARE
Agnostic
EASY
To deploy
RESOURCE
Unbound
"We make it possible for the 90% of agentic workloads that are routine and always on, to run on leaner and often under-utilized compute topologies."
Behnam Bastani · CEO, OpenInfer
The Substrate

Meet OpenInfer

An inference OS that treats compute as a scheduling problem. Latency-critical turns run on GPU. The other 90 percent run on CPU and Graviton. Sessions migrate across processors without re-paying the prefill cost.

Learn more about OpenInfer
Distributed inference for heterogeneous compute +
A unified inference layer that schedules across every chip you have. GPUs handle the latency-critical work. CPUs absorb the long-tail. Sessions migrate at prefill and decode boundaries.
Built with deployment in mind +
Drop-in OpenAI-compatible endpoint. No agent code changes. A single config.json points your OpenClaw workspace at OpenInfer and the OS layer takes over.
Always-on collaborative AI +
Background agents that never sleep cost a fortune on premium silicon. OpenInfer keeps them alive on the cheap pool so always-on becomes economical.
Data center-grade inference where data lives +
Run at the edge, on-prem, or in cloud. The same OS layer schedules across whatever hardware is sitting in front of your data, so the data never has to leave.
Performance

Proven at the Edge

Real benchmarks on commodity hardware. +50% capacity on a single AWS g6e.16xlarge by recruiting otherwise-idle CPUs into the inference fabric.

Learn more
The Eyes

Meet ClawMetry

OpenInfer makes the substrate schedule smartly. ClawMetry shows you what actually ran. Every turn, every chip, every dollar, every sub-agent. Local-first and end-to-end encrypted, so the substrate stops being a black box.

Read the joint launch post
Per-turn silicon attribution +
Every LLM turn annotated with the chip it ran on. Scroll a Telegram chat replay and see "this turn cost $0.0008, ran on EPYC, 1.4s end-to-end" right next to the user's message and the agent's reply.
Cost split by route +
Two new lines in the Tokens tab: GPU pool, CPU pool. "86 percent of our token spend went through the other-90 percent pool this week" becomes a number you can show finance.
Sub-agent kill switch +
A runaway that fans out 17 sub-agents shows up as a tree, each leaf with its own cost. End the runaway in-place without waiting for the quota wall.
Local-first, E2E encrypted +
An HTTP interceptor on the OpenClaw process. No new instrumentation. Data never leaves your machine in plaintext. pip install clawmetry.
Together

The cost curve of heterogeneous compute. The explainability of a single-vendor stack.

Two ten-minute integrations. No code changes. No new dashboard to learn.

"OpenInfer turned the inference substrate from a fixed cost into a scheduling problem. ClawMetry's job is to make sure that, when the substrate gets smart, the operator does not get blind."
Vivek Chand · founder, ClawMetry
Get started

Run AI anywhere. Watch every turn.

Both halves are free to try. Neither requires changes to your agent code.

# One command. That's it. 🦞
$curl -fsSL https://clawmetry.com/install.sh | bash
# Python users: pip install + guided onboarding.
$pip install clawmetry && clawmetry onboard

macOS, Linux, Windows. The installer figures it out.

Install ClawMetry Start OpenInfer beta

Contact: hello@clawmetry.com · contact@openinfer.io