Open source · Runtime Intelligence Platformv0.1 available on PyPI

Runtime Intelligence
for AI Agents

Critiqor evaluates observable runtime behaviour rather than relying on agent self-reporting. Capture runtime evidence. Generate explainable diagnoses. Improve agent reliability.

terminal
$ pip install critiqor
Evidence-backedRuntime-observedDeveloper-firstApache 2.0 licensed
live observation pipeline
run_004
  • Developer
    CLI
  • AI Agent
    OpenClaw
  • Runtime Events
    observed
  • Evidence Collection
    tool calls · outputs
  • Diagnosis Engine
    explainable
  • Interactive Dashboard
    verdict · timeline
healthyverdictReady For Runtime
trust 100
Why Critiqor

Traditional evals judge the answer.
Critiqor watches the work.

Most evaluation frameworks score the final response. Critiqor instead observes the agent during execution — recording tool calls, tool outputs, runtime events, reasoning flow, execution efficiency and evidence utilisation.

Every diagnosis is backed by observable execution evidence — not a model's opinion about itself.

7+
evidence types
100s
observed events / run
0%
self-report
100%
explainability
Traditional Evaluation
legacy

Answer-only scoring

  • Final response only
  • Self-reported reasoning
  • Limited explainability
  • No runtime visibility
Critiqor
runtime

Evidence-backed diagnosis

  • Runtime evidence
  • Observable execution
  • Explainable diagnosis
  • Root cause analysis
  • Historical intelligence
Platform

Everything you need to trust your agents.

Six capabilities working together — from the moment your agent boots to the final diagnosis.

Runtime Observation

Observe agents during execution — not after. Critiqor attaches before launch and follows every event end-to-end.

Evidence Collection

Capture runtime events, tool calls, tool outputs, provider requests and rich execution metadata into structured artifacts.

Diagnosis Engine

Convert raw runtime evidence into explainable reports — verdicts, confidence, trust score and reasoning summaries.

Root Cause Analysis

Identify failures, ignored outputs, loops and retrieval gaps. Each issue links back to the underlying evidence.

Interactive Dashboard

Executive summaries, runtime timelines, causal graphs and historical evaluations — local-first, no data leaves your machine.

Coming Soon

Benchmarking & Leaderboards

Compare agent reliability with private, anonymous and public visibility. Opt-in benchmarks for teams and communities.

Architecture

How Critiqor works.

A nine-stage pipeline turning raw agent execution into evidence, diagnosis and recommendations — fully local by default.

01
User
developer
02
OpenClaw
AI agent
03
Critiqor Plugin
openclaw integration
04
Runtime Events
observed signal
05
Session File
session.json
06
Diagnosis Engine
causal analysis
07
Diagnosis File
diagnosis.json
08
Dashboard
local-first
09
Recommendations
actionable
Local-first
Runs entirely on your machine
Artifact-based
session.json + diagnosis.json
Explainable
Every claim references evidence
Workflow

Developer Workflow.

Four commands from install to insight. Local. Reproducible. Friction-free.

step 01

Install Critiqor

One pip install. Zero infrastructure. No accounts, keys or cloud services required.

terminal
$ pip install critiqor
step 02

Launch under observation

Critiqor launches OpenClaw and immediately begins observing runtime activity.

terminal
$ critiqor monitor openclaw
step 03

Use OpenClaw normally

Work as you always do. Critiqor observes silently in the background — zero changes to your agent code.

terminal
 openclaw run …  # business as usual
step 04

Finalize the session

Critiqor finalizes the observation session, generates the diagnosis and automatically opens the local dashboard.

terminal
$ critiqor finalize
Dashboard

Read the evidence. Trust the verdict.

A local-first interface designed for engineers — fast, dense, explainable.

localhost:5173 / dashboard

Dashboard

local diagnosis

Live reliability intelligence for OpenClaw agents.

Explore Dashboard
healthy · run_004 · openclaw_agent

Runtime evidence captured

No OpenClaw failure mode was detected from runtime evidence.

100trust
confidence
95%
Critiqor certainty
Executive Summary
trust100/100
confidence95%
verdictReady For Runtime
Runtime Timeline
events7
duration70.4s
tools0
Diagnosis Artifact
diagnosis.jsonrun_004
session.jsoncaptured
events7
Dashboard
active runs4
diagnoses0 critical
trust impact-0 pts
Recent runsView all →
  • openclaw_agent runtime run
    run_004 · openclaw
    passed
  • openclaw_agent runtime run
    run_003 · openclaw
    passed
  • openclaw_agent runtime run
    run_002 · openclaw
    passed
  • openclaw_agent runtime run
    run_001 · openclaw
    passed
Primary diagnosesInvestigate →
No failure causes detected yet. Diagnoses appear after runtime evidence is finalized.
Get started in 30 seconds

Install Critiqor. Observe everything.

One pip install, three commands. No accounts. No cloud. Just runtime evidence.

install
$ pip install critiqor
observe
$ critiqor monitor openclaw
diagnose
$ critiqor finalize
Roadmap

Built in the open, with you in the loop.

Transparent roadmap, public issues, and fast iteration on real agent failures.

Shipped
  • Runtime Observation
  • Interactive Dashboard
  • OpenClaw Integration
  • Diagnosis Engine
  • Root Cause Analysis
In Progress
  • Improved benchmarkingcompare agents across runs and environments.
  • Dashboard enhancementsdenser evidence views for faster debugging.
Planned
  • Additional agent frameworksextend beyond OpenClaw while keeping local-first.
  • Community leaderboardsopt-in reliability benchmarks for teams and OSS.
  • Enterprise dashboardmulti-tenant observability for AI ops teams.

Track progress on GitHub Issues , propose a feature on GitHub , or follow the full Roadmap.

Community

Stay in the loop.

Docs, source, plugins and the conversation around runtime intelligence.

FAQ

Questions, answered.

Everything developers ask before adopting Critiqor.