Open Experiment Standard
Draft v0.1.0

Open Experiment
Standard.

A vendor-neutral standard for documenting, exchanging, archiving, and presenting online experiments. Portable between vendors and products.

Why a standard?

Experiment data lives in many tools — GrowthBook, Optimizely, Statsig, internal platforms — but most teams want the same things from it: an archive of past decisions, a way to present results to executives, the ability to migrate between platforms, and a learning repository that outlives any single vendor.

Existing tools standardize how flags are evaluated. OES standardizes how experiments are documented and exchanged — design, metrics, results, decisions, and the trust checks that justify them.

Top-level envelope

A short, predictable structure makes documents safe to parse, even when fields are extended.

{
  "schemaVersion": "0.1.0",
  "objectType": "experiment",
  "experiment": {},
  "design": {},
  "variants": [],
  "metrics": [],
  "analysis": {},
  "results": {},
  "scorecard": {},
  "decision": {},
  "qualityChecks": [],
  "artifacts": [],
  "provenance": {},
  "extensions": {}
}

Design principles

Separate planning from outcomes

The standard distinguishes what was planned, what happened, how it was analyzed, what was concluded, and what should be shown to humans.

Snapshot, don't reference

Metric definitions, code versions, and warehouse queries are captured into the document at analysis time — they may have changed by the time you reread it.

Extensible by namespace

Vendor-specific fields live under extensions.*. Importers must safely ignore unknown extensions instead of rejecting documents.

Trust is first-class

Sample ratio mismatch, exposure health, invariant metrics, and other quality checks are part of the standard, not an afterthought.

Decisions, not just results

Experiments fail as institutional memory when the result exists but the decision does not. OES makes the decision a first-class object.

Bundle, don't fragment

JSON is the canonical manifest. Charts, CSVs, SQL, notebooks, and HTML reports travel alongside it as a research object.

What's in the spec

Twelve sections, ordered from envelope to artifacts. The MVP covers the fields needed for the 80% of online A/B tests teams run today.

MVP at a glance

For v0.1, we don't try to model every possible statistical method. We start with the fields needed for the 80% case:

  • Experiment identity: ID, title, status, owner, hypothesis, dates
  • Design: type, randomization unit, population, allocation, variants
  • Metrics: definitions with role, direction, window, data source
  • Results: sample sizes, deltas, intervals, p-values or Bayesian probabilities
  • Scorecard: primary outcome, guardrails, overall result, recommended action
  • Decision: ship / do-not-ship / iterate / rerun, with rationale
  • Quality checks: SRM, exposure health, invariants, data freshness
  • Artifacts and provenance: links to charts, SQL, dashboards, commits, source system

How it relates to OpenFeature

OpenFeature standardizes how applications evaluate feature flags and associate those evaluations with downstream outcomes. OES standardizes how experiment plans, metrics, results, scorecards, and decisions are exchanged after or during analysis. The two are complementary, not competitive.