Open Experiment
Standard.
A vendor-neutral standard for documenting, exchanging, archiving, and presenting online experiments. Portable between vendors and products.
Why a standard?
Experiment data lives in many tools — GrowthBook, Optimizely, Statsig, internal platforms — but most teams want the same things from it: an archive of past decisions, a way to present results to executives, the ability to migrate between platforms, and a learning repository that outlives any single vendor.
Existing tools standardize how flags are evaluated. OES standardizes how experiments are documented and exchanged — design, metrics, results, decisions, and the trust checks that justify them.
Top-level envelope
A short, predictable structure makes documents safe to parse, even when fields are extended.
{
"schemaVersion": "0.1.0",
"objectType": "experiment",
"experiment": {},
"design": {},
"variants": [],
"metrics": [],
"analysis": {},
"results": {},
"scorecard": {},
"decision": {},
"qualityChecks": [],
"artifacts": [],
"provenance": {},
"extensions": {}
}Design principles
The standard distinguishes what was planned, what happened, how it was analyzed, what was concluded, and what should be shown to humans.
Metric definitions, code versions, and warehouse queries are captured into the document at analysis time — they may have changed by the time you reread it.
Vendor-specific fields live under extensions.*. Importers must safely ignore unknown extensions instead of rejecting documents.
Sample ratio mismatch, exposure health, invariant metrics, and other quality checks are part of the standard, not an afterthought.
Experiments fail as institutional memory when the result exists but the decision does not. OES makes the decision a first-class object.
JSON is the canonical manifest. Charts, CSVs, SQL, notebooks, and HTML reports travel alongside it as a research object.
What's in the spec
Twelve sections, ordered from envelope to artifacts. The MVP covers the fields needed for the 80% of online A/B tests teams run today.
MVP at a glance
For v0.1, we don't try to model every possible statistical method. We start with the fields needed for the 80% case:
- Experiment identity: ID, title, status, owner, hypothesis, dates
- Design: type, randomization unit, population, allocation, variants
- Metrics: definitions with role, direction, window, data source
- Results: sample sizes, deltas, intervals, p-values or Bayesian probabilities
- Scorecard: primary outcome, guardrails, overall result, recommended action
- Decision: ship / do-not-ship / iterate / rerun, with rationale
- Quality checks: SRM, exposure health, invariants, data freshness
- Artifacts and provenance: links to charts, SQL, dashboards, commits, source system
How it relates to OpenFeature
OpenFeature standardizes how applications evaluate feature flags and associate those evaluations with downstream outcomes. OES standardizes how experiment plans, metrics, results, scorecards, and decisions are exchanged after or during analysis. The two are complementary, not competitive.