The atomic data platform for modern teams.

Define each pipeline as a single .msh file — ingest, transform, tests, deploy, and lineage in one place.

msh runs it end-to-end and gives you a local AI assistant to explain, review, and generate safe changes, with msh Cloud coming soon for teams.

What msh is

msh is a modern data platform built around atomic assets — one .msh file per data asset.

Each file defines ingest, transform, tests, deploy, and lineage end-to-end — replacing scattered scripts, YAML, and DAGs with a single source of truth.

Ingest

API, SQL, files

Transform

SQL or Python

Deploy

Blue/green safety

Tests

Built-in validation

Lineage

Full dependency graph

One asset

One state. One place.

One asset. One file. One place to reason about your pipeline.

No multi-tool sprawl. No hidden orchestration logic. Just atomic assets the platform can run and understand.

Why atomic assets

Traditional stacks scatter ingest, transforms, tests, deploys, and lineage across different tools and files.

msh flips this by putting the entire lifecycle of a data asset into a single .msh file.

That makes pipelines easier to reason about, safer to deploy, and the perfect shape for AI to understand:

ingest
transform
deploy
tests
lineage
glossary terms
schemas

Deterministic deploys, simple rollbacks, and built-in lineage.

Not a plugin or wrapper — an atomic architecture designed so AI and humans can safely reason about it.

AI for your atomic assets

Because each asset lives in a single structured .msh file, AI can actually understand your pipeline end-to-end.

From the CLI, AI works as a helper on top of the platform — not a chatbot on the side:

Explain any asset

msh ai explain

Generate new assets

msh ai new

Fix broken logic

msh ai fix

Improve tests

msh ai tests

Review for risks

msh ai review

Suggest improvements

msh ai suggest

AI here knows your entire pipeline — ingest, transforms, tests, deploys, lineage, and glossary — because it is all defined in one place.

The Semantic Layer (Cloud)

Teach your AI how your business works.

Define and link them to your assets and columns.

Business terms

"Customer", "MRR", etc.

Metrics

Defined and linked

Dimensions

"Region", "Product", etc.

Policies

PII rules, access control

This powers natural-language SQL, smarter AI insights, and safer pipeline changes.

(Coming soon in msh Cloud)

msh Cloud · Coming soon

The team workspace for msh — a shared view of your atomic assets, lineage, and metrics.

Hosted manifests, lineage, and a Semantic Layer so teams share one governed view of how data is produced and used.

Hosted manifest & lineage

Semantic Layer: glossary & metrics

AI sidekick in the UI

NL→SQL query console

Pre-deploy safety checks

Versioned deploys

Join the beta list to get early access.

msh CLI · Open Source

The developer entry point to the platform — a local-first engine for atomic assets.

Define, run, test, and deploy .msh files without needing any cloud account. Use the CLI today, then plug those atomic pipelines into msh Cloud (coming soon) for team workflows.

Local-first runs

Execute atomic assets on your laptop or in CI.

No cloud required

Great DX without any hosted dependency.

AI-powered workflows

Explain, review, generate, and test from the CLI.

AI commands
msh ai explain
msh ai review
msh ai new
msh ai fix
msh ai tests

Use OpenAI, Anthropic, or local models — msh provides the structured context from your atomic assets.

Features

Open Source (CLI)

Atomic assets: one .msh file per data asset
Local-first runs (laptop and CI)
Integrated data tests and contracts
Lineage-aware execution
Blue/green deploy strategies
Instant, deterministic rollbacks
AI commands: explain, review, new, fix, tests
Works with OpenAI, Anthropic, or local models

Cloud (Optional)

Hosted manifest and lineage graph
Semantic Layer: glossary, metrics, dimensions, policies
NL→SQL console for governed access
AI sidekick in the UI
Pre-deploy safety and impact checks
Versioned environments and deploys
Team ownership, approvals, and audit trails
Cloud optional — runs on top of the CLI

Who it's for

Data engineers

Less glue code, fewer DAGs, one engine for each asset.

Analytics engineers

Models with clear tests, contracts, and lineage in one file.

Founders & CTOs

A serious data platform without assembling a multi-tool stack.

Data & product leaders

Safer changes, consistent metrics, and clear ownership.

Community & docs

Explore the CLI on GitHub, learn the .msh format in the docs, and follow along as Slack and example galleries launch.