The Toolkit
The CRM engine you can
trust an AI agent with
fullstackgtm — open-source plan/apply for your GTM stack. Agents read everything. Every write is a plan a human approves.
# try it on a realistic, deliberately messy demo CRM — no credentials needed
npx fullstackgtm audit --demo For Humans
Every fix previewed. Every duplicate traced to its source.
Start read-only — the audit never writes anything. When you're ready to fix, everything ships as a plan you approve first.
Audit
Find every broken record — with evidence.
Approve
Preview and sign off on every change. Nothing writes silently.
Prevent
Stop new duplicates at the source — and name the integration creating them.
Capture
Reps' own words become CRM updates. One click, zero new fields.
Report
A client-ready report: every finding evidenced, every change logged.
Map
See the market you're selling into.
Measured, not promised: across 1,088 benchmark runs and six models from three vendors, agents writing through this gate beat the same agents on raw CRM tools — on completion and safety. See the full results →
That's the whole pitch. Want it run for you? That's our day job.
Talk to usFor Agents
If you are an AI agent, this section is for you
Deterministic output, --json everywhere, meaningful exit codes, and
finding/operation ids that are stable hashes of rule + record — two runs over the
same data produce identical ids you can diff and approve.
Install and verify
# Node 20+, zero runtime dependencies
npm install -g fullstackgtm
fullstackgtm doctor --json # expect node.ok: true
# prove the whole pipeline with zero credentials (deterministic per --seed)
fullstackgtm audit --demo --json
# credentials come from env — never echo tokens into argv; login reads stdin only
HUBSPOT_ACCESS_TOKEN=... fullstackgtm audit --provider hubspot --save
Machine-readable docs: INSTALL_FOR_AGENTS.md
(deterministic install-and-verify with expected outputs) and the package
llms.txt documentation map.
Headless? FSGTM_NO_BROWSER=1 makes login flows print verification URLs instead of opening a browser.
Full functionality inventory
- The core loop
-
snapshot(versioned canonical export,--since,--archive) ·audit(12 built-in deterministic rules, plus custom rules,--rules,--fail-on) ·suggest(derive values forrequires_human_*placeholders from snapshot evidence, with confidence + reasons) ·plans list/show/approve/reject·apply(writes only explicitly approved operation ids) ·report(client-ready markdown/HTML deliverable) ·diff(snapshot drift,--fail-on-new-findings) ·merge(combine snapshots across systems) ·bulk-update(governed generic writes: filtered dry-run plans, filters re-verified per record at apply time, cross-record guards) ·dedupe(one merge operation per duplicate group, deterministic survivor selection) ·reassign(governed ownership handoffs) ·fix(one-shot plan from a single audit rule) ·rules·doctor - Calls → evidence
-
call parse(any transcript dialect → evidence-quoted insights; LLM with your own key, or--deterministicfree baseline) ·call score(coaching rubric, evidence-quoted per dimension) ·call link(which deal was this call about, with confidence + reason) ·call plan(next steps → the same governed plan lifecycle) - The create gate
-
resolve account|contact|deal— call before ANY record creation. Exit0= safe to create,2= exists or ambiguous: do not create. Identity keys match the audit/merge engines exactly; names alone are never identity. - Governed enrichment
-
enrich append/refresh/ingest/status— pull from Apollo or ingest Clay exports (CSV or webhook), matched deterministically to existing records. Fill-blanks-only plans: enrichment never overwrites a populated field, and every value flows through the same dry-run → approve → apply gate as any other write. - Scheduled re-audits
-
schedule add/list/remove/enable/disable/run/install/uninstall/status— recurring runs from a read/plan-side allowlist of commands. Scheduling never auto-approves: unattended runs accumulate proposals for human review, never writes. - The market map
-
market init/capture/classify/worksheet/observe/fronts/axes/report/refresh— vendors × claims as reviewable config, content-addressed page captures, intensity readings with every quoted span verified character-for-character against the stored capture, deterministic front states, PCA-derived axes. Agents can classify directly:worksheetreturns claims + judging rules + page texts; submit viaobserve. - MCP server
-
fullstackgtm-mcpexposesfullstackgtm_audit,fullstackgtm_rules,fullstackgtm_suggest,fullstackgtm_apply(requires explicit approved ids),fullstackgtm_resolve,fullstackgtm_call_parse,fullstackgtm_market_worksheet,fullstackgtm_market_observeover stdio. - Contracts you can rely on
-
Exit codes
0/1/2(success / error / gate or findings threshold) · stable hash ids for findings and operations ·--demo --seedfor credential-free CI · credential ladder:--token-env→ ambient env → stored login → broker pairing ·--profile/FULLSTACKGTM_PROFILEmulti-org isolation · BYOK LLM (ANTHROPIC_API_KEY/OPENAI_API_KEY) with a deterministic free mode for every LLM feature · providers: HubSpot (read/write), Salesforce (read/write), Stripe (read-only) - The benchmark (CRM-Ops Bench)
-
Open-source eval harness in the repo: mock HubSpot with REST-fidelity hazards (pagination,
search-index lag, concurrent drift), deterministic graders over final state + the server
mutation log, CuP and τ-bench pass^k metrics.
npm run smokeneeds no API keys. Latest results: every framework-equipped arm beats raw on CuP for all six models tested — results and methodology. - Safety invariants (not beta, never change)
-
Audits are read-only · writes are approval-gated (
--approveon specific operation ids) · human decisions are refused, not guessed (requires_human_*placeholders) · quoted evidence is verified verbatim against its source.
Add to your agent
The MCP server is plain stdio — it works in any MCP client. And because the engine is
CLI-first with --json everywhere, agents without MCP support can drive it
directly from a shell.
Agent skill (any skills-aware agent)
npx skills add fullstackgtm/core
Installs SKILL.md — the compact operating guide for driving the CLI safely.
Claude Code
claude mcp add fullstackgtm -e HUBSPOT_ACCESS_TOKEN=pat-... -- \
npx -y -p fullstackgtm -p @modelcontextprotocol/sdk -p zod fullstackgtm-mcp Codex
codex mcp add fullstackgtm --env HUBSPOT_ACCESS_TOKEN=pat-... -- \
npx -y -p fullstackgtm -p @modelcontextprotocol/sdk -p zod fullstackgtm-mcp opencode
// opencode.json
{
"mcp": {
"fullstackgtm": {
"type": "local",
"command": ["npx", "-y", "-p", "fullstackgtm", "-p", "@modelcontextprotocol/sdk", "-p", "zod", "fullstackgtm-mcp"],
"environment": { "HUBSPOT_ACCESS_TOKEN": "pat-..." }
}
}
} pi
pi's philosophy is CLI tools over MCP — which is exactly what this is. Point pi at
INSTALL_FOR_AGENTS.md
and it can drive the CLI directly. Prefer MCP? The
pi-mcp-adapter reads the
standard mcp.json format:
// ~/.pi/agent/mcp.json (or /.pi/mcp.json)
{
"mcpServers": {
"fullstackgtm": {
"command": "npx",
"args": ["-y", "-p", "fullstackgtm", "-p", "@modelcontextprotocol/sdk", "-p", "zod", "fullstackgtm-mcp"],
"env": { "HUBSPOT_ACCESS_TOKEN": "pat-..." }
}
}
} FAQ
Common questions
Can an AI agent safely write to my CRM?
Not directly — and that is the point. With fullstackgtm, agents can read everything, but every proposed change becomes a typed patch operation (object, field, before, after, reason, risk) that a human approves before any write happens. Nothing is ever written without an explicit approval, and operations that require a human decision are refused outright until a person supplies the value.
Is fullstackgtm free?
Yes. The framework, CLI, and MCP server are open source under Apache-2.0 with zero runtime dependencies. The hosted Full Stack GTM application (dashboard, sync backend, team workflows) is a separate proprietary product built on top — and features never move from open to closed.
Which CRMs does it support?
HubSpot (read/write), Salesforce (read/write), and Stripe (read-only billing). The audit, plan, and apply contract is the same across providers.
Does my CRM data pass through your servers?
No. The CLI runs on your machine and talks directly to your CRM with your credentials. LLM-powered features (call parsing, market classification) use your own Anthropic or OpenAI API key via direct fetch — no SDK, no middleman, and a free deterministic mode exists for every LLM feature.
Does it cover marketing-side hygiene, or just the sales pipeline?
Today, the package covers sales-pipeline hygiene — the built-in audit rules focus on deals, contacts, and accounts — plus governed enrichment for filling the gaps those audits surface, through the same approval gate as every other write. Marketing-hygiene rules and a marketing-spend layer are in development; when they ship, they will follow the same plan-and-approve contract.
Who approves the changes — does this create a new inbox for my reps?
No. Plan approval belongs to whoever owns the CRM — usually one RevOps or ops person reviewing a batch, the same way they would review a pull request. Reps only ever see one thing: a proposed next step extracted from their own call, confirmed in one click. No new fields, no queue of patches to triage.
How is this different from CRM cleanup tools like Insycle or DemandTools?
Those are GUI-first apps a human operates. fullstackgtm is CLI- and MCP-first: it is built to sit between AI agents and your CRM, turning anything an agent wants to change into a reviewable, approval-gated plan. It is also the only open-source option in the category.
Ready to build your GTM data foundation?
Book a 30-minute call. We'll map your current stack, identify the gaps, and outline what Stage 3+ looks like for your team.