Salesforce Data Cleanup: A Practical Playbook for RevOps

Ryan Iyengar, CEO, Full Stack GTM

Salesforce orgs don’t get dirty randomly. They get dirty in predictable places: duplicates that slipped past (or predate) your duplicate rules, opportunities with close dates from two quarters ago, records still owned by people who left last year. The general method is the same one we use everywhere — snapshot, audit, reviewed fixes, prevention — written up in the step-by-step CRM cleanup process. This guide is the Salesforce-specific version: which native features do what, where they fall short, and the order of operations that keeps a cleanup from becoming an incident.

Step 1: Snapshot before you touch anything

Before any cleanup work starts, export the objects you’re going to touch — at minimum Account, Contact, and Opportunity, plus whatever custom objects matter to your pipeline.

Two practical options:

  • Data Export Service (Setup → Data Export) gives you scheduled full exports as zipped CSVs. Turn on the weekly schedule if it isn’t already running; a standing weekly export is the cheapest insurance an org can have.
  • Data Loader extracts give you targeted, on-demand exports. Before a cleanup, pull a full extract of each object you’ll modify, including record IDs and every field you might change.

The snapshot does three jobs: rollback path if a merge or mass update goes wrong, baseline to diff future audits against, and honest accounting when someone asks why the pipeline number moved. Skipping this step is how cleanups become postmortems.

Step 2: Audit with reports, not scrolling

Salesforce’s report builder can run most of the audit natively. Build these as saved reports — the second run is where you learn whether the cleanup stuck. (The full rule set is in the CRM audit checklist; these are the Salesforce-native versions of the highest-value checks.)

  • Opportunities with close dates in the past. Open opportunities, Close Date less than today. Every record on this report is distorting your forecast right now.
  • Opportunities with no next step. Open opportunities where Next Step is blank. If your org uses a custom field for this instead of the standard one, filter on that.
  • Opportunities with no recent activity. Filter on Last Activity older than your threshold — 30 days is a reasonable default for cycles under ninety days; tune to yours. Last Activity reflects logged tasks and events, so this report is only as honest as your activity capture. If reps work outside Salesforce and nothing logs for them, fix that first or this report will flag your best-run deals.
  • Records owned by inactive users. For each object, filter on the owner’s Active flag being false (in SOQL terms, Owner.IsActive = false). Run it for Accounts, Contacts, Opportunities, and Cases. More on why this report should be permanent in Step 5.
  • Accounts effectively without owners. Salesforce requires an owner on records, so true ownerless records are rare — but Leads and Cases sitting in queues nobody works are functionally ownerless, and accounts owned by an integration user or a long-departed admin are the Account-object equivalent. List them.

Each report is a finding queue with evidence attached — which record, which field, which value. That’s what makes the cleanup reviewable instead of arguable.

Step 3: Duplicates — what matching rules do and don’t solve

Salesforce splits duplicate management into two pieces, and the distinction matters:

  • Matching rules define what counts as a match. Salesforce ships standard rules — fuzzy matching on account name, contact matching that leans on email and name — and you can build custom rules with exact or fuzzy logic per field. The matching rule is pure comparison logic; it takes no action by itself.
  • Duplicate rules define what happens when a match is found. Alert the user and let them save anyway, block the save outright, or allow it silently while reporting the match. A duplicate rule references one or more matching rules and applies the action on create or edit.

This design is good at the point of entry. A sensible baseline — alert on contact email matches, alert or block on fuzzy account-name matches — stops most new duplicates at the door.

Where it’s weak is retroactive cleanup at scale. Duplicate rules fire when records are created or edited, so duplicates already sitting in the org mostly stay invisible to them. For the backlog, you have two routes:

  1. Reports plus careful merges. Find duplicate sets via reports (exact email matches, accounts sharing a website domain, similar names) or duplicate record sets where your rules have flagged them, then merge deliberately. The native merge handles up to three records at a time and makes you choose surviving field values — the right amount of friction for small batches, impractical for large ones.
  2. Tooling for the backlog. Past a few hundred duplicate sets, you want something that proposes merges in bulk with survivorship rules you review before applying. Whatever you use, the rule stands: no silent bulk merges. Our general approach — match keys, survivorship, review gates — is in the deduplication guide.

Sequence matters: fix the duplicate sources (integrations, list imports) and turn on point-of-entry rules before burning weeks on the backlog, or the backlog refills while you work.

Step 4: Validation rules and stage gates — prevention, with a tradeoff

The cheapest record to clean is the one that was never created dirty. Salesforce gives you two prevention levers:

  • Validation rules that enforce conditions on save — for example, an opportunity can’t move past qualification without an Amount, or can’t be marked Closed Won without the fields finance needs.
  • Required fields at stage gates, so each pipeline stage demands the data that stage actually depends on.

Used well, these encode your process into the system instead of into tribal knowledge. But there’s a tradeoff we see in almost every mature org: when gates are too strict, reps don’t supply better data — they supply workaround data. “TBD” in required text fields, $1 amounts, close dates pushed to the same arbitrary Friday. The validation rule reports green while the data quietly gets worse, which is more dangerous than visibly missing data.

The discipline: gate only on fields someone downstream genuinely consumes, write error messages specific enough that the rep knows what to enter and why, and audit for workaround patterns (clusters of identical values in required fields). A gate nobody games is a gate that asks for something reasonable.

Step 5: Ownership cleanup after departures

Here’s the Salesforce behavior that surprises people: deactivating a user does not reassign their records. Accounts, Contacts, Opportunities, and open activities keep the inactive user as owner until someone explicitly transfers them. Assignment rules and territory logic generally won’t touch existing records either — they act on new or re-routed ones. So every departure without a reassignment step leaves a slice of the database frozen: deals nobody is working, accounts nobody is renewing, leads aging in a ghost’s name.

The cleanup is mechanical — mass transfer tools or a Data Loader update on OwnerId, moving records to the right successor, territory owner, or round-robin pool. The harder part is making it permanent:

  • Put record reassignment on the offboarding checklist, owned by RevOps, executed at deactivation.
  • Where you have territories or round-robin assignment, route the departed rep’s book through the same logic new records get, rather than dumping it on one manager.
  • Keep the “owned by inactive users” report as a standing report, reviewed monthly. It’s a smoke detector: if it ever shows records again, your offboarding process has a hole, and you found it in weeks instead of at pipeline review.

Step 6: Mass updates without incidents

Every cleanup ends in mass changes — reassignments, field corrections, stage fixes, archive flags. The procedure that keeps them boring:

  1. Export a backup CSV first, every time, containing record IDs and the current values of every field you’re about to change. It’s the only rollback you have.
  2. Make the change from a CSV keyed on record ID using Data Loader or Dataloader.io, not by editing filtered list views by hand.
  3. Work in batches, smallest first. Run a pilot batch, spot-check the results, then proceed. Batches turn an org-wide mistake into a contained one.
  4. Keep the operation log. Data Loader writes success and error files for every run — keep them with the backup CSV as the complete record of what changed, what failed, and what the values were before.

This is the same propose-review-apply rhythm from the general cleanup process, expressed in CSVs: the backup is the snapshot, the load file is the proposed change, and the spot-checked pilot batch is the review.

Keeping it clean

Re-run the audit reports on a schedule — weekly for pipeline checks, monthly for ownership and duplicates — and watch the trend per report rather than a single health score. A report that was empty and isn’t anymore is pointing at a broken process, and that’s the thing worth fixing.

If you’d rather not run this by hand, the open-source fullstackgtm CLI connects to Salesforce with read and write access, runs these checks as deterministic audit rules, and turns the fixes into patch plans you approve before anything is applied — the same snapshot, audit, and review-gate discipline described above, automated.

Frequently asked questions

What's the difference between Salesforce duplicate rules and matching rules?

Matching rules define what counts as a match — the fields and logic (exact or fuzzy) used to compare records, like the standard fuzzy account-name rule or exact contact email. Duplicate rules define what happens when a match is found: alert the user, block the save, or just report it. A duplicate rule always references one or more matching rules; they only work together.

How do I find Salesforce records owned by inactive users?

Build a report on the object you care about and filter on the owner's active flag — in report filters this surfaces as a field like Owner: Active equals False, or in SOQL as Owner.IsActive = false. We recommend saving this as a standing report per object, not a one-time check, because every departure quietly creates more of these records.

What happens to a user's records when they're deactivated in Salesforce?

Nothing, by default. Deactivating a user does not reassign their records — Accounts, Contacts, Opportunities, and open activities keep the inactive user as owner until someone explicitly transfers them. That's why orphaned records accumulate silently after departures and why ownership reassignment belongs on the offboarding checklist.

What's the best way to mass-update records in Salesforce?

Use Data Loader (or a hosted equivalent like Dataloader.io) and always export a backup of the affected records first, including the record IDs and the fields you're changing. Run the update from a CSV keyed on ID, work in batches rather than one giant job, and keep the success and error log files. The backup CSV is your rollback path if the update goes wrong.

Can Salesforce duplicate rules clean up existing duplicates?

Not really. Duplicate rules fire at point of entry — when a record is created or edited — so they prevent new duplicates but do little for the ones already in the database. Retroactive cleanup means finding duplicates via reports or duplicate record sets and merging them deliberately; the native merge tool handles up to three records at a time, so large backlogs need batching or dedicated tooling.

Ready to build your GTM data foundation?

Book a 30-minute call. We'll map your current stack, identify the gaps, and outline what Stage 3+ looks like for your team.