How to Remove Duplicates in SQL
Use duplicate-removal patterns when multiple rows represent the same logical record and only one should remain. The safest workflow is identify duplicates first, preview what will be deleted, then remove extras deterministically.
What you’ll build or solve
You’ll learn how to remove duplicates in SQL by keeping one row per duplicate group and deleting the rest. You’ll also know how to preview duplicates safely before deletion.
Learn SQL on Mimo
When this approach works best
This approach is the right choice when duplicate rows are accidental and one canonical row should stay.
Common real-world scenarios include:
- Repeated email signups
- Duplicate imported products
- Multiple identical log rows
- CRM merge cleanup
- Migration overlap records
This is a bad idea when repeated rows are legitimate events, such as multiple orders from the same customer.
Prerequisites
You only need:
- A SQL table that contains duplicates
- A stable unique key such as
id - Basic
SELECT,GROUP BY, andDELETEknowledge
Step-by-step instructions
Step 1: Preview duplicate groups first
Start by identifying duplicate groups.
SQL
SELECT email, COUNT(*) AS duplicate_count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
This shows which values repeat.
Next, preview the rows you plan to keep versus remove.
SQL
SELECT *
FROM users
WHERE email = 'alex@example.com'
ORDER BY created_at;
A safe delete pattern is keeping the lowest id per duplicate group.
SQL
DELETE FROM users
WHERE id NOT IN (
SELECT MIN(id)
FROM users
GROUP BY email
);
This keeps one row per email and removes the rest.
What to look for:
- Always preview duplicates first
- Keep one deterministic row
MIN(id)is a safe keep rule- Great for import cleanup
- Add unique constraints afterward to prevent repeats
Examples you can copy
Remove duplicate emails
SQL
DELETE FROM users
WHERE id NOT IN (
SELECT MIN(id)
FROM users
GROUP BY email
);
Duplicate SKUs
SQL
DELETE FROM products
WHERE id NOT IN (
SELECT MIN(id)
FROM products
GROUP BY sku
);
Duplicate session tokens
SQL
DELETE FROM sessions
WHERE id NOT IN (
SELECT MIN(id)
FROM sessions
GROUP BY token
);
Common mistakes and how to fix them
Mistake 1: Deleting before previewing
What the reader might do:
SQL
DELETE FROM users ...
Why it breaks: a wrong duplicate key can remove valid distinct rows.
Corrected approach:
Run the GROUP BY ... HAVING preview first.
Mistake 2: Keeping arbitrary rows without a rule
What the reader might do:
SQL
DELETE FROM users
WHERE email IN (...)
Why it breaks: the canonical record may be deleted.
Corrected approach:
Use MIN(id), MAX(created_at), or another clear keep rule.
Mistake 3: Not preventing future duplicates
What the reader might do:
Clean duplicates once and stop there.
Why it breaks: duplicates can reappear after the next import.
Corrected approach:
Add a unique index or constraint after cleanup.
Troubleshooting
If too many rows are deleted, verify the duplicate grouping column.
If the wrong row remains, change the keep rule.
If duplicates return, add a unique constraint.
If performance is slow, index the duplicate key column.
Quick recap
- Find duplicates with
GROUP BY ... HAVING - Preview rows before deleting
- Keep one deterministic row
- Delete the rest safely
- Add unique constraints afterward
Join 35M+ people learning for free on Mimo
4.8 out of 5 across 1M+ reviews
Check us out on Apple AppStore, Google Play Store, and Trustpilot