TIL: Resolving a PostgreSQL Deadlock with Concurrent
February 12, 2023
Recently, I encountered a PostgreSQL deadlock in a production environment while performing concurrent inserts using the INSERT INTO ... ON CONFLICT
clause. At first glance, this was surprising because one of the key reasons for using the ON CONFLICT
syntax is to avoid conflicts and handle concurrency gracefully.
Here's what happened, why it occurred, and how I resolved it.
Understanding the Deadlock
My scenario was fairly straightforward:
- Multiple parallel jobs inserting records into the same
accounts
table. - Each job used the
INSERT INTO ... ON CONFLICT
syntax to either create new accounts with a zero balance or increment existing balances by one dollar based on the accounts' unique IDs.
The conflicting SQL queries looked something like this:
Transaction A:
INSERT INTO accounts (account_id, amount)
VALUES (1, 0), (2, 0)
ON CONFLICT (account_id) DO UPDATE SET amount = accounts.amount + 1;
Transaction B:
INSERT INTO accounts (account_id, amount)
VALUES (2, 0), (1, 0)
ON CONFLICT (account_id) DO UPDATE SET amount = accounts.amount + 1;
Why Did the Deadlock Occur?
At first glance, PostgreSQL's ON CONFLICT
seems safe from deadlocks, but it’s not foolproof. Under heavy concurrent load, two transactions might concurrently detect conflicts, attempt to update, and then wait on each other's locks, resulting in a classic deadlock situation:
- Transaction A attempts to insert/update
account_id = 1
first and then waits to update the conflicting rowaccount_id = 2
, which is locked by Transaction B. - Transaction B attempts to insert/update
account_id = 2
first and then waits for Transaction A’s lock onaccount_id = 1
.
This cyclic waiting creates a deadlock.
Detecting the Issue
Postgres automatically detects and terminates one of the deadlocked transactions, logging an error similar to:
ERROR: deadlock detected
DETAIL: Process 1234 waits for ShareLock on transaction 5678; blocked by process 4321.
Reviewing the logs immediately clarified the issue.
Resolving the Deadlock
To fix this, I needed to adjust my approach to handling concurrency:
- Batch operations: Group multiple inserts/updates into a single transaction carefully, reducing the locking window.
- Ordered inserts: Ensure consistent ordering (e.g., always ascending by
account_id
), reducing cyclic waiting scenarios. - Retry logic: Implement application-level retry logic after catching deadlock errors.
Final Thoughts
Even powerful and well-designed features like PostgreSQL's ON CONFLICT
have their caveats, particularly around concurrency. Always monitor your database carefully and have strategies ready for handling these edge cases. Understanding the underlying mechanics helps in creating robust, resilient systems.