PC

TIL: Resolving a PostgreSQL Deadlock with Concurrent

February 12, 2023

Recently, I encountered a PostgreSQL deadlock in a production environment while performing concurrent inserts using the INSERT INTO ... ON CONFLICT clause. At first glance, this was surprising because one of the key reasons for using the ON CONFLICT syntax is to avoid conflicts and handle concurrency gracefully.

Here's what happened, why it occurred, and how I resolved it.


Understanding the Deadlock

My scenario was fairly straightforward:

  • Multiple parallel jobs inserting records into the same accounts table.
  • Each job used the INSERT INTO ... ON CONFLICT syntax to either create new accounts with a zero balance or increment existing balances by one dollar based on the accounts' unique IDs.

The conflicting SQL queries looked something like this:

Transaction A:

INSERT INTO accounts (account_id, amount)
VALUES (1, 0), (2, 0)
ON CONFLICT (account_id) DO UPDATE SET amount = accounts.amount + 1;

Transaction B:

INSERT INTO accounts (account_id, amount)
VALUES (2, 0), (1, 0)
ON CONFLICT (account_id) DO UPDATE SET amount = accounts.amount + 1;

Why Did the Deadlock Occur?

At first glance, PostgreSQL's ON CONFLICT seems safe from deadlocks, but it’s not foolproof. Under heavy concurrent load, two transactions might concurrently detect conflicts, attempt to update, and then wait on each other's locks, resulting in a classic deadlock situation:

  1. Transaction A attempts to insert/update account_id = 1 first and then waits to update the conflicting row account_id = 2, which is locked by Transaction B.
  2. Transaction B attempts to insert/update account_id = 2 first and then waits for Transaction A’s lock on account_id = 1.

This cyclic waiting creates a deadlock.


Detecting the Issue

Postgres automatically detects and terminates one of the deadlocked transactions, logging an error similar to:

ERROR:  deadlock detected
DETAIL:  Process 1234 waits for ShareLock on transaction 5678; blocked by process 4321.

Reviewing the logs immediately clarified the issue.


Resolving the Deadlock

To fix this, I needed to adjust my approach to handling concurrency:

  • Batch operations: Group multiple inserts/updates into a single transaction carefully, reducing the locking window.
  • Ordered inserts: Ensure consistent ordering (e.g., always ascending by account_id), reducing cyclic waiting scenarios.
  • Retry logic: Implement application-level retry logic after catching deadlock errors.

Final Thoughts

Even powerful and well-designed features like PostgreSQL's ON CONFLICT have their caveats, particularly around concurrency. Always monitor your database carefully and have strategies ready for handling these edge cases. Understanding the underlying mechanics helps in creating robust, resilient systems.