Pessimistic or Optimistic Concurrency Control? Lessons Learned from Real-World Customer Scenarios
Selecting the right concurrency control strategy — Pessimistic Concurrency Control (PCC) or Optimistic Concurrency Control (OCC), is one of the toughest decisions you’ll make when designing a transactional database. Theoretically, OCC can provide high concurrency with minimal locking overhead, while PCC ensures predictable performance by acquiring locks before modifying data. But real-world conditions rarely match neat theoretical models.
In the early versions of TiDB, we relied solely on OCC. We believed that OCC would suffice for our customers. However, as we engaged more deeply with customers across diverse industries and use cases, it became clear that assumptions don’t always hold up in production. Ultimately, these insights led us to shift from OCC to PCC in TiDB (from 3.0.8), offering customers a more stable and predictable concurrency experience.
In what follows, I’ll share with you the lessons we’ve learned from our customers and why we offer PCC support.
Quick Introduction to PCC vs. OCC
Before we start, let’s introduce PCC and OCC a little, talk about their Pros and Cons.
Optimistic Concurrency Control (OCC):
- How It Works: Transactions run freely and only check for conflicts at commit time. Conflicts trigger rollbacks and retries.
- Pros: High concurrency in low-conflict scenarios; no need to acquire locks upfront.
- Cons: Unpredictable rollbacks under contention; complex retry logic; wasted work if conflicts emerge late in the transaction.
Pessimistic Concurrency Control (PCC):
- How It Works: Transactions acquire locks before making changes, preventing others from altering the same data simultaneously.
- Pros: Predictable performance; fewer late-stage rollbacks; simpler code paths for handling concurrency.
- Cons: Potential lock contention and overhead; may reduce raw concurrency in read-heavy, low-contention workloads.
What We Learned from Customers
We made a lot of assumptions about our customers when we first started developing TiDB, but as it turns out, it’s all been a lesson learned, including, but not limited to
Don’t Assume the Customer Knows Their Workloads:
Many customers believe they have “low contention,” but as their business grows or usage patterns evolve, hot spots and contention become common. OCC’s rollback storms and retries can quickly surface when assumptions about workload behavior don’t pan out.
Don’t Assume the Customer Knows How to Write Retry Logic:
With OCC, every conflict potentially triggers a retry. How many retries are enough? Will retries cause a “retry storm” that amplifies latency under load? Customers may not have the expertise — or patience — to implement complex, backoff-based retry strategies.
Don’t Assume the Customer Can Easily Change Their Codebase:
Migrating from one concurrency model to another or integrating complex retry logic into an existing codebase is no small feat. Customers migrating from older databases may not have the engineering bandwidth to rewrite their application logic to handle OCC conflicts gracefully.
Don’t Assume the Customer Only Has Short, Small Transactions:
Some customers run long, complex transactions or multi-step business processes. OCC rollbacks are especially painful for these scenarios, as a lot of work gets invalidated at the commit stage. PCC’s early locking approach can prevent wasted computation.
Don’t Assume the Customer Understands Which Keys Are Prone to Conflict:
Identifying hot keys and designing around them requires deep insight into workload patterns. Without this knowledge, OCC may result in frequent surprise failures that are hard to diagnose and fix. PCC, by contrast, makes contention more visible and predictable upfront.
What Do Customers Really Want?
Our assumptions fail, so for the customer, what do they really need most? From our perspective, they need:
- Predictable Performance: They want stable, consistent response times, rather than sudden spikes caused by frequent retries.
- Controllable Workloads: They can clearly signal resource contention by blocking transactions rather than letting them run and fail later.
- Operational Simplicity: They need better observability and the ability to tune or reason about behavior under contention.
- Easy Ecosystem Integration: Most of them rely on existing tools, ORMs, and patterns that work smoothly with a predictable locking model.
So we decided to provide PCC to our customers, and now almost all of our customers use PCC by default in all scenarios, even in some scenarios with very high performance requirements.
Conclusion
Finally, I would like to emphasize that PCC is not perfect, both PCC and OCC have their place. In fact, many databases including TiDB offer both, allowing customers to choose. However, if your database only offers OCC, you may lose customers who have higher requirements for your database, especially those who want to use your database in core scenarios. As we discovered through our journey with TiDB, theoretical benefits don’t always stand up in the real-world. Offering PCC gave our customers a better experience and opened the door to more mature workloads, simpler application code, and fewer operational surprises.
If there’s one lesson we’ve learned, it’s this: Always evaluate concurrency control strategies in the context of real-world conditions, not just theoretical ideals.
References