on
Large Jsonb Value Performance in Postgres
Introduction
Recently, our team encountered some performance issues with our APIs. Typically, such issues arise from inefficient querying strategies. However, we stumbled upon a particularly interesting case. This specific API was solely responsible for update operations.In this post, we'll share our experience with this unique case and delve into the JSONB performance during update actions within PostgreSQL.
Background
Technology Stack
- .NET 6 Web API
- Entity Framework Core for communication with database
- PostgreSQL
API Functionality
- Update data within the domain model
- Upon successful update, preserves the original state of the domain model.
- Serialize the original model into JSON format entirely.
- Persist the serialized JSON data within the database using the JSONB data type.
Storing the original model is a business requirement. Using JSONB is a straightforward approach. However, this also influences performance. In some cases, the original model may be very large (~5k).
PostgreSQL Toast Issue
PostgreSQL's storage mechanism organizes data into fixed-size blocks known as "pages," typically sized at 8KB each. Each page can accommodate at least 4 tuples (rows). If a row within a table exceeds 2KB in size, PostgreSQL triggers the "TOAST" (The Oversized-Attribute Storage Technique) mechanism to efficiently manage large data values. This mechanism stores oversized data values separately from the main data page, preventing them from bloating individual rows. However, the TOAST mechanism introduces additional complexity and overhead, which can significantly impact performance, particularly during operations such as data updates. For a more detailed exploration of PostgreSQL's JSONB and TOAST mechanism and their influence on performance, refer to the article Postgres performance cliffs with large JSONB values and TOAST.
Conclusion
Despite the performance challenges posed by storing the entire domain model, it remains a business requirement, and the frequency of its usage is currently manageable. As a result, our team has no immediate plans to modify this design. However,if performance optimization becomes necessary, we prefer to implement a deferred storage approach by moving the archival step into a message queue rather than persisting the data immediately.
While JSONB provides convenience, it's essential to recognize that it comes with performance considerations. Through this experience, we've gained valuable insights into the trade-offs involved in using JSONB for storing large data values.