• Fri. Nov 15th, 2024

JSONB in PostgreSQL today and tomorrow

Byadmin

Mar 1, 2022


We know that data drives innovation and powers better customer experiences. And we know that companies today tend to rely on a variety of on-premises and cloud-based databases and services to make the best use of their data. In this confusing environment, the adoption of one particular open source database, PostgreSQL, is growing rapidly based on its ability to handle both structured and unstructured data.A relational, or SQL, database system with more than 30 years of active development history, PostgreSQL is known for reliability, a strong feature set, and great performance. JSON is a file format and data interchange format that enables applications to store and analyze text-based documents. As such, JSON powers NoSQL databases. JSONB, a data type exclusive to PostgreSQL, is an evolution of JSON that stores the text-based data in binary form, making the applications that access the unstructured data both faster and more efficient.JSONB is now the most popular supported data type driving adoption of Postgres. Among its benefits are a flexible schema to help businesses ramp up new projects quickly and compression to reduce overhead and increase performance. Such features enable businesses to accomplish more tasks using a single database solution, introducing more innovation more quickly while conserving development resources and putting less pressure on teams to manage security for multiple applications.A brief history of bringing JSON and JSONB into PostgreSQLIn 2000, while working on Rambler.ru, a major Russian portal, I and others began implementing text arrays and full-text search within PostgreSQL. In 2003, we implemented HStore to store key-value pairs in a binary format. This provided schema flexibility, which quickly became popular among users. In 2006, these features were officially committed in PostgreSQL as an extension.In 2012, limited features of JSON were enabled in Postgres, including storage and organization. However, no operations were enabled because at that time, the developers did not believe this was necessary. So in 2013, we began working on what we called Native HStore with array and nesting support, and in 2014, we started work on adding a binary storage native data structure and an application HStore structure. Since the name JSON was already used for features in Postgres, we decided to use JSONB, for “JSON Binary,” as the name.Once JSONB was made available in PostgreSQL v9.4 in 2014, the popularity of the database grew significantly. While the ability to query both structured and unstructured data is the biggest draw of JSONB in PostgreSQL, the availability of JSONB also coincided with the surge of interest in microservices architecture. In this architecture, every microservice exists to answer a known question (e.g., display some product with all its metadata, etc.), which makes it essential to have fast access to known aggregated data. This is where JSONB shines because it stores the pre-computed result of data aggregation, which enables fast access.Current capabilities of PostgreSQL with JSONBThe key capabilities of JSONB include simpler, flexible schema design, faster queries, and support for indexing. Let’s take those one by one. Simpler, flexible schema design. Flexible schemas enable developers to launch new projects faster. And because columns can be queried, indexed, and joined, JSONB provides a significant performance improvement over JSON. Fast application development drives demand for continued correction in data structures. Each scrum sprint may change the number of attributes and their types. Moreover, for some types of records, a number of attributes (e.g., color, price per unit, size) may need to be stored while others may not (weight, price per pound, date of production, etc). If you are using only tables and need to add more attributes, you need to change the data structure, determine which attributes are mandatory, determine if old records meet new requirements, and potentially modify them accordingly. JSONB provides flexibility in the definition of the data schema, allowing developers to create applications faster and better adapt them to customer requirements.Speed. Because JSONB stores data in a binary format, queries process significantly faster. Storing data in binary form allows Postgres to access a particular JSON key-value pair without reading the entire JSON record. The reduced disk load speeds up overall performance.Support for indexing. The indexing feature enables faster access to JSON key-value pairs that meet particular criteria. Postgres now knows more about the data inside JSONB records, selecting only the key-value pairs needed to perform SQL queries.Common JSONB use casesAlthough anonymized, the following use case descriptions are based on actual PostgreSQL deployments. FintechFintech apps are constantly evolving. Typically, different development teams need to be able to add new features and make releases simultaneously. JSONB provides the required flexibility to change data structures. One fintech service provider is using JSONB as a unified format that enables information transfers between front-end and back-end systems and the DBMS. Thanks to the use of JSONB, the stable data structure allocated inside the database can be changed or extended seamlessly.Voting systemsA regional governmental body implemented a multi-purpose voting system that supports voting for a variety of local issues (construction, land management, transportation, etc.) by more than 10 million people. The voting rules allow the citizens to change their minds about their vote and revote multiple times. Every vote is recorded both in the database and in a blockchain for immutability. The final vote is saved in the database, and the entire history of each person’s voting for each issue is recorded in a JSONB field. This approach helped speed up system operations tenfold, and the final results are now delivered much faster.BankingIn banking, applications and services are constantly being developed, resulting in frequent API format changes. One bank is using an integration bus for messaging between back-end systems, and all messages are logged. Every step of data exchange is logged to detect fraud and errors as they occur. Because storing data in pure JSONB in a NoSQL database cannot provide the necessary performance for searching and analyzing data, some of the data is stored in PostgreSQL’s relational structures, and only the changing attributes go to the JSONB fields. Once again, Postgres offers the best of both worlds.HealthcareMedical tests are typically prescribed based on a patient’s overall health and current symptoms. While there may be a standard set of indicators for particular tests, a doctor may want to run additional tests or check for other issues. This makes the number of tests a lab runs for a specific set of indicators subject to constant change. One healthcare system is storing test results in a uniform data structure. Thanks to JSONB, adding a new type of test can be done without any data structure modification. JSONB indexing enables fast search of the test results by any combination of any indicators and their value ranges. Current limitations and the future of JSONB in PostgreSQLThe development of JSONB is not finished. While it is an extremely useful feature of PostgreSQL, it doesn’t solve every database challenge. It’s not a silver bullet. For example, it does not handle updates as effectively as possible. To update just a single key in JSONB, PostgreSQL updates the entire field, which can be very large. There is currently no partial update. PostgreSQL is also not capable of in-place updates, so the cost of updating large JSONB fields is very high. This is the main pitfall for JSONs, at least for large deployments.Of course, workarounds exist. However, users need straightforward solutions, so making JSONB update-friendly is one of the key goals of future development. Among other tasks being worked on are faster key search inside a JSONB field and making internal JSON storage technology transparent for users. Currently, from the user’s viewpoint, JSON and JSONB are different data types. A better way is to present them as two storage technologies for one structured data type.JSON and SQL standardsPostgreSQL was the first relational DBMS to introduce JSON support, and the JSONB index search capability is unique. So for the last seven years, PostgreSQL has been developing ahead of existing standards. However, the SQL:2016 standard adopted in December 2016 declared new syntax features for working with JSON data in DBMSs. Roughly half of these have been implemented in PostgreSQL. Other DBMSs also implement this standard only partially. But PostgreSQL’s flexible data type system with operation polymorphism provides a very organic way to deal with JSON data, which is now possible both in SQL standard style and in initial PostgreSQL style.It’s not a surprise that the popularity of PostgreSQL continues to soar. This open source database offers the best of both worlds, allowing businesses to handle both structured and unstructured data. Further, the rise of online businesses has led to huge numbers of JavaScript developers coming into the industry with no understanding of database internals. JSONB provides them with the ability to work more or less uniformly in their code and database, smoothing out the contradictions between the code-centric and data-centric worlds. So the popularity of JSONB – and PostgreSQL – is also rising as low-code and no-code technologies become mainstream.Oleg Bartunov is the co-founder and CEO of Postgres Professional. A major contributor to PostgreSQL, Bartunov has been using PostgreSQL since 1995 and has been developing and promoting PostgreSQL since 1996. His PostgreSQL contributions include the locale support, GiST, GIN, and SP-GiST extensibility infrastructures, full text search, KNN, NoSQL features (HStore and JSONB), and several extensions including fuzzy search, support for tree-like structures, and arrays. In addition to his work with Postgres Professional, Bartunov is a research scientist at Lomonosov Moscow State University and a professional astronomer.—New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Copyright © 2022 IDG Communications, Inc.



Source link