Which 🐘PostgreSQL problems are solved with 🚀YugabyteDB

#1: Disastrous XID Wraparound

🐘PostgreSQL uses a 32-bit transaction ID and the database freezes when it is exhausted. Table *vacuum * must run frequently to avoid this.

#2: Failover Will Probably Lose Data

🐘PostgreSQL doesn’t provide ACID across multiple servers. An async replica can be setup to offload reads, with an acceptable lag, and to provide Disaster Recovery with a Recovery Time Objective in minutes, and a Recovery Point Objective with some data loss. A quorum of sync replicas can reduce the risk of data loss at the price of higher latency for write transactions. This must be carefully managed to avoid data corruption in case of failover.

#3: Inefficient Replication That Spreads Corruption

🐘PostgreSQL streaming replication is based on physical replication though the WAL, at page level. I do not agree with the “Spread Corruption” part. The author of the “10 things” mentions “hardware-induced data corruption” which would need more explanations, because WAL streaming replication is a lot safer, in this area, than storage-based replication. I guess that what the author experienced was a non-full-page WAL record corrupting an existing block in the standby.

#4: MVCC Garbage Frequently Painful

🐘PostgreSQL accumulates the states by writing entire new rows. Even updating one byte of a single column will copy the whole row. This is known as table bloat, has also some performances on index maintenance especially if you use the default 100% fillfactor, and requires frequent vacuum.

#5: Process-Per-Connection = Pain at Scale

🐘PostgreSQL is not multi-threaded, and forks a new process per connection. Each connection takes memory, and memory is limited on a server. Especially with double buffering, less available memory means more I/O. Connection pools can help to limit this, but, with microservices, this is still a lot of connections. Pgbouncer can help, but that’s another component to install and manage.

#6: Primary Key Index is a Space Hog

🐘PostgreSQL stores rows in heap tables like Oracle, and the primary key is an additional secondary index. This takes more space and more reads (even the Index Only Access) has to read the table.

#7: Major Version Upgrades Can Require Downtime

🐘PostgreSQL upgrades are done with a planned outage. The application must be stopped, not only for the duration of the upgrade but also for the statistic gathering that must be done after it (ANALYZE). This makes the downtime window depending on the size of the database.

#8: Somewhat Cumbersome Replication Setup

🐘PostgreSQL replication is not easy to setup. From a recent discussion with a friend concerned by it, here is a recent example showing how troubleshooting can be complex. Note that it also shows the amazing help you can have from an open-source community.

#9: Ridiculous No-Planner-Hints Dogma

🐘PostgreSQL core team has always been against the usage of query planner hints. Even when it is accepted that giving directives to the optimizer is a good thing, like materializing a CTE or not, it has been introduced in the SQL syntax, breaking the compatibility with previous versions.

#10: No Block Compression

🐘PostgreSQL doesn’t provide compression and, generally speaking, counts on the operating system to handle anything file-related. I’ve seen people using ZFS for the many features it provides, but getting acceptable and predictable performance for a database on ZFS is not easy.

All That Said…

The author, in conclusion, still recommends: “In general I’d recommend starting with PostgreSQL and then trying to figure out why it won’t work for your use case”. He mentions that some of those issues can be reduced by using a managed service. Because, solutions, or workarounds are well known. You just need some DBAs to managed them pro-actively, in-house or as from a managed service. PostgreSQL is the best open-source database I know for a wide area of workloads.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Franck Pachot

Franck Pachot

502 Followers

Developer Advocate at Yugabyte, Open Source distributed SQL database 🚀 Also Oracle ACE Director, Oracle Certified Master, AWS Data Hero, OakTable member