Oracle Sharding methods compared to YugabyteDB

Franck Pachot
8 min readFeb 20, 2024

--

Oracle has long been a leader in partitioning, distributing, and replicating databases. They offer shared-storage RAC for High Availability and Log-Streaming Data Guard for Disaster Recovery. However, on built-in shared-nothing replication, Oracle is now a follower of the Distributed SQL innovation initiated by Spanner and pursued by CockroachDB, TiDB, and YugabyteDB with Raft consensus replication. In 23c, Oracle added Raft replication to follow its competitors as an alternative to primary-standbys configurations. However, there is still a great deal of innovation to provide this on top of Oracle’s existing partitioning schemes without building a new database architecture, and that’s what I’ll describe here.

Adding Raft replication does not turn a traditional monolithic database into a cloud-native distributed SQL database. The database becomes distributed only when the table rows, index entries, and transaction intents are distributed to multiple Raft groups. This involves sharding/partitioning. Oracle documentation for Globally Distributed Databases 23c defines four Sharded Data Distribution Methods: System-Managed (consistent HASH), User-Defined (LIST or RANGE), Composite Sharding (combining System and User-Defined), and Directory-Based (User-Defined with a mapping table). This provides multiple sharding methods to cover various use cases. So, how does this compare to Distributed SQL databases?

In short, Oracle Globally Distributed Database and YugabyteDB can be used for all kinds of scenarios but in different ways. To be compatible with the existing features, Oracle introduced many new concepts, such as chunks, tablespace sets, table family, shards, shardspace, and partition sets. This adds to the already-known concepts like tablespaces, partitions, and subpartitions. It involves SQL commands and GDS (Global Data Services) commands. You can imagine the operational complexity of such a deployment as well as the many possibilities.

YugabyteDB has a two-layer architecture that simplifies data distribution. The storage and transaction layer has automatic sharding, which uses HASH or RANGE to distribute data to Raft groups for high availability and elasticity. The second layer is PostgreSQL declarative partitioning, which allows users to partition by HASH, RANGE, or LIST and define their placement through tablespaces. One is to distribute automatically, the other to add user-defined data placement preferences or constraints. Before comparing them with Oracle’s sharding methods, let’s describe the YugabyteDB methods first, as they are easier to understand.

YugabyteDB Distribution Methods

Range Sharding (system-managed)

The straightforward method to distribute table rows and index entries across multiple Raft groups (tablets in YugabyteDB) is splitting the range of their key values (primary key for the table, indexed columns for the index). This is a must for all Distributed SQL databases because SQL applications can query data on ranges. like with ‘>’, ‘<’ or ‘between’ in WHERE clauses or to get a sorted result for ORDER BY. All distributed databases provide this: Spanner, CockroachDB, TiDB, YugabyteDB, and some give only this.

Because it is a distribution method, sharding must be automatic so data can be re-balanced automatically when scaling horizontally. We can pre-split a table on specific values, especially when we know the values we will bulk-load, but small tables will start with one shard and be automatically split when growing. YugabyteDB auto-split thresholds are described in the documentation.

The syntax for range sharding is easy in YugabyteDB: you define ASC or DESC as with any SQL index definition. For example, an index on (timestamp ASC, name ASC) will be ordered by timestamp, name and split in the middle when growing.

Hash Sharding (system-managed)

There are two issues with range sharding. Firstly, it isn’t easy to distribute data before knowing the values. Secondly, it can create a hotspot when inserting rows in the same range, such as with timestamps or sequences. A hash value can be applied when a key is only used for point queries (equality predicate on the key). YugabyteDB can do this automatically when you define a hash-key part as it applies a hash function to get a value in the range of 0–65535 that will be added as a prefix to the internal key. Then, range sharding will be applied to this hash value only. YugabyteDB extends the PostgreSQL syntax by adding HASH, like in a primary key defined as ( id HASH ). One advantage of using hash values is that the distribution is predetermined, allowing the database to split the data into multiple tablets automatically. This helps avoid the issue of hotspots that can arise with traditional indexes. For example, if an identifier is generated from a sequence, it will be distributed across multiple tablets, ensuring a more even data distribution. Not all Distributed SQL databases provide Hash Sharding in addition to Range Sharding. YugabyteDB offers the two methods, the default being HASH on the first column of the key.

Hash Sharding + Clustering key

HASH can be combined to ASC/Desc so that a key has two components: a hash key (also called partitioning key in some NoSQL databases) to distribute to multiple shards and a range key (also called clustering key or sort key in some NoSQL databases) to group the values that are queried together. With YugabyteDB, you may declare a multi-column primary key like ( device_id HASH, timestamp ASC).

Note that the hash function here is known as linear or consistent hashing (an excellent definition of the common sharding method definitions is here) and differs from the hash function used in SQL partitioning by hash. It is ideal to scale out and rebalance as the range of hash values can be split further. This method makes sense for high cardinality values. For low cardinality ones, you may prefer to add your bucket number (example here).

Range, List, Hash Partitioning (user-defined)

The sharding methods we have seen above are automatic and done at the storage level. The database manages the distribution over the cluster according to the key definition and the global settings for fault tolerance. When you want more user control on the data placement, for lifecycle management, latency, or data residency reasons, YugabyteDB offers all PostgreSQL partitioning methods on top of the automatic sharding. Typically, you may partition by range on a date to purge the old data quickly. Or you may partition by list of countries to store them in a specific region for regulatory or performance reasons.

This uses the PostgreSQL tablespaces, where YugabyteDB adds some placement information. Tablespaces in PostgreSQL define the location in a single node as a filesystem directory. YugabyteDB tablespaces are global and determine a geographical part of a cluster that spans multiple data centers. Each tablespace can set its specific replication factor and multiple placement blocks mapped to cloud providers, regions, and zones when in the cloud or rack and data centers when on-premises. It can additionally define a preference for the Raft leader placement to reduce latency. Here is an example.

As all PostgreSQL partitioning methods are available, partitioning by Hash can also add modulo-based hashing on top of the consistent hash from automatic sharding. I described it here, but it is rarely needed.

With YugabyteDB, all partitioning methods can be combined with all sharding techniques. Here is an example where a table of people is partitioned by country to store them in specific regions, and each partition is distributed by hash across the availability zones of their region:

create table people (primary key(id hash, country asc), id uuid, country char(2), name text) partition by list (country);

Oracle Distribution Methods

There are four methods described in 23c documentation. They were added through 12c, 19c, and 23c on top of the existing partitioning features and global data services coordinator.

System-Managed Sharding

Oracle’s System-Managed Sharding is the equivalent of YugabyteDB Hash sharding. With YugabyteDB, defining the HASH function in the primary key or index key definition, is sufficient because the distribution is built-in the key-vue distibuted storage (DocDB).

With Oracle, you define the partition key with PARTITION BY CONSISTENT HASH. Even if sharding is automatic, it has to map to the traditional storage attributes: databases, tablespaces, extents, and blocks, and you must additionally create a TABLESPACE SET to create a tablespace on each node. Each node is a complete Oracle Database.

As far as I know, Hash Sharding is the only automatic one in 23c, and no equivalent of Range Sharding can be automatically split when the table grows. The System-Managed Sharding is only Hash and can be used only for high cardinality columns not queried by range.

User-Defined Sharding

Oracle’s User-Defined Sharding is the equivalent of YugabyteDB Range or List Partitioning. Partitions are assigned to tablespaces that define their location in a sub-set of the cluster. With YugabyteDB, this location is a set of placement blocks defining the replication factor and the nodes (cloud provider, cloud region, availability zone) where Raft leaders and followers can be placed.

With Oracle, you define each tablespace with a SHARDSPACE that you must configure in GDSCTL to map to the nodes (shards) because each of those nodes is a monolithic CDB (Container Database).

Directory-Based Sharding

Oracle’s Directory-Based Sharding has no direct equivalent in YugabyteDB because it requires a directory table to store the mapping between column values and partitions. To scale linearly, YugabyteDB avoids such a central table. The use cases fall into other YugabyteDB methods (range sharding for uneven data distribution, list partitioning to group multiple key values, range sharding on additional columns for custom policy). If you use Directory-Based sharding in Oracle and move to YugabyteDB you should look at what you wanted to achieve with it. There’s a good chance that automatic Range Sharding is the solution.

Composite Sharding

Oracle’s Composite Sharding is the equivalent of using YugabyteDB partitioning and sharding. With YugabyteDB, each partition declared with Range, List, and Hash partitioning in the query layer (PostgreSQL) is like a table for the storage layer where sharding applies on the key, so all combinations are possible.

With Oracle, you have to declare PARTITION with SHARDSPACE for system-managed partitioning and PARTITION SET with TABLESPACE SET for user-defined partitioning.

This is different from sub-partitioning which combines multiple user-defined partitioning method. In YugabyteDB, like in PostgreSQL, because a partitions like a table, you can do the same by partitioning a partition, but this should rarely be used given that automatic sharding allows Hash and Range.

Quick Comparison

It isn’t easy to compare the sharding methods between two different architectures.

  • Oracle Globally Distributed Database adds distribution and replication on top of a set of monolithic databases.
  • YugabyteDB was designed with built-in sharding in the transaction and storage layer and PostgreSQL on top of it to add all SQL features.

When comparing current versions, Oracle has more possibilities in its legacy partitioning, like operations to merge and split user-defined partitions. Some can be used to work around the lack of automatic range sharding, which is a must for SQL applications with range queries and is implemented in all Distributed SQL databases.

For migrations, you should look at the requirements for sharding (to scale data storage and processing) and partitioning (for geo-distribution). Both databases have their solution, with different operational complexity. You do more with legacy partitioning methods in Oracle and more with automatic sharding methods in Distributed SQL databases. You can do both in YugabyteDB.

Oracle Database Sharding uses monolithic databases to store parts of a global one, with its well-known proprietary RDBMS and the coordination of Global Data Services. YugabyteDB is a new database that is horizontally scalable, open-source, and PostgreSQL-compatible. It also uses proven technology (PostgreSQL, RocksDB, Apache Kudu) but with a different architecture (Distributed SQL).

--

--

Franck Pachot
Franck Pachot

Written by Franck Pachot

Developer Advocate for YugabyteDB (Open-Source, PostgreSQL-compatible Distributed SQL Database. Oracle Certified Master and AWS Data Hero.

Responses (1)