UUID (aka GUID) vs. Oracle sequence number

  • large values are less efficient (more space in disk, memory and more CPU to process)

NUMBER from SEQUENCE

In order to put some numbers about this argument, I create a 10 million rows table with an ID generated from a SEQUENCE:

SQL> create sequence DEMO_NUM_ID cache 1e6;
Sequence DEMO_NUM_ID created.
SQL> create table DEMO_NUM
(ID constraint DEMO_NUM_ID primary key, NUM)
as select DEMO_NUM_ID.nextval, rownum
from xmltable('1 to 10000000');
Table DEMO_NUM created.
Elapsed: 00:04:37.796
SQL> desc DEMO_NUMName Null?    Type       
---- -------- ----------
ID NOT NULL NUMBER(38)
NUM NUMBER
SQL> select sum(vsize(ID))/count(*),max(ID),count(*) from DEMO_NUM;SUM(VSIZE(ID))/COUNT(*)    MAX(ID)   COUNT(*)
----------------------- ---------- ----------
4.8888893 10000000 10000000

UUID from SYS_GUID()

Here is a similar table with the ID generated as a GUID:

SQL> create table DEMO_GUID
(ID constraint DEMO_GUID_ID primary key, NUM)
as select sys_guid(), rownum
from xmltable('1 to 10000000');
Table DEMO_GUID created.
Elapsed: 00:05:45.900
SQL> desc DEMO_GUID
Name Null? Type
---- -------- ------------
ID NOT NULL RAW(16 BYTE)
NUM NUMBER
SQL> select sum(vsize(ID))/count(*) from DEMO_GUID;SUM(VSIZE(ID))/COUNT(*)
-----------------------
16

RAW from NUMBER

I see only one advantage in GUID primary keys: they are RAW datatypes. I like it because we don’t want arithmetic operations on it. And in addition to that nobody will complain about gaps in numbering. Then, can we store our NUMBER from the sequence as a ROW?

SQL> create sequence DEMO_RAW_ID cache 1e6;
Sequence DEMO_RAW_ID created.
SQL> create table DEMO_RAW
(ID constraint DEMO_RAW_ID primary key, NUM)
as select hextoraw(to_char(DEMO_RAW_ID.nextval,rpad('FM',65,'X')))
, rownum
from xmltable('1 to 10000000');
Table DEMO_RAW created.
Elapsed: 00:04:21.259
SQL> select * from 
(select sum(vsize(ID))/count(*) "vsize(NUM)" from DEMO_NUM),
(select sum(vsize(ID))/count(*) "vsize(RAW)" from DEMO_RAW),
(select sum(vsize(ID))/count(*) "vsize(GUID)" from DEMO_GUID);
vsize(NUM) vsize(RAW) vsize(GUID)
---------- ---------- -----------
4.8888893 2.993421 16
SQL> select segment_name,segment_type,bytes/1024/1024 MBytes 
from dba_segments
where owner=user and segment_name like 'DEMO%' order by mbytes;
SEGMENT_NAME SEGMENT_TYPE MBYTES
------------------------------ ------------------ ----------
DEMO_RAW_ID INDEX 160
DEMO_RAW TABLE 168
DEMO_NUM_ID INDEX 174
DEMO_NUM TABLE 188
DEMO_GUID_ID INDEX 296
DEMO_GUID TABLE 304

Compression

12cR2 introduced a compression algorithm for indexes which is interesting even when there are no repeated column values: ADVANCED HIGH (which is available in Enterprise Edition with Advanced Compression Option):

SQL> alter index DEMO_GUID_ID rebuild compress advanced high;Index DEMO_GUID_ID altered.
Elapsed: 00:01:30.035
SQL> alter index DEMO_RAW_ID rebuild compress advanced high;Index DEMO_RAW_ID altered.
Elapsed: 00:00:57.193
SQL> alter index DEMO_NUM_ID rebuild compress advanced high;Index DEMO_NUM_ID altered.
Elapsed: 00:00:49.574
SQL> select segment_name,segment_type,bytes/1024/1024 MBytes from dba_segments where owner=user and segment_name like 'DEMO%' order by mbytes;SEGMENT_NAME                   SEGMENT_TYPE           MBYTES
------------------------------ ------------------ ----------
DEMO_RAW_ID INDEX 80
DEMO_NUM_ID INDEX 80
DEMO_GUID_ID INDEX 136
DEMO_RAW TABLE 168
DEMO_NUM TABLE 188
DEMO_GUID TABLE 304

When is GUID smaller than a NUMBER?

Here is where a NUMBER starts to be larger than a 16 bytes GUID:

In summary: do not use GUID or UUID for your primary keys.

A NUMBER generated from a SEQUENCE is the right/efficient/scalable way to get a unique identifier. And if you want to be sure to have no collision when merging or replicating from another system, then just add another ‘system identifier’ as an additional column in the primary key. This can be a node number in a distributed system, or a 3-letter identifier of the company. When column values have a repeated subset, there’s no efficient deduplication or compression techniques. But when it is a different column that is fully repeated, table and index compression can be used. So, if you really need to add a hashed hostname, put it in an additional column rather than the UUID idea of mixing all in one value.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Franck Pachot

Franck Pachot

Developer Advocate at Yugabyte, Open Source distributed SQL database 🚀 Also Oracle ACE Director, Oracle Certified Master, AWS Data Hero, OakTable member