19c High-Frequency statistics gathering and Real-Time Statistics
Those are the two exciting new features about the optimizer statistics which arrived in the latest release of 12cR2: 19c. Less exciting is that we are not allowed to use them in any other platform than Exadata:
But let’s cross the fingers and hope that this will be released in the future because they solve real-life problems such as Out-of-Range queries. Here is a little example involving both of them. A table starts empty and is growing during the day. Relying only on the statistics gathered during the maintenance window will give bad estimations. And dynamic sampling may not sample the right blocks.
A little example
Here is a little example where I insert one row every second in a DEMO table and look at the statistics.
Initialization
First I initialize the example by creating the table, gathering stats on it and set the global parameter AUTO_TASK_STATUS to ON. In is not obvious from the name, but in the context of DBMS_STATS, this Auto Task is the “high-frequency” one, running outside of the maintenance window, every 15 minutes by default, as opposed to the Auto Job that runs during the maintenance window every 4 hours.
spool hi-freq-stat.log
set echo on time on
whenever sqlerror exit failure
alter session set nls_date_format='dd-MON-yyyy hh24:mi:ss';
create table DEMO(d date,n number(*,2),h varchar2(2),m varchar2(2));
exec dbms_stats.gather_table_stats(user,'DEMO');
show parameter exadata
exec dbms_stats.set_global_prefs('auto_task_status','on');
This is a lab where I simulate Exadata features. High-Frequency Automatic Optimizer Statistics Collection is available only on Exadata.
Run every second
Every second I run this transaction to add one row. The D column will contain SYSDATE at the time of insert:
insert into DEMO (D, N, H, M) values (sysdate
,sysdate-trunc(sysdate,'HH24')
,to_char(sysdate,'HH24')
,to_char(sysdate,'MI'));
select count(*),min(d),max(d) from DEMO;
commit;
Then I query the column statistics for this D column:
exec dbms_stats.flush_database_monitoring_info;with function d(r raw) return date as o date;
begin dbms_stats.convert_raw_value(r,o); return o; end;
select column_name,d(low_value),d(high_value),num_distinct,notes
from dba_tab_col_statistics where owner=user and table_name='DEMO'
and column_name='D'
/
As an example, here is the 10000th iteration 2 hours 46 minutes after the initialization:
The min value is from 20:22:16 and the max value is this insert at 23:18:19
Without those 19c features, the statistics would have stayed at 0 distinct values, and null low/high, as it was gathered when the table was empty.
However, here I have two statistics here (visible in the dictionary after flushing database monitoring info):
- Real-Time Statistics (NOTES=STATS_ON_CONVENTIONAL_DML) which is updated when DML occurs and can be used as dynamic statistics without the need to sample. The number of distinct value is not known there (this is not updated on the fly because it would be expensive to know if new values are within the same set of existing values, or not). high/low values are accurate: in this example less than 2 minutes stale (23:16:42 is the high value known at 23:19:18 point-in-time).
- The regular Statistics (no NOTES) which are not as accurate, but not too stale either: nearly 15 minutes stale (23:02:06 is the high value known at 23:19:18 point-in-time). And they are full statistics (gathered on the table with Auto Sample size): the number of distinct values is the one we had at 23:02:06 when the statistics were gathered by the “high-frequency” task.
Cleanup
To cleanup the test I drop the table and set back the auto_task_status to the default (off):
set termout off
drop table DEMO;
set termout on
exec dbms_stats.set_global_prefs('auto_task_status','off');
High-Frequency statistics gathering task result
I have run this every second for 10 hours with:
for i in {1..36000} ; do echo @ /tmp/sql2.sql ; done
And I awk’d the spool to get information about the distinct statistics we had during that time (regular statistics only) and the first time they were known:
awk '
#ignore real-time stats here
/STATS_ON_CONVENTIONAL_DML/{next}
#store update time
/> commit;/{time=$1}
# stats on column inserted with SYSDATE
/^D/{if (logtime[$0]==""){logtime[$0]=time}}
END{for (i in logtime){print i,logtime[i]}}
' hi-freq-stat.log | sort -u
As the high value is the SYSDATE at the time of insert, this shows the staleness:
Here, I started my script at 20:22:16 with an empty table, and gathered the statistics, then showing 0 rows and null low/high value. Then one row was inserted each second. And the statistics stay until 20:31:30 where they show 523 lines. The high value here is from 20:31:29 when the high-frequency job has run. Those stats were used by the optimizer until 20:46:38 when the task has run again.
All those task and job execution are logged and visible from the same view:
select * from DBA_AUTO_STAT_EXECUTIONS order by OPID;
The detail for this specific tables from DBA_OPTSTAT_OPERATION_TASKS:
select target,start_time,end_time,notes
from DBA_OPTSTAT_OPERATION_TASKS
where target like '%DEMO%' order by OPID desc
We see the runs every 15 minutes from 20:22:16, then 20:31:30, then 20:46:38… until 23:02:06
Then, I was till inserting at the same rate but the task, still running every 15 minutes, gathered statistics on this table only every 30 minutes: 23:02:06, then 23:32:12… and we see the latest here every 60 minutes only.
What do you think happened for the high-frequency job not gathering statistics on this table 15 minutes after the 23:02:06 run?
Let’s look at the numbers:
- I insert one row per second
- The task runs every 15 minutes
This means that when the task runs, I have inserted 900 rows. I didn’t change the default STALE_PERCENT which is 10%. And when those 900 rows do reach the threshold of 10%? When the table has more than 9000 rows.
And now look at the log at 23:02:06 before the task has run:
The statistics show 8222 rows (from DBA_TAB_STATISTICS but as I know they are all unique I guess them from the NUM_DISTINCT in DBA_TAB_COL_STATISTICS) and then the 900 modifications recorded count for 11%. This is higher than 10%, the table statistics are stale and the next task has re-gathered them:
A side note: the “STATS_ON_CONVENTIONAL_DML” disappeared because statistics were just gathered. But that’s for later…
Now that the table is known to have 9078 rows, when will it be stale again? 15 minutes later we will have inserted 900 rows, but 900/9078=9.9% and that’s under the 10% threshold. This is why the next run of the task did not gather statistics again. But after another 15 minutes, then 1800 rows have been inserted and that’s a 19.8% staleness.
You see the picture: this high-frequency task takes care of the very volatile tables. It runs every 15 minutes (the default AUTO_TASK_INTERVAL) and spends no more than 1 hour (the default AUTO_TASK_MAX_RUN_TIME).
Real-Time statistics
And to go further, Real-Time statistics not only count the modifications to evaluate the staleness but also stores information about low and high value.
Here is my awk script now showing the distinct information from the real-time statistics:
awk '
#ignore real-time stats here
#store update time
/> commit;/{time=$1}
# stats on column inserted with SYSDATE
/^D.*STATS_ON_CONVENTIONAL_DML/{if (logtime[$0]==""){logtime[$0]=time}}
END{for (i in logtime){print i,logtime[i]}}
' hi-freq-stat.log | sort -u
and an except around the time where the high-frequency job ran every 15 or 30 minutes only:
While I was inserting every second, the staleness is 2 minutes only. This reduces even further the risk of out-of-range queries.
Basically, the high-frequency task:
- gathers statistics in the same way as in the maintenance window job (stale tables first) but more frequently and using fewer resources
and real-time statistics:
- adds some more fresh information, not from gathering on the table but inferring from the DML that was monitored.
A little remark though: there’s no magic to have the new query executions use the new statistics. The high-frequency task just calls dbms_stats with the default rolling window invalidation.