Oracle Refresh Group consistency with nested materialized views.
A Refresh Group can be used to refresh a list of materialized views and ensures that all data is read with transaction consistency. This post is about the meaning of consistency when one materialized view depends on the other one. That’s just a quick test to understand the behavior. Any comment is welcome (preferably on twitter — @FranckPachot)
Documentation
The documentation mentions:
To preserve referential integrity and transactional (read) consistency among multiple materialized views, Oracle Database can refresh individual materialized views as part of a refresh group. After refreshing all of the materialized views in a refresh group, the data of all materialized views in the group correspond to the same transactionally consistent point in time.
But what does that mean when one materialized view depends on the other? This can happen when we do a detailed aggregate, and then build finer aggregates from them. What I would like in this case is all materialized views showing data as it were at the same point-in-time in the original tables. But is that possible? Here is a little test.
Demo
I create a DEMO table where I’ll store a sequence number and a timestamp:
SQL> connect demo/demo@//localhost/PDB1
Connected.
SQL> set echo on
SQL> create table DEMO as select 0 seq,current_timestamp now
from xmltable('1 to 1000');Table created.
I run a background job that will constantly increase the sequence number and update the timestamp:
SQL> set escape on
SQL> host { echo "set echo on" ; echo "whenever sqlerror exit failure" ; echo "exec loop update demo set seq=seq+1,now=current_timestamp where rownum=1; commit; end loop;" ; } | sqlplus -s demo/demo@//localhost/PDB1 \& :SQL> host sleep 1;
This is a loop that updates at high rate, so do it in a noarchivelog database.
Now that I have this constantly changing table, I create a first materialized view DEMOMV1 which is just a select on DEMO:
12:17:07 SQL> create materialized view DEMOMV1
as select * from DEMO;Materialized view created.12:17:07 SQL> select max(seq),max(now)
from DEMOMV1; MAX(SEQ) MAX(NOW)
___________ ______________________________________
21393 01-JUL-19 12.17.07.221922000 PM GMT
This mview contains the original data as-of its creation time: SEQ=21393
And then I create DEMOMV2 that queries both the DEMO table and the DEMOMV1 materialized view, to show the consistency easily:
12:17:07 SQL> create materialized view DEMOMV2 as
select 'MVIEW' source,seq,now from DEMOMV1
union all
select 'TABLE' source,seq,now from DEMO;Materialized view created.12:17:07 SQL> select source,max(seq),max(now)
from DEMOMV2 group by source;
SOURCE MAX(SEQ) MAX(NOW)
_________ ___________ ______________________________________
MVIEW 21393 01-JUL-19 12.17.07.221922000 PM GMT
TABLE 22099 01-JUL-19 12.17.07.336869000 PM GMT
I see the original data as-of two point in times:
- DEMO table was read at the time of the creation of this DEMOMV2 mview, which contains data at this point-in-time where SEQ was 22099
- DEMOMV1 was read at the time of the creation of this DEMOMV2 mview, which contain data from the last refresh of DEMOMV1, which means the original data when SEQ was 21393
So far so good, this is exactly what I expected. The refreshes were done one after the other. I cannot expect to see the original data as of the same point-in-time.
Refresh Group
Here is the creation of the refresh group:
12:17:07 SQL> exec dbms_refresh.make('DEMO.DEMORGROUP', list=>'DEMOMV1,DEMOMV2', next_date=>sysdate+100, interval=>'null');PL/SQL procedure successfully completed.
and the current state of my tables:
12:17:07 SQL> select max(seq),max(now) from DEMO; MAX(SEQ) MAX(NOW)
___________ ______________________________________
22214 01-JUL-19 12.17.07.377349000 PM GMT12:17:07 SQL> select max(seq),max(now) from DEMOMV1; MAX(SEQ) MAX(NOW)
___________ ______________________________________
21393 01-JUL-19 12.17.07.221922000 PM GMT12:17:07 SQL> select source,max(seq),max(now) from DEMOMV2 group by source; SOURCE MAX(SEQ) MAX(NOW)
_________ ___________ ______________________________________
MVIEW 21393 01-JUL-19 12.17.07.221922000 PM GMT
TABLE 22099 01-JUL-19 12.17.07.336869000 PM GMT12:17:07 SQL> select max(seq),max(now) from DEMO; MAX(SEQ) MAX(NOW)
___________ ______________________________________
22241 01-JUL-19 12.17.07.386421000 PM GMT
DEMO had some changes, but the materialized views are still at the same state as they were not refreshed.
Now refreshing the group (I enable some traces before):
12:17:07 SQL> alter session set tracefile_identifier=rgroup
events='10979 trace name context forever, level 1
: 10046 trace name context forever, level 3';Session altered.12:17:07 SQL> exec dbms_refresh.refresh('DEMO.DEMORGROUP');PL/SQL procedure successfully completed.
DEMO has newer sequence:
12:17:07 SQL> select max(seq),max(now) from DEMO; MAX(SEQ) MAX(NOW)
___________ ______________________________________
22713 01-JUL-19 12.17.07.532700000 PM GMT
Then I expect the materialize views to have been refreshed when DEMO was at a point where SEQ was between 22214 and 22713.
12:17:07 SQL> select max(seq),max(now) from DEMOMV1; MAX(SEQ) MAX(NOW)
___________ ______________________________________
22507 01-JUL-19 12.17.07.464718000 PM GMT
That’s right, this refresh of DEMOMV1 occurred when SEQ was 22507.
Now the big question is about DEMOMV2:
12:17:07 SQL> select source,max(seq),max(now)
from DEMOMV2 group by source; SOURCE MAX(SEQ) MAX(NOW)
_________ ___________ ______________________________________
MVIEW 22507 01-JUL-19 12.17.07.464718000 PM GMT
TABLE 22610 01-JUL-19 12.17.07.499890000 PM GMT
Same as when I did the refreshes manually, one by one: DEMOMV2 was refreshed after DEMOMV1 and sees DEMOMV1 as-of this point in time where SEQ was 22507. And data from the original table is fresher than that: SEQ was 22610 when DEMOMV2 was refreshed.
This means that if you build aggregates on top of aggregates, while the base table is changing, then the summaries will not show the aggregate data as-of the same point-in-time. It is perfectly normal that they are stale, but they don’t have the same staleness because they were nested.
Flashback query?
For a real consistency, I would have expected that DEMOMV2 had read DEMO as-of the same point-in-time as the other mviews in the refresh group. But here it behaved like a read-commited isolation level transaction: consistency is at statement level.
We can think of Flashback Query in this case. This would be fine to read DEMO. But reading DEMOMV1 with flashback query would ignore the changes done in the transaction, and I want to read the refreshed DEMOMV1 or I’ll get values from the previous refresh.
Anyway, flashback mode is disallowed here:
Refresh order
We do not define any order when we add a materialized view to a refresh group. And, as we have seen, the order matters when one mview reads another one. The refresh group takes care of the dependencies as we can see in the 10979 trace:
The ‘Sched. In’ and ‘Out’ link the dependencies. Here DEMOMV2 (which is #2 in the group) takes input from DEMOMV1 (which is #1) and then is scheduled after it.
Actually, in this example, the refresh works exactly as if I had refreshed DEMOMV2 with the nested option, triggering the refresh of the other mviews it depends on, like:
SQL> exec dbms_mview.refresh('DEMO.DEMOMV2',nested=>true);
Serializable
Finally, what I would have expected here is that the refresh was done in a SET TRANSACTION ISOLATION LEVEL SERIALIZABLE so that the refresh of DEMOMV2 sees DEMO at the same query SCN as when DEMOMV1 was refreshed, and sees DEMOMV1 as refreshed (uncommitted changes visible from the current transaction). Here is how it should work in my opinion, showing it with normal tables:
12:07:14 SQL> set transaction isolation level serializable;Transaction succeeded.12:07:14 SQL> delete from DEMOT1;1 row deleted.12:07:14 SQL> insert into DEMOT1 select * from DEMO;1 row created.12:07:14 SQL> delete from DEMOT2;2 rows deleted.12:07:14 SQL> insert into DEMOT2
select 'MVIEW' source,seq,now from DEMOT1
union all
select 'TABLE' source,seq,now from DEMO;2 rows created.12:07:14 SQL> select max(seq),max(now) from DEMO; MAX(SEQ) MAX(NOW)
___________ ______________________________________
2023485 02-JUL-19 12.07.14.394374000 PM GMT12:07:14 SQL> select max(seq),max(now) from DEMOT1; MAX(SEQ) MAX(NOW)
___________ ______________________________________
2023485 02-JUL-19 12.07.14.394374000 PM GMT12:07:14 SQL> select source,max(seq),max(now) from DEMOT2 group by source; SOURCE MAX(SEQ) MAX(NOW)
_________ ___________ ______________________________________
MVIEW 2023485 02-JUL-19 12.07.14.394374000 PM GMT
TABLE 2023485 02-JUL-19 12.07.14.394374000 PM GMT
This is what I would like to do with a materialized view refresh, but unfortunately:
This failed in the refresh of DEMOMV1:
/* MV_REFRESH (INS) */INSERT /*+ BYPASS_RECURSIVE_CHECK */ INTO "DEMO"."DEMOMV1" select * from DEMO
About the reason, it doesn’t work as I would expect it, my guess is that it has something to do with: the fact that a refresh is somehow considered DDL:
So… be careful if you have materialized views on top on the other, and want a consistent refresh, and the base tables are changing. And test it. The consistency issues are always discovered too late in production. Do you have UAT checking that?