Where to check Data Guard gap?

5 min readJan 27, 2019

At work, we had a discussion with well-known colleagues, Luca Canali and Ludovico Caldara, about where we check that Data Guard recovery works as expected without gap. Several views can be queried, depending on the context. Here are a few comments about them.

v$database

This is my preferred because it relies on the actual state of the database, whatever the recovery process is:

SQL> select scn_to_timestamp(current_scn) 
     from v$database;SCN_TO_TIMESTAMP(CURRENT_SCN)
----------------------------------------------------------
22-JAN-19 03.08.32.000000000 PM

This reads the current System Change number (DICUR_SCN from X$KCCDI) and maps it to a timestamp (using the mapping SMON_SCN_TIME table).

However, relying on a function which is available only when the database is opened, this is easy only with Active Data Guard. When the database is in mount state, you will get ORA-00904: “SCN_TO_TIMESTAMP”: invalid identifier.

Note that I’ve also seen cases where, in case of gap, the SMON_SCN_TIME was not up-to-date and I got ORA-08181: specified number is not a valid system change number. Then this is not for automatic monitoring.

Without Active Data Guard, you need to do the SCN to timestamp conversion on the primary. Or read the SCN from the datafiles, but this is not the latest apply but the lastest checkpointed:

SQL> alter session set nls_date_format='dd-mm-yy hh24:mi:ss';
Session altered.SQL> select max(checkpoint_time) from v$datafile_header;
MAX(CHECKPOINT_TI
-----------------
26-01-19 17:45:04

Reading from V$DATABASE does not rely on Data Guard and then is also available when the MRP is not started and also in Standard Edition non-managed standby.

Update 13–08–2019

The datafile headers may not be updated when MRP is running, see:

Oracle Database 18c changed MRP checkpointing from log boundary to time-based

The story started with an always-unpleasant alert log message: 2019-08-08T09:46:32.915070+02:00 PR00 (PID:10646): Media…

www.usn-it.de

gv$recovery_progress

This is Luca’s favored one (see is adg.sql script among many other interesting ones in https://github.com/LucaCanali/Oracle_DBA_scripts):

SQL> select inst_id, max(timestamp) 
     from gv$recovery_progress group by inst_id;INST_ID MAX(TIMESTAMP)
---------- --------------------
         1 22-JAN-2019 15:08:51

Where does this information come from? If you look at the execution plan you will see that it reads X$KSULOP which is the X$ that is behind V$SESSION_LONGOPS. You can get the same timestamp from it:

SQL> alter session set nls_date_format='dd-mm-yy hh24:mi:ss';
Session altered.SQL> select inst_id,opname,timestamp from gv$session_longops 
where opname='Media Recovery' and target_desc='Last Applied Redo';INST_ID OPNAME               TIMESTAMP
---------- -------------------- -----------------
         1 Media Recovery       26-01-19 18:56:39
         1 Media Recovery       26-01-19 19:40:35

As this information comes from what the MRP (Managed Recovery Process) logs, this view is available only when the recovery is running (APPLY-ON).

v$managed_standby;

Talking about what is logged by MRP, Ludo goes to the MRP status in v$managed_standby to see the sequence and block# increase. This is very interesting as we can compare the remaining work to do, from what is received by RFS:

19:59:46 SQL> select inst_id,process,status,client_process,thread#,sequence#,block#,blocks from gv$managed_standby;I PROCESS   STATUS       CLIENT_P THREAD# SEQUENCE# BLOCK# BLOCKS
- --------- ------------ -------- ------- --------- ------ ------
1 ARCH      CLOSING      ARCH           1        34  32768    481
1 DGRD      ALLOCATED    N/A            0         0      0      0
1 DGRD      ALLOCATED    N/A            0         0      0      0
1 ARCH      CLOSING      ARCH           1        41  28672   2046
1 ARCH      CLOSING      ARCH           1        37  30720   1804
1 ARCH      CLOSING      ARCH           1        40  26624   1608
1 RFS       IDLE         Archival       1         0      0      0
1 RFS       IDLE         UNKNOWN        0         0      0      0
1 RFS       IDLE         LGWR           1        42    121      1
1 RFS       IDLE         UNKNOWN        0         0      0      0
1 MRP0      APPLYING_LOG N/A            1        42    121  4096011 rows selected.19:59:50 SQL> /I PROCESS   STATUS       CLIENT_P THREAD# SEQUENCE# BLOCK# BLOCKS
- --------- ------------ -------- ------- --------- ------ ------
1 ARCH      CLOSING      ARCH           1        34  32768    481
1 DGRD      ALLOCATED    N/A            0         0      0      0
1 DGRD      ALLOCATED    N/A            0         0      0      0
1 ARCH      CLOSING      ARCH           1        41  28672   2046
1 ARCH      CLOSING      ARCH           1        37  30720   1804
1 ARCH      CLOSING      ARCH           1        40  26624   1608
1 RFS       IDLE         Archival       1         0      0      0
1 RFS       IDLE         UNKNOWN        0         0      0      0
1 RFS       IDLE         LGWR           1        42    124      1
1 RFS       IDLE         UNKNOWN        0         0      0      0
1 MRP0      APPLYING_LOG N/A            1        42    124  4096011 rows selected.

v$archived_log

V$ARCHIVED_LOG has an ‘APPLIED’ flag, but it is not really helpful here as it does not consider the real-time apply. In the following screenshot the changes up to 20:35:46 in sequence# 55 have been applied but V$ARCHIVED_LOG shows sequence 52 as not applied:

This view is about archived logs. But before being archived, the redo stream is received to the standby logs.

v$standby_log

With real-time apply (RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE) the redo is applied as soon as it is received in the standby redo logs. The gap should be small and is visible in v$managed_standby (number of blocks between RFS from LGWR and MRP apply). I’ve seen some monitoring queries on V$STANDBY_LOG. The idea is to read the actual state of the transport, in the same idea that when I read V$DATABASE for the actual state of apply, without relying on what is logged by the processes:

SQL> select max(last_time) 
     from gv$standby_log;MAX(LAST_TIME)
--------------------
22-JAN-2019 15:08:55

However, this query reads X$KCCSL which is not very efficient as it reads the standby redo log files. This can be long when they are large and full:

So… be careful with this one.

v$dataguard_stats

The DG Broker ‘show database’ displays the gap information. This comes from V$DATAGUARD_STATS as Data Guard checks the state at regular interval and stores the latest here:

SQL> select name||' '||value ||' '|| unit
     ||' computed at '||time_computed 
     from v$dataguard_stats;NAME||''||VALUE||''||UNIT||'COMPUTEDAT'||TIME_COMPUTED
------------------------------------------------------transport lag +00 00:00:00 day(2) to second(0) interval computed at 01/27/2019 22:08:33apply lag +00 00:00:01 day(2) to second(0) interval computed at 01/27/2019 22:08:33apply finish time +00 00:00:06.493 day(2) to second(3) interval computed at 01/27/2019 22:08:33

You must always check when the value was calculated (TIME_COMPUTED) and may add this to gap to estimate the gap from the current time, as with DGMGRL:

Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 1 second ago)
  Apply Lag:          1 second (computed 1 second ago)
  Average Apply Rate: 0 Byte/s
  Real Time Query:    OFF
  Instance(s):
    CDB2