Pular para o conteúdo

Oracle GoldenGate BRINTERVAL: Why You Should Always Configure It to Avoid Extract Lag

Are you dealing with Extract lag, especially after a restart? You’re not alone, and this post might help you. Let’s dive in!

 

Recently, I faced an interesting case at a customer site that perfectly illustrates the importance of the BRINTERVAL parameter in Oracle GoldenGate, and if you don’t know how it works, you really should.

 

 

What is BRINTERVAL in Oracle GoldenGate?

The BRINTERVAL parameter controls how often GoldenGate creates bounded recovery checkpoints. Unlike normal checkpoints, bounded recovery checkpoints include not just the read/write position, but also the full transaction state, memory structures, and open transaction context. This means that after an unexpected stop, GoldenGate can resume from a safe point without having to go all the way back to the oldest open transaction.

Real Case: Extract Recovery back 20 Hours

The environment had two Extract processes. One was configured with BRINTERVAL 1H, while the other was running without this parameter.

So, what happened after an unexpected stop?

  • ✅ The Extract with BRINTERVAL 1H recovered quickly.
  • ❌ The other one started a recovery process that went back more than 20 hours, because of an old open transaction, and this significantly delayed the restart.

 

That single gap made a huge difference in the behavior of the two Extracts. The one without the BRINTERVAL parameter had to scan the archived logs to find the beginning of an open transaction that had started 20 hours earlier. It then had to read through all those logs until it finally reached the last valid checkpoint, only then could it resume capturing new transactions. Meanwhile, the replication lag kept growing until the entire recovery was completed, and it took nearly 2 hours before the Extract caught up and real-time capture was re-established.

 

 

In real-time replication, this kind of lag is simply unacceptable after just a single stop and start — don’t you agree? The longer the Extract spends catching up, the longer your entire replication flow is left waiting for new data, and the longer it takes for real-time replication to be fully re-established.

 

After the process finally completed its recovery, the BRINTERVAL parameter was also added to the parameter file. However, restarting again still wasn’t an option — because doing so would trigger the entire long recovery cycle all over again.

 

The immediate fix was to issue the command:

SEND EXTRACT <extract_name> BR BRSTATUS SEND EXTRACT <extract_name> BR BRINTERVAL 1H

 

This forced the creation of a new bounded recovery checkpoint and ensured that future stops would recover much faster.

Now, here’s the key part: GoldenGate has a default BRINTERVAL of 4 hours, even if you don’t set it explicitly. So why did the recovery go back more than 20 hours?

 

The explanation lies in how Bounded Recovery works. As described recently in the Oracle blog from Sourav Bhattacharya:

  • bounded recovery checkpoint is not the same as a normal checkpoint.
  • It persists not only the read/write positions, but also the transaction context (open transactions, memory structures, threads, etc.).
  • If something prevents bounded recovery files from being written, such as I/O issues, permission problems, parameter changes, or very long transactions that were never captured in a BR cycle, GoldenGate may fall back to the last available recovery point, which can be much older than the 4-hour default.

 

 

That’s exactly what happened in this case. Although the default is technically 4h, relying on it can be risky because it depends on successful and consistent checkpointing.

How to Configure BRINTERVAL

According to the official Oracle documentation (OGG 23ai), the valid range for BRINTERVAL is:

  • Minimum: 20 minutes
  • Maximum: 96 hours
  • Default: 4 hours

 

 

Also, the interval you set must be a multiple of the CHECKPOINTSECS value of the Extract process. For example, if CHECKPOINTSECS is 60 seconds, you can set BRINTERVAL to 1h, 2h, 4h, and so on. However, CHECKPOINTSECS is provided here only for knowledge and understanding, you should not change it deliberately without Oracle Support’s recommendation.

 

Here’s how you can configure it in your parameter file:

EXTRACT E_GGBR
USERIDALIAS ogg_connect
EXTTRAIL gg BR BRINTERVAL 1H <<<<<<<<<<<<<<<<<<
TABLE SOURCE.*;

 

In this example, GoldenGate creates a bounded recovery checkpoint every hour, ensuring faster recovery even if there are open long-running transactions.

How to Validate Bounded Recovery

 

You can also check that bounded recovery checkpoints are being written properly:

INFO EXTRACT <extract_name> SHOWTRANS TABULAR → shows open transactions.
INFO EXTRACT <extract_name> SHOWCH → shows the checkpoints, including bounded recovery files.

In addition, monitor the BR directory:

  • Classic Architecture (CA): $OGG_HOME/BR
  • Microservices Architecture (MA): <DEPLOYMENT_HOME>/var/lib/data/BR

 

These directories should contain bounded recovery files created at the expected intervals. If the directory is empty or the files are stale, bounded recovery is not functioning properly.

 

Another excellent source of information on this topic, with a hands-on approach and command examples, can be found in Gavin Soorma’s blog: Oracle GoldenGate Bounded Recovery

Final thoughts

Although GoldenGate provides a default of 4 hours, my experience and Oracle’s own documentation show that relying on it can be risky. The best practice is simple: always set BRINTERVAL explicitly, and choose an interval that matches your availability requirements. For most real-time, mission-critical replication use cases, 1 hour is a solid choice.