Oracle’s SCN Flaw – could it happen in DB2?

You may also like...

6 Responses

  1. jipy says:

    Hi Ember –

    I manage commerce DB2 at {Company Redacted}. I like your blog and occasionally head over here when I find the topic that pops up in my RSS reader interesting.

    Believe it or not, this has happened to me once in a previous life. It was a DB that managed a service bus architecture (webmethods, I believe). The transactions were very short-lived and the application always disconnected after the work. This frequent deactivation of DB2 caused log files to be truncated and advanced the LSN number. Eventually it went over and it got reset to FFFFFFFs and DB was unaccessible/read-only. I think we had to export and reload the data to fix the issue.

    Going forward, we manually activated the DB every time after a recycle, so that the disconnects didn’t deactivate the DB and in-turn didn’t cause the log truncations.

    Fun memories.

    Cheers –

    • Ember Crooks says:

      Good to know it can happen, and a good reminder of one of the many reasons to activate your database if you don’t have an application that has a continuous connection like WebSphere Commerce does. I support some ESB databases myself, but nothing with the kind of volume you saw. At least in my experience they’re usually smaller databases since most of the data is transitory – so at least possible to export and reload the data, though no outage like that is good.

      I notice that it was just with db2 9.7 that the number was increased to the value I listed above, so they’re apparently aware of the need to increase it.

      • Ember Crooks says:

        One other comment – one of the most concerning things with the Oracle flaw is that once you reach the limit for one database, you reach it for all interconnected databases. Bad enough for one databases, but for multiple databases in your organization?

  2. David Tolleson says:

    Hi, Ember,

    Data Replication affects the LSN of target databases by the simple fact that it’s replaying transactions from the source database. However, it’s just issuing SQL (insert, update, delete) the same way any other app does. It doesn’t touch or even know about the target’s LSN directly (unless you’re also capturing from the target :).

    In other words, Data Replication would likely only push the target towards an LSN limit if the source workload was pushing both source database towards the limit.


    • Ember Crooks says:

      Thanks for the details. I haven’t actually worked with Replication since version 7, and while I remember that’s where I learned how to find the LSN using db2flsn, I was thinking that it was only for some investigative purposes on one side, not that it was synchronized. So even my statement that it’s “used” by replication is an overstatement – it’s used by replication the same way as any other app “uses” it.

  3. Pete Suhner says:

    Even though IBM are increasing the size of the RBA (Relative Byte Address) field for DB2 z/OS 11, several companies have been hit by this issue for their DB2 z/OS running in single system mode in the recent past. So they would have been happy if IBM had come up with this improvement one or two versions earlier. While the RBA can be reset to zero, some downtime is unavoidable to get this done. DB2 z/OS also has a soft limit after which a warning message is issued. And IBM provides good and active support for customers who realize that the hard limit is coming closer.

    The whole thing is however NOT an issue for DB2 z/OS in Data Sharing mode as these systems use a different implementation (LRSN – Logical Record Sequence Number).

    So it looks like yes, it can happen to some flavours of DB2, but it is extremely unlikely to happen to recent versions of DB2 LUW.

    And one might assume that vendors sometimes copy the wrong ideas and concepts from their competitors. One of the many shades of “copy right”…

Leave a Reply

Your email address will not be published. Required fields are marked *