DB2 Version 10
There are certainly more authoritative souces on 10 than me, but having just come from the conference and learned at least enough to get certified, I thought I’d share. They seem to be pushing the big data/data warehousing angle, which I find a bit annoying as a OLTP DBA, because I think there are some good/interesting features for OLTP/e-commerce too. You may notice a lot of my links below are to the info center – that’s because as of this writing, there’s not a lot of information out on DB2 10 yet.
When DB2 10 is a Real Option
First of all, I’d never recommend going to a new version until at least FixPack 2, and more likely FixPack 3 or 4. There are usually some pretty significant bugs in GA and the first FixPack. If you have a vended application(such as WebSphere Commerce), most of the time, you shouldn’t attempt an upgrade until your vendor specifically states that they support the new version. I’d love to be at a company willing to take the risk for beta or for early adoption, but am just not.
The feature that stands out above the rest, assuming it is true, is the performance improvement. IBM is saying that with no changes, most clients will see 5% to 30% or more performance improvement. That sounds pretty cool to me. Will be interesting to see how that pans out without the close involvement of beta.
New Layer of Abstraction Between Tablespace Container and Storage
This is mostly being sold to implement HOT/WARM/COLD forms of data storage – so you can keep most frequently data on SSD, and so forth. That’s a great idea, and if I had it now, I’d be trying to get SSD for my product catalog tables. But there’s more this can be used for too. Essentially instead of creating tablespace containers on paths or in files, you’ll start creating them in storage groups. All tablsepaces will be AS (Automatic Storage) tablespaces, and instead of the DB2 9 way of only one set of paths for all AS tablespaces in the database, you’ll be able to specify different sets of paths and associate those with different tablespaces. I applaud this, as there are still many clients who can and want to fully separate the I/O paths for Data, Indexes, and Temp.
The re-balancing of everything here sounds pretty seamless. You can have up to 256 storage groups. So that’s not enough for a PeopleSoft database to have separate storeage paths for each tablespace, but come on, that’s a bit ridiculous. This is also closer to the way DB2 for z/OS does things if I understand correctly. Serge Rielau’s blog has a good description of the approach as IBM envisions it with multi-temperature data: https://www.ibm.com/developerworks/mydeveloperworks/blogs/SQLTips4DB2LUW/entry/multitempdata?lang=en_us
New Join Strategies
Gap/Jump scans sound pretty cool. Basically before DB2 10, db2 cannot use an index unless the column(s) it needs is first in order in the index. With Gap/Jump scans, DB2 10 will consider the use of these, but based on the implementation, it still may not choose them unless the cardinality of the index column(s) being skiped is very low. Still, it will be interesting to see how this performs in the real world. The Info Center mentions them here: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.db2.luw.admin.perf.doc/doc/c0005299.html
RCAC – New Security Model
I actually find this pretty interesting. Basically, you can use permissions and a where clause to limit the rows in a table that a user or a role can even see exists, and you can use a CASE statement in a mask to control what a user or role sees – if they’re not allowed to see something, you can choose what else they see instead. I can espeically see this being useful for ensuring that users cannot even see the encrypted value of a credit card number or whatever. The other thing that stood out to me here is that once you turn this on for a table, nobody has permissions unless explicitly defined – which could be a good or a bad thing.
While I see a place for LBAC, this new RCAC seems like something I’d be more likely to use myself.
Info Center entry on this feature: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.db2.luw.admin.sec.doc/doc/c0057423.html
HADR: Multiple Standbys
Thrilled to see this feature. The way they’re doing it, you still have your primary and your “principal” standby – they can use any of the synchronization modes. Then you can have up to 2 other standbys using super-async mode (meaning DB2 sends off the log data, but doesn’t wait for a response before finishing the transaction). This will allow you to have HADR across a larger distance. You can also have one(or both) of those standbys on a delayed roll-forward, allowing you to use HADR for recovery from user or logical data issues, not just system issues. Assuming you get your licensing right, you can also have either of these auxiliary standbys allow reads on the standby.
The feature I’m still waiting for is tighter integration with tsa. I’d love to be able to use db2haicu to get tsa out of some of the wonky states it appears to get into.
Info Center on this feature: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.db2.luw.admin.ha.doc/doc/c0059994.html
Time Travel Query
Conceptually, this is cool, and similar to Oracle’s Flashback Query as I understand it. There are essentially 3 forms of temporal tables in this area. System-period temporal tables record changes to the table and when they occurred (using a history table), and then allow you to query the data “AS OF” a certain time. Application-period temporal tables include columns that note when a particular row was valid (such as dates of an insurance policy). Bi-temporal tables allow you to do both at the same time. Part of the functionality allows you to have columns that are implicitly hidden. A dba I respect actually suggested hiding every column in a table so that a “select *” wouldn’t even work. The columns are still accessible by name, so it actually doesn’t seem like a bad way to force developers to use best practices when accessing table data.
Info Center on this feature: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0058476.html
CDI – Continuous Data Ingest
This feature is on the client, so you could use a db210 client to run this against a 9.7 or 9.5 (the old two down, one up rule) database. They said it was faster than LOADing into a staging table and then inserting that data into the real table. But I won’t be convinced that it is feasible or fast until I’ve seen it, since I heard multiple times that it was just regular old SQL inserts/updates behind the scenes. I guess the idea is that you’ll just Ingest data into your database “continuously”, and when they said a similiar thing for REORGS, it turned out to only be true up to a point.
Info Center on this feature: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.db2.luw.admin.dm.doc/doc/c0057237.html
Compression, WLM, and Other Pay-For-Use Features
I’m sure compression is great. And the new Adaptive Compression in DB2 10 that adds compression at the page level is probably also good. But it doesn’t come with WebSphere Commerce, and many of my clients OLTP databases are less than 100GB. So I have no idea when I’m actually going to get my hands on it.
Info Center on this feature: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0059054.html
Same thing goes for WLM. I’ll probably start using thresholds and actions on the default class to do what I used to do with the db2 governor, because the licensing as I understand it allows me to do that. But the new CPUSHARES and CPULIMIT and all that, while they sound neat, I’m not likely to get my hands on any time soon. Note that the wlm dispatcher has a 3% overhead, so only have it enabled if you really plan to use it.
If you’re thinking about db2 10, be sure to check out this webinar on June 22, 2012: http://www.idug.org/p/fo/et/thread=40636
Update: Here’s the book I used to prepare for the 10.1 certifications – it is pretty good: http://public.dhe.ibm.com/common/ssi/ecm/en/imm14091usen/IMM14091USEN.PDF