DBClean – Guest Users
This is part of a series focusing on the details of each item that needs to be pruned in a Commerce database.
What is it?
Guest Users are created by Commerce in the database. Technically they are only needed for when you allow a customer to add items to their cart before logging in or registering, or for a few other actions. In reality, if the developers have not properly used/overridden the ‘isGeneric’ method, then you can end up with tens of thousands of guest users per day. The problem is that it takes a very, very, very long time to delete any user in a Commerce database, because for whatever (really stupid) reason, IBM chose to make just about every table have a foreign key to the member or users table. So if you delete one row from the users or member table, it’s really deleting from or looking for dependencies in 300 or more tables.
Potential for impact
The users and member tables are frequently two of the top 10 Commerce tables in a database in terms of size. The real impact here is that it is possible to create guest users faster than you can actually delete them. That’s right, even if you run DBClean 24/7, sometimes you still cannot delete them as fast as you add them. This is one area where it is critical to talk to your developers and get them to properly reduce the creation of guest users. On three different clients, I’ve had to take the database offline for up to 6 hours and perform a much riskier operation to get rid of the extra guest users. Even once you’ve done that, you may have to spend hours every night just deleting guest users – I had one client where it was 4-6 hours every night just to delete the guest users.
Since the users and members tables are so heavily used, anything you can do to make them smaller is a good thing for database performance.
Where this Applies
This applies to nearly every Commerce database. I believe I once saw a B2B Commerce database without guest users, but the dozens(hundreds?) of others I’ve worked with all had guest users and needed them to be pruned.
Table (other tables may be involved in cascading deletes)
MEMBER (cascades to USERS and hundreds of other tables)
note: always test for yourself, don’t take my word for it
note also that you should be cautious running the SQL during peak loads, and always use “with ur”
Number of rows to be deleted:
select count(*) from member where member_id in ( select users_id from users T1 where registertype='G' and (days(CURRENT TIMESTAMP) - days(prevlastsession)) >= <days_to_retain> And not Exists (select 1 from orders where orders.member_id=t1.users_id and status != 'Q') and (users_id > 0)) with ur
If your Commerce database does not take orders (rare, but I’ve seen it), then you can vastly improve performance by tweaking the sql in WSCOMUSR.CLEANCONF in a custom dbclean job to eliminate the subquery for orders.
You can use dbclean to prune this one. Object=user and Type=guest
If you wanted to prune it manually, you could, keeping in mind all the gotchas for big deletes – you’d probably want to break it up. You don’t want to run it as one statement, but it would be:
delete from member where member_id in ( select users_id from users T1 where registertype='G' and (days(CURRENT TIMESTAMP) - days(prevlastsession)) >= ? And not Exists (select 1 from orders where orders.member_id=t1.users_id and status != 'Q') and (users_id > 0))
This statement changes from version to version of Commerce, so be sure to look in the CLEANCONF table for the best statement.
Guest users are obsolete pretty quickly. The actual time depends on how long before the cookies expire. Once a user’s cookie has expired, he cannot return to the guest user, and the guest user cannot be otherwise accessed by Commerce. Some sites actually allow guest users to complete orders – this pruning area will not delete users with orders. Orders in a commerce database include carts, so you may want to prune abandoned shopping carts before pruning guest users
What happens if you get behind?
By this, I mean what if you can’t delete the guest users as fast as they are created or you get to a point where you could run dbclean for days and still not delete all the users you need to. Sometimes you’re even deleting more than 50% of the tables. Guest users is the most likely area for this to happen. If you get to that point, then you have to essentially stop Commerce, export the data from the Users and Member tables that you want to keep and then LOAD that data back into the tables using the replace keyword. LOAD is required because IMPORT won’t let you do replace with the RI. After the LOAD, you have to set integrity (on approximately 300 tables – better have it scripted), preferably using exception tables, and then re-start Commerce. NEVER do this without testing thoroughly – prefferably with a full copy of the database on a development environment. This will give you timing information, but also help you learn important lessons (pay special attention to every row kicked out in the exception tables by the set integrity statements). I’ve done it three times, and had to tweak it each time as to what statements I was using because there were different pieces of data that got kicked out during the set integrity. Remember never to delete users with negative member_ids or users_ids as a starting place.
This method is risky. You’re fairly likely to accidentally delete something you needed – I’ve never had it just right on the first try, and I believe in one case went through 5 iterations before I had it right.
So there are some users that don’t have a value in prevlastsession or lastsession. Generally, you would want to give them a date there, BUT if you give them all the same date, then they’ll all come up with the same deletion date too, and you could end up with a long running dbclean that day. This is rarer with Guest Users than registered users.
I’m not sure this next thing was with Guest or Registered users, so I’ll mention it in both places. Sometimes users were created by a call center, and have never technically logged in, but should still not be deleted (they may have orders, addresses, etc). In this case, you do not want to assign them a prevlastsession or delete them.
So, good information? More detail needed in any areas? Let me know what you think.