Rollbacks in the Real World

Audience:
Topic:

As someone who have been working for slightly over a year now, there are some things I wish I learned earlier on. Space and time performance of code has been important but I hardly knew about that in the workplace where databases are complicated, code isn't as clean as you like it to be and architecture riles people up more so as it becomes philosophical debates about what should be used for what cases.

With all those noise and debates aside, my tech lead was kind enough to point to me all these different ways of people logging their changes when they wrote long scripts that made huge changes to our databases. In June, I was tasked with an important project to update the pricing of our product. Long story short, I was required to write a script that changed pricing depending on 7/8 variables. You can only imagine it's like selling camp cups, the price depends on the size, the color, the material used and where you are selling it, the list goes on about price discrimination). It sounded like a simple task that I thought I could handle it quickly but as I made the changes. I learnt things the hard way:

1. DO a dry run because if you don't, when you mess up your local database, it may take you a VERY LONG TIME to reseed and build it. 

2. ALWAYS log your script to give you a sense of how and when the changes are being made so that you can eyeball/keep track of it as the changes go on. I didn't log it correctly, in fact, I was only displaying my output on my own console so I didn't have proof of how my previous run went.

3. The logs will guide you on your rollback because an undo isn't as simple as ctrl/commandZ, especially when so many people are doing things to the same database (multiversion concurrency control is not the focus of this talk). Log it where you can share it, compare data if you need to.

In this talk, I will be showing a live rollback of a simple database (<5 columns, 1 row) vs a complex database (about 10 colums, ~10,000 rows). This is to show why logging is important during rollbacks especially for big databases. I will also talk about how long we should keep those logs for and how do performances compare between MySQL and PostgreSQL along the way. 

 

<h2>How Simple Rollbacks become Complicated</h2>

We have all been there when we made a mistake and wish to undo it like we can in our text editors. But, when you are making changes to a database, say one that requires you to make that small change but on a big scale, a small mistake may be amplified. 

When you write a script to make changes to a database with a thousand rows of changes, and near the end of that script, someone watching your logs points out, 

 

"Hey, you're off by a decimal point."

The script completes.

You panic.

Because you are now paying someone more by a hundred dollars. 

And that "someone" is actually a few thousand people because you have written an effective script to not do manual work of paying one by one so you are now essentially off by thousands too many.

 

<h2>Don't Panic, There's a Rollback Coming</h2>

The simple way we can go about it is to halt the thing we are doing and do a rollback, but how do we really know that the change is being made? Well, because you know the exact error made, you could just use `mysql/postgreSQL> ROLLBACK;`

 

But to confidently handle the errors, we really want to have seen some logs while running to keep track of the progress, what is being changed and perhaps, when/who is making the changes.

 

<h2>Undo Fearlessly</h2>

What we want instead is not a blind undo, we, ideally, should have seen something like this while the script is running: 

"You paid Jane $11.00 previously, and are now paying her $110.0".

 

 So that when we do a rollback, we can now also see,

"You paid Jane $110.0 previously, and are now paying her $11.00".

 

We can make this even better, we can change "You" to log your user name and id so that we have a record of people who are touching this very important script. We can also indicate when we are making the change like so,

"You (user id: 1), paid Jane (vendor id: 1)  $11.00 on 12:28pm EST, 11/2/2018, and will be paying her $110.00 on <time.zone.today>."

 

The user log will provide other people on your team evidence of who touched what and when because this is an important log to keep people accountable especially when payments are being made. The amount on the log will also give you an indication like a receipt to check your database and make sure those numbers are committed. Lastly, the timestamp will give you a sense on when the changes are made and how to intervene if large payments are being made while you wish to abort.

 

<h2>How much do we need to remember?</h2>

With cases like payments that are highly sensitive as an actual transaction is being performed, money is being exchanged, the logs are also highly important to back up those payments in case of disputes. Under the Bank Security Act, banks must keep a detailed history of each checking and savings account for at least five years after the information is obtained. Personal data under GDPR is another issue to watch out for. 

Room:
Room 105
Time:
Thursday, March 5, 2020 - 17:15 to 18:00