When processing data that is time senstitive, always process the data at one unit of time smaller than is desired.
If you report is required quarterly, run it monthly. Due at the end of the month? Run it weekly. Weekly, gets run daily; and finaly, I run a lot of daily reports every 4 hours.
This rule comes from doing a lot of report generation and time senstive data aggregation.
Observant readers may have noticed something about the initial examples of time: the frequency with which you run is approaching 0. That is the correct solution: continous running. Rather than checking to see if something has arrived, send notifications from the data source and process until complete.
You are responsible for a payroll system that uploads data to the bank every week. Being a responsible individual, and being robust system aware, you decide to run the payroll data aggregation process every night. This means that every night, you construct a file that can be sent up to the bank.
On Sunday your file processes with:
S M T W R F S S: 0% 0% 0% 0% 0% 0% 0%
But on monday:
S M T W R F S M: 100% 0% 0% 0% 0% 0% 0%
Sunday got filled with data. The rest of the week is still empty, but that’s because nobody has worked those days yet.
S M T W R F S M: 100% 0% 0% 0% 0% 0% 0% T: 100% 100% 0% 0% 0% 0% 0% W: 100% 100% 100% 0% 0% 0% 0% R: 100% 100% 100% 100% 0% 0% 0% F: 100% 100% 100% 100% 100% 0% 0% S: 100% 100% 100% 100% 100% 100% 0% S: 100% 100% 100% 100% 100% 100% 100%
TIP: It is at this point your manager will be pissed off for wasting resources. Persevere.
One weekend (Sunday 4am) you get a call, the payroll system has completely tanked. A fire broke out in the server room!
Your responsibility is to getting employees paid, so your first thought is that it is a pay week, and people are expecting to get paid TONIGHT. You quickly remote into your desktop in the office and start to perform an assessment:
- The final pay run cannot reach the bank
- It failed because the network failed
- The server that stores the timesheet data is toast
- Your server that sends the data seems to be OK
Your data file from last night’s run looks something like this:
S M T W R F S S: ad% XgL% 100% bWV% 9V% Ft% 12% ^&**MCIIUEsDBAoAAAAAAPO1jkwAAAAAAAAAAAAAA wAc2FtcGxlYXNzaWdubWVudC9VVAkAAwmF0lqj2tNadXgLAAEE6AMAAAToAwAAUEsDBAoAAAAAAPO1j kwAAAAAAAAAAAAAAAAaABwAc2FtcGxlYXNzaWdubWVudC9zdHVkZW50NS9VVAkAAwmF0lqj2tNadXgL AAEE6AMAAAToAwAAUEsDBAoACAAAAPO1jkwAAAAAYEUIAGBFCAAoABwAc2FtcGxlYXNzaWdubWVudC9 zdHVkZW50NS9hc3NpZ25tZW50LnppcFVUCQADCYXSWpjh01p1eAsAAQToAwAABOgDAABQSwMECgAAAA AAQaqITAAAAAAAAAAAAAAAAAsAHA
That’s not good.
But as it happens, becuase you run nightly, you usually have two files in existance: the current run, and the previous run. The previous run just does not get cleaned up until the next job completes
TIP: it is at this point that your manager will start thinking, we should never delete any delivery files. It is going to take some coaching to make them realize that storing that much data wastes resources, and offers no extra benefit. Be gentle, they are confused and scared.
Inspecting the previous night’s run looks like this:
S M T W R F S 100% 100% 100% 100% 100% 100% 0%
Let’s summarize what our state is.
- The input data is not available
- You clean data, except for one day
- People expect to be paid tomorrow
What do you do?
TIP: A manager is going to suggest that we hold off the pay run until all the timesheet data is available. Giving an incomplete report could have negative consequences. It is better to not pay at all, and wait until we have all the information. (Be gentle ad firm, they are scared, but wrong)
Get that data to the bank.
You will probably be phoning the bank for support, reading through manuals about how to move the data, possibly reading the processes on the sending server, but you are going to get that data to the bank.
People’s lives depend on this stuff:
- Rent and Mortgage payments are due, and evictions are pending
- Children go hungry while they wait for groceries
- Sick dependants will live in pain while perscriptions wait ot be filled
- A parent is going to loose access to their children if they miss a child support payment
You just saved the day. Because you run nightly, you are able to get most people paid for most of their time. It’s a little inaccurate, but we can fix that last day later. Mostly people are going to be unhappy, but OK.
The example draws from several actual cases I have been involved in over the years, though is worse than any one failure I’ve ever experienced.
The managerial objections are not made up, they are pretty much quotes (from memory) and were dangerously wrong in each case. I mention this because it is worth noting that most managers career paths have not equiped them for long-term risk assessment. You have likely been hired for your expertise in this domain, and you have a repsonsibility to protect them from their lack of understanding in the domain of risk management.
In data management, one of the most important skills you can master is “Managerial Management”: the art of keeping managers from making damaging decisions. Managers, your job is to answer the phone and run interference while the analysts solve the problem.