Dude, Where's My Data?
This isn't a question I like to hear after a night of successful job runs, yet here we are. I can see the panic in the faces of my sweet colleagues fielding calls from the business— folks are insisting that dashboards and scheduled reports have stale data. They want answers. They also want pitchforks.
It turns out that when your analytics jobs run on data that is two replication processes removed from production, there is plenty of opportunity for people and systems to fall out of sync. So what happened here?
When everything works as designed, the production database maintained by the vendor is backed up and restored to a reporting instance. When the restore completes, a record is created. An in-house job looks for the record periodically, and once it's found, the steps to update the data warehouse begin. But what happens when the trigger record is never created?
You Look Like That Flashcard They Told Me Means Sadness
The data warehouse never updates, but the analytics jobs go about their business as usual— with stale warehouse data. The jobs successfully run nothing from the eyes of the business, and they find the issue before the data people do.
My strategy was to immediately implement a way to identify this issue sooner, and approach long-term process improvements over time. I settled on a critical data point to compare the data warehouse against the reporting database. If the data warehouse and reporting database differ on that data point, an error is thrown as the first step of the analytics jobs. If a failure occurs, the job sends an email to the analytics team to notify that the data warehouse is not up to date. The riots can start much earlier in the morning, with more information at hand.
While this solution is a short-term band-aid that only addresses a particular replication issue, it has already proven useful. And for that, I can spare a few lines of code.