Merging with Time Travel
Sorry science fiction fans this post isn’t about time travel or the new fascination with surviving while traveling through a black hole.
It’s about something much scarier and harder to deal with. End users who are super demanding.
You know the ones I’m talking about. The ones that no matter how often you refresh the data in your Qlik Sense application, they still complain. Because they want the power to press a magic button and get the latest and greatest data in the database right then. You know, so that they have complete control.
There are times you can’t really blame them but you are in a bit of a fix. You would like to use Dynamic Views because you loved my post but they need the ability to interact with all of the data or ask for Insights. You quickly pivot as you know how to do Partial Reloads in Qlik Sense and you know how to do incremental loads but then reality hits you in the face. You read the comments in the load script and they tell you that the reason the scheduled reloads are hourly is because the tables in this system don’t have any Last_Modified_Timestamp type fields that would allow you to load just a handful of records.
It would be out of this world, if you could just travel through time and collect the records that have changed since the application was last reloaded on it’s schedule, and then merge them with all of the existing data. Wouldn’t it?
As it turns out my friends. You can travel through time if you are using Snowflake. Well sort of. You see they have a cool feature set built into the product for something called Time Travel. One form of that maintains Changes that are made to the data. It adds metadata columns behind the scenes to your tables that allow you to modify your typical SQL Selection and say “give me only the changes that have occurred since X time”. I’m not kidding, check out my query and the results.
The link for Time Travel Changes above has all of the details so I will focus on the highlights. I’ve wrapped a normal query with a little bit of syntax that says “get me only the changes that have occurred since ‘2021-06-29 18:14:18.779 -0400′” One tricky part is the last line that says “Hey Snowflake do me a favor, I know that you store updates as deletes and inserts, and frankly I don’t need those because I just care about the new data.” Oh Snowflake will still return any actual deletes that occur, but only the ones that aren’t part of an update or upsert if you will. You gotta like that.
PS – Snowflake’s Time Travel includes other fantastic features as well as this change stuff I’m showing you here. If you really want to have your mind blown check them out.
I hear you out there. “That’s cool and all but how does that help me. It’s not like Qlik Sense will just let me merge that result set from Snowflake with my current gigantic tables or anything.”
You see while you have been having so much fun Asking the Insight Advisor questions, setting up Alerting and adding Collaboration to your applications Qlik released a new method of incremental loading that literally does exactly that. The new method is aptly named Merge. As you can see below I literally take the command from above and just tell Qlik Sense to Merge it with the Accounts table and that the Hospital_Account_ID field is the Primary Key value it should use.
Your DBA left you hanging and you had no way to know which records could be loaded incrementally, but Qlik and Snowflake had your back.
We’ve got your back
But it gets better when you think about this from a performance standpoint. No more of that complicated incremental load script code to handle deletes. They are handed to you on a silver platter, quickly, and the Merge handles them for you. Which means you can invoke it as often as you would like. Like those situations where pesky users want to press a button.
In this screenshot I’ve taken the Tiny Reload button and added it to the screen so that end users can press it when they want to know the exact up to the second account balances (and any other account details.) If you aren’t familiar with how to add extensions this one, or are curious why I chose it please check out my video called Qlik SaaS Enablement – Extensions. In it I demonstrate how to add it and demonstrate why.
While slightly overly dramatic, you will see that in my details table I have a column that shows me when the data was last refreshed, and any rows that were loaded by the end user are displayed in green. (Be sure to let me know if you think that’s a bit cheesy or a neat feature your end users might like.)
Please refer to the documentation for Partial Reload for all of the information you might need, but I wanted to at least show you just how easy it is.
As a developer you need to modify your code slightly and call the system function IsPartialReload to see if the loading is for a full reload or a partial reload. If it is False then you do your normal table loading like always. At the onset of my “full reload” I run a simple query against Snowflake to ask for it’s current timestamp so I know when (in it’s time system) I know when I’m starting my data load.
After I finish all of the table loading script there is naturally an Else where this magic happens. That incredibly complex, I mean super simple, Merge statement that does it’s Snowflake Time Travel. Lastly I swap out the variables and reset my LastTimestamp table.
When I got started on this journey through time with Snowflake I read in their help that Time Travel is enabled for all Databases. “WooHoo! I can just run my code.” To which I was promptly greeted with an error message saying that time travel wasn’t enabled. That seems like a contradiction. But then it hit me. There is storage required to handle those metadata columns I mentioned, and perhaps I need to tell them specifically that I need to do want these changes for particular tables. Sure enough I needed to execute the following statement and then I was cooking with gas.
Alter Table “GENERALHOSPITALDB”.”dbo”.”Accounts” set change_tracking = true
As you are fully aware at this point in your career, there is no such a thing as “one size fits all”. There are use cases where this makes sense to allow end users to do partial reloads so they are in control, and there are use cases where this doesn’t make sense. You know tables with tons of data that are simply used for historical analysis; Trends over time.
It’s comforting to know that for nearly 30 years Qlik has continued to adapt it’s solutions to provide developers with ways to handle disparate data from anywhere and ensure you can respond to those demanding use cases. Even cases like this where you are on the leading, with data that does time travel with Snowflake? Yeah they’ve got you covered. In an easy to implement way that will put a smile on your end users faces.