Datapipe Weekly #22
Hello friend! This is a newsletter for builders.
What do you like to build?
I hope the ideas in this week’s newsletter can help you get it done.
In this weeks newsletter
💻 Tip: AWS Redshift Data Quality Assertions
📜 Quote of the week
Tip: AWS Redshift Data Quality Assertions
A regular part of my data pipelines is to drop a “lookback window” of data (say, the last 5 days) and backfill it with new data. This may seem redundant but is a simple way to prevent against gaps appearing when jobs fail to run, and helps for other details related to digital marketing data.
But this preventative measure can actually harm the data quality sometimes. What if the current job fails to pull API data? Then it will “update” the database with over the lookback window with nothing, in fact creating a gap.
To address this, I can add assertions to the pipeline after pulling from the API, and before dropping the lookback window. Such an assertion can be defined in AWS Redshift as a UDF with the following code:
create or replace function assert_nonzero(a int) returns bool stable as $$
if a == 0:
raise Exception('assert_nonzero failed')
else:
return True
$$ language plpythonu;
And it can be used like this:
>>> select assert_nonzero(19);
true
>>> select assert_nonzero(0);
Exception: assert_nonzero failed. Please look at svl_udf_log for more information
In practice, I use this along with an inner select, such as this example:
select assert_nonzero((select count(*) from table_name)::int);
… and keep that data looking good!
Quote of the week
“Someday soon, you'll be nostalgic for this. Someday soon, you'll have new problems and new pleasures.”
- @nameandnoun
I read this on twitter a few months ago, and floated back into my mind today.
Standing in my kitchen this morning, looking into my office with sun pouring in through the window. Feeling well rested and healthier than yesterday - or last week.
Lately I’ve been itching for a change. This change will bring problems, brand new ones I’ve never encountered before. It’s my intention though, to lay the framework for new pleasures. Things I have only ever dreamed of.
Will the change be worth it? I’m not sure. But it will surely leave me feeling nostalgic for today. For this.
-Alex
Thank you for reading Datapipe 👋
Subscribe the Datapipe weekly newsletter ⬇️