Why this topic? This is a lead-in to next week’s episode, where I will talk about calculating moving averages in Pandas. Subsequently, I will show how nulls in the data set will affect how you want to structure your code. Therefore, I need to get everyone on the same page in understanding how nulls work.
Tag: SQL
Read and write files with Jupyter Notebooks
Now that you’ve set up your Jupyter notebook, you can start getting data into it. To this purpose, this post discusses how to read and write files into and out of your Jupyter Notebooks. Furthermore, it tells you about the Python libraries you need for analyzing data. First things first: Essential Python libraries Your Jupyter
Take ownership of your data: The end-goal
We’ve come to the end of the Survival SQL series. Hopefully, you think the posts are relevant and easy to understand. Above all, I hope that your newfound knowledge has empowered you to take greater ownership of the data in your company. What does ownership mean? Your company probably has dedicated teams of data engineers
Subqueries and CTEs: Multi-step problems
As you progress, you’ll find situations where you need to combine more than one data pull to answer your question. In these situations, you’ll use subqueries and join or filter the results from these. Actually, you already have seen some subqueries in the examples from the last 2 posts. This post aims to provide an
Window functions : Automate lengthy Excel tasks
“Find our 5 best-selling products by country.””What was the 7-day moving average revenue for every day last month?” Chances are, you’ve come across one of these business questions before. And previously, you’ve probably solved them either by creating a ticket for your analytics team, or by writing manual formulas in Excel. Instead of doing these
Sort your data with “ORDER BY”
Most of you will use Excel to manipulate and plot the data you pull with SQL. Eventually, you’re likely to create charts for presentations to your management. If you sort your data with “ORDER BY” when you pull it in SQL, you’ll save time. As a result, you can skip the “Sort and Filter” step
When to use more complex joins
As you pull your data more often, you may encounter situations where you need more complex joins than what we’ve previously discussed. Here are some examples: You’re looking for something that did not happen: e.g. products without orders in the past 7 days. In this case, you need an outer join. Specifically, you would pull
Define your own metrics: Operators & conditionals
As you become confident with pulling data, you’ll start wanting more control over the metrics you report on. Hence, you’ll want to define your own metrics. Therefore, you’ll want to learn how to use operators and conditional statements. This Kaggle notebook contains all the code you need to follow along with the examples. Run cells
Aggregating metrics: Using “GROUP BY”
Earlier, we’ve discussed that your database stores data at its most granular level. This means that every order creates one row in your orders table. Likewise, every click a customer makes creates one row in your clicks table. And every review you receive creates one row in your product reviews table. When this builds up
Filtering the data you want: The “WHERE” clause
Why is SQL necessary and useful? In essence, we have huge amounts of data and SQL enables us to extract what we need for specific business insights. So, filtering with the “WHERE” clause is a crucial part of using SQL. This Kaggle notebook contains all the code for following along with the examples. Run cells