Your Jupyter notebook has read the data from your files and / or SQL queries. Therefore, you now have a DataFrame each for what would have been your data sheets in Excel. At this point, it’s time to inspect and explore your data. Inspect what the first few rows of data look like Firstly, to
Month: September 2020
Run Jupyter notebooks in minimal time
You need to re-run a report your colleague has created in a notebook. Or perhaps, you want to pass a report you’ve made to a colleague to refresh. For either situation, these are some short cuts to run Jupyter notebooks in minimal time. Run entire Jupyter notebooks or cells within notebooks For instances when you
Read and write files with Jupyter Notebooks
Now that you’ve set up your Jupyter notebook, you can start getting data into it. To this purpose, this post discusses how to read and write files into and out of your Jupyter Notebooks. Furthermore, it tells you about the Python libraries you need for analyzing data. First things first: Essential Python libraries Your Jupyter
Jupyter notebook setup basics
When you set up Python on your computer for the first time, you also need software to write and run code. At my former employer, I used Anaconda, an open-source toolkit for Python which includes Jupyter notebooks. These allow you to write small blocks of code and run them immediately to check your work along
On feature development in mask making…
Since the pandemic began, I’ve made 889 masks. Of those, 750 went to 9 institutions in 6 states, with the rest going to friends, neighbours, and family. Initially, the process was very hectic because of the severe shortage in healthcare. But thankfully, things eased by June. With donations out of the way, I could consider
Pandas: Do work quickly. And learn programming!
I love pandas! This kind, of course… … but also the Python Data Analysis Library, an open-source software library. Even if you’re not trained as a programmer (like me), it can become a valuable time-saving tool. And here’s why. The data looks like Excel. So, it’s familiar. Pandas uses two basic formats (“objects”, in programming-speak)
Think Stats runner problem (Chap 3): My solution
Lately, I’ve been working through Think Stats by Professor Allen Downey. This book provides a novel way to learn statistics. Instead of formulas and theory, it teaches the same concepts with hands-on coding exercises. Because I am a business analyst, I identify with the practical approach of Think Stats. And today, I’d like to share
Take ownership of your data: The end-goal
We’ve come to the end of the Survival SQL series. Hopefully, you think the posts are relevant and easy to understand. Above all, I hope that your newfound knowledge has empowered you to take greater ownership of the data in your company. What does ownership mean? Your company probably has dedicated teams of data engineers
Subqueries and CTEs: Multi-step problems
As you progress, you’ll find situations where you need to combine more than one data pull to answer your question. In these situations, you’ll use subqueries and join or filter the results from these. Actually, you already have seen some subqueries in the examples from the last 2 posts. This post aims to provide an
Window functions : Automate lengthy Excel tasks
“Find our 5 best-selling products by country.””What was the 7-day moving average revenue for every day last month?” Chances are, you’ve come across one of these business questions before. And previously, you’ve probably solved them either by creating a ticket for your analytics team, or by writing manual formulas in Excel. Instead of doing these