Humans and computers like their text set up in different ways. For humans, you want familiar terms with capital letters and spacing in the right places. However, computers don’t like spaces between words. Therefore, this post is about how to rename the columns and data in your DataFrames. With this, you can make your text
Explore data in your DataFrame
Your Jupyter notebook has read the data from your files and / or SQL queries. Therefore, you now have a DataFrame each for what would have been your data sheets in Excel. At this point, it’s time to inspect and explore your data. Inspect what the first few rows of data look like Firstly, to
Run Jupyter notebooks in minimal time
You need to re-run a report your colleague has created in a notebook. Or perhaps, you want to pass a report you’ve made to a colleague to refresh. For either situation, these are some short cuts to run Jupyter notebooks in minimal time. Run entire Jupyter notebooks or cells within notebooks For instances when you
Read and write files with Jupyter Notebooks
Now that you’ve set up your Jupyter notebook, you can start getting data into it. To this purpose, this post discusses how to read and write files into and out of your Jupyter Notebooks. Furthermore, it tells you about the Python libraries you need for analyzing data. First things first: Essential Python libraries Your Jupyter
Jupyter notebook setup basics
When you set up Python on your computer for the first time, you also need software to write and run code. At my former employer, I used Anaconda, an open-source toolkit for Python which includes Jupyter notebooks. These allow you to write small blocks of code and run them immediately to check your work along
On feature development in mask making…
Since the pandemic began, I’ve made 889 masks. Of those, 750 went to 9 institutions in 6 states, with the rest going to friends, neighbours, and family. Initially, the process was very hectic because of the severe shortage in healthcare. But thankfully, things eased by June. With donations out of the way, I could consider
Pandas: Do work quickly. And learn programming!
I love pandas! This kind, of course… … but also the Python Data Analysis Library, an open-source software library. Even if you’re not trained as a programmer (like me), it can become a valuable time-saving tool. And here’s why. The data looks like Excel. So, it’s familiar. Pandas uses two basic formats (“objects”, in programming-speak)
Think Stats runner problem (Chap 3): My solution
Lately, I’ve been working through Think Stats by Professor Allen Downey. This book provides a novel way to learn statistics. Instead of formulas and theory, it teaches the same concepts with hands-on coding exercises. Because I am a business analyst, I identify with the practical approach of Think Stats. And today, I’d like to share
Take ownership of your data: The end-goal
We’ve come to the end of the Survival SQL series. Hopefully, you think the posts are relevant and easy to understand. Above all, I hope that your newfound knowledge has empowered you to take greater ownership of the data in your company. What does ownership mean? Your company probably has dedicated teams of data engineers
Subqueries and CTEs: Multi-step problems
As you progress, you’ll find situations where you need to combine more than one data pull to answer your question. In these situations, you’ll use subqueries and join or filter the results from these. Actually, you already have seen some subqueries in the examples from the last 2 posts. This post aims to provide an