If you use Excel heavily, you will have used their functions at some point. In short, a function is a “black box” of code with a useful purpose, that you need over and over again. Subsequently, anyone can call it and get a consistent result. In this post, you will learn how to write your own Python functions.

### Syntax to write your own Python functions

As a start, we’ll break down the function from the previous post on renaming columns into its structure below. Additionally, you can get step-by-step explanations from Python For Everybody here.

### When will you need to write functions to crunch business data?

Just like in Excel, you can do most arithmetic calculations among columns of a DataFrame with operators. In this case, you need to thank NumPy because its underpinnings allow you to match the values in the columns to calculate, and applies the calculation to all rows in the column. So, you’ll need to write functions only if you need multiple steps of “if-then-else” logic.

### Example of using operators: Calculating profit margin

Let’s return to the superstore data set. As a start, we had converted all the column names to lower case with underscores. Next, we want to calculate the profit margin, which is profit / sales.

See what happened with the code in the red rectangle? You can create a new column and give it any name (just don’t repeat a name already in that DataFrame unless you want to replace the column). Next, you can use a mathematical expression involving other columns to tell Python how to calculate that new column. As a result, Python creates a new column (in this example, named “profit_margin”) and its values would be the result of performing the math on each row. In other words, the profit margin value in each row here is the value in the “profit” column of that row, divided by the value in the “sales” column of the same row. So far, it works just like Excel!

These are the common Python operators: +, , *, /, ** (power), % (modulo i.e. the remainder of dividing one number by another, and // (floor i.e. the whole number quotient dividing one number by another).

### Python built-in functions and libraries

In addition to the above operators, Python also has some built-in functions. These are functions that were written as part of the Python language. Many of them are more useful for developers, but the ones that may interest you immediately as an Excel user are abs(), max() and min(), which work similarly to how they would in Excel.

Further to the standard library (part of the Python language), we discussed libraries when we talked about importing numpy and pandas. Most other functions that you use in Excel, such as std(), median() etc, can be found in Numpy. Links to the official documentation are at the bottom of this post.

### Solving an Excel pain point: if-then-else logic

While we’re on the topic of Excel – have you ever used the “IF” function in a spreadsheet? If you’ve done many of these, perhaps you might agree with my pain points:

• Everything is crammed tightly into one cell, so it’s hard for a third party to read.
• Furthermore, the logic is based on cell references, so a third party has to keep looking back and forth between cells to decipher the reasoning.

Therefore, using Python to apply if-then-else logic to business data helps not only your colleagues, but also the future you. Since your code clearly shows your columns and your logic, it will be easier to follow your thought process.

For instance, let’s suppose each order was supposed to ship within 2 days. So, we want to create a new column that says “late” if the shipping date minus the order date was more than 2 days, and “on time” otherwise. To achieve this, we can write a function with an “if” statement, such as this:

##### Syntax for “if” statement:

The Python For Everybody tutorial here provides a great foundation for understanding “if” statements. If you’ve watched this before but need a quick memory jog of the key points, remember these:

• Python keywords: use if for the first condition, elif for subsequent conditions, and else for the last one.
• You need to follow every condition and else with a colon.
• Indent all the code associated with each condition. This tells Python exactly which code to run when a particular condition is satisfied, and when to stop.
• Since you are writing a function here, your “if” statement has a single indent. Thus, the code stating the action to take for each condition will have a double-indent (see picture above). Jupyter will manage your indents for you, as long as you place your colons properly.
• Remember that Python goes through the conditions from top to bottom, so the order matters.
• Conditional operators for your “if” statements include == (equals), != (not equals), >, <, >= and <=.
##### Syntax for .apply: dataframe_name= dataframe_name.apply(function_name, axis = 1)

Unlike the code in the picture above, you want to save the DataFrame that you’ve applied your function to. So, you have to declare your DataFrame variable again (using =). You can use the same name, in which case you will overwrite your original DataFrame, or use a different name if you want to preserve both versions.

### Technical resources

I believe that the above material is already quite technical, but if you want to be more exhaustive, you can refer to the links below.

This site uses Akismet to reduce spam. Learn how your comment data is processed.