-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathp_foundations_writing_functions.qmd
319 lines (230 loc) · 9.24 KB
/
p_foundations_writing_functions.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
---
title: 'Intro to Functions and Conditionals'
---
```{python}
import pandas as pd
pd.options.display.max_rows = 7
```
## Intro
So far in this course you have mostly used functions written by others. In this lesson, you will learn how to write your own functions in Python.
## Learning Objectives
By the end of this lesson, you will be able to:
1. Create and use your own functions in Python.
2. Design function arguments and set default values.
3. Use conditional logic like `if`, `elif`, and `else` within functions.
## Packages
Run the following code to install and load the packages needed for this lesson:
```{python}
# Import packages
import pandas as pd
import numpy as np
import vega_datasets as vd
```
## Basics of a Function
Let's start by creating a very simple function. Consider the following function that converts pounds (a unit of weight) to kilograms (another unit of weight):
```{python}
def pounds_to_kg(pounds):
return pounds * 0.4536
```
If you execute this code, you will create a function named `pounds_to_kg`, which can be used directly in a script or in the console:
```{python}
print(pounds_to_kg(150))
```
Let's break down the structure of this first function step by step.
First, a function is created using the `def` keyword, followed by a pair of parentheses and a colon.
```{python}
# | eval: false
def function_name():
# Function body
```
Inside the parentheses, we indicate the **arguments** of the function. Our function only takes one argument, which we have decided to name `pounds`. This is the value that we want to convert from pounds to kilograms.
```{python}
# | eval: false
def pounds_to_kg(pounds):
# Function body
```
Of course, we could have named this argument anything we wanted. E.g. `p` or `weight`.
The next element, after the colon, is the **body** of the function. This is where we write the code that we want to execute when the function is called.
```{python}
def pounds_to_kg(pounds):
return pounds * 0.4536
```
We use the `return` statement to specify what value the function should output.
You could also assign the result to a variable and then return that variable:
```{python}
def pounds_to_kg(pounds):
kg = pounds * 0.4536
return kg
```
This is a bit more wordy, but it makes the function clearer.
We can now use our function like this with a named argument:
```{python}
pounds_to_kg(pounds=150)
```
Or without a named argument:
```{python}
pounds_to_kg(150)
```
To use this in a DataFrame, you can create a new column:
```{python}
pounds_df = pd.DataFrame({'pounds': [150, 200, 250]})
pounds_df['kg'] = pounds_to_kg(pounds_df['pounds'])
pounds_df
```
And that's it! You have just created and usedyour first function in Python.
::: {.callout-tip title="Practice"}
### Age in Months Function
Create a simple function called `years_to_months` that transforms age in years to age in months.
Use it on the `riots_df` DataFrame imported below to create a new column called `age_months`:
```{python}
riots_df = vd.data.la_riots()
riots_df
```
:::
## Functions with Multiple Arguments
Most functions take multiple arguments rather than just one. Let's look at an example of a function that takes three arguments:
```{python}
def calc_calories(carb_grams, protein_grams, fat_grams):
result = (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return result
calc_calories(carb_grams=50, protein_grams=25, fat_grams=10)
```
The `calc_calories` function computes the total calories based on the grams of carbohydrates, protein, and fat. Carbohydrates and proteins are estimated to be 4 calories per gram, while fat is estimated to be 9 calories per gram.
If you attempt to use the function without supplying all the arguments, it will yield an error.
```{python}
# | eval: false
calc_calories(carb_grams=50, protein_grams=25)
```
```
TypeError: calc_calories() missing 1 required positional argument: 'fat_grams'
```
You can define **default values** for your function's arguments. If an argument is **called** without a **value assigned to it**, then this argument assumes its default value. Let's make all arguments optional by giving them all default values:
```{python}
def calc_calories(carb_grams=0, protein_grams=0, fat_grams=0):
result = (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return result
```
Now, we can call the function with only some arguments without getting an error:
```{python}
calc_calories(carb_grams=50, protein_grams=25)
```
Let's use this on a sample dataset:
```{python}
food_df = pd.DataFrame({
'food': ['Apple', 'Avocado'],
'carb_grams': [25, 10],
'protein_grams': [0, 1],
'fat_grams': [0, 14]
})
food_df['calories'] = calc_calories(food_df['carb_grams'], food_df['protein_grams'], food_df['fat_grams'])
food_df
```
::: {.callout-tip title="Practice"}
### BMI Function
Create a function named `calc_bmi` that calculates the Body Mass Index (BMI) for one or more individuals, then apply the function by running the code chunk further below. The formula for BMI is weight (kg) divided by height (m) squared.
```{python}
# Your code here
```
```{python}
# | eval: false
bmi_df = pd.DataFrame({
'Weight': [70, 80, 100], # in kg
'Height': [1.7, 1.8, 1.2] # in meters
})
bmi_df['BMI'] = calc_bmi(bmi_df['Weight'], bmi_df['Height'])
bmi_df
```
:::
## Intro to Conditionals: `if`, `elif`, and `else`
Conditional statements allow you to execute code only when certain conditions are met. The basic syntax in Python is:
```{python}
# | eval: false
if condition:
# Code to execute if condition is True
elif another_condition:
# Code to execute if the previous condition was False and this condition is True
else:
# Code to execute if all previous conditions were False
```
Let's look at an example of using conditionals within a function. Suppose we want to write a function that classifies a number as positive, negative, or zero.
```{python}
def class_num(num):
if num > 0:
return "Positive"
elif num < 0:
return "Negative"
else:
return "Zero"
print(class_num(10)) # Output: Positive
print(class_num(-5)) # Output: Negative
print(class_num(0)) # Output: Zero
```
If you try to use this function the way we have done above for, for example the BMI function, you will get an error:
```{python}
num_df = pd.DataFrame({'num': [10, -5, 0]})
num_df
```
```{python}
# | eval: false
num_df['category'] = class_num(num_df['num'])
```
```
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
```
The reason for this is that if statements are not built to work with series (they are not inherently vectorized); but rather work with single values. To get around this, we can use the `np.vectorize` function to create a vectorized version of the function:
```{python}
class_num_vec = np.vectorize(class_num)
num_df['category'] = class_num_vec(num_df['num'])
num_df
```
To get more practice with conditionals, let's write a function that categorizes grades into simple categories:
- If the grade is 85 or above, the category is 'Excellent'.
- If the grade is between 60 and 84, the category is 'Pass'.
- If the grade is below 60, the category is 'Fail'.
- If the grade is negative or invalid, return 'Invalid grade'.
```{python}
def categorize_grade(grade):
if grade >= 85 and grade <= 100:
return 'Excellent'
elif grade >= 60 and grade < 85:
return 'Pass'
elif grade >= 0 and grade < 60:
return 'Fail'
else:
return 'Invalid grade'
categorize_grade(95) # Output: Excellent
```
We can apply this function to a column in a DataFrame but first we need to vectorize it:
```{python}
categorize_grade = np.vectorize(categorize_grade)
```
```{python}
grades_df = pd.DataFrame({'grade': [95, 82, 76, 65, 58, -5]})
grades_df['grade_cat'] = categorize_grade(grades_df['grade'])
grades_df
```
::: {.callout-tip title="Practice"}
### Age Categorization Function
Now, try writing a function that categorizes age into different life stages as described earlier. You should use the following criteria:
- If the age is under 18, the category is 'Minor'.
- If the age is greater than or equal to 18 and less than 65, the category is 'Adult'.
- If the age is greater than or equal to 65, the category is 'Senior'.
- If the age is negative or invalid, return 'Invalid age'.
Use it on the `riots_df` DataFrame printed below to create a new column called `Age_Category`.
```{python}
# Your code here
riots_df = vd.data.la_riots()
riots_df
```
:::
::: {.callout-note title="Side Note"}
### Apply vs Vectorize
Another way to use functions with if statements on a dataframe is to use the `apply` method. Here is how you can do the grade categorization function with `apply`:
```{python}
grades_df['grade_cat'] = grades_df['grade'].apply(categorize_grade)
grades_df
```
The `vectorize` method is easier to use with multiple arguments, but you will encounter the `apply` method further down the road.
:::
## Conclusion
In this lesson, we've introduced the basics of writing functions in Python and how to use conditional statements within those functions. Functions are essential building blocks in programming that allow you to encapsulate code for reuse and better organization. Conditional statements enable your functions to make decisions based on input values or other conditions.