Chapter 16 Functions
Note: Chapter under development
Functions are set sof instructions that take inputs (arguments), processes them, and returns an output. Instead of writing the same code repeatedly, you can define a function once and use it whenever needed. This makes your scripts cleaner and easier to maintain.
I recommend Hadley Wickham’s great chapter on functions in his book R for Data Science 2e.
16.1 Components
A function in R is defined using the function
keyword. Here is a simple function that adds two numbers:
add_numbers <- function(a, b) {
result <- a + b
return(result)
}
# Using the function
add_numbers(3, 5)
In this example, we see a function consisting of:
- A name (e.g.,
add_numbers
) - Arguments (e.g.,
a
,b
) - A body that performs calculations or operations
- A return statement (optional, but recommended)
We can also reference functions (from other packages) for use in our own custom functions. This data cleaning functions references dplyr
, stringr,
and tidyr
.
# Practical Example: Data Cleaning Function
library(dplyr)
library(stringr)
library(tidyr)
data_cleaner <- function(df) {
df <- df %>%
dplyr::mutate(across(where(is.character),
stringr::str_trim)) %>% # Trim whitespace
tidyr::drop_na() # Remove missing values
return(df)
}
# Example
df <- data.frame(name = c("Alice ", " Bob", NA), age = c(25, 30, 22))
data_cleaner(df)
16.2 Luhn example
It turns out credit card numbers aren’t a random 16 digits, there’s an algorithm (named after it’s inventor Hans Peter Luhn) that determines the combinations possible.
It follows these steps:
- Starting from the right, double every second digit.
- If doubling a digit results in a number greater than 9, subtract 9 from it.
- Sum all the digits.
- If the total sum is a multiple of 10, the number is valid; otherwise, it is invalid.
That’s a lot of info… instead let’s write a function to check if a credit card number is valid:
validate_credit_card <- function(card_number) {
digits <- as.numeric(strsplit(as.character(card_number), "")[[1]])
n <- length(digits)
# Double every second digit from the right
for (i in seq(n - 1, 1, by = -2)) {
digits[i] <- digits[i] * 2
if (digits[i] > 9) {
digits[i] <- digits[i] - 9
}
}
# Check if the sum is a multiple of 10
return(sum(digits) %% 10 == 0)
}
# Example usage
validate_credit_card(4532015112830366) # Should return TRUE
validate_credit_card(1234567812345678) # Should return FALSE
16.3 Loops in functions
The real power of functions comes when we introduce loops. Loops allow us to repeat a calculation for different parameter values. Factorials are a great example where a loop is needed.
A factorial is a mathematical operation denoted by an exclamation mark (!), used to find the product of all positive integers less than or equal to a given number.
Let’s write a looping function to calculate 10! (10x9x8x7x6x5x4x3x2x1)
## Calculating Factorial Using a Loop
factorial_loop <- function(n) {
if (n < 0) {
stop("Factorial is not defined for negative numbers.")
}
result <- 1
for (i in 1:n) {
result <- result * i
}
return(result)
}
# Example usage
factorial_loop(10) # Test 10!
We see factorials blow out in size very quickly. Try inputting 100! and see what happens.