This is a demonstration of Excel to R equivalents. Specifically, I am showing equivalents of six key functions from the dplyr
package.
Data comes from the US National Health and Nutrition Examination Survey and was collected from the NHANES package. It is a sample of 10,000 people, weighted to be representative of the US population.
I used the clean_names
function from the janitor package to make the variable names easy to work with.
Let’s import the NHANES data into a data frame and then take a look at it.
nhanes <- read_csv("nhanes.csv")
nhanes
I’ve also made a codebook in case you want to learn more about the variables (most are self-explanatory from their names).
nhanes_codebook <- read_csv("nhanes-codebook.csv")
nhanes_codebook
nhanes %>%
select(height)
nhanes %>%
mutate(height_inches = height / 2.54) %>%
select(height, height_inches)
nhanes %>%
filter(height > 150) %>%
select(height)
nhanes %>%
summarize(mean_height = mean(height, na.rm = TRUE))
nhanes %>%
group_by(gender) %>%
summarize(mean_height = mean(height, na.rm = TRUE))
nhanes %>%
arrange(height) %>%
select(height)