Lesson 1. Write Clean Code - Expressive or Literate Programming in R - Data Science for Scientists 101
Clean Code & Getting Help - Earth analytics course module
Welcome to the first lesson in the Clean Code & Getting Help module. This module covers how to write easier to read, clean code. Further is covers some basic approaches to getting help when working in R. Finally it reviews how to install QGIS - a free and open source GIS tool - on your computer.This lesson reviews best practices associated with clean coding.
Learning Objectives
At the end of this activity, you will be able to:
- Write code using Hadley Wickham’s style guide.
What You Need
You need R
and RStudio
to complete this tutorial. Also we recommend that you have an earth-analytics
directory set up on your computer with a /data
directory within it.
Resources
Clean code means that your code is organized in a way that is easy for you and for someone else to follow and read. Certain conventions are suggested to make code easier to read. For example, many guides suggest the use of a space after a comment. Like so:
#poorly formatted comments are missing the space after the pound sign.
# good comments have a space after the pound sign
While these types of guidelines may seem unimportant when you first begin to code, after a while you’ll realize that consistently formatted code is much easier for your eye to scan and quickly understand.
Consistent, Clean Code
Take some time to review Hadley Wickham’s style guide. From here on in, you will follow this guide for all of the assignments in this class.
Object Naming Best Practices
- Keep object names short: This makes them easier to read when scanning through code
- Use meaningful names: For example,
precip
is a more useful name that tells us something about the object compared tox
- Don’t start names with numbers! Objects that start with a number are NOT VALID in
R
- Avoid names that are existing functions in R: e.g.
if
,else
,for
, see here
A few other notes about object names in R
:
R
is case sensitive (e.g.weight_kg
is different fromWeight_kg
).- Avoid other function names (e.g.
c
,T
,mean
,data
,df
,weights
). - Use nouns for variable names and verbs for function names.
- Avoid using dots in object names - e.g.
my.dataset
- dots have a special meaning inR
(for methods) and other programming languages. Instead use underscoresmy_dataset
.
Challenge
Take a look at the code below.
- Create a list of all of the things that could be improved to make the code easier to read and work with.
- Add to that list things that don’t fit the Hadley Wickham style guide standards.
- Try to run the code in
R
. Any issues?
#my code
#load stuff
library(ggplot2)
#turn off factors
options(stringsAsFactors = FALSE)
1variable <- 3 * 6
meanVariable <- 1variable
#calculate something important
mean-variable <- meanvariable * 5
thefinalthingthatineedtocalculate <- mean-variable + 5
#get things that are important
download.file(url = "https://ndownloader.figshare.com/files/7010681",
destfile = "data/boulder-precip.csv")
my.data <- read.csv(file="data/boulder-precip.csv")
head(my_data)
str(my.data)
qplot(x=my.data$DATE,
y=my.data$PRECIP)
Leave a Comment