Leah Wasser
Leah Wasser has contributed to the materials listed below. Leah is the director of the Earth Analytics Education Initiative at Earth Lab and maintains this website.Course Lessons
Course lessons are developed as a part of a course curriculum. They teach specific learning objectives associated with data and scientific programming. Leah Wasser has contributed to the following lessons:
Calculate Seasonal Summary Values from Climate Data Variables Stored in NetCDF 4 Format: Work With MACA v2 Climate Data in Python
Learn how to calculate seasonal summary values for MACA 2 climate data using xarray and region mask in open source Python.
Calculate Summary Values Using Spatial Areas of Interest (AOIs) including Shapefiles for Climate Data Variables Stored in NetCDF 4 Format: Work With MACA v2 Climate Data in Python
Climate datasets stored in netcdf 4 format often cover the entire globe or an entire country. Learn how to subset climate data spatially and by time slices using xarray and regionmask in open source python.
How to Open and Process NetCDF 4 Data Format in Open Source Python
Historic and projected climate data are most often stored in netcdf 4 format. Learn how to open and process MACA version 2 climate data for the Continental United States using the open source python package, xarray.
How to Download MACA2 Climate Data Using Python
MACA V2 climate data provides but historica and future predictions of climate variables using different models. Learn how to download netcdf 4 format programatically using open source Python and open the data with xarray.
Introduction to the CMIP and MACA v2 Climate Data
In this lesson you will learn the basics of what CMIP5 and MACA v 2 data are and how global climate data are downscaled to higher resolutions to support regional analysis.
Introduction to the NetCDF4 Hierarchical Data Format
In this lesson you will learn about that netcdf 4 data format which is a format, commonly used to store climate data. In later lessons you will learn how to open climate data using open source Python tools.
Loops in Python Exercise
Loops can be used to automate data tasks in Python by iteratively executing the same code on multiple data structures. Practice using loops to automate certain functionality in Python.
Introduction to List Comprehensions in Python: Write More Efficient Loops
A list comprehensions in Python is a type of loop that is often faster than traditional loops. Learn how to create list comprehensions to automate data tasks in Python.
Activity: Plot Spatial Raster Data in Python
Practice your skills creating maps of raster and vector data using open source Python.
Activity: Plot Time Series Data Using Pandas in Open Source Python
Practice your skills plotting time series data stored in Pandas Data Frames in Python.
Activity: Practice Plotting Tabular Data Using Matplotlib and Pandas in Open Source Python
Practice your skills plotting time series data stored in Pandas Data Frames in Python.
File Formats Exercise
Complete these exercises to practice the skills you learned in the file formats chapters.
Introduction to Spatial Vector Data File Formats in Open Source Python
Vector data is one of the two most common spatial data types. Learn to work with vector data for earth data science.
Use Raster Data for Earth Data Science
Raster data is one of the two most common spatial data types. Learn to work with raster data for earth data science.
Spatial Data Formats for Earth Data Science
Two of the major spatial data formats used in earth data science are vector and raster data. Learn about these two common spatial data formats for earth data science workflows in this chapter.
Use Twitter Data to Explore the 2013 Colorado Flood Using Open Source Python
In this lesson you will learn how to parse a JSON file containing twitter data to better understand the 2013 Colorado Floods using Open Source Python tools.
Open and Use MODIS Data in HDF4 format in Open Source Python
MODIS is remote sensing data that is stored in the HDF4 file format. Learn how to open and use MODIS data in HDF4 form in Open Source Python.
Introduction to the HDF4 Data Format - Explore H4 Files Using HDFView
MODIS is remote sensing data that is stored in the HDF4 file format. Learn how to view and explore HDF4 files (and their metadata) using the free HDF viewer provided by the HDF group.
Find and Download MODIS Data From the USGS Earth Explorer Website
Learn how to find and download MODIS data from the USGS Earth Explorer website.
Work with MODIS Remote Sensing Data using Open Source Python
MODIS is a satellite remote sensing instrument that collects data daily across the globe at 250-500 m resolution. Learn how to import, clean up and plot MODIS data in Python.
Practice Opening and Plotting Landsat Data in Python Using Rasterio
A set of activities for you to practice your skills using Landsat Data in Open Source Python.
Find and Download Landsat 8 Remote Sensing Data From the USGS Earth Explorer Website
Learn how to find and download Landsat 8 remote sensing data from the USGS Earth Explorer website.
How to Replace Raster Cell Values with Values from A Different Raster Data Set in Python
Most remote sensing data sets contain no data values that represent pixels that contain invalid data. Learn how to handle no data values in Python for better raster processing.
Clean Remote Sensing Data in Python - Clouds, Shadows & Cloud Masks
Landsat remote sensing data often has pixels that are covered by clouds and cloud shadows. Learn how to remove cloud covered landsat pixels using open source Python.
Open and Crop Landsat Remote Sensing Data in Open Source Python
Learn how to open up and create a stack of Landsat images and crop them to a certain extent using open source Python.
Work with Landsat Remote Sensing Data in Python
Landsat 8 data are downloaded in tif file format. Learn how to open and manipulate Landsat 8 data in Python. Also learn how to create RGB and color infrared Landsat image composites.
Calculate and Plot Difference Normalized Burn Ratio (dNBR) using Landsat 8 Remote Sensing Data in Python
The Normalized Burn Index is used to quantify the amount of area that was impacted by a fire. Learn how to calculate the normalized burn index and classify your data using Landsat 8 data in Python.
Calculate NDVI Using NAIP Remote Sensing Data in the Python Programming Language
A vegetation index is a single value that quantifies vegetation health or structure. Learn how to calculate the NDVI vegetation index using NAIP data in Python.
Calculate Vegetation Indices in Python
A vegetation index is a value that quantifies vegetation health or structure. Learn how to calculate the NDVI and NBR vegetation indices to study vegetation health and wildfire impacts in Python.
Summary Activity for Time Series Data
An activity to practice all of the skills you just learned in .
Customize Dates on Time Series Plots in Python Using Matplotlib
When you plot time series data using the matplotlib package in Python, you often want to customize the date format that is presented on the plot. Learn how to customize the date format on time series plots created using matplotlib.
Resample or Summarize Time Series Data in Python With Pandas - Hourly to Daily Summary
Sometimes you need to take time series data collected at a higher resolution (for instance many times a day) and summarize it to a daily, weekly or even monthly value. This process is called resampling in Python and can be done using pandas dataframes. Learn how to resample time series data in Python with Pandas.
Subset Time Series By Dates Python Using Pandas
Sometimes you have data over a longer time span than you need for your analysis or plot. Learn how to subset your data using a begin and end date in Python.
Work With Datetime Format in Python - Time Series Data
Python provides a datetime object for storing and working with dates. Learn how you can convert columns in a pandas dataframe containing dates and times as strings into datetime objects for more efficient analysis and plotting.
Work With Datetime Format in Python - Time Series Data
Python provides a datetime object for storing and working with dates. Learn how you can convert columns in a pandas dataframe containing dates and times as strings into datetime objects for more efficient analysis and plotting.
The Relationship Between Precipitation and Stream Discharge | Explore Mass Balance
Learn how to create a cumulative sum plot in Pandas to better understand stream discharge in a watershed
Why A Hundred Year Flood Can Occur Every Year. Calculate Exceedance Probability and Return Periods in Python
Learn how to calculate exceedance probability and return periods associated with a flood in Python.
Write Functions with Multiple Parameters in Python
A function is a reusable block of code that performs a specific task. Learn how to write functions that can take multiple as well as optional parameters in Python to eliminate repetition and improve efficiency in your code.
Write Functions in Python
A function is a reusable block of code that performs a specific task. Learn how to write functions in Python to eliminate repetition and improve efficiency in your code.
Introduction to Writing Functions in Python
A function is a reusable block of code that performs a specific task. Learn how functions can be used to write efficient and DRY (Do Not Repeat Yourself), code in Python.
Create Data Workflows with Loops
Loops can be an important part of creating a data workflow in Python. Use loops to go from raw data to a finished project more effeciently.
Automate Data Tasks With Loops in Python
Loops can be used to automate data tasks in Python by iteratively executing the same code on multiple data structures. Learn how to automate data tasks in Python using data structures such as lists, numpy arrays, and pandas dataframes.
Introduction to Using Loops to Automate Workflows in Open Source Python
Loops can help reduce repetition in code by iteratively executing the same code on a range or list of values. Learn about the basic types of loops in Python and how they can be used to write Do Not Repeat Yourself, or DRY, code in Python.
Conditional Statements with Alternative or Combined Conditions
Conditional statements in Python can be written to check for alternative conditions or combinations of multiple conditions. Learn how to write conditional statements in Python that choose betweeen alternative conditions or check for combinations of conditions before executing code.
Intro to Conditional Statements in Python
Conditional statements help you to control the flow of code by executing code only when certain conditions are met. Learn about the structure of conditional statements in Python and how they can be used to write Do Not Repeat Yourself, or DRY, code in Python.
Practice Forking a GitHub Repository and Submitting Pull Requests
A pull request allows anyone to suggest changes to a repository on GitHub that can be easily reviewed by others. Learn how to submit pull requests on GitHub.com to suggest changes to a GitHub repository.
An Example of a Github Collaborative Workflow for Team Science
GitHub.com can be used to store and access files in the cloud using GitHub repositories. Learn how to submit pull requests on GitHub.com to suggest changes to a GitHub repository.
Track, Manage and Discuss Project Changes and Updates Using GitHub Issues
An issue is a GitHub project management tool that allows anyone to identify and discuss potential changes to a repo. Learn how to create and manage GitHub issues to support collaborative open reproducible science projects.
Sync a GitHub Repo: How To Ensure Your GitHub Fork Is Up To Date
When you are working on a forked GitHub repository you will need to update your files frequently. Learn how to update your GitHub fork using a reverse pull request.
How To Create A Pull Request on Github: Propose Changes to GitHub Repositories
A pull request allows anyone to suggest changes to a repository on GitHub that can be easily reviewed by others. Learn how to submit pull requests on GitHub.com to suggest changes to a GitHub repository.
Learn How To Use GitHub to Collaborate on Open Science Projects
GitHub is a website that supports git-based version control and collaborative project management. Learn how to use git and GitHub to collaborate on projects in support of open reproducible science.
Select Data From Pandas Dataframes
Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn how to use indexing and filtering to select data from pandas dataframes.
Run Calculations and Summary Statistics on Pandas Dataframes
Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn how to run calculations and summary statistics (such as mean or maximum) on columns in pandas dataframes.
Import CSV Files Into Pandas Dataframes
Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn how to import text data from .csv files into numpy arrays.
Intro to Pandas Dataframes
Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn about the key characteristics of pandas dataframes that make them a useful data structure for storing and working with labeled scientific datasets.
Slice (or Select) Data From Numpy Arrays
Numpy arrays are an efficient data structure for working with scientific data in Python. Learn how to use indexing to slice (or select) data from one-dimensional and two-dimensional numpy arrays.
Run Calculations and Summary Statistics on Numpy Arrays
Numpy arrays are an efficient data structure for working with scientific data in Python. Learn how to run calculations and summary statistics (such as mean or maximum) on one-dimensional and two-dimensional numpy arrays.
Import Text Files Into Numpy Arrays
Numpy arrays are an efficient data structure for working with scientific data in Python. Learn how to import text data from .txt and .csv files into numpy arrays.
Intro to Numpy Arrays
Numpy arrays are a commonly used scientific data structure in Python that store data as a grid, or a matrix. Learn about the key characteristics of numpy arrays that make them an efficient data structure for storing and working with large scientific datasets.
Use the OS and Glob Python Packages to Manipulate File Paths
The os and glob packages are very useful tools in Python for accessing files and directories and for creating lists of paths to files and directories, respectively. Learn how to manipulate and parse file and directory paths using os and glob.
Write Code That Will Work On Any Computer: Introduction to Using the OS Python Package to Set Up Working Directories and Construct File Paths
Manually constructed files paths will often not run on computers with different operating systems. Learn how to construct file paths in Python that will work on Mac, Linux and Windows, in support of open reproducible science.
Working Directories, Absolute and Relative Paths and Other Science Project Management Terms Defined
A directory refers to a folder on a computer that has relationships to other folders. Learn about directories, files, and paths, as they relate to creating reproducible science projects.
Install Packages in Python
Packages in Python provide pre-built functionality that adds to the functionality available in base Python. Learn how to install packages in Python using conda environments.
Use Conda Environments to Manage Python Dependencies: Everything That You Need to Know
A conda environment is a self contained Python environment that allows you to run different versions of Python (with different installed packages) on your computer. Learn how to conda environments can you help manage Python packages and dependencies.
Python Packages for Earth Data Science
The Python programming language provides many packages and libraries for working with scientific data. Learn about key Python packages for earth data science.
Customize your Maps in Python using Matplotlib: GIS in Python
In this lesson you will review how to customize matplotlib maps created using vector data in Python. You will review how to add legends, titles and how to customize map colors.
Customize Matplotlibe Dates Ticks on the x-axis in Python
When you plot time series data in matplotlib, you often want to customize the date format that is presented on the plot. Learn how to customize the date format in a Python matplotlib plot.
Customize Your Plots Using Matplotlib
Matplotlib is the most commonly used plotting library in Python. Learn how to customize the colors, symbols, and labels on your plots using matplotlib.
Introduction to Plotting in Python Using Matplotlib
Matplotlib is the most commonly used plotting library in Python. Learn how to create plots using the matplotlib object oriented approach.
DRY Code and Modularity
DRY (Do Not Repeat Yourself) code supports reproducibility by removing repetition and making code easier to read. Learn about key strategies to write DRY code in Python.
Make Your Code Easier to Read By Using Expressive Variable Names in Python
Expressive variable names refer to function and variable names that describe what the variable contains or what the function does. Using expressive names makes your code easier to understand. Learn how to create expressive names for objects in your Python code.
Clean Code Syntax for Python: Introduction to PEP 8 Style Guide
Using a standard format and syntax when programming makes your code easier to read. Learn more about PEP 8, a set of guidelines for writing clean code in Python.
Introduction to Writing Clean Code and Literate Expressive Programming
Clean code refers to writing code that runs efficiently, is not redundant and is easy for anyone to understand. Learn about the characteristics and benefits of writing clean, expressive code in Python.
Python Fundamentals Exercise
Complete these exercises to practice the skills you learned in the Python fundamentals chapters.
Basic Operators in Python
Operators are symbols in Python that carry out a specific computation, or operation, such as arithmetic calculations. Learn how to use basic operators in Python.
Lists in Python
A Python list is a data structure that stores a collection of values in a specified order (or sequence) and is mutable (or changeable). Learn how to create and work with lists in Python.
Variables in Python
Variables store data (i.e. information) that you want to re-use in your code (e.g. single numeric value, path to a directory or file). Learn how to to create and work with variables in Python.
Introduction to the Python Scientific Programming Language for Earth Data Science
Python is a free, open source programming language that can be used to work with scientific data. Learn about using Python to develop scientific workflows.
Use Tabular Data for Earth Data Science
Tabular data is common in all analytical work, most commonly seen as .txt and .csv files. Learn to work with tabular data for earth data science in this lesson.
Format Text In Jupyter Notebook With Markdown
Markdown allows you to format text using simple, plain-text syntax and can be used to document code in a variety of tools, including Jupyter Notebook. Learn how to format text in Jupyter Notebook using Markdown.
Text File Formats for Earth Data Science
There are many text file formats that are useful for earth data science workflows including Markdown, text (.txt, .csv) files, and YAML (Yet Another Markup Language). Learn about these common text file formats for earth data science workflows.
Useful Jupyter Notebook Shortcuts
The Jupyter ecosystem contains many useful tools for working with Python including Jupyter Notebook, an interactive coding environment. Learn useful shortcuts in Jupyter Notebook that can help you complete your tasks quickly and efficiently.
Manage Jupyter Notebook Files
The Jupyter ecosystem contains many useful tools for working with Python including Jupyter Notebook, an interactive coding environment, and the Jupyter Notebook dashboard, which allows you to manage files and directories in your Jupyter environment. Learn how to manage Jupyter Notebook files including saving, renaming, deleting, moving, and downloading notebooks.
Manage Directories in Jupyter Notebook Dashboard
The Jupyter ecosystem contains many useful tools for working with Python including the Jupyter Notebook dashboard, which allows you to manage files and directories in your Jupyter environment. Learn how to create, rename, move, and delete directories using the Jupyter Notebook dashboard.
Code and Markdown Cells in Jupyter Notebook
The Jupyter ecosystem contains many useful tools for working with Python including Jupyter Notebook, an interactive coding environment. Learn how to work with cells, including Python code and Markdown text cells, in Jupyter Notebook.
Get Started With Jupyter Notebook For Python
The Jupyter ecosystem contains many useful tools for working with Python including Jupyter Notebook, an interactive coding environment. Learn how to launch and close Jupyter Notebook sessions and how to navigate the Jupyter Dashboard to create and open Jupyter Notebook files (.ipynb).
Introduction to Jupyter For Python
The Jupyter ecosystem contains many useful tools for working with Python including Jupyter Notebook, an interactive coding environment. Learn how the components and functionality of Jupyter Notebook can help you implement open reproducible science workflows.
Bash Commands to Manage Directories and Files
Bash or Shell is a command line tool that is used in open science to efficiently manipulate files and directories. Learn how to run useful Bash commands to access and manage directories and files on your computer.
Introduction to Bash (Shell) and Manipulating Files and Directores at the Command Line
Bash or Shell is a command line tool that is used in open science to efficiently manipulate files and directories. Learn how to use Bash to manipulate files in support of reproducible science.
Customize your Maps in Python: GIS in Python
In this lesson you will learn how to adjust the x and y limits of your matplotlib and geopandas map to change the spatial extent..
How To Organize Your Project: Best Practices for Open Reproducible Science
Open reproducible science refers to developing workflows that others can easily understand and use. Learn about best practices for organizing open reproducible science projects including the use of machine readable names.
Tools For Open Reproducible Science
Key tools for open reproducible science include Shell (Bash), git and GitHub, Jupyter, and Python. Learn how these tools help you implement open reproducible science workflows.
What Is Open Reproducible Science
Open reproducible science refers to developing workflows that others can easily understand and use. It enables you to build on others' work rather than starting from scratch. Learn about the importance and benefits of open reproducible science.
Customize your Maps in Python using Matplotlib: GIS in Python
When making maps, you often want to create legends, customize colors, adjust zoom levels, or even make interactive maps. Learn how to customize maps created using vector data in Python with matplotlib, geopandas, and folium.
Handle missing spatial attribute data: GIS in Python
Sometimes vector data are missing attribute data, and it can be helpful to clean up your data. Learn how to handle missing attribute data in Python using GeoPandas.
How to Join Attributes From One Shapefile to Another in Open Source Python Using Geopandas: GIS in Python
A spatial join is when you assign attributes from one shapefile to another based upon its spatial location. Learn how to perform spatial joins in Python.
How to Dissolve Polygons Using Geopandas: GIS in Python
When you dissolve polygons, you remove the interior boundaries of a set of polygons with the same attribute value and create one new merged or combined polygon for each attribute value. Learn how to dissolve polygons in Python using GeoPandas.
Clip a spatial vector layer in Python using Shapely & GeoPandas: GIS in Python
Sometimes you may want to spatially clip a vector data layer to a specified boundary for easier plotting and analysis of smaller spatial areas. Learn how to clip a vector data layer in Python using GeoPandas and Shapely.
GIS in Python: Reproject Vector Data.
Often when spatial data do not line up properly on a plot, it is because they are in different coordinate reference systems (CRS). Learn how to reproject a vector dataset to a different CRS in Python using the to_crs() function from GeoPandas.
GIS in Python: Reproject Vector Data.
Often when spatial data do not line up properly on a plot, it is because they are in different coordinate reference systems (CRS). Learn how to reproject a vector dataset to a different CRS in Python using the to_crs() function from GeoPandas.
How Do You Design and Automate a Data Workflow
Designing and developing data workflows can help you complete your work more efficiently by allowing you to repeat and automate data tasks. Learn how to design and develop efficient workflows to automate data analyses in Python.
Learn to Write Pseudocode for Python Programming
Pseudcode can help you design data workflows through listing out the individual steps of workflow in plain language, so the focus is on the overall data process, rather than on the specific code needed. Learn best practices for writing pseudocode for data workflows.
Data Workflow Best Practices - Things to Consider When Processing Data
Identifying aspects of a workflow that can be modularized and tested can help you design efficient and effective data workflows. Learn best practices for designing efficient data workflows.
About the ReStructured Text Format - Introduction to .rst
Restructured text (RST) is a text format similar to markdown that is often used to document python software. Learn how create headings, lists and code blocks in a text file using RST syntax.
Introduction to Documenting Python Software
Lack of documentation will limit peoples’ use of your code. In this lesson you will learn about 2 ways to document python code using docstrings and online documentation. YOu will also learn how to improve documentation in other software packages.
The GitHub Workflow - How to Contribute To Open Source Software
Open source means that you can view and contribute to software code like packages you use in Python. Learn about the ways that you can contribute without being an expert progammer.
Introduction to Open Source Software - What Is It and How Can You Help?
Open source means that you can view and contribute to software code like packages you use in Python. Learn about the ways that you can contribute without being an expert progammer.
Remote Sensing to Study Wildfire
Scientists often use remote sensing methods to study the impacts of wildfire through calculations of vegetation indices before and after wildfire. Learn more about how remote sensing can be used to study wildfire impacts.
Field Methods to Study Wildfire
Scientists often use field survey methods to study the impacts of wildfire through measurements of biomass and soil. Learn more about how survey methods can be used to study wildfire impacts.
An Overview of the Cold Springs Wildfire
The Cold Springs wildfire burned a total of 528 acres of land between July 9, 2016 and July 14, 2016. Learn more about this wildfire and how scientists study wildfire using both field and remote sensing methods.
Practice Using Git and GitHub to Manage Files
Practice your skills setting up git locally, committing changes to files and pushing and pulling files to GitHub.com
Undo Local Changes With Git
A version control system allows you to track and manage changes to your files. Learn how to undo changes in git after they have been added or committed to version control.
Get Started with Git Commands for Version Control
A version control system allows you to track and manage changes to your files. Learn how to use some basic Git commands including add, commit and push.
How To Setup Git Locally On Your Computer
Learn how to setup git locally on your computer.
Copy (Fork) and Download (Clone) GitHub Repositories
GitHub.com can be used to store and access files in the cloud to share with others or simply as a backup of your local files. Learn how to create a copy of files on GitHub (fork) and to download files from GitHub to your computer (clone).
What Is Version Control
A version control system allows you to track and manage changes to your files. Learn benefits of version control for scientific workflows and how git and GitHub.com support version control.
Guided Activity on Git/Github.com For Collaboration
This lesson teaches you how to collaborate with others in a project, including tasks such as notifying others that an assigned task has been completed.
Guided Activity on Undo Changes in Git
This lesson teaches you how to undo changes in Git after they have been added or committed.
Crop a Spatial Raster Dataset Using a Shapefile in Python
This lesson covers how to crop a raster dataset and export it as a new raster in Python
How to Dissolve Polygons Using Geopandas: GIS in Python
In this lesson you review how to dissolve polygons in python. A spatial join is when you assign attributes from one shapefile to another based upon its spatial location.
How to Reproject Vector Data in Python Using Geopandas - GIS in Python
Sometimes two shapefiles do not line up properly even if they cover the same area because they are in different coordinate reference systems. Learn how to reproject vector data in Python using geopandas to ensure your data line up.
GIS in Python: Introduction to Vector Format Spatial Data - Points, Lines and Polygons
This lesson introduces what vector data are and how to open vector data stored in shapefile format in Python.
Subtract Raster Data in Python Using Numpy and Rasterio
Sometimes you need to manipulate multiple rasters to create a new raster output data set in Python. Learn how to create a CHM by subtracting an elevation raster dataset from a surface model dataset in Python.
Open, Plot and Explore Lidar Data in Raster Format with Python
This lesson introduces the raster geotiff file format - which is often used to store lidar raster data. You will learn the 3 key spatial attributes of a raster dataset including Coordinate reference system, spatial extent and resolution.
Get Started With GIS in Open Source Python - Geopandas, Rasterio & Matplotlib
There are a suite of powerful open source python libraries that can be used to work with spatial data. Learn how to use geopandas, rasterio and matplotlib to plot and manipulate spatial data in Python.
Text Editors for the Command Line and Scientific Programming
Text editors can be used to edit code and for commit messages in git. Learn about features to look for in a text editor and how to change your default text editor at the command line.
Set Up Your Conda Earth Analytics Python Environment
Conda environments allow you to easily manage the Python package installations on your computer. Learn how to install a conda environment using a yml file.
How to Access and Use Shell to Set Up a Working Directory
This tutorial walks you through how access the shell through terminal, use basic commands in the terminal for file organization, and set up a working directory for the course.
Setup Git, Bash, and Conda on Your Computer
Learn how to install Git, GitBash (a version of command line Bash) and the Miniconda Python distribution on your computer.
Setup Your Earth Analytics Python, Git, Bash Environment On Your Computer
There are several core tools that are required to work with data. These include Shell/Bash, Git/Github and Python. Learn how to set all of these tools up on your computer so you can work with different types of data using open science workflows.
Get NAIP Remote Sensing Data From the Earth Explorer Website
In this lesson you will review how to find and download USDS NAIP imagery from the USGS Earth Explorere website.
Learn to Use NAIP Multiband Remote Sensing Images in Python
Learn how to open up a multi-band raster layer or image stored in .tiff format in Python using Rasterio. Learn how to plot histograms of raster values and how to plot 3 band RGB and color infrared or false color images.
How multispectral imagery is drawn on computers - Additive Color Models
Learn the basics of how addidative colors models are used to render RGB images in Python.
Introduction to Multispectral Remote Sensing Data in Python
Multispectral remote sensing data can be in different resolutions and formats and often has different bands. Learn about the differences between NAIP, Landsat and MODIS remote sensing data as it is used in Python.
Create interactive leaflet maps using folium in Jupyter Notebook: GIS in Python
Interactive maps allow you to easily explore data. Learn how to create interative leaflet maps embedded in a Jupyter Notebook using Python and folium.
Customize Map Extents in Python: GIS in Python
When making maps, sometimes you want to zoom in to a specific area in your map. Learn how to adjust the x and y limits of your matplotlib and geopandas map to change the spatial extent that is displayed.
Customize Map Legends and Colors in Python using Matplotlib: GIS in Python
When making maps, you often want to add legends and customize the map colors. Learn how to customize legends and colors in matplotlib maps created using vector data in Python.
Overlay Raster and Vector Spatial Data in A Matplotlib Plot Using Extents in Python
When plotting raster and vector data together, the extent of the plot needs to be defined for the data to overlay with each other correctly. Learn how to define plotting extents for Python Matplotlib Plots.
Customize Matplotlib Raster Maps in Python
Sometimes you want to customize the colorbar and range of values plotted in a raster map. Learn how to create breaks to plot rasters in Python.
Interactive Maps in Python
Folium is a Python package that can be used to create interactive maps in Jupyter Notebook. Learn how to create interactive maps with raster overlays in Python using Folium.
Layer a raster dataset over a hillshade in Python to create a beautiful basemap that represents topography.
A hillshade is a representation of the earth's surface as it would look with shade and shadows from the sun. Learn how to overlay raster data on top of a hillshade in Python.
Plot Spatial Raster Data in Python.
When plotting rasters, you often want to overlay two rasters, add a legend, or make the raster interactive. Learn how to make a map of raster data that has these attributes using Python.
Canopy Height Models, Digital Surface Models & Digital Elevation Models - Work With LiDAR Data in Python
This lesson defines 3 lidar data products: the digital elevation model (DEM), the digital surface model (DSM) and the canopy height model (CHM).
How lidar point clouds are converted to raster data formats
Rasters are gridded data composed of pixels that store values, such as an image or elevation data file. Learn how a lidar data point cloud is converted to a raster format such as a GeoTIFF.
Get to know Lidar (Light Detection and Ranging) Point Cloud Data - Active Remote Sensing
This lesson covers what a lidar point cloud is. You will use the free plas.io point cloud viewer to explore a point cloud.
Introduction to Light Detection and Ranging (Lidar) Remote Sensing Data
This lesson reviews what Lidar remote sensing is, what the lidar instrument measures and discusses the core components of a lidar remote sensing system.
Measure Changes in the Terrain Caused by a Flood Using Lidar Data
A flood event often changes the terrain as water moves sediment and debris across the landscape. Learn how terrain changes are measured using lidar remote sensing data.
Rain: a Driver of the 2013 Colorado Floods
The amount and/or duration of rainfall can impact how severe a flood is. Learn how rainfall is measured and used to understand flood impacts.
About the Stream Discharge Data Used in this Data Story
Learn more about the stream discharge data that is used in this data story.
How the Atmosphere Drives Floods: The 2013 Colorado Floods
Changes in the atmosphere, including how quickly a storm moves can impact the severity of a flood. Learn more about how atmospheric conditions impact flood events.
An Overview of the 2013 Colorado Floods
The 2013 flood event caused significant damage throughout the state of Colorado, USA. Learn about what caused the 2013 floods in Colorado and also some of the impacts.
Analyze The Sentiment of Tweets From Twitter Data and Tweepy in Python
One way to analyze Twitter data is to analyze attitudes (or sentiment) in the tweet text. Learn how to analyze sentiments in Twitter data using open source Python.
Analyze Co-occurrence and Networks of Words Using Twitter Data and Tweepy in Python
One common way to analyze Twitter data is to identify the co-occurrence and networks of words in Tweets. Learn how to analyze word co-occurrence (i.e. bigrams) and networks of words using Python.
Analyze Word Frequency Counts Using Twitter Data and Tweepy in Python
One common way to analyze Twitter data is to calculate word frequencies to understand how often words are used in tweets on a particular topic. To complete any analysis, you need to first prepare the data. Learn how to clean Twitter data and calculate word frequencies using Python.
Automate Getting Twitter Data in Python Using Tweepy and API Access
You can use the Twitter RESTful API to access tweet data from Twitter. Learn how to use tweepy to download and work with twitter social media data in Python.
Use Twitter Social Media Data in Python - An Introduction
You can access twitter social media data using the twitter API automatically in Python. Learn about the basics of downloading twitter data using open source Python.
Programmatically Accessing Geospatial Data Using APIs
This lesson walks through the process of retrieving and manipulating surface water data housed in the Colorado Information Warehouse. These data are stored in JSON format with spatial x, y information that support mapping.
Introduction to Working With JSON Data in Open Source Python
This lesson introduces how to work with the JSON data structure using Python using the JSON and Pandas libraries to create and convert JSON objects.
Introduction to JSON Data in Python
JSON is a powerful text based data format that contains hierarchical data. JSON and GeoJSON are common data formats that are returned when accessing automatically data using an API. Learn more about JSON and GeoJSON data.
Introduction to APIs
API's allow you to automate access and downloading data in your code to support open reproducible science. Learn how how to use API's to download data from the internet using open source python.
Reproject Raster Data Python
Sometimes you will work with multiple rasters that are not in the same projections, and thus, need to reproject the rasters, so they are in the same coordinate reference system. Learn how to reproject raster data in Python using Rasterio.
Crop Spatial Raster Data With a Shapefile in Python
Sometimes a raster dataset covers a larger spatial extent than is needed for a particular purpose. In these cases, you can crop a raster file to a smaller extent. Learn how to crop raster data using a shapefile and export it as a new raster in open source Python
Classify and Plot Raster Data in Python
Reclassifying raster data allows you to use a set of defined values to organize pixel values into new bins or categories. Learn how to classify a raster dataset and export it as a new raster in Python.
Subtract One Raster from Another and Export a New GeoTIFF in Open Source Python
Often you need to process two raster datasets together to create a new raster output and then save that output as a new file. Learn how to subtract rasters and create a new GeoTIFF file using open source Python.
Introduction to Raster Data Processing in Open Source Python
You can perform the same raster processing steps in Python that you would in a tool like ArcGIS. Learn how to process spatial raster data using Open Source Python.
Open, Plot and Explore Raster Data with Python
Raster data are gridded data composed of pixels that store values, such as an image or elevation data file. Learn how to open, plot, and explore raster files in Python using Rasterio.
Test Your Skills: Open Raster Data Using RioXarray In Open Source Python
Challenge your skills. Practice opening, cleaning and plotting raster data in Python
About the Geotiff (.tif) Raster File Format: Raster Data in Python
Metadata describe the key characteristics of a dataset such as a raster. For spatial data, these characteristics including the coordinate reference system (CRS), resolution and spatial extent. Learn about the use of TIF tags or metadata embedded within a GeoTIFF file to explore the metadata programatically.
Spatial Raster Metadata: CRS, Resolution, and Extent in Python
Raster metadata includes the coordinate reference system (CRS), resolution, and spatial extent. Learn about these metadata and how to access them in Python
Plot Histograms of Raster Values in Python
Histograms of raster data provide the distribution of pixel values in the dataset. Learn how to explore and plot the distribution of values within a raster using histograms.
Open, Plot and Explore Raster Data with Python and Xarray
Raster data are gridded data composed of pixels that store values, such as an image or elevation data file. Learn how to open, plot, and explore raster files in Python.
What is Raster Data
Rasters are gridded data composed of pixels that store values. Learn more about the structure of raster data and how to use them to store data, such as imagery or elevation values.
Understand EPSG, WKT and Other CRS Definition Styles
Coordinate Reference System (CRS) information is often stored in three key formats, including proj.4, EPSG and WKT. Learn more about the ways that coordinate reference system data are stored including proj4, well known text (wkt) and EPSG codes.
Geographic vs projected coordinate reference systems - GIS in Python
Geographic coordinate systems span the entire globe (e.g. latitude / longitude), while projected coordinate systems are localized to minimize visual distortion in a particular region (e.g. Robinson, UTM, State Plane). Learn more about key differences between projected vs. geographic coordinate reference systems.
GIS in Python: Intro to Coordinate Reference Systems in Python
A coordinate reference system (CRS) defines the translation between a location on the round earth and that same location, on a flattened, 2 dimensional coordinate system. Learn how to explore and reproject data into geographic and projected CRS in Python.
GIS in Python: Introduction to Vector Format Spatial Data - Points, Lines and Polygons
Vector data are composed of discrete geometric locations (x, y values) known as vertices that define the shape of the spatial object. Learn more about the structure of vector data and how to open vector data stored in shapefile format in Python.
Explore Precipitation and Stream Flow Data Using Interactive Plots: The 2013 Colorado Floods
Practice interpreting data on plots that show rainfall (precipitation) and stream flow (discharge) as it changes over time.
Create Data Driven Reports using Jupyter Notebooks | 2013 Colorado Flood Data
Connecting data to analysis and outputs is an important part of open reproducible science. In this lesson you will explore that value of a well documented workflow.
Use Google Earth Time Series Images to View Flood Impacts
Learn how to use the time series feature in Google Earth to view before and after images of a location.
Challenge Yourself
This lesson contains a series of challenges that require using tidyverse functions in R to process data.
Automate Workflows Using Loops in R
When you are programming, it can be easy to copy and paste code that works. However this approach is not efficient. Learn how to create for-loops to process multiple files in R.
Handle Missing Data in R
Learn how to handle missing data in the R programming language.
Use tidyverse group_by and summarise to Manipulate Data in R
Learn how to write pseudocode to plan our your approach to working with data. Then use tidyverse functions including group_by and summarise to implement your plan.
Get Started with Clean Coding in R
Learn...
Learn to Use tidyverse and Clean Code to Work With Data in R
When working with data, you often spend the most amount of time cleaning your data. Learn how to write more efficient code using the tidyverse in R.
Submit a pull request on the GitHub website
Learn how to create and submit a pull request to another repo.
How to fork a repo in GitHub
Learn how to fork a repository using the GitHub website.
Introduction to undoing things in git
Learn how to undo changes in git after they have been added or committed.
First steps with git: clone, add, commit, push
Learn basic git commands, including clone, add, commit, and push.
An introduction version control
Learn what version control is, and how Git and GitHub are used in a typical version control workflow.
Make Interactive Maps with Leaflet R - GIS in R
In this lesson you learn the steps to create a map in R using ggplot.
Maps in R: R Maps Tutorial Using Ggplot
You can use R as a GIS. Learn how to create a map in R using ggplot in this R maps tutorial.
Sentiment Analysis of Colorado Flood Tweets in R
Learn how to perform a basic sentiment analysis using the tidytext package in R.
Create Maps of Social Media Twitter Tweet Locations Over Time in R
This lesson provides an example of modularizing code in R.
Use Tidytext to Text Mine Social Media - Twitter Data Using the Twitter API from Rtweet in R
This lesson provides an example of modularizing code in R.
Text Mining Twitter Data With TidyText in R
Text mining is used to extract useful information from text - such as Tweets. Learn how to use the Tidytext package in R to analyze twitter data.
Twitter Data in R Using Rtweet: Analyze and Download Twitter Data
You can use the Twitter RESTful API to access data about Twitter users and tweets. Learn how to use rtweet to download and analyze twitter social media data in R.
Work With Twitter Social Media Data in R - An Introduction
This lesson will discuss some of the challenges associated with working with social media data in science. These challenges include working with non standard text, large volumes of data, API limitations, and geolocation issues.
Creating Interactive Spatial Maps in R Using Leaflet
This lesson covers the basics of creating an interactive map using the leaflet API in R. We will import data from the Colorado Information warehouse using the SODA RESTful API and then create an interactive map that can be published to an HTML formatted file using knitr and rmarkdown.
Programmatically Accessing Geospatial Data Using API's - Working with and Mapping JSON Data from the Colorado Information Warehouse in R
This lesson walks through the process of retrieving and manipulating surface water data housed in the Colorado Information Warehouse. These data are stored in JSON format with spatial x, y information that support mapping.
Understand Namespaces in R - What Package Does Your fromJSON() Function Come From?
This lesson covers namespaces in R and how we can tell R where to get a function from (what code to use) in R.
Programmatically Access Data Using an API in R - The Colorado Information Warehouse
This lesson covers accessing data via the Colorado Information Warehouse SODA API in R.
Introduction to the JSON data structure
This lesson covers the JSON data structure. JSON is a powerful text based format that supports hierarchical data structures. It is the core structure used to create geoJSON which is a spatial version of json that can be used to create maps. JSON is preferred for use over .csv files for data structures as it has been proven to be more efficient - particulary as data size becomes large.
Access Secure Data Connections Using the RCurl R Package.
This lesson reviews how to use functions within the RCurl package to access data on a secure (https) server in R.
An Example of Creating Modular Code in R - Efficient Scientific Programming
This lesson provides an example of modularizing code in R.
Introduction to APIs
In this module, you learn various ways to access, download and work with data programmatically. These methods include downloading text files directly from a website onto your computer and into R, reading in data stored in text format from a website, into a data.frame in R and finally, accessing subsets of particular data using REST API calls in R.
Use lapply in R Instead of For Loops to Process .csv files - Efficient Coding in R
Learn how to take code in a for loop and convert it to be used in an apply function. Make your R code more efficient and expressive programming.
If Statements, Functions, and For Loops
Learn how to combine if statements, functions and for loops to process sets of text files.
Create For Loops
Learn how to write a for loop to process a set of .csv format text files in R.
Working with Function Arguments
Learn how to work with function arguments in the R programming language..
Get to Know the Function Environment & Function Arguments in R
This lesson introduces the function environment and documenting functions in R. When you run a function intermediate variables are not stored in the global environment. This not only saves memory on your computer but also keeps our environment clean, reducing the risk of conflicting variables.
How to Write a Function in R - Automate Your Science
Learn how to write a function in the R programming language.
Write Efficient Scientific Code - the DRY (Don't Repeat Yourself) Principle
This lesson will cover the basic principles of using functions and why they are important.
Work with MODIS Remote Sensing Data in R.
In this lesson you will explore how to import and work with MODIS remote sensing data in raster geotiff format in R. You will cover importing many files using regular expressions and cleaning raster stack layer names for nice plotting.
Calculate and Plot Difference Normalized Burn Ratio (dNBR) from Landsat Remote Sensing Data in R
In this lesson you review how to calculate difference normalized burn ratio using pre and post fire NBR rasters in R. You finally will classify the dNBR raster.
Work with the Difference Normalized Burn Index - Using Spectral Remote Sensing to Understand the Impacts of Fire on the Landscape
In this lesson you review the normalized burn ratio (NBR) index which can be used to identify the area and severity of a fire. Specifically you will calculate NBR using Landsat 8 spectral remote sensing data in raster, .tif format.
How to Replace Raster Cell Values with Values from A Different Raster Data Set in R
Often data have missing or bad data values that you need to replace. Learn how to replace missing or bad data values in a raster, with values from another raster in the same pixel location using the cover function in R.
Get Landsat Remote Sensing Data From the Earth Explorer Website
In this lesson you will review how to find and download Landsat imagery from the USGS Earth Explorere website.
Clean Remote Sensing Data in R - Clouds, Shadows & Cloud Masks
In this lesson, you will learn how to deal with clouds when working with spectral remote sensing data. You will learn how to mask clouds from landsat and MODIS remote sensing data in R using the mask() function. You will also discuss issues associated with cloud cover - particular as they relate to a research topic.
How to Convert Day of Year to Year, Month, Day in R
Learn how to convert a day of year value to a normal date format in R.
Adjust plot extent in R.
In this lesson you will review how to adjust the extent of a spatial plot in R using the ext() or extent argument and the extent of another layer.
Plot Grid of Spatial Plots in R.
In this lesson you learn to use the par() or parameter settings in R to plot several raster RGB plots in R in a grid.
How to Remove Borders and Add Legends to Spatial Plots in R.
In this lesson you review how to remove those pesky borders from a raster plot using base plot in R. We also cover adding legends to your plot outside of the plot extent.
How to Reuse Functions That You Create In Scripts - Source a Function in R
Learn how to source a function in R. Learn how to import functions that are stored in a separate file into a script or R Markdown file.
Landsat Remote Sensing tif Files in R
In this lesson you will cover the basics of using Landsat 7 and 8 in R. You will learn how to import Landsat data stored in .tif format - where each .tif file represents a single band rather than a stack of bands. Finally you will plot the data using various 3 band combinations including RGB and color-infrared.
Calculate NDVI in R: Remote Sensing Vegetation Index
NDVI is calculated using near infrared and red wavelengths or types of light and is used to measure vegetation greenness or health. Learn how to calculate remote sensing NDVI using multispectral imagery in R.
How Multispectral Imagery is Drawn on Computers - Additive Color Models
In this lesson you will learn the basics of using Landsat 7 and 8 in R. You will learn how to import Landsat data stored in .tif format - where each .tif file represents a single band rather than a stack of bands. Finally you will plot the data using various 3 band combinations including RGB and color-infrared.
How to Open and Work with NAIP Multispectral Imagery in R
In this lesson you learn how to open up a multi-band raster layer or image stored in .tiff format in R. You are introduced to the stack() function in R which can be used to import more than one band into a stack object in R. You also review using plotRGB to plot a multi-band image using RGB, color-infrared to other band combinations.
Introduction to Spatial and Spectral Resolution: Multispectral Imagery
Multispectral imagery can be provided at different resolutions and may contain different bands or types of light. Learn about spectral vs spatial resolution as it relates to spectral data.
Import and Summarize Tree Height Data and Compare it to Lidar Derived Height in R
It is important to compare differences between tree height measurements made by humans on the ground to those estimated using lidar remote sensing data. Learn how to perform this analysis and calculate error or uncertainty in R.
Extract Raster Values Using Vector Boundaries in R
This lesson reviews how to extract pixels from a raster dataset using a vector boundary. You can use the extracted pixels to calculate mean and max tree height for a study area (in this case a field site where tree heights were measured on the ground. Finally you will compare tree heights derived from lidar data compared to tree height measured by humans on the ground.
Sources of Error in Lidar and Human Measured Estimates of Tree Height
There are difference sources of error when you measure tree height using Lidar. Learn about accuracy, precision and the sources of error associated with lidar remote sensing data.
GIS in R: Plot Spatial Data and Create Custom Legends in R
In this lesson you break down the steps required to create a custom legend for spatial data in R. You learn about creating unique symbols per category, customizing colors and placing your legend outside of the plot using the xpd argument combined with x,y placement and margin settings.
GIS in R: How to Reproject Vector Data in Different Coordinate Reference Systems (crs) in R
In this lesson you learn how to reproject a vector dataset using the spTransform() function in R.
GIS in R: Understand EPSG, WKT and other CRS definition styles
This lesson discusses ways that coordinate reference system data are stored including proj4, well known text (wkt) and EPSG codes.
GIS With R: Projected vs Geographic Coordinate Reference Systems
Geographic coordinate reference systems are often used to make maps of the world. Projected coordinate reference systems are use to optimize spatial analysis for a region. Learn about WGS84 and UTM Coordinate Reference Systems as used in R.
Coordinate Reference System and Spatial Projection
Coordinate reference systems are used to convert locations on the earth which is round, to a two dimensional (flat) map. Learn about the differences between coordinate reference systems.
GIS in R: shp, shx and dbf + prj - The Files That Make up a Shapefile
This lesson reviews the core files that are required to use a shapefile including: shp, shx and dbf. It also covers the .prj format which is used to define the coordinate reference system (CRS) of the data.
GIS in R: Intro to Vector Format Spatial Data - Points, Lines and Polygons
This lesson introduces what vector data are and how to open vector data stored in shapefile format in R.
Clip Raster in R
You can clip a raster to a polygon extent to save processing time and make image sizes smaller. Learn how to crop a raster dataset in R.
Classify a Raster in R.
This lesson presents how to classify a raster dataset and export it as a new raster in R.
Create a Canopy Height Model With Lidar Data
A canopy height model contains height values trees and can be used to understand landscape change over time. Learn how to use LIDAR elevation data to calculate canopy height and change in terrain over time.
How to Open and Use Files in Geotiff Format
A GeoTIFF is a standard file format with spatial metadata embedded as tags. Use the raster package in R to open geotiff files and spatial metadata programmatically.
Plot Histograms of Raster Values in R
This lesson introduces the raster geotiff file format - which is often used to store lidar raster data. You learn the 3 key spatial attributes of a raster dataset including Coordinate reference system, spatial extent and resolution.
Introduction to Lidar Raster Data Products
This lesson introduces the raster geotiff file format - which is often used to store lidar raster data. You learn the 3 key spatial attributes of a raster dataset including Coordinate reference system, spatial extent and resolution.
How Lidar Point Clouds Are Converted to Raster Data Formats - Remote Sensing
This lesson reviews how a lidar data point cloud is converted to a raster format such as a geotiff.
Introduction to Lidar Point Cloud Data - Active Remote Sensing
This lesson covers what a lidar point cloud is. We will use the free plas.io point cloud viewer to explore a point cloud.
What is Lidar Data
This lesson reviews what lidar remote sensing is, what the lidar instrument measures and discusses the core components of a lidar remote sensing system.
Layer a Raster Dataset Over a Hillshade Using R Baseplot to Create a Beautiful Basemap That Represents Topography
This lesson covers how to overlay raster data on a hillshade in R using baseplot and layer opacity arguments.
Add a Basemap to an R Markdown Report Using ggmap
This lesson covers creating a basemap with the ggmap package in R. Given some ongoing bugs with ggmap it also covers the map package as a backup!
Create Interactive Plots in R - Time Series & Scatterplots Using plotly and dygraphs
Learn how to create interactive reports using plotly and dygraphs in R for plotting.
Subset & Aggregate Time Series Precipitation Data in R Using mutate(), group_by() and summarise()
This lesson introduces the mutate() and group_by() dplyr functions - which allow you to aggregate or summarize time series data by a particular field - in this case you will aggregate data by day to get daily precipitation totals for Boulder during the 2013 floods.
Homework Challenge: Plot USGS Stream Discharge Data in R
This lesson illustrated what your final stream discharge homework plots should look like for the week. Use all of the skills that you've learned in the previous lessons to complete it.
Summarize Time Series Data by Month or Year Using Tidyverse Pipes in R
Learn how to summarize time series data by day, month or year with Tidyverse pipes in R.
Use Tidyverse Pipes to Subset Time Series Data in R
Learn how to extract and plot data by a range of dates using pipes in R.
Time Series Data: Work with Dates in R
Times series data can be manipulated efficiently in R. Learn how to work with, plot and subset data with dates in R.
Plot Data and Customize Plots with ggplot Plots in R - Earth Analytics - Data Science for Scientists
Learn how to plot data and customize your plots using ggplot in R.
How to Address Missing Values in R
Missing data in R can be caused by issues in data collection and / or processing and presents challenges in data analysis. Learn how to address missing data values in R.
How to Import, Work with and Plot Spreadsheet (Tabular) Data in R
Learn how to import and plot data in R using the read_csv & qplot / ggplot functions.
Understand the Vector Data Type in R and Classes Including Strings, Numbers and Logicals - Data Science for Scientists 101
This tutorial introduces vectors in R. It also introduces the differences between strings, numbers and logical or boolean values (True / False) in R.
Creating Variables in R and the String vs Numeric Data Type or Class - Data Science for Scientists 101
This lesson covers creating variables or objects in R. It also introduces some of the basic data types or classes including strings and numbers. This lesson is designed for someone who has not used R before.
The Syntax of the R Scientific Programming Language - Data Science for Scientists 101
This lesson introduces the basic syntax associated with the R scientific programming language. You will learn about assignment operators (<-), comments and basic functions that are available to use in R to perform basic tasks including head(), qplot() to quickly plot data and others. This lesson is designed for someone who has not used R before. You will work with precipitation and stream discharge data for Boulder County.
Get Help with R - Data Science for Scientists 101
This tutorial covers ways to get help when you are not sure how to perform a task in R.
Write Clean Code - Expressive or Literate Programming in R - Data Science for Scientists 101
This lesson covers the basics of clean coding meaning that you ensure that the code that you write is easy for someone else to understand. The lesson will briefly cover style guides, consistent spacing, literate object naming best practices.
Use Regression Analysis to Explore Data Relationships & Bad Data
You often want to understand the relationships between two different types of data. Learn how to use regression to determine whether there is a relationship between two variables.
Compare Lidar to Measured Tree Height
To explore uncertainty in remote sensing data, it is helpful to compare ground-based measurements and data that are collected via airborne instruments or satellites. Learn how to create scatter plots that compare values across two datasets.
Extract Raster Values at Point Locations in Python
For many scientific analyses, it is helpful to be able to select raster pixels based on their relationship to a vector dataset (e.g. locations, boundaries). Learn how to extract data from a raster dataset using a vector dataset.
Compare Lidar With Human Measured Tree Heights - Remote Sensing Uncertainty
Uncertainty quantifies a range of values within which a measurement value could be within, considering a specified level of confidence. Learn about the types of uncertainty that you can expect when working with tree height data both derived from lidar remote sensing and human measurements and learn about sources of error including systematic vs. random error.
Explore Precipitation and Stream Flow Data Using Interactive Plots: The 2013 Colorado Floods
Practice interpreting data on plots that show rainfall (precipitation) and stream flow (discharge) as it changes over time.
Work With Precipitation Data in R: 2013 Colorado Floods
Learn why documentation is important when analyzing data by evaluating someone elses report on the Colorado floods.
Use Google Earth Time Series Images to Explore Flood Impacts
Learn how to use the time series feature in Google Earth to view before / after images of a location.
R Markdown resources
Find resources that will help you use the R Markdown format.
Add Citations and Cross References to an R Markdown Report with Bookdown
Learn how to use bookdown in R to add citations and cross references to your data-driven reports.
Add Images to an R Markdown Report
This lesson covers how to use markdown to add images to a report. It also discusses good file management practices associated with saving images within your project directory to avoid losing them if you have to go back and work on the report in the future.
Convert R Markdown to PDF or HTML
Knitr can be used to convert R Markdown files to different formats, including web friendly formats. Learn how to convert R Markdown to PDF or HTML in RStudio.
How to Use R Markdown Code Chunks
Code chunks in an R Markdown document are used to separate code from text in a Rmd file. Learn how to create reports using R Markdown.
Introduction to Markdown Syntax - a Primer
Learn how to write using the markdown syntax in an R Markdown document.
How to create an R Markdown File in R Studio and the R Markdown File Structure
Learn about the format of a R Markdown file including a YAML header, R code and markdown formatted text.
Introduction to R Markdown & Knitr - Connect Data, Methods and Results
Learn what open science is and how R Markdown can help you document your work.
Create a Project & Working Directory Setup
Learn how to create a well-organized working directory to store your course data.
File Organization 101
Learn key principles for naming and organizing files and folders in a working directory.
Install & Use Packages in R
Learn what a package is in R and how to install packages to work with your data.
Get to Know RStudio
Learn how to work with R using the RStudio application.
Install & Set Up R and RStudio on Your Computer
Learn how to download and install R and RStudio on your computer.
Introduction to Open Reproducible Science Teaching Activity
A hands-on activity where students review a project for readability, organization, etc and identify key elements that would make it more usable and readily reproducible.
Open Science Lesson Instructor Notes
Instructor notes for the open science lesson.