Lesson 4. File Organization 101


In the previous lessons, you set up R and RStudio. The last part of your setup is to set up your working directory. A working directory is an organized space or directory on your computer where you keep your data, scripts and outputs. It is important to think about the organization of that directory, to make your own future life easier (so you can find things) and also to make it easier to collaborate with other people.

Set Up Your Project

Project organization is integral to efficient research. A well organized project structure will allow you to more easily find components of your project AND make it easier for others you are working with to understand and find data, code, and results. In this tutorial, you will create a well-organized working directory.

Learning Objectives

At the end of this activity, you will:

  • Be able to describe the key characteristics of a well structured project.
  • Be able to summarize in 1-3 sentences why good project structure can make your work more efficient and make it easier to collaborate with colleagues.
  • Be able to explain what a working directory is.

What You Need

You will need the most current version of R and, preferably, RStudio loaded on your computer to complete this tutorial.

Characteristics of a Well-structured Project and Working Directory

Please note that in this lesson, you will be using your project directory as your working directory. Thus these terms will be used intechangably throughout.

1. Organization - Files & Directories

When it comes to structuring the names of the files and folders that create your project, the more self explanatory, the better.

basmati rice label on cookie container.
A well structured project uses directory (folder) names that describe the contents of the directory. Source: Jenny Bryan, Reproducible Science Curriculum.

A well structured project directory should:

  • Utilize a naming convention that is:
    • Human readable - use directory names that are easy to understand.
    • Machine readable - avoid funky characters OR SPACES.
    • Supportive of sorting - If you have a list of input files, it’s nice to be able to sort them to quickly see what’s there and find What you need.
  • Preserve raw data so it’s not modified: You’ll worry about this later.
  • Have easy to read directory names that contain components of the project (e.g. code, data, outputs, figures, etc)
good file organization
Example of a well-organized project directory. Source: Jenny Bryan, Reproducible Science Curriculum.

Which Filenames are Most Self-Explanatory?

Your goal when structuring a project directory is to use a naming convention that someone who is not familiar with your project can quickly understand. Case in point, have a look at the graphic below. Which list of file names are the most self explanatory? The ones on the LEFT? Or the ones of the RIGHT?

example of human readable file names
Compare the list of file names on the LEFT to those on the right which ones are easier to quickly understand? Source: Jenny Bryan, Reproducible Science Curriculum.

Consider the structure of your project as you build the project or working directory for your earth analytics tutorials in the next lesson.

Leave a Comment