Working with files

Working with files. Importing data into R

One of the most important features we need to be able to do in R is import existing data, whether it be .txt files, .csv files, or even .xls (Excel files). If we cannot import data into R, than we cannot do anything.

Working with notepad

Step 1: open notepad.
Step 2: enter data as I have shown below (no spaces, use only commas).

rain

Step 3: save the file as ‘rain.txt’ on your Desktop

It is time to get to work in R importing this data. We have two options, importing using the R Studio environment (the easy way), or importing using standard R functions.

Import thought RStudio

Step 4: Click the ‘Import Dataset’ button, then click ‘From Local File’

import_r

import_r2

Step 5: Navigate to the ‘rain.txt’ file located on your Desktop and click ‘open’. The next dialog box we get shows the values contained within our file, and different importing options. A few things to notice, ‘Name’ at the top has been set to “rain”, which will become the variable our data is stored as in R. The ‘Heading’ radio button has already been moved to ‘yes’ because R Studio has recognized our column headers (month, rain_mm, flow_cms). Additionally, the ‘Separator’ has been adjusted to ‘comma’ as we have made a comma delimited text file. All you have to do is just click ‘Import’.

import-dataset

Step 6: R Studio automatically opens the ‘rain’ dataset as a table in a new tab. R Studio also provides the snippet of code it used to import the data. You can copy that code and paste it into your R script file for future use.

rain_data

rstudio_code

Import data using functions

There are lots of functions that can be used to import data into R: read.table, read.csv, etc.

  • read.csv(): for reading “comma separated value” files (“.csv”).
  • read.csv2(): variant used in countries that use a comma “,” as decimal point and a semicolon “;” as field separators.
  • read.delim(): for reading “tab-separated value” files (“.txt”). By default, point (“.”) is used as decimal points.
  • read.delim2(): for reading “tab-separated value” files (“.txt”). By default, comma (“,”) is used as decimal points.

We will use read.table in this example.

To understand how this function works, let’s open up the R help by typing ?read.table.

readR

That should open up a help file in the lower right ‘Help’ menu if you are using R Studio. The main pieces of this function we need to set are ‘file’, ‘header’, and ‘sep’.
-The ‘file’ piece is the file name and file path we want to import.
-The ‘header’ piece is set to TRUE or FALSE based on whether or not there is a header within the file.
-The ‘sep’ piece describes the separator used within the file (in our case, a comma)

So, here is how code should look:

readR2

Two things of note in the code above.
-The part of the path “YOUR-NAME” is based on your computer login settings. It might be something like ‘Tim’, ‘Jane’, ‘PeterC’, etc.
-I have defined this data as “rain” in R, using the rain <- bit of code. Here is how the data looks in R.

readR3

Summary:

  • Import a local .txt file: read.delim(file.choose())

  • Import a local .csv file: read.csv(file.choose())

  • Import a file from internet: read.delim(url) if a txt file or read.csv(url) if a csv file.

Another example of data:

Excel file