Israel Arevalo 2022-09-23
This guide was created to assist a new user of R in importing a dataset.
R is a powerful tool that allows the user to import a variety of data
types (i.e., .csv, .sav, .dat, and more). To start, we want to download
a package called haven
by running the following code.
Note
While R supports reading filetypes such as .csv, not all data comes in this format. Therefore, we will be installing thehaven
package to allow us to import various different file types
install.packages('haven')
Now that we’ve downloaded the haven
package, let’s load it into our
environment. We will do that by running the library
function. Notice
that now that we have installed our package, we don’t need to wrap our
package’s name with ” ” or ’ ’. However, doing so will not yield an
error.
library(haven)
The Haven package will allow you to import a variety of data types into
your R environment. In order to read
these datasets, we will be using
the read_DATATYPE
function within the Haven package. You can find the
full documentation for the Haven library
here. The
Haven package allows the user to read .dta, .sas, .spss, and .xpt files.
If you are importing a natively supported datatype such as .csv
,
click here to skip the
following sections and navigate to the (Importing CSV files into R)
section of this tutorial.
Several assumptions are made prior to this process.
- You have installed R on your computer.
- You have installed an Integrated Development Environment (IDE) like RStudio on your computer.
- You have a dataset that you can import into RStudio.
Now that we have installed the haven
package, let’s start importing
our data. Importing your data is the same, regardless of whether you are
using a MacOS, Windows, or Linux; however, you may come across issues
when running your code on different operating systems. This is due to
the fact that the paths (even though they may be in the same folder) may
vary slightly due to the nature of the operating system. This can be a
problem if you house your dataset in a different folder than the one you
created your Rmarkdown/Rscript document. To avoid this issue, a common
practice is to house your dataset and your R document within the same
folder. However, this may not always be possible. A workaround this
issue is to create a read_DATATYPE
command for each environment you
will be working on and comment out #
the one you are not using at the
time.
Now, let’s see how importing a file looks like. In the example below, we
are importing a dataset that is located within the same folder as our R
file and calling it df
.
df <- read_spss(mylovelydata.sav)
Now let’s look at how our command would look like if our data is located
in a different path than our R file. Notice how the path now needs to be
wrapped using a "
.
df <- read_spss("E:\Dropbox\Research\Data\mylovelydata.sav")
Here’s a list of the read functions that come within the haven
package
to help you import your dataset into R.
# Importing SPSS Data Files
df <- read_spss(mylovelydata.sav)
# Importing Stata DTA Files
df <- read_dat(mylovelydata.dta)
# Importing SAS Data
df <- read_sas(mylovelydata.sas7bdat)
Importing a .csv
file into R is very similar to importing a file using
the haven
package; however, instead of using the read_DATATYPE
function, we will be using read.csv
function from native R. Let’s view
this command by importing a dataset called mylovelydata.csv
into a
dataframe called df
.
# Importing CSV File
df <- read.csv("E:\Dropbox\Research\Data\mylovelydata.csv")
Lastly, we will review importing Excel files into R. To do this, we will
need to download a package called xlsx
that will allow us to read
.xlsx
files. Below is an example of how we would install and load the
package, read the data, and call it ‘df’.
install.packages('xlsx')
library(xlsx)
df <- read.xlsx("E:\Dropbox\Research\Data\mylovelydata.xlsx")
While short, I hope this guide provided you with the basics on importing your data into the R environment. For further details, alternative methods, and documentation on importing data into R, please visit RStudio’s guide on this topic.