Chapter 2 Data Import

2.1 Entering data

You can’t use the R language to analyze data until you put your data in R.

Importing data into R is fairly simple. For Stata and Systat, use the foreign package. For SPSS and SAS I would recommend the Hmisc package for ease and functionality. See the Quick-R section on packages, for information on obtaining and installing the these packages. Example of importing data are provided below.

2.2 From Text

The read.csv function can include settings or parameters that may need to be set for the file to be read correctly (see ?read.csv for more info). Parameters are entered after the file name and separated by a comma. Some of the more useful parameters are shown below:

  • read.csv for comma separated values with period as decimal separator.

  • read.csv2 for semicolon separated values with comma as decimal separator.

  • read.delim tab-delimited files with period as decimal separator.

  • read.delim2 tab-delimited files with comma as decimal separator.

  • read.fwf data with a predetermined number of bytes per column.

  • setwd(“~/Desktop/”) #set working directory

  • setwd(“C:/Users/mateu/Desktop/data”)

person <- read.csv(file = "data.csv", header = FALSE, col.names = c("age","height"),sep=";")

(To practice importing a csv file, try this exercise.)

2.3 From Excel

One of the best ways to read an Excel file is to export it to a comma delimited file and import it using the method above. Alternatively you can use the xlsx package to access Excel files. The first row should contain variable/column names.

Read in the first worksheet from the workbook myexcel.xlsx. First row contains variable names.

library(xlsx)

mydata <-read.xlsx("c:/myexcel.xlsx", 1)

Read in the worksheet named mysheet:

mydata <-read.xlsx("c:/myexcel.xlsx", sheetName = "mysheet")

(To practice, try this exercise on importing an Excel worksheet into R.)

2.4 From SPSS

# save SPSS dataset in trasport format

get file='c:\mydata.sav'

export outfile='c:\mydata.por'

In R:

library(Hmisc)

mydata <- spss.get("c:/mydata.por", use.value.labels=TRUE)

  • last option converts value labels to R factors!

(To practice importing SPSS data with the foreign package, try this exercise.)

2.5 From SAS

Save SAS dataset in trasport format:

libname out xport 'c:/mydata.xpt'; data out.mydata; set sasuser.mydata; run;

In R:

library(Hmisc) mydata <-sasxport.get("c:/mydata.xpt")

2.6 From Stata

Input Stata file:

library(foreign)

mydata <- read.dta("c:/mydata.dta")

(To practice importing Stata data with the foreign package, try this exercise.)

2.7 From systat

Input Systat file:

library(foreign)

mydata <-read.systat("c:/mydata.dta")