# Introduction to Statistical Analysis Using R

Introduction to Statistical Analysis Using R Nick, Caroline, Tanya What is R? R is a programming language for data analysis and graphics All information about R is found on http://www.R-project.org R system contains two major components:

1.Base system contains the R language software and the high priority add-on packages listed on pg.3 2.User contributed add-on packages Who uses R? All scientists especially those working in developing countries It allows universal free access to state of the art tools for statistical data analysis

Most widely used for teaching undergraduates and graduates statistics b/c the students can use it free of cost Installing RBase System 1. Go to http://CRAN.R-project.org 2. Choose your computer from the list (Linux, MacOS X, or Windows) 3. Click on Base (Base or Contrib) 4. Click on R-2.6.1-win32.exe

5. Save R Getting Started Changing prompt - pg.3 Example using R as a pocket calculator pg.3 Storing vs. Printing R is not space sensitive, but it is case sensitive

Getting Help in R The Help system is a collection of manual pages describing each function and data set that comes with R Help/manual page is shown when the name of the function we would like to get help for is supplied to the help function Ex. help(mean) or help(mean) or ?mean Installing add-on packages

All packages are available on: http://CRAN.R-project.org/src/contrib/PA CKAGES.html Pick package from list and download To install add-on package:

1. install.packages(package name) 2. library(package name) Forbes2000 Example Go to http://CRAN.R-project.org/src/contrib/PACKAGE S.html and select HSAUR from the list Choose what pertains to your computer ex. Windows binary HSAUR 1.2-1.zip Save to desktop

Find Forbes2000 list in rawdata folder Install in R : install.package(HSAUR) library(HSAUR) Working with Data Sets Ex. Forbes 2000 list Vector elementary structure for data handling in R; set of simple elements, all being objects of the same class Ex. First 3 companies in Forbes - Forbes2000[,"name"]

[1:3] Variable names headings names(Forbes2000) Finding structures of data set useful for large data sets str(Forbes2000) Dimensions dim(Forbes2000) nrow(Forbes2000) ncol(Forbes2000)

Simple Summary Statistics Mean mean(Forbes2000 [,sales]) Median median(Forbes2000 [,assets]) Range range(Forbes2000 [,sales]) Importing Data Not Part of a Package When is this used? Most data sets are not part of a down-loadable package

Most people need to import their own data sets into R Example Airport data (download to Desktop) In R: File Change Dir Desktop OK name given < - read.table (airport.csv, header = TRUE, sep = ,, row.names =1 Making a Graph Graph of Rank of airport vs. Shop

Plot (Rank ~ Shop, data = name given, pch =O) Homework Change Prompt > to R>

Import Airport Data Set from Excel Print data set in R Find the Dimensions, the number of Columns, and the number of Rows in the data set Find structure of data set Find median of category Shop Find mean of Domestic Contact info Tanya [email protected]

Caroline [email protected] Nick [email protected]

