## M19-513

2014

July 6-8, in the Doll & Hill Room, Taylor Avenue Building.

1-credit course for graduate students, postdoctral fellows, and residents.

- Instructional Staff:

Jeff Gill, Instructor/Course Master, email

Jung Ae Lee, Instructor, email **Description**: This is a short 1-credit primer to introduce the R Statistical Environment to new users. R is "a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. The goal is to give you a set of tools to perform sophisticated statistical analysis in medicine, biology, or epidemiology.

**Competencies**: At the conclusion of this primer participants will: be able to manipulate and analyze data, write basic models, understand the R environment for using packages, and create standard or customized graphics.

**Prerequisite Details**: This primer assumes a knowledge of basic statistics as taught in a first semester undergraduate or graduate sequence. Topices should include: probability, cross-tabulation, basic statistical summaries, and linear regression in either scalar or matrix form.**Grading**:- Attendance/Participation: 20%

Data Assignment 1: 40%

Data Assignment 2: 40% **Datasets and Assignments**:- Assignment 1:
- Read the description and downlad the Hemodialysis Data
- Describe each variable numerically and graphically.
- Find an interesting relationship between two dichotomous variables and describe it with a 2 X 2 table.

- Colon Cancer for Tuesday labwork.
- Assignment 2:

- Assignment 1:
**Available Reading**: The following are high quality monographs that are 100% free online.- An Introduction to R
- Using R for Data Analysis and Graphics, J.H. Maindonald
- simpleR - Using R for Introductory Statistics, John Verzani
- The R Guide
- Analysis of Epidemiological Data Using R and Epicalc, Virasakdi Chongsuvivatwong
- Statistics Using R with Biological Examples, Kim Seefeld and Ernst Linder
- An Introduction to R: Software for Statistical Modeling & Computing, P. Kuhnert & B. Venables

- Course Outline:

- Module 1, Monday, July 6, 2015, 8:30-12, Jeff Gill

- Downloading and Installing R
- Downloading and Installing R Packages
- Setting Up Help
- Basic Syntax (comments, naming conventions etc)
- Data Types and Data Structures
- Basic Operations
- Data Import and Export
- Quitting R and Saving R Objects
- Basic Tabular Analysis
- slides for introduction (do not print).

- Module 2, Monday, July 6, 2015, 1:30-5, Jeff Gill

- Basic Plotting Commands
- Plotting Categorical Data
- Basic Two-Dimensional Plotting Commands
- Combining Different Graphs
- Important Ways To Export Your Graphs
- Setting Up The Graphics Window
- Following Trends
- Illustrating The Law of Large Numbers
- Illustrating The Central Limit Theorem
- ROC Curves
- Graphing Networks
- slides for plotting (do not print).

- Module 3, Tuesday, July 7, 2015, 8:30-12, Jung Ae Lee.

- Data Manipulation
- Univeriate Data
- Multivariate Data
- Regression Analysis

- Module 4, Tuesday, July 7, 2014, 1:30-5, Jung Ae Lee.

- ANOVA
- Generalized Linear Models
- Survival Models

- Module 5, Wednesday, July 8, 2014, 8:30-12, Jeff Gill

- Defining Functions
- Multiple Arguments
- Naming Your Functions
- Loops In Functions
- Termination
- Counting Rules and Permutations

- Module 6, Wednesday, July 8, 2014, 1:30-5, Jeff Gill

- Sampling
- Generating Samples in R
- Random Imputation
- Monte Carlo Introduction
- Basic Monte Carlo Integration
- Rejection Sampling
- Bootstrapping for Standard Errors
- slides for functions (do not print).
- slides for sampling (do not print).

Helpful Websites: