Preface
Why
Statistics with R
?
Philosophy
What is in this handbook?
Resources
About me
(PART*) Four basic ingredients
0.1
R
0.2
RStudio
0.2.1
The RStudio IDE
0.2.2
Install packages
0.2.3
RStudio Projects
0.3
Git & GitHub
0.4
Resources
1
Data Import
1.1
Entering data
1.2
From Text
1.3
From Excel
1.4
From SPSS
1.5
From SAS
1.6
From Stata
1.7
From systat
1.8
Data from R packages
2
R-Basics
2.1
Help
2.2
Data structures
2.2.1
Vectors
2.2.2
Sequences
2.2.3
Factors
2.2.4
Data frames
2.2.5
Tibbles
2.2.6
Matrix
2.2.7
List
2.2.8
Array
2.3
Dates
2.3.1
Date Conversion
2.3.2
Date to Character
2.4
Piping
2.5
Base pipes
2.6
Tidy pipes
3
R-Markdown
3.1
Installation
3.2
Resources
3.3
PowerPoint
I Data Preprocessing
4
Data Manipulation
4.1
Tidy data
4.2
Tutorial
5
Data Wrangling
5.1
Wrangling Tutorial
5.2
Wrangling Tutorial 2
5.3
Data Manipulations
6
Missing Values
6.1
Deleting NA’s
6.2
Multiple Imputations
6.3
NA’s tutorial
7
Outliers
7.1
Outliers
7.1.1
Detection by plots
7.1.2
Using statistics
7.1.3
Using MAD
7.1.4
Interquartile Range (IQR)
7.1.5
Grubb’s Test
7.1.6
Tools in R
7.2
Leverage
7.3
Influential
II Data Visualization
Introduction
ggplot2
7.3.1
Syntax
8
Aesthetic Mappings
8.1
Aesthetics
8.2
Coordinate systems
8.3
Color scales
8.4
Figure design
8.5
Right order
9
Visualizing Amounts
10
Visualizing Distributions
10.1
Histograms
10.2
Boxplots
11
Visualizing Proportions
12
Visualizing Trends
III Descriptive Statistics
Introduction
13
Data Tabulation
13.1
Frequency Tables
13.1.1
Tables in R
13.2
Cross-tabulations
13.2.1
Cross-tabs in R
13.3
Kable package
13.4
Tutorial
14
Univariate Analysis
14.1
Measurement Scales
14.2
Central Tendency
14.2.1
Arithmetic mean
14.2.2
Median
14.2.3
Mode
14.2.4
Quantiles
14.3
Dispersion
14.3.1
Range
14.3.2
Interquartile range
14.3.3
Variance
14.3.4
Standard deviation
14.3.5
% Variability
14.4
Chebychev’s rule
14.5
Empirical rule
14.6
Method of moments
14.7
Skewness
14.7.1
Skewness risk
14.8
Kurtosis
14.8.1
Kurtosis risk
14.9
Robust Statistics
14.9.1
Trimmed mean
14.9.2
Winsorized mean
14.9.3
Trimmed sd
14.9.4
MAD
14.9.5
IQR deviation
14.10
Summary reports
14.11
Tutorial
15
Bivariate Analysis
15.1
Spurious correlations
15.2
Bivariate data
15.3
Quantitative pairs
15.3.1
Scatterplots
15.3.2
Linear correlation
15.3.3
Partial correlations
15.3.4
Part correlation
15.4
Mixed scales
15.4.1
Dotplots
15.4.2
Boxplots
15.4.3
Rank correlations
15.4.4
Point-biserial correlation
15.5
Nonlinear correlation
15.5.1
eta
15.6
Correlation matrix
15.7
Qualitative pairs
15.7.1
Contingency table
15.7.2
Chi-square statistic
15.7.3
Mosaic plots
15.7.4
Pie charts
15.7.5
Barplots
15.7.6
Contingency correlations
15.8
Recreating data
IV Regression Analysis
Introduction
16
Simple Regression
16.1
Linear regression
16.2
Sample data
16.2.1
Univariate analysis
16.2.2
Scatterplots
16.2.3
Batter up
16.3
Sum of squared residuals
16.4
The linear model
16.5
Prediction and prediction errors
16.6
Model diagnostics
17
Multiple Regression
17.1
Sample data
17.1.1
Univariate analysis and correlation plots
17.1.2
Scatterplots
17.2
Simple Model
17.3
Model validation
17.4
Model diagnostics
17.4.1
Linearity
17.4.2
Nearly normal residuals
17.4.3
Constant variability
17.4.4
Outliers
17.4.5
Leverage points
17.4.6
Influential observations
17.4.7
Global tests of linear model assumptions
17.5
Nonlinear regression model
17.6
Multiple variables regression model
17.7
Evaluating multi-collinearity
17.8
Best subset regression
17.9
Stepwise regression
17.10
Comparing competing models
17.10.1
Akaike Information Criterion
17.10.2
Bayesian Information Criterion
17.10.3
Adjusted R-Squared
17.11
Cross Validation
17.12
Printing the final regression table
17.12.1
The ‘jtools’ package
17.12.2
The final model
17.12.3
The ‘stargazer’ package
17.13
TUTORIAL
18
GLM Regression
18.1
Beyond linear models
18.2
Logistic regression
18.2.1
Fitting a logistic regression model with
glm()
18.2.2
Log-odds transform
18.2.3
Worked Example
18.2.4
Over-dispersion
18.2.5
Comparing overall models
18.3
Modeling probabilities
18.3.1
Dissecting the logistic model
18.3.2
Predicting
18.4
Case study
18.5
Probit regression
18.6
Summary
V Time Series Analysis
Introduction
19
Time Series
20
Time Series Smoothing
21
Time Series Models
Appendix
A
R-Pubs
A.1
Prerequisites
A.2
Instructions
B
Google Colab
C
Jupyter
D
R & SQL
Published with bookdown
Statistics with R
C
Jupyter