Introduction to Machine Learning

Agenda

Types of Models
Getting the terms right
Machine Learning vs. Traditional Statistics
The Tradeoff
Types of Machine Learning Models
The process to Model
How do we spend our data

Types of Models

Descriptive
Inferential
Predictive

Describe or illustrate characteristics of some data.
No other purpose than to visually emphasize some trend or artifact in the data.

Explore a specific hypothesis: statistical tests are used. An inferential model starts with a predefined conjecture or idea about a population and produces a statistical conclusion such as an interval estimate or the rejection of a hypothesis.
For example, the goal of a clinical trial might be to provide confirmation that a new therapy does a better job in prolonging life

Produce the most accurate prediction possible Predicted values have the highest possible fidelity to the true value of the new data.

Artificial Intelligence? Machine Learning? What?

Artificial intelligence is the name of a whole knowledge field.
Machine Learningis a part of artificial intelligence.
Neural Networks are one of machine learning types.

A little of History

Algorithms to analyze and cluster unlabeled datasets

Clustering: groups unlabeled data based on their similarities or differences
Dimensionality reduction: Principal component analysis

Use of labeled datasets to train algorithms that classify data or predict outcomes accurately

There is a “y” or outcome variable.

Popular algorithms:

Naive Bayes.
Decision Trees.
Logistic Regression.

How is it used in our daily lifes?

Example 1
Example 2
Example 3

Machine Learning vs. “Traditional Statistics”?

Statisticians
Machine Learning -People-

Care about variability.
Care about defining the range of normal values across samples (the standard error).
Focus on estimating betas

\[{y} = \alpha + \beta_1x_1 + \beta_2x_2 + \dots + \beta_nx_n + \epsilon\]

Care about prediction.
Focus on estimating y-hat.

\[\hat{y} = \hat{f}(x_1)\]
\(\hat{y}\) Represents the resulting prediction for \(Y\).
\(\hat{f}\) Represents the estimate for \(f\), which is often treated as a blackbox (No one is concerned with the exact form of \(\hat{f}\), provided that it yields accurate predictions for \(Y\)) Introduction to Statistical Learning

Machine Learning vs. “Traditional Statistics”

Statistics
Machine Learning
Training and Testing

Tradeoff Bias/Variance

“When we discuss prediction models, prediction errors can be decomposed into two main subcomponents we care about: error due to ”bias” and error due to ”variance”.” (Fortman-Roe, 2012)
“There is a tradeoff between a model’s ability to minimize bias and variance.” (Fortman-Roe, 2012)

Bias

The error due to bias is taken as the difference between the expected (or average) prediction of our model and the correct value which we are trying to predict.

Variance

The error due to variance is taken as the variability of a model prediction for a given data point.
- The variance is how much the predictions for a given point vary between different realizations of the model.

Graphical Representation of the Tradeoff

An Example with Data

Animation

Final Remarks About Bias/Variance

What are the different models?

So Where Do We Start?

R for Data Science

Explanatory Model Analysis

How do we Spend our Data?

For machine learning, we split data into training and test sets:

The training set is used to estimate model parameters.
The testing set is used to find an independent assessment of model performance.

🚫 CAUTION: Do not use the test set during training.

🤓

Let’s take a look

Animation

Resampling Methods

They are a tool consisting in repeatedly drawing samples from a dataset and calculating statistics and metrics on each of those samples.

Crossvalidation

This approach involves randomly dividing the set of observations into k folds of nearly equal size. The first fold is treated as a validation set and the model is fit on the remaining folds.

Leave one out

Only one observation is used for validation and the rest is used to fit the model.

Boostraping

Tunning Hyperparameters

Method	Hyperparameter	Description
Lasso	lambda	Regularization strength
KNN	n_neighbors	Number of neighbors to consider
KNN	weights	Weight function used in prediction: “uniform” or “distance”
Trees	max_depth	Maximum depth of the tree
Trees	min_samples_split	Minimum number of samples required to split an internal node
Trees	min_samples_leaf	Minimum number of samples required to be at a leaf node
Trees	max_features	Number of features to consider when looking for the best split
Random Forest	n_estimators	Number of decision trees in the forest
Random Forest	max_depth	Maximum depth of the decision trees
Random Forest	min_samples_split	Minimum number of samples required to split an internal node
Random Forest	min_samples_leaf	Minimum number of samples required to be at a leaf node
Random Forest	max_features	Number of features to consider when looking for the best split

The Actual Process:

Collect Data
Data exploration and preparation.
Model training
Model evaluation (Don’t PANIC will cover this next session).

Look at RMSE or contingency table statistics (accuracy, sensitivity, specificity, etc)

Model improvement

Tweak preparation, reparametrize a method or use a different method

Use the test data to evaluate the final model.
Share/Publish results

Introduction to Machine Learning

Agenda

Types of Models

Artificial Intelligence? Machine Learning? What?

A little of History

Supervised vs Unsupervised Learning

How is it used in our daily lifes?

Machine Learning vs. “Traditional Statistics”?

Machine Learning vs. “Traditional Statistics”

Tradeoff Bias/Variance

Bias

Variance

Graphical Representation of the Tradeoff

An Example with Data

Final Remarks About Bias/Variance

What are the different models?

So Where Do We Start?

How do we Spend our Data?

How do we Spend our Data?

🤓

Let’s take a look

Resampling Methods

Crossvalidation

Leave one out

Boostraping

Tunning Hyperparameters

The Actual Process:

The actual Process - As an image

How to implement all of these?