## Contents

- k-Nearest Neighbours (kNN)
- How it works
- Finding a good value for k
- Importance of normalising data

- Naive Bayes
- Background on Bayesian Methods
- Probabilistic model
- Flavours of Naive Bayes
- Laplace smoothing
- Document Classifier

- Model Evaluation
- Confusion Matrix
- Accuracy
- Recall / Sensitivity
- Precision
- Specificity
- Positive/Negative Predictive Value
- F Measure
- ROC and AUC

- Costs of Errors
- Decision Trees
- Recursive Partitioning algorithm
- Pruning
- Model parameters (preventing underfitting and overfitting)
- A variation: Conditional Inference Trees

- Support Vector Machine
- Maximum Margin Classifiers
- Support Vector Classifiers
- The Kernel Trick
- Non-Linear Boundaries
- Polynomial Kernel
- Radial Kernel

- Unbalanced Data
- Oversampling
- Undersampling
- Synthetic Data Generation

## Prior Knowledge

We assume that participants have prior experience with R, ideally having completed both the the Introduction to R and Data Wrangling courses.