Review: Machine Learning with R Cookbook

Book Review R Review

“Machine Learning with R Cookbook” by Chiu Yu-Wei is nothing more or less than it purports to be: a collection of 110 recipes for applying Data Analysis and Machine Learning techniques in R. I was asked by the publishers to review this book and found it to be an interesting and informative read. It will not help you understand how Machine Learning works (that’s not the goal!) but it will help you quickly learn how to apply Machine Learning techniques to you own problems.

The recipes are broken down into chapters which address the following topics:

  • Installing R and an Introduction to R
  • Exploring an Example Data Set
  • Basic Statistics in R
  • Regression Analysis (including Poisson and Binomial models)
  • Classification (Decision Trees, Nearest Neighbour, Logistic Regression and Naïve Bayes)
  • Classification (Neural Networks and SVMs)
  • Model Evaluation and Comparison
  • Ensemble Models
  • Clustering
  • Mining Associations and Sequences
  • Dimensionality Reduction
  • Big Data and Integration of R with Hadoop

This is a relatively exhaustive list of topics. The last chapter might have better been omitted, but still provides a useful introduction to the use of R with massive data sets.

Each recipe in the book is divided into four parts entitled “Getting Ready”, “How to do it…”, “How it works…” and “See also”. This is a clever structure and an intuitive way to organise the material. In general the “Getting Ready” part provides sufficient background material to prepare you for the task at hand. “How to do it…” presents the meat of the recipe as a step-by-step procedure. The intention with “How it works…” is to explain how and why the recipe works. In many instances the explanations are somewhat superficial or reference details which are not discussed with sufficient depth. They are generally helpful though. The “See also” part provides links to additional material or alternative ways to solve the same problem.

There are some errors in the text and sometimes the language and grammar are imperfect. However, if you want to learn more about using R for Machine Learning, this might be a useful book to have in your collection. I should note that all of these recipes could easily be constructed from online resources, but this book has the merit of assembling them all in one place.

Categorically Variable