Book Review: R for Business Analytics

The book R for Business Analytics by Ajay Ohri sets out to look at "some of the most common tasks performed by business analysts and helps the user navigate the wealth of information in R and its 4000 packages." In my opinion it succeeds in covering an extensive range of topics but fails to provide anything of substantial use to its intended audience. At least, not anything that could not be uncovered by a brief internet search.

R-for-Business-Analytics

There is little detailed treatment of any particular topic related to Business Analytics. It is, at best, a high level overview of many of the capabilities of R. There is scant logical flow and the book appears to have been cobbled together from information readily available on the Internet. The editing is questionable and the code examples are poorly formatted. I also found the use of eponymous variable names in many of the code examples a little disturbing: is this really good practice? I don't think so.

Admittedly the author points out that the book is not aimed at statisticians, but rather at analytics professionals and students, where previous experience with R is not a prerequisite. However I suspect that neither of these groups would find the book very fulfilling.

Not everything about the book is bad though: the author does cover a diverse range of packages and I was pleased to learn about some interesting packages, of which I was previously unaware.

I feel a little uncomfortable posting such a negative review. For a more positive take on the book, you can read this review. However, if you are considering buying this book, I would personally advise against it. I felt more than a little cheated.

Downloading Option Chain Data from Google Finance in R: An Update

I recently read an article which showed how to download Option Chain data from Google Finance using R. Interestingly, that article appears to be a close adaption of another article which does the same thing using Python.

While playing around with the code from these articles I noticed a couple of things that might benefit from minor tweaks. Before I look at those though, it's worthwhile pointing out that there already is a function in quantmod for retrieving Option Chain data from Yahoo! Finance. What I am doing here is thus more for my own personal edification (but hopefully you will find it interesting too!).

Background

An Option Chain is just a list of all available options for a particular security spanning a range of expiration dates.

The Code

First we need to load a few packages which facilitate the downloading, parsing and manipulation of the data.

> library(RCurl)
> library(jsonlite)
> library(plyr)

We'll be retrieving the data in JSON format. Somewhat disturbingly the JSON data from Google Finance does not appear to be fully compliant with the JSON standards because the keys are not quoted. We'll use a helper function which will run through the data and insert quotes around each of the keys. The original code for this function looped through a list of key names. This is a little inefficient and would also be problematic if additional keys were introduced. We'll get around that by using a different approach which avoids stipulating key names.

> fixJSON <- function(json){
+   gsub('([^,{:]+):', '"\\1":', json)
+ }

To make the download function more concise we'll also define two URL templates.

> URL1 = 'http://www.google.com/finance/option_chain?q=%s&output=json'
> URL2 = 'http://www.google.com/finance/option_chain?q=%s&output=json&expy=%d&expm=%d&expd=%d'

And finally the download function itself, which proceeds through the following steps for a specified ticker symbol:

  1. downloads summary data;
  2. extracts expiration dates from the summary data and downloads the options data for each of those dates;
  3. concatenates these data into a single structure, neatens up the column names and selects a subset.
> getOptionQuotes <- function(symbol){
+   url = sprintf(URL1, symbol)
+   #
+   chain = fromJSON(fixJSON(getURL(url)))
+   #
+   options = mlply(chain$expirations, function(y, m, d) {
+     url = sprintf(URL2, symbol, y, m, d)
+     expiry = fromJSON(fixJSON(getURL(url)))
+     #
+     expiry$calls$type = "Call"
+     expiry$puts$type  = "Put"
+     #
+     prices = rbind(expiry$calls, expiry$puts)
+     #
+     prices$expiry = sprintf("%4d-%02d-%02d", y, m, d)
+     prices$underlying.price = expiry$underlying_price
+     #
+     prices
+   })
+   #
+   options = cbind(data.frame(symbol), rbind.fill(options))
+   #
+   names(options)1 = c("price", "bid", "ask", "open.interest")
+   #
+   for (col in c("strike", "price", "bid", "ask")) options[, col] = as.numeric(options[, col])
+   options[, "open.interest"] = suppressWarnings(as.integer(options[, "open.interest"]))
+   #
+   options[, c(1, 16, 15, 6, 10, 11, 17, 14, 12)]
+ }

Results

Let's give it a whirl. (The data below were retrived on Saturday 10 January 2015).

> AAPL = getOptionQuotes("AAPL")
> nrow(AAPL)
[1] 1442

This is what the resulting data look like, with all available expiration dates consolidated into a single table:

> head(AAPL)
  symbol type     expiry price   bid   ask underlying.price strike open.interest
1   AAPL Call 2015-01-17 82.74 84.00 84.35           112.01  27.86           505
2   AAPL Call 2015-01-17 83.75 83.20 83.70           112.01  28.57          1059
3   AAPL Call 2015-01-17 84.75 82.60 82.90           112.01  29.29            13
4   AAPL Call 2015-01-17 81.20 81.80 82.40           112.01  30.00            29
5   AAPL Call 2015-01-17 83.20 81.10 81.65           112.01  30.71           150
6   AAPL Call 2015-01-17 79.75 80.35 80.75           112.01  31.43           396
> tail(AAPL)
     symbol type     expiry price   bid   ask underlying.price strike open.interest
1437   AAPL  Put 2017-01-20 47.57 46.45 48.20           112.01    150           108
1438   AAPL  Put 2017-01-20 51.00 50.45 52.30           112.01    155            72
1439   AAPL  Put 2017-01-20 52.45 54.55 56.50           112.01    160           203
1440   AAPL  Put 2017-01-20 58.55 58.65 60.75           112.01    165            76
1441   AAPL  Put 2017-01-20 67.90 63.15 64.50           112.01    170           167
1442   AAPL  Put 2017-01-20 68.00 67.35 68.25           112.01    175           239

There is a load of data there. To get an idea of what it looks like we can generate a couple of plots. Below is the Open Interest as a function of Strike Price across all expiration dates. The underlying price is indicated by the vertical dashed line. As one might expect, the majority of interest is associated with the next expiration date on 17 January 2015.

open-interest-strike-price-AAPL

It's pretty clear that this is not the optimal way to look at these data and I would be extremely interested to hear from anybody with a suggestion for a better visualisation. Trying to look at all of the expiration dates together is probably the largest problem, so let's focus our attention on those options which expire on 17 January 2015. Again the underlying price is indicated by a vertical dashed line.

open-interest-premium-strike-price-AAPL

This is the first time that I have seriously had a look at options data, but I will now readily confess to being intrigued. Having the data readily available, there is no reason not to explore further. Details to follow.

Simulating Intricate Branching Patterns with DLA

Manfred Schroeder's book Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise is a fruitful source of interesting topics and projects. He gives a thorough description of Diffusion-Limited Aggregation (DLA) as a technique for simulating physical processes which produce intricate branching structures. Examples, as illustrated below, include Lichtenberg Figures, dielectric breakdown, electrodeposition and Hele-Shaw flow.

dla-post-illustration

Diffusion-Limited Aggregation

DLA is conceptually simple. A seed particle is fixed at the origin of the coordinate system. Another particle is introduced at a relatively large distance from the origin, which then proceeds to move on a random walk. Either it will wander off to infinity or it will come into contact with the particle at the origin, to which it sticks irreversibly. Now another particle is introduced and the process repeats itself. As successive particles are added to the system, a portion of them become bound to the growing cluster of particles at the origin.

The objects which evolve from this process are intrinsically random, yet have self-similar structure across a range of scales. There is also an element of positive feedback, where once a protuberance has formed on the cluster, further particles are more likely to adhere to it since they will probably encounter it first.

A Simple Implementation in R

First we need to construct a grid. We will start small: a 20 by 20 grid filled with NA except for four seed locations at the centre.

> # Dimensions of grid
> W <- 20
> grid <- matrix(NA, nrow = W, ncol = W)
> grid[W/2 + c(0, 1), W/2 + c(0, 1)] = 0
> grid
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
 [1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [2,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [3,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [5,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [7,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [8,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [9,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[10,]   NA   NA   NA   NA   NA   NA   NA   NA   NA     0     0    NA    NA    NA    NA    NA    NA    NA    NA    NA
[11,]   NA   NA   NA   NA   NA   NA   NA   NA   NA     0     0    NA    NA    NA    NA    NA    NA    NA    NA    NA
[12,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[13,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[14,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[15,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[16,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[17,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[18,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[19,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[20,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA

We need to generate two dimensional random walks. To do this I created a table of possible moves, from which individual steps could be sampled at random. The table presently only caters for the cells immediately above, below, left or right of the current cell. It would be a simple matter to extend the table to allow diagonal moves as well, but these more distant moves would need to be weighted accordingly.

> moves <- data.frame(dx = c(0, 0, +1, -1), dy = c(+1, -1, 0, 0))
> #
> M = nrow(moves)
> moves
  dx dy
1  0  1
2  0 -1
3  1  0
4 -1  0

Next a function to transport a particle from its initial location until it either leaves the grid or adheres to the cluster at the origin. Again a possible refinement here would be to allow sticking to next-nearest neighbours as well

> diffuse <- function(p) {
+   count = 0
+   #
+   while (TRUE) {
+     p = p + moves[sample(M, 1),]
+     #
+     count = count + 1
+     #
+     # Black boundary conditions
+     #
+     if (p$x > W | p$y > W | p$x < 1 | p$y < 1) return(NA)
+     #
+     # Check if it sticks (to nearest neighbour)
+     #
+     if (p$x < W && !is.na(grid[p$x+1, p$y])) break
+     if (p$x > 1 && !is.na(grid[p$x-1, p$y])) break
+     if (p$y < W && !is.na(grid[p$x, p$y+1])) break
+     if (p$y > 1 && !is.na(grid[p$x, p$y-1])) break
+   }
+   #
+   return(c(p, count = count))
+ }

Finally we are ready to apply this procedure to a batch of particles.

> library(foreach)
>
> # Number of particles per batch
> #
> PBATCH <- 5000
> #
> # Select starting position
> #
> phi = runif(PBATCH, 0, 2 * pi)
> #
> x = round((1 + cos(phi)) * W / 2 + 0.5)
> y = round((1 + sin(phi)) * W / 2 + 0.5)
> #
> particles <- data.frame(x, y)
> 
> result = foreach(n = 1:PBATCH) %do% diffuse(particles[n,])
> 
> lapply(result, function(p) {if (length(p) == 3) grid[p$x, p$y] <<- p$count})

The resulting grid shows all of the locations where particles have adhered to the cluster. The number at each location is the diffusion time, which indicates the number of steps required for the particle to move from its initial location to its final resting place. The shape of the cluster is a little boring at present: essentially a circular disk centred on the origin. This is due to the size of the problem and we will need to have a much larger grid to produce more interesting structure.

> grid
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
 [1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [2,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [3,]   NA   NA   NA   NA   NA   NA   NA   NA   NA     2    18    NA    NA    NA    NA    NA    NA    NA    NA    NA
 [4,]   NA   NA   NA   NA   NA   NA   NA   NA    8     6    17    10    NA    NA    NA    NA    NA    NA    NA    NA
 [5,]   NA   NA   NA   NA   NA   NA   NA    6   11     7    54    25    15    NA    NA    NA    NA    NA    NA    NA
 [6,]   NA   NA   NA   NA   NA   NA   16   10   11    58    69    18    31    16    NA    NA    NA    NA    NA    NA
 [7,]   NA   NA   NA   NA   NA   20   19   10   21    24    32    50    24    65     8    NA    NA    NA    NA    NA
 [8,]   NA   NA   NA   NA   18   12   55   26   13   151    86    20    21    26    27    34    NA    NA    NA    NA
 [9,]   NA   NA   NA   56   21   43   19   53   43    30    26    37    66    52    30    22    10    NA    NA    NA
[10,]   NA   NA   29    9    9   23   70   38   48     0     0   122    26    44    22    10    27     5    NA    NA
[11,]   NA   NA    2    9   10   36   32   38   24     0     0    54    14    21    65    14    30    29    NA    NA
[12,]   NA   NA   NA    5   10   16   13   83   52    43    23    42    39    23    66     9    32    NA    NA    NA
[13,]   NA   NA   NA   NA   21   70   28   31   NA    41    61    15    17    29    25    17    NA    NA    NA    NA
[14,]   NA   NA   NA   NA   NA    8   29   19    7    47   119    37    19     9    10    NA    NA    NA    NA    NA
[15,]   NA   NA   NA   NA   NA   NA   15   33   68    26    38    13    33     8    NA    NA    NA    NA    NA    NA
[16,]   NA   NA   NA   NA   NA   NA   NA   10   12    12    15    35    11    NA    NA    NA    NA    NA    NA    NA
[17,]   NA   NA   NA   NA   NA   NA   NA   NA   20     8     6     5    NA    NA    NA    NA    NA    NA    NA    NA
[18,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    18    20    NA    NA    NA    NA    NA    NA    NA    NA    NA
[19,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
[20,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA

Taking a look at the distribution of diffusion times below we can see that there is a strong peak between 10 and 15. The majority of particles diffuse in less than 40 steps, while the longest diffusion time is 151 steps. The grid above shows that, as one would expect, smaller diffusion times are found close to the surface of the cluster, while longer times are embedded closer to the core.

dla-big-steps-hist-sample

Scaling it Up

To produce a more interesting cluster we need to scale the grid size up considerably. But this also means that the run time escalates enormously. So, to make this even remotely tractable, I had to parallelise the algorithm. I did this using the SNOW package and ran it on an MPI cluster. The changes to the code are trivial, involving only the creation and initialisation of the cluster and changing %do% to %dopar% in the foreach() loop.

dla-big

Conclusion

This is nowhere near being an efficient implementation of DLA. But it gets the job done and is illustrative of how the algorithm works. To do production runs one would want to code the algorithm in a low-level language like C or C++ and take advantage of the inherent parallelism of the algorithm.

Lessons I've Learned from Physics

These are the slides from a short talk that I gave to a group of developers at DERIVCO Dev night. The title of the talk seemed a little pompous in retrospect and I considered retitling it The Physics of Cats and Mayonnaise. One of my colleagues came up with the title Levers and Gravity and Shit, which I rather like too.

The content of the talk is a little hard to follow from the slides alone, but the general gist of it is that there are some broad principles in Physics which can be interpreted (tenuously extended?) to the world of software development.

A very nice TED talk in a similar vein was given by Dan Cobley, entitled What physics taught me about marketing.

Zacks Data on Quandl

zacks-logo

Data from Zacks Research have just been made available on Quandl. Registered Quandl users have free preview access to these data, which cover the following:

These data describe over 5000 publicly traded US and Canadian companies and are updated daily.

Finding the Data

If you are not already a registered Quandl user, now is the time to sign up. You will find links to all of the data sets mentioned above from the Quandl vendors page. Then, for example, from the Earnings Estimates page you can search for a particular company. I selected Hewlett Packard, which links to a page giving summary data on the Earnings per Share (EPS) for the next three years. These data are presented both in tabular format as well as an interactive plot.

Zacks-HP-EPS

Browsing the data via the Quandl web site gives you a good appreciation of what is available and the general characteristics of the data. However, to do something meaningful you would probably want to download data into an offline analysis package.

Getting the Data into R

I am going to focus on accessing the data through R using the Quandl package.

library(Quandl)

Obtaining the data is remarkably simple. First you need to authenticate yourself.

> Quandl.auth("ZxixTxUxTxzxyxwxFxyx")

You will find your authorisation token under the Account Settings on the Quandl web site.

Grabbing the data is done via the Quandl() function, to which you need to provide the appropriate data set code.

Quandl-menu

Beneath the data set code you will also find a number of links which will popup the precise code fragment required for downloading the data in a variety of formats and on a selection of platforms (notable amongst these are R, Python and Matlab although there are interfaces for a variety of other platforms too).

> # Annual estimates
> #
> Quandl("ZEE/HPQ_A", trim_start="2014-10-31", trim_end="2017-10-31")[,1:5]
        DATE EPS_MEAN_EST EPS_MEDIAN_EST EPS_HIGH_EST EPS_LOW_EST
1 2017-10-31         3.90          3.900         3.90        3.90
2 2016-10-31         4.09          4.100         4.31        3.87
3 2015-10-31         3.94          3.945         4.01        3.88
4 2014-10-31         3.73          3.730         3.74        3.70

> # Quarterly estimates
> #
> Quandl("ZEE/HPQ_Q", trim_start="2014-10-31", trim_end="2017-10-31")[,1:5]
> HP[,1:5]
        DATE EPS_MEAN_EST EPS_MEDIAN_EST EPS_HIGH_EST EPS_LOW_EST
1 2015-10-31         1.10           1.10         1.14        1.04
2 2015-07-31         0.97           0.98         1.00        0.94
3 2015-04-30         0.95           0.95         1.00        0.91
4 2015-01-31         0.92           0.92         1.00        0.85
5 2014-10-31         1.05           1.05         1.07        1.03

Here we see a subset of the EPS data available for Hewlett Packard, giving the maximum and minimum as well as the mean and median projections of EPS at both annual and quarterly resolution.

Next we'll look at a comparison of historical actual and estimated earnings.

> Quandl("ZES/HPQ", trim_start="2011-11-21", trim_end="2014-08-20")[,1:6]
         DATE EPS_MEAN_EST EPS_ACT EPS_ACT_ADJ EPS_AMT_DIFF_SURP EPS_PCT_DIFF_SURP
1  2014-08-20         0.89    0.89       -0.37              0.00              0.00
2  2014-05-22         0.88    0.88       -0.22              0.00              0.00
3  2014-02-20         0.85    0.90       -0.16              0.05              5.88
4  2013-11-26         1.00    1.01       -0.28              0.01              1.00
5  2013-08-21         0.87    0.86       -0.15             -0.01             -1.15
6  2013-05-22         0.81    0.87       -0.32              0.06              7.41
7  2013-02-21         0.71    0.82       -0.19              0.11             15.49
8  2012-11-20         1.14    1.16       -4.65              0.02              1.75
9  2012-08-22         0.99    1.00       -5.50              0.01              1.01
10 2012-05-23         0.91    0.98       -0.18              0.07              7.69
11 2012-02-22         0.87    0.92       -0.18              0.05              5.75
12 2011-11-21         1.13    1.17       -1.05              0.04              3.54

Looking at the last column gives the EPS surprise amount (difference between the actual and estimated EPS) as a percentage. It's clear that the estimates are generally rather good.

The last thing that we are going to look at is dividend data.

> Quandl("ZDIV/HPQ", trim_start="2014-11-07", trim_end="2014-11-07")[,1:6]
       AS_OF DIV_ANNOUNCE_DATE DIV_EX_DATE DIV_PAY_DATE DIV_REC_DATE DIV_AMT
1 2014-11-07          20140717    20140908     20141001     20140910    0.16

Here we see that a $0.16 per share dividend was announced on 17 July 2014 and paid on 1 October 2014.

Having access to these data for a wide range of companies promises to be an enormously useful resource. Unfortunately access to the preview data is fairly limited, but if you plan on making full use of the data, then the premium access starting at $100 per month seems like a reasonable deal.

Creating More Effective Graphs

A few years ago I ordered a copy of the 2005 edition of Creating More Effective Graphs by Naomi Robbins. Somewhat shamefully I admit that the book got buried beneath a deluge of papers and other books and never received the attention it was due. Having recently discovered the R Graph Catalog, which implements many of the plots from the book using ggplot2, I had to dig it out and give it some serious attention.

Both the book and web site are excellent resources if you are looking for informative ways to present your data.

Being a big fan of xkcd, I rather enjoyed the example plot in xkcd style (which I don't think is covered in the book...). The code provided on the web site is used as the basis for the plot below.

life-expectancy

This plot is broadly consistent with the data from the Public Data archive on Google, but the effects of smoothing in the xkcd style plot can be clearly seen. Is this really important? Well, I suppose that depends on the objective of the plot. If it's just to inform (and look funky in the process), then the xkcd plot is perfectly fine. If you are looking for something more precise, then a more conventional plot without smoothing would be more appropriate.

life-expectancy-google

I like the xkcd style plot though and here's the code for generating it, loosely derived from the code on the web site.

> library(ggplot2)
> library(xkcd)
> 
> countries <- c("Rwanda", "South Africa", "Norway", "Swaziland", "Brazil")
> 
> hdf <- droplevels(subset(read.delim(file = "http://tiny.cc/gapminder"), country %in% countries))
> 
> direct_label <- data.frame(year = 2009,
+ 	lifeExp = hdf$lifeExp[hdf$year == 2007],
+ 	country = hdf$country[hdf$year == 2007])
> 
> set.seed(123)
> 
> ggplot() +
+ 	geom_smooth(data = hdf,
+ 		aes(x = year, y = lifeExp, group = country, linetype = country),
+ 		se = FALSE, color = "black") +
+ 	geom_text(aes(x = year + 2.5, y = lifeExp + 3, label = country), data = direct_label,
+ 		hjust = 1, vjust = 1, family = "xkcd", size = 7) +
+ 	theme(legend.position = "none") +
+ 	ylab("Life Expectancy") +
+ 	xkcdaxis(c(1952, 2010), c(20, 83))

Standard Bank: Striving for Mediocrity

Recently I was in my local Standard Bank branch. After finally reaching the front of the queue and being helped by a reasonably courteous young man, I was asked if I would mind filling out a survey. Sure. No problem. I had been in the bank for 30 minutes, I could probably afford another 30 seconds.

And then I was handed this abomination:

standard-bank-survey

So, if I was deliriously satisfied with the service that I had received, then I would award them a 10. If I was neither impressed nor dismayed, I would give them a 9. But if I was not happy at all, then I would give them an 8.

Let me repeat that so that the horror sinks in: if I was completely dissatisfied with their service then I would give them an 8! Out of 10. That's 80%.

80% for shoddy service!

Whoever is managing this little piece of supposed market research should be ashamed. What a load of rubbish.