It’s all very well generating myriad statistics characterising your data. How do you know whether or not those statistics are telling you something interesting? Hypothesis Tests. To that end, we’ll be looking at the HypothesisTests package today.
The first (small) hurdle is loading the package.
That wasn’t too bad. Next we’ll assemble some synthetic data.
We’ll apply a one sample t-test to x1 and x2. The output below indicates that x2 has a mean which differs significantly from zero while x1 does not. This is consistent with our expectations based on the way that these data were generated. I’m impressed by the level of detail in the output from OneSampleTTest(): different aspects of the test are neatly broken down into sections (population, test summary and details) and there is automated high level interpretation of the test results.
Using pvalue() we can further interrogate the p-values generated by these tests. The values reported in the output above are for the two-sided test, but we can look specifically at values associated with either the left- or right tails of the distribution. This makes the outcome of the test a lot more specific.
The associated confidence intervals are also readily accessible. We can choose between two-sided or left/right one-sided intervals as well as change the significance level.
As a second (and final) example we’ll look at BinomialTest(). There are various ways to call this function. First, without looking at any particular data, we’ll check whether 25 successes from 100 samples is inconsistent with a 25% success rate (obviously not and, as a result, we fail to reject this hypothesis).
Next we’ll see whether the Bernoulli samples in x5 provide contradictory evidence to an assumed 50% success rate (based on the way that x5 was generated we are not surprised to find an infinitesimal p-value and the hypothesis is soundly rejected).
There are a number of other tests available in this package, including a range of non-parametric tests which I have not even mentioned above. Certainly HypothesisTests should cover most of the bases for statistical inference. For more information, read the extensive documentation. Check out the sample code on github for further examples.
Look here for an explanation of the xkcd cartoon.(although if you are reading this blog, then that probably won’t be necessary).