As a teeny-tiny follow-up to the previous post I realised boot-strapping is also a great way to compare the power of statisitical tests. It is generally accepted that Mood’s Median is lower power than Mann-Whitney-Wilcoxon, but you can demonstrate this quite easily with boot-strapping.

First of all, we need to define a Mood’s Median test in R:

#This matches Minitab's definition (which might not be the best idea, but...)
minitab.moods.median.test <- function(x,y){
    z <- c(x,y)
    g <- rep(1:2, c(length(x),length(y)))
    m <- median(z)
    chisq.test(z<=m,g, correct=FALSE)
}

The above was inspired by this implementation, which is wrong as far as the definition on Wikipedia goes (and also how Minitab does things). However, I appreciate that any person picked at random from the R mailing list is likely to know more about statistics than I do and hence that approach is proabably still valid.

We can then define separate power functions for each test. Running these with the same data as last time shows that the Mood’s Median test needs ~43 sample points to get 80% power, whereas the Wilcoxon only needs 24 points.