Another one in the series
Because I’m a bit slow I’ve been stuck for awhile wondering how to determine sample size for Levene’s test: I’d got the understanding of shifting data so we had two data sets with known difference in medians for tests like Mood’s Median, but I couldn’t figure out how to shift the data so we had a known difference in variance.
It’s embarrassingly simple.
Say, for sake of example, I have a set of non-normal data and I’m measuring its variance using 5th quantile to 95th quantile span and this value is 150. I have a second set of data whose span is 50 and I’m going to compare these sets with Levene’s test, but I want to know the power my sample size provides.
Well, all I have to do is to take my first set of data and make a new data set from it by multiplying the first by 50/150. This “shifts” it so we have two data sets with a known difference in variation (span).
Then I can bootstrap as previously to determine sample power:
#First data set
data1 <- c(...)
#Shift data (No real need for integers, beyond prettier data)
data.shift <- as.integer(data1 * 50/150)
#Function to assess power of sample size
power.of.levenes = function(group1, group2, reps=5000, size=10) {
results <- sapply(1:reps, function(r) {
resample1 <- sample(group1, size=size, replace = T)
resample2 <- sample(group2, size=size, replace = T)
#leveneTest from Cars package
test <- leveneTest(values ~ ind, data = stack(data.frame(resample1, resample2)))
test$"Pr(>F)"[1]
})
sum(results<0.05)/reps
}
#Calculate power
power.of.levenes(data1, data.shift, reps=1000, size=18)
And this will give me the power of my sample size for detecting the difference in variation I’m interested in.