I wanted to write a post about the permutation test video I uploaded to YouTube. I have linked the video and put up the materials on Advanced Statistics page.
I mainly wanted to cover that advantages to permutation tests:
- You are not relying on some magical population. I hope I expressed this idea well in the video. The more I do research, the more I realize that populations are a thing of magic that just doesn’t exist – especially because, short of a lot of money, how are we supposed to randomly sample from that population anyway?
- Those pesky assumptions! I am a big proponent of checking your assumptions – which is why all my videos have information about data screening in them. However, I am also guilty of being like “oh well shrug, there goes some power because what else am I supposed to do?”. Or even better … what do you do when all the reviewers only know ANOVA, and you do want to use something special? It’s a messy system we have going here.
- They have a certain elegance to them … I test my data, and it turns out to be X statistic number. If I randomize that data, how many times do I get X or greater? How simple is that idea?
The hidden side of permutation tests is that they still rely on some form of probability, and potentially, the same faulty logic that we use now for null hypothesis significance testing. Additionally, I can see someone running the test to fish – if something is close, you could run permutation until it comes out your way.
I do know that I said something a bit wrong at the end of the video around 30 minutes in … you can’t really calculate F for the permutation test, because there are lots of F values (that’s the point). I would suggest reporting the p values and potentially calculating F for the original test by doing MS / Residuals but making it very clear the p value is for a permutation test. I also highly recommend adding eta or eta squared for effect size using the SS Variable and SS Residual information provided in the table. If you compare the aovp() output to regular ANOVA, you will find it is approximately the same for the SS and MS, but p changes based on the randomization results.