A new paper is now in press at the Psychonomic Bulletin and Review, entitled A Meta Analysis of the Survival Processing Advantage in Memory. This paper explores several different meta-analytic techniques and bias-correcting tools on the topic of survival processing. An abstract is posted below, and check out the unformatted manuscript online at https://osf.io/6sd8e/.

“The survival processing advantage occurs when processing words for their survival value improves later performance on a memory test. Due to the interest in this topic, we conducted a meta-analysis to review literature regarding the survival processing advantage to estimate a bias-corrected effect size. Traditional meta-analytic methods were used, as well as the Test of Excessive Success, p-curve, p-uniform, trim and fill, PET-PEESE, and selection models to re-evaluate effect sizes while controlling for forms of small-study effects. Average effect sizes for survival processing ranged between ηp2 = .06 and .09 for between-subjects experiments, and between .15 and .18 for within-subjects experiments after correcting for potential bias and selective reporting. Overall, researchers can expect to find medium to large survival processing effects, with selective reporting and bias correcting techniques typically estimating lower effects than traditional meta-analytic techniques.”

We recently submitted a paper to the Journal of Psychological Inquiry that focuses on the utilization of undergraduate learning assistants (ULAs) in Introductory Psychology classes at Missouri State University. The research has identified many problems for students associated with large class sizes. These large classes unfortunately limit opportunities for interaction among students and faculty. Missouri State University has implemented a program that utilizes ULAs to help increase interactions between course staff and students. Additionally, this course has reaped additional benefits that are discussed. In this manuscript, we review different ways in which large courses hinder student success and discuss different ways to implement undergraduate assistants. Additionally, we examine data reported by prior studies examining the effectiveness of ULAs. The finalized manuscript will soon be uploaded to Open Science Framework at the following link:




Recently, I’ve become very intrigued with combined forms of treatment for anxiety disorders and phobias. Until recently, the most beneficial treatment for social anxiety disorder was thought to be a combination of both therapy and medication. However, a recent study published by Nordahl et al. (2016) suggests that cognitive therapy is the most effective treatment approach compared to those involving either medication or a combination of medication and therapy for treating social phobia. In this study, researchers examined self-reported social anxiety in those diagnosed with social anxiety disorders in three treatment conditions: cognitive therapy, medication, and a combination of both cognitive therapy and medication. The results, which showed that cognitive therapy as a standalone treatment was the most effective form of treatment, seemed surprising to me, as medication has become the most mainstream form of treatment.

In order to understand the implications and impact of this specific study on contemporary treatment for social anxiety disorders, an understanding of prevalence rates of mainstream medication for these disorders is needed. One example of a type of medication used for anxiety disorders are selective serotonin reuptake inhibitors (SSRIs). This form of medication prevents the reuptake process of serotonin in the neurons. Jenkins, Nguyen, Polglaze, & Bertrand (2016) discuss the role serotonin plays in influencing both mood and cognition. Generally, low serotonin levels are related to a lowered state of mood. Additionally, lower serotonin levels are related to lowered cognitive facets such as verbal reasoning and both episodic and working memory. Granted, other medications exist that are used to treat anxiety disorders and associated psychopathology. However, this study only examines participants currently taking Paxil for their treatment conditions.

IMS Health published information on prevalence rates of different types of drugs in the United States as well as different countries. Not surprisingly, over 40,000,000 people in the United states are currently taking antidepressant medication as of April 2014. While research has shown that medication and therapy has been the most effective form of treatment for anxiety disorders and other psychopathology, clearly other forms of treatment need to be considered for specific anxiety disorders. Below is a breakdown of antidepressant medication prevalence in the United States, reported by the Citizens Commission on Human Rights from IMS Health.

Antidepressants  0-5 Years                  110,516

0-1 Years                   26,406
2-3 Years                   46,102
4-5 Years                   45,822

6-12 Years                 686,950
13-17 Years               1,444,422
18-24 Years               2,860,537
25-44 Years               12,400,129
45-64 Years               16,185,388
65 Year +                    8,566,579

Grand Total                                             41,226,394

Although this study does not guarantee generalizability across other anxiety disorders and psychopathology, it presents an interesting finding regarding the effectiveness/efficacy of therapy for a specific type of anxiety disorder. More long-term studies of this nature need to be conducted that examine the differences of self-reported measures of symptoms based on the following treatment conditions: medication only, therapy only, and medication combined with therapy.

Hey all!

I wanted to write a post about the permutation test video I uploaded to YouTube. I have linked the video and put up the materials on Advanced Statistics page.

I mainly wanted to cover that advantages to permutation tests:

  • You are not relying on some magical population. I hope I expressed this idea well in the video. The more I do research, the more I realize that populations are a thing of magic that just doesn’t exist – especially because, short of a lot of money, how are we supposed to randomly sample from that population anyway?
  • Those pesky assumptions! I am a big proponent of checking your assumptions – which is why all my videos have information about data screening in them. However, I am also guilty of being like “oh well shrug, there goes some power because what else am I supposed to do?”. Or even better … what do you do when all the reviewers only know ANOVA, and you do want to use something special? It’s a messy system we have going here.
  • They have a certain elegance to them … I test my data, and it turns out to be X statistic number. If I randomize that data, how many times do I get X or greater? How simple is that idea?

The hidden side of permutation tests is that they still rely on some form of probability, and potentially, the same faulty logic that we use now for null hypothesis significance testing. Additionally, I can see someone running the test to fish – if something is close, you could run permutation until it comes out your way.

I do know that I said something a bit wrong at the end of the video around 30 minutes in … you can’t really calculate F for the permutation test, because there are lots of F values (that’s the point). I would suggest reporting the p values and potentially calculating F for the original test by doing MS / Residuals but making it very clear the p value is for a permutation test. I also highly recommend adding eta or eta squared for effect size using the SS Variable and SS Residual information provided in the table. If you compare the aovp() output to regular ANOVA, you will find it is approximately the same for the SS and MS, but p changes based on the randomization results.


Pase et al. (2017) recently published an article in the journal Stroke entitled “Sugar- and Artificially Sweetened Beverages and the Risks of Incident Stroke and Dementia”. This publication was quickly picked up by the mainstream media, and it was off to the races from there. News article titles included:

Daily dose of diet soda tied to triple risk of deadly stroke

Diet sodas may be tied to stroke, dementia risk

Diet sodas tied to dementia and stroke

Diet Sodas May raise risk of dementia and stroke, study finds

Diet soda can increase risk of dementia and stroke, study finds

With headlines at these at the fingertips of millions of people, it is important to retain a healthy dose of scientific skepticism, especially with suggestions like these:

“if you’re partial to a can of Pepsi Max at lunch, or enjoy a splash of Coke Zero with your favorite rum – you might want to put that drink back on ice”

The driving point from this article was that drinking artificially sweetened drinks led to a 3x increase in the risk of incidental stroke and dementia. Now to be fair, while some news outlets may have overstated some of the results from Pase et al., others actually included fair points regarding the article, including that this connection was only found between artificially sweetened beverages, and that no link was found between other sugary beverages (i.e. sugar-sweetened sodas, fruit juice). Some news articles rightly pointed out the classic “correlation does not equal causation” remark. The lead author of the paper even pointed out in an interview that “In our study, 3% of the people had a new stroke and 5% developed dementia, so we’re still talking about a small number of people developing either stroke or dementia”. With 2,888 participants analyzed for incident stroke, and 1,484 observed for new-onset dementia, that translates into roughly 87 people who suffered a stroke, and roughly 74 people who developed new-onset dementia.

Pase et al. (2017) did adjust for age, sex, caloric intake, education, diet, exercise, and smoking, among other things. Interestingly however, they did not control for multiple comparisons, which is the bigger point I would like to raise in this post. Whenever a researcher runs multiple tests using the same dataset (for instance when a treatment has 3 or more levels), or when running extra analyses on a subset of a dataset, or even when running extra tests on variables that weren’t previously considered, this all increases the risk of Type I error. A.K.A. a “false positive”, Type I error occurs when you find an effect, when in the population no effect actually exists. Running multiple tests yields more chances that an effect will be found, thus increasing the risk of running into Type I error. An easy solution would be to adjust the alpha criterion (usually .05) by the number of tests being run (i.e. the Bonferroni correction, a very popular Type I error correction because it is easy to calculate manually). For instance, if you are running 10 t-tests on the same dataset, one could easily adjust alpha to .005 (.05/10). So, controlling for multiple comparisons, an effect would be deemed significant if the p-value fell below .005, not the typical .05.

How much does this actually matter? Would adjusting for multiple comparisons yield any meaningful changes in regards to statistical interpretation? To investigate this, I simulated 100 datasets, each with a sample size of 100, assuming a medium effect size. Data were generated as Likert type data ranging from 1 to 7. One factor with five levels was considered (with means of 2.0, 2.3, 2.6, 2.9, 3.2). Post-hoc t-tests were then analyzed for all pairwise comparisons both with no p-value adjustment as well as using a Bonferroni correction. The number of significant p-values were then logged in both cases for each dataset. The R code used for this simulation can be found at the end of this post.

Simulation results revealed, probably not to anyone’s surprise, that yes, there is a difference in the number of significant p-values found, depending on if one controls for multiple comparisons. Out of 10 possible comparisons, on average, post-hoc t-tests revealed more significant p-values (M = 5.60, SD = 1.33) when you don’t control for multiple comparisons compared to when you do (M = 3.64, SD = 1.38), t(99) = 17.25, p < .001, davg = 1.45, 95% CI [1.16, 1.73].

Turning back to Pase et al. (2017), what effect would this have had on their conclusions, considering they did not control for multiple comparisons? Below is a snapshot of beverage intake and the risk of stroke from Pase et al.

Apologies if the table is a bit blurry (might be better to look up the article directly), but the top 2/3 of this table shows neither total sugary beverages nor sugar-sweetened soft drinks had any significant effect on the risk of stroke, as the p-values are above what we assume is their alpha-criterion, p < .05. The bottom third of the table shows artificially sweetened soft drinks. Looking at just the results from model 3, which controlled for the most potentially confounding factors, we see that 6 out of the 8 p-values are significant. However, by using a simple adjustment of .05/8 tests, our new alpha criterion is .00625. Using this criterion, none of the p-values reported would fall below the significance criterion.

Considering dementia, total sugary beverages and sugar-sweetened soft drinks both remained non-significant. However, using Model 3 (the authors most conservative model), none of the p-values were significant for artificially sweetened drink intake, even before controlling for multiple comparisons.  If we assume that these regression models include control for other variables (i.e. lessened df for including many predictors), we could reduce the number of comparisons down to 4 (recent intake/cumulative intake by stroke type), and then the corrected alpha would be .05/4 = .0125. Given the precision of the table is only two decimals, it is difficult to tell if the .01 values would still be considered significant. By not employing this simple adjustment, the authors increased their risk of Type I error, and as a result the conclusions from this paper drastically changed, from finding significant effects to finding none. By making sure we control multiple comparisons, we can avoid problems, such as finding false positives, and make for better, more robust science.

Given the large sample size and the large number of models employed here (24 regressions!), we must be careful in our interpretation – especially given we are predicting very small categories, which is always a difficult science. The use of an effect size in the table is encouraging, especially with confidence intervals. These confidence intervals indicate an even more telling story – many of them include a 1:1 ratio or very close (i.e. you have a 50-50 or chance shot of developing a stroke) and are quite wide, indicating we don’t quite have a good picture of the relationship between drinks and stroke just yet.

R Syntax

#set up
Means = c(2.0, 2.3, 2.6, 2.9, 3.2)
N = 100
round = 0
sig_data = data.table(no = 1:N,
yes = 1:N)
for(i in 1:N){ #start loop
#create data
sigma = matrix(c(3,0,0,0,0,
), nrow = 5, ncol = 5)
dataset = as.data.table(rmvnorm(N, Means, sigma))
dataset = round(dataset, digits = 0)
dataset[ dataset < 1 ] = 1
dataset[ dataset > 7 ] = 7
dataset$partno = as.factor(1:nrow(dataset))
longdataset = melt(dataset,
id = “partno”)
#pairwise comparisons
round = round+1
noadjust = pairwise.t.test(longdataset$value,
paired = T,
p.adjust.method = “none”)
x = unname(c(noadjust$p.value[c(1:4)],
noadjust$p.value[4, 4]))
x = x < .05
sig_data$no[round] = sum(x == TRUE)
yesadjust = pairwise.t.test(longdataset$value,
paired = T,
p.adjust.method = “bonferroni”)
y = unname(c(yesadjust$p.value[c(1:4)],
yesadjust$p.value[4, 4]))
y = y < .05
sig_data$yes[round] = sum(y == TRUE)
} # end loop
## Differences in number of significant p-values?
t.test(sig_data$no, sig_data$yes, paired=TRUE)
m1 = mean(sig_data$no)
sd1 = sd(sig_data$no)
n = length(sig_data$no)
m2 = mean(sig_data$yes)
sd2 = sd(sig_data$yes)
d.dept.avg(m1 = m1, m2 = m2, sd1 = sd1, sd2 = sd2, n = n, a = .05, k = 2)
sig_data$partno = 1:nrow(sig_data)
noout = melt(sig_data, id = “partno”)
theme = theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line.x = element_line(colour = “black”),
axis.line.y = element_line(colour = “black”),
legend.key = element_rect(fill = “white”),
text = element_text(size = 15))
ggplot(noout, aes(variable, value)) +
stat_summary(fun.y = mean,
geom = “point”,  size = 2, fill = “gray”, color = “gray”) +
stat_summary(fun.data = mean_cl_normal,
geom = “errorbar”, position = position_dodge(width = 0.90),
width = 0.2) +
theme + xlab(“Controlling for Multiple Comparisons”) + ylab(“Number of Significant p-values”) +
scale_x_discrete(labels = c(“None”,”Bonferroni”))


All blogs have to start somewhere, so wanted to give a quick introduction. I am an Associate Professor of Quantitative Psychology and Missouri State University. I teach a lot of stuff mostly related to statistics: baby stats (undergraduate basics), advanced stats (undergraduate/graduate mix of multivariate methods), graduate stats (graduate basics), and structural equation modeling. I run the Statistics and Research Design certificate program at MSU, along with working closely with our Experimental Psychology Track in the master’s program.

My research focuses on computational linguistics and applied statistics, which you can read a whole lot more about on my website. I would describe my language work as being interested in the types and way we use psycholinguistic variables, and how these variables relate to judgments and memory. Statistically speaking, I usually help others by exploring how they might analyze their data, but more recently am interested in the way we do business in statistics (i.e. understanding the way our statistics work and function under different scenarios) and how to teach statistics.

Here on the blog, we will be posting all sorts of information including links to new help videos, discussions about statistics in the real world, promoting our research papers, and any random thoughts that might cross the brain. My goal for this information is to not only promote what we are doing as scientists, but also to be able to teach anyone interested in how we did our work and promote the open science framework.

I also have purple hair, much to the amusement of my students and small children.