StatsTools
Hi everyone!

I have been super swamped with a bunch of due dates that all hit in April. For a small brag, and I like making lists:

  • 9 revise and resubmits (four we’ve sent back, two have been accepted!)
  • 4 conference posters and one invited talk
  • 1 submitted grant (fingers crossed!)
  • 2 invited papers
  • 2 theses that I’m chairing, 2 that I’m on the committee for
  • Data camp!

It’s been nuts, so haven’t left the house much or done much of anything else. Anywho, I thought I would share my mediation and moderation talk information for the conferece.

You can get it here: https://osf.io/t3syq/

The information includes SPSS and R guides for mediation/moderation, including bootstrapped confidence intervals for the indirect effect. These CI values give you more to talk about rather than saying “fully” or “partially” mediated based on some magic “p” value change (don’t do this). I will record my talk and put it online on our YT page as well. Enjoy!

My coauthor John Scofield and I just had a publication accepted at Behavior Research Methods – you can check out the publication preprint at OSF.

We thew together a website for the paper that summarizes everything we found, as well as puts all the materials together in one place – check it out.

We create a really nice R function to help you detect low quality data, which you can find on GitHub, and I even made a video that explains all the parts to the function at YouTube.

If you aren’t a R person, you can use our Shiny App, download the code, and watch the YouTube video that explains everything to you.

Enjoy!

Heyo! I wanted to write a post about some of the quirky things I’ve found with writing manuscripts in R Markdown, as well as provide a solution to a problem that someone else might be having.

Update: The csl file I describe below is a special formatted one, which was shared with me. You can download it from GitHub to try the suggestions below.

Update 2: Turns out, potentially, the suggestions from the manual are not working correctly, as Frederik has checked it out and opened an issue on github. I’ll write a new post when there are updates!

First, let me tell you how much I love Frederik Aust’s papaja package for R. I had been trying to integrate open science and transparency in our lab, which was helped by the switch to R to track what we were doing in our data analysis. I heard about papaja through a former student, and I jumped in head first. I know it’s helped us think a LOT about reproducibility and replication, as we want people to be able to track what we did and avoid p-hacking in our papers. Having a workflow that is integrated throughout the manuscript really forces you to think about how you are presenting your data and knowing that others can view it especially forces you to be clear about what you did. We’ve fully embraced working transparently through Open Science Foundation integration, much of work in on GitHub, and we are writing manuscripts with papaja to make it more obvious what is what.

Before doing that, I had started learning markdown, and although I’ve been using it for a bit now, I still feel like a noob. Mix LaTeX in there, and even more so. Thankfully, I have some very awesome twitter friends that help me when I get stuck in trying to do something … like trying to stick a % symbol in a column name for a table. Whew. One thing I wish were a little bit different is citations. Currently, papaja using pandoc-citeproc to create the text referencing for knitting to PDF or Word.

The problem with this is that any time you have the same author last names (like Erin Buchanan and Tom Buchanan), you automatically get E. Buchanan and T. Buchanan in the in-text referencing. That is APA style but reviewers and the like do not like it. Real APA != to Used APA. The other issue stems from the fact that you will get the the first initials, even if the other author name match is in second or third place. Therefore, if I cite myself and cite Tom but he only appears as second author, I will still get E. Buchanan in the in text citation. That’s probably also a correct interpretation of APA but ain’t worth fighting reviewers over. Additionally, the absolute name matching often forces us to fix bibtex files a lot over things like Buchanan, E. versus Buchanan, E.M. versus Buchanan, Erin etc. Many different permutations of one person’s name via differences in doi citations can be tedious to fix.

Therefore! I checked out the papaja manual – which is stellar – to see if there was some other way to do it. I also googled this, but really got stuck with the translation of latex to markdown. The manual suggests you can do this:

---
output:
  papaja::apa6_pdf:
    citation_package: biblatex
---

To pass the citations through a different processor. Great! I will try that.

Latexmk: This is Latexmk, John Collins, 19 Jan. 2017, version: 4.52c.
Latexmk: applying rule 'biber QWERTY'...
Rule 'biber QWERTY': File changes, etc:
   Non-existent destination files:
      'QWERTY.bbl'
------------
Run number 1 of rule 'biber QWERTY'
------------
------------
Running 'biber  "QWERTY"'
------------
INFO - This is Biber 2.7
INFO - Logfile is 'QWERTY.blg'
ERROR - QWERTY.bcf is malformed, last biblatex run probably failed. Deleted QWERTY.bbl
INFO - ERRORS: 1
Latexmk: biber found malformed bcf file for 'QWERTY'.
  I'll ignore error, and delete any bbl file.
Rule 'pdflatex': File changes, etc:
   Non-existent destination files:
      'QWERTY.pdf'
------------
Run number 1 of rule 'pdflatex'
------------
Biber error: [427] Utils.pm:180> ERROR - QWERTY.bcf is malformed, last biblatex run probably failed. Deleted QWERTY.bbl
Latexmk: applying rule 'pdflatex'...
------------
Running 'pdflatex  -halt-on-error -interaction=batchmode -recorder  "QWERTY.tex"'
------------
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017) (preloaded format=pdflatex)
restricted \write18 enabled.
entering extended mode
Latexmk: Non-existent bbl file 'QWERTY.bbl'
No file QWERTY.bbl.
=== TeX engine is 'pdfTeX'
Biber error: [427] Utils.pm:180> ERROR - QWERTY.bcf is malformed, last biblatex run probably failed. Deleted QWERTY.bbl
Latexmk: Errors, so I did not complete making targets
Collected error summary (may duplicate other messages):
  pdflatex: Command for 'pdflatex' gave return code 1
      Refer to 'QWERTY.log' for details
Latexmk: Use the -f option to force complete processing,
unless error was exceeding maximum runs of latex/pdflatex.
! LaTeX Error: Command \c@author already defined.
               Or name \end... illegal, see p.192 of the manual.
 
Error: Failed to compile QWERTY.tex. See QWERTY.log for more info.
Execution halted

Balls. I searched this error for a while and found: 1) update LaTeX: check, 2) figure out why your bibtext was messed up: check … tried with only one reference and still crashed, and 3) other stuff I don’t remember. When I tried a separate markdown, thinking the one that I had open was the problem, I got the actual citation codes, rather than the text:

Researchers discovered that online data collection can be 
advantageous over laboratory and paper data collection, as it 
is often cheaper and more efficient (Ilieva2001;Schuldt1994;Reips2012)

I thought maybe it was my computer, so one of my coauthors tried it. Same as the first error. Maybe it’s a mac thing? Another coauthor with a mac, got the second error. I’m sad to say that I don’t have an answer for either of these problems – from the looks of it, I’m following the guidelines suggested, but both problems pop up. I would love to hear if you know why.

Enter Julia! Julia helped find a work around for the issue. In the head of your markdown file (note I used some … to shorten some of what papaja does for you automatically):

...
bibliography      : ["q_bib.bib"]
...
output            : papaja::apa6_pdf
replace_ampersands: yes
csl               : apa6.csl
---

And then be sure to put the apa6.csl in the same folder as your markdown. Now, you can confuse people with all your Buchanans, Logans, Cohens, and Fritzs. Or, in our case, we can make Reviewer #2 happy and annoy the copy editor.

Note: I had to update papaja to get this solution to work, as the replace ampersands did not work the first time.

For a recent publication comparing null hypothesis testing p-values to Bayes Factors and Observation Oriented Modeling, we created a Shiny app to graph all of our complex plots. I particularly pleased with the plotly 3D graph – as I usually think that 3D graphs are impossible to read. This plot shows what we found in our study (albeit I would recommend viewing the 2D plots more):

  • Bayes Factors and p-values follow a power function, as we expected.
  • Bayes Factors and OOM values follow an interesting pattern, wherein as sample size increases, BF expands outwards, while PCC values tend to constrict.
  • p-values will always decrease to floor, and PCC values still tend to constrict toward the simulated effect size range.

Another component of this app I wanted to show off was the interactive response points, wherein the input options (on the left) change based on a user selected input option. Therefore, options that are normally only input are both input and output in the traditional Shiny set up.

You can see that by having the selection (first part) and the changing selection (second part) in the fluid page:

selectInput("Nselect", "Select N Scaling:",
                  c("N" = "N",
                    "Log N" = "log")),
                    
htmlOutput("slider_selector")

Which is connected to the server function below:

  ####change the slider####
  output$slider_selector = renderUI({ 
    
    if (input$Nselect == "N") { minN = 10; maxN = 1000; stepN = 10}
    if (input$Nselect == "log") { minN = round(log(10),1) 
                                  maxN = round(log(1000),1)
                                  stepN = .1}
    
    sliderInput("xaxisrange", "X-Axis Range:",
                min = minN, max = maxN,
                value = c(minN,maxN),
                sep = "",
                round = -1,
                step = stepN)
  })

These two pieces feed information back and forth depending on the user input to show either X on a real scale or X on a log scale.Code is included below, and when our server isn’t being cranky, the app is here. The code is pretty long due to the sheer number of graphs, so it’s edited down to just the shiny parts – when you see ####GRAPH#### that’s some kicking ggplot2 graphs you can view in our github repo.Check out the project OSF page here. You can download the entire app from our github repo (also other shiny apps!).

library(shiny)
library(ggplot2)
library(reshape)
library(plotly)

####remove data loading and reshaping####

####user interface####
ui <- fluidPage (
  
  titlePanel("Valentine et al. Interactive Graphics"),
  
  sidebarLayout(
    
    ##sidebarpanel
    sidebarPanel(
      
      br(),
      
      ##put input boxes here
      tags$em("All Graphs:"),
      selectInput("sizeselect", "Select Effect Size:",
                  c("Negligible" = "None",
                    "Small" = "Small",
                    "Medium" = "Medium",
                    "Large" = "Large")),
      
      tags$em("Percent Graphs:"),
      selectInput("Nselect", "Select N Scaling:",
                  c("N" = "N",
                    "Log N" = "log")),
      
      htmlOutput("slider_selector"),
      
      tags$em("Comparison Graphs:"),
      
      selectInput("graphselect", "Select Graph:",
                  c("PCC - p" = "pccp",
                    "PCC - BF" = "pccbf",
                    "BF - p" = "bfp")),
      
      sliderInput("bfrange", "Log BF Range:",
                  min = -5, max = 600,
                  value = c(-5,600),
                  sep = "",
                  step = 10),
      
      sliderInput("prange", "p Range:",
                  min = 0, max = 1,
                  value = c(0,1),
                  step = .01),
      
      sliderInput("pccrange", "PCC Range:",
                  min = 0, max = 1,
                  value = c(0,1),
                  step = .01)
      
    ), #close sidebar panel
    
    mainPanel(
      
      tabsetPanel(
        tabPanel("Significant", plotOutput("sigpic"),
                 br(),
                 helpText("Complete dataset avaliable at: https://osf.io/u9hf4/")),
        tabPanel("Non-Significant", plotOutput("nonpic"),
                 br(),
                 helpText("Complete dataset avaliable at: https://osf.io/u9hf4/")),
        tabPanel("Omnibus Agreement", plotOutput("omniagree"),
                 br(),
                 helpText("Complete dataset avaliable at: https://osf.io/u9hf4/")),
        tabPanel("Posthoc Agreement", plotOutput("postagree"),
                 br(),
                 helpText("Complete dataset avaliable at: https://osf.io/u9hf4/")),
        tabPanel("Criterion Comparison", plotOutput("compare"), 
                 br(),
                 helpText("Complete dataset avaliable at: https://osf.io/u9hf4/",br(), 
                          "BF values have been log transformed to show the entire range of the data.")),
        tabPanel("3D Comparison", plotlyOutput("compare3d"), 
                 br(),
                 helpText("Complete dataset avaliable at: https://osf.io/u9hf4/",br(), 
                          "BF values have been log transformed to show the entire range of the data."))
      )
      
    ) #close main panel 
    
  ) #close sidebar layout

) #close fluid page

####server functions####
server <- function(input, output) {
   
  ####change the slider####
  output$slider_selector = renderUI({ 
    
    if (input$Nselect == "N") { minN = 10; maxN = 1000; stepN = 10}
    if (input$Nselect == "log") { minN = round(log(10),1) 
                                  maxN = round(log(1000),1)
                                  stepN = .1}
    
    sliderInput("xaxisrange", "X-Axis Range:",
                min = minN, max = maxN,
                value = c(minN,maxN),
                sep = "",
                round = -1,
                step = stepN)
  })
  
   ####SIGNIFICANT EFFECTS####
   output$sigpic <- renderPlot({

     graphdata = subset(long_graph, Significance=="Sig" & Effect == input$sizeselect)
     
     ##log N
     if (input$Nselect == "log") { graphdata$N = log(graphdata$N) 
                                    xlabel = "Log N" } else { xlabel = "N"}
     
     ####GRAPH####
   })
   
   ####NONSIGNIFICANT EFFECTS####
   output$nonpic <- renderPlot({
     
     nsgraphdata = subset(long_graph, Significance=="Non" & Effect == input$sizeselect)
     
     ##log N
     if (input$Nselect == "log") { nsgraphdata$N = log(nsgraphdata$N)  
                                   xlabel = "Log N" } else { xlabel = "N"}
     
     ####GRAPH####
   })
   
   ####OMNIBUS AGREEMENT####
   output$omniagree <- renderPlot({
     
     ##log n to get a better graph
     if (input$Nselect == "log") { agreelong$N = log(agreelong$N)
                                   xlabel = "Log N" } else { xlabel = "N"}
     
     ####GRAPH####
   })
   
   ####POST HOC AGREEMENT####
   output$postagree <- renderPlot({
     
     ##log n to get a better graph
     if (input$Nselect == "log") { agreelong$N = log(agreelong$N)
     xlabel = "Log N" } else { xlabel = "N"}
     
     ####GRAPH####
   })
   
   ####COMPARISON GRAPHS####
   output$compare <- renderPlot({
     
     if (input$graphselect == "pccp"){
       
       ####GRAPH####
       
     } else if (input$graphselect == "pccbf"){
       
       ####GRAPH####
       
     } else if (input$graphselect == "bfp"){
       
       ####GRAPH####
       
     }
     
   })
   
   ####3D COMPARISON GRAPHS####
   output$compare3d <- renderPlotly({
     
     ####GRAPH SET UP####
     
     overall = plot_ly(overallgraph3d, 
                       x = ~overallBF,
                       y = ~oompcc,
                       z = ~omniP,
                       color = ~N,
                       symbol = ~star,
                       symbols=c("circle","cross"),
                       mode="markers") %>%
       add_markers() %>%
       layout(scene = list(xaxis = list(title = 'Bayes Factors'),
                           yaxis = list(title = 'OOM PCC'),
                           zaxis = list(title = 'p-Value')),
              annotations = list(
                x = 1.13,
                y = 1.05,
                text = colorlabel,
                xref = 'paper',
                yref = 'paper',
                showarrow = FALSE
              ))
     
     overall
     
   })
   
} #close server functions

# Run the application 
shinyApp(ui = ui, server = server)