Critiques of Null Hypothesis Significance Testing in Applied Linguistics

The following is an in-progress bibliography of applied linguists who have joined the longstanding chorus of quantitative methodologists and researchers in other social sciences cautioning against the use of null hypothesis significance testing (NHST; i.e., p vales and tests of statistical significance).

Brown, J. D. (2011). Quantitative research in second language studies. In E. Hinkel (Ed.), Handbook of research on second language teaching and learning (Vol. 2). New York: Routledge.

Crookes, G. (1991). Power, effect size, and second language research. Another researcher comments. TESOL Quarterly, 25, 762-765.

Egbert, J., Plonsky, L., & Schwander, M. (2014, March). The linguistic and stylistic features of SLA conference abstracts and their relationship to ratings. Paper presented at the conference of the American Association for Applied Linguistics (AAAL), Portland, OR.

Gass, S., Mackey, A., Aluarez-Torres, M. J., & Ferndandez-Garcia, M. (1999). Language Learning 49, 549-581.

Gries, S. Th. (2005). Null-hypothesis significance testing of word frequencies: a follow-up on Kilgarriff. Corpus Linguistics and Linguistic Theory, 1, 277-294.

Gries, S. Th. (2006). Some proposals towards a more rigorous corpus linguistics. Zeitschrift für Anglistik und Amerikanistik54, 191-202.

Kilgarriff, A. (2005). Language is never, ever, ever, random. Corpus Linguistics and Linguistic Theory, 1, 263-276.

Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. New York: Routledge.

Larson-Hall, J., & Plonsky, L. (2015). Reporting and interpreting quantitative research findings: What gets reported and recommendations for the field. Language Learning, 65, Supp. 1, 127-159.

Lazaraton, A. (1991). Power, effect size, and second language research. A researcher comments. TESOL Quarterly, 25, 759-762.

Nassaji, H. (2012). Significance tests and generalizability of research results: A case for replication. In G. Porte (Ed.), Replication research in applied linguistics (pp. 92-115). Cambridge: Cambridge University Press.

Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50, 417–528.

Norris, J. M., & Ortega, L. (2006). The value and practice of research synthesis for language learning and teaching. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 3–50). Philadelphia: Benjamins.

Norris, J. M. (2015). Statistical signifi cance testing in second language research: Basic problems and suggestions for reform. Language Learning, 65 (Supp. 1), 97–126.

Norris, J. M., Plonsky, L., Ross, S. J., & Schoonen, R. (2015). Guidelines for reporting quantitative methods and results in primary research.Language Learning65, 470-476.

Oswald, F. L., & Plonsky, L. (2010). Meta-analysis in second language research: Choices and challenges. Annual Review of Applied Linguistics, 30, 85-110.

Plonsky, L. (2009, October). “Nix the null”: Why statistical significance is overrated. Paper presented at the Second Language Research Forum (SLRF), East Lansing, MI.

Plonsky, L. (2011a). The effectiveness of second language strategy instruction: A meta-analysis. Language Learning, 61, 993-1038.

Plonsky, L. (2011b). Study Quality in SLA: A Cumulative and Developmental Assessment of Designs, Analyses, Reporting Practices, and Outcomes in Quantitative L2 Research. Unpublished doctoral dissertation, Michigan State University.

Plonsky, L. (2012). Effect sizes. In P. Robinson (Ed.), The Routledge Encyclopedia of Second Language Acquisition (pp. 200-202). New York: Routledge.

Plonsky, L. (2012). Replication, meta-analysis, and generalizability. In G. Porte (Ed.), Replication research in applied linguistics (pp. 116-132). Cambridge: Cambridge University Press.

Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition35, 655-687.

Plonsky, L. (2014). Study quality in quantitative L2 research (1990-2010): A methodological synthesis and call for reform. Modern Language Journal98, 450-470.

Plonsky, L. (2015). Quantitative considerations for improving replicability in CALL and applied linguistics. CALICO Journal32, 232-244.

Plonsky, L. (2015). Statistical power, p values, descriptive statistics, and effect sizes: A “back-to-basics” approach to advancing quantitative methods in L2 research. In L. Plonsky (Ed.), Advancing quantitative methods in second language research (pp. 23-45). New York: Routledge.

Plonsky, L., Egbert, J., & LaFlair, G. T. (in press). Bootstrapping in applied linguistics: Assessing its potential using shared data. Applied Linguistics.

Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61, 325-366.

Plonsky, L., & Gonulal, T. (2015). Methodological synthesis in quantitative L2 research: A review of reviews and a case study of exploratory factor analysis. Language Learning, 65, Supp. 1 (edited by J. M. Norris, S. J. Ross, & R. Schoonen), 9-36.

Plonsky, L., & Oswald, F. L. (2012). How to do a meta-analysis. In A. Mackey & S. Gass (Eds.), A guide to research methods in second language acquisition. London: Basil Blackwell.

Plonsky, L., & Oswald, F. L. (2014). How big is ‘big’? Interpreting effect sizes in L2 research. Language Learning64, 878-912.

Plonsky, L., & Oswald, F. L. (2015). Meta-analyzing second language research. In L. Plonsky (Ed.), Advancing quantitative methods in second language research (pp. 106-128). New York: Routledge. [Adapted from Plonsky, L., & Oswald, F. L. (2012). How to do a meta-analysis. In A. Mackey & S. M. Gass (Eds.), Research methods in second language acquisition: A practical guide (pp. 275-295). London: Wiley Blackwell.

Ross, S. J. (2012). Probability and hypothesis testing.  In C. A. Chapelle (Ed.). Encyclopedia of applied linguistics (pp. 4673-4679). London: Wiley-Blackwell.