![]() These studies are prevalent in many fields including biomedical science (such as those for rare diseases) and psychology (Szucs & Ioannidis, 2017). Footnote 2 A decrease in the power of the test leads to an increase in the number of false negatives, likely causing problems in small-sample studies. For example, lowering the significance level considerably decreases the power to detect true effects (Lazic, 2018). While their suggestion may lead to more reliable research findings, it also faces a number of difficulties. The proposed choice is also supported by recent findings of Held ( 2019) in the context of a novel credibility approach. In addition to general recommendations such as utilizing ‘ more robust experimental designs’, ‘ better statistics’ (Baker, 2016), and ‘ appropriately stringent significance tests’ (McNutt, 2014), Johnson ( 2013) and Benjamin et al., ( 2018) have made a rather simple proposal of redefining statistical significance as p-value less than 0.005 (“ p < 0.005”) instead of the common choice of p < 0.05. The ongoing discussion about the current replication crisis in quantitative research has led to many ad hoc suggestions (Ioannidis, 2005 Liberati et al., 2009 McNutt, 2014 Collins & Tabak, 2014 OSC, 2015 Peng, 2015 Baker, 2016). Through an extensive simulation study, it is found that the permutation versions of the Welch t-test and the Brunner-Munzel test are particularly robust and powerful while the commonly used two-sample tests which utilize t-distribution tend to be either liberal or conservative, and have peculiar power curve behaviors under skewed distributions with variance heterogeneity. Therefore, the main aim of our study is to investigate the robustness and power curve behaviors of independent (unpaired) two-sample tests for metric and ordinal data at nominal significance levels of α = 0.005 and α = 0.05. Even though their suggestion is easy to implement, it is unclear whether or not the commonly used statistical tests are robust and/or powerful at such a small significance level. ( Nature Human Behaviour, 2, 6–10 2018) recommend using the significance level of α = 0.005 (0.5 %) as opposed to the conventional 0.05 (5 %) level. Among them, Johnson ( Proceedings of the National Academy of Sciences, 110, 19313–19317, 2013) and Benjamin et al. Recent replication crisis has led to a number of ad hoc suggestions to decrease the chance of making false positive findings.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |