References

Abadie, Alberto, Susan Athey, Guido Imbens, and Jeffrey Wooldridge. 2017. “When Should You Adjust Standard Errors for Clustering?” Working Paper 24003. Working Paper Series. National Bureau of Economic Research. https://doi.org/10.3386/w24003.
Abelson, Harold, and Gerald Jay Sussman. 1996. Structure and Interpretation of Computer Programs. Cambridge: The MIT Press.
Abeysooriya, Mandhri, Megan Soria, Mary Sravya Kasu, and Mark Ziemann. 2021. “Gene Name Errors: Lessons Not Learned.” PLOS Computational Biology 17 (7): 1–13. https://doi.org/10.1371/journal.pcbi.1008984.
Acemoglu, Daron, Simon Johnson, and James Robinson. 2001. “The Colonial Origins of Comparative Development: An Empirical Investigation.” American Economic Review 91 (5): 1369–1401. https://doi.org/10.1257/aer.91.5.1369.
Achen, Christopher. 1978. “Measuring Representation.” American Journal of Political Science 22 (3): 475–510. https://doi.org/10.2307/2110458.
Akerlof, George. 1970. “The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism.” The Quarterly Journal of Economics 84 (3): 488–500. https://doi.org/10.2307/1879431.
Alexander, Monica. 2019a. “Reproducibility in Demographic Research.” https://www.monicaalexander.com/posts/2019-10-20-reproducibility/.
———. 2019b. “The Concentration and Uniqueness of Baby Names in Australia and the US,” January. https://www.monicaalexander.com/posts/2019-20-01-babynames/.
———. 2019c. “Analyzing Name Changes After Marriage Using a Non-Representative Survey,” August. https://www.monicaalexander.com/posts/2019-08-07-mrp/.
———. 2021. “Overcoming Barriers to Sharing Code.” YouTube, February. https://youtu.be/yvM2C6aZ94k.
Alexander, Monica, and Leontine Alkema. 2022. A Bayesian Cohort Component Projection Model to Estimate Women of Reproductive Age at the Subnational Level in Data-Sparse Settings.” Demography 59 (5): 1713–37. https://doi.org/10.1215/00703370-10216406.
Alexander, Monica, Mathew Kiang, and Magali Barbieri. 2018. “Trends in Black and White Opioid Mortality in the United States, 1979–2015.” Epidemiology 29 (5): 707–15. https://doi.org/10.1097/EDE.0000000000000858.
Alexander, Rohan, and Monica Alexander. 2021. “The Increased Effect of Elections and Changing Prime Ministers on Topics Discussed in the Australian Federal Parliament Between 1901 and 2018.” https://doi.org/10.48550/arXiv.2111.09299.
Alexander, Rohan, and Paul Hodgetts. 2021. AustralianPoliticians: Provides Datasets About Australian Politicians. https://CRAN.R-project.org/package=AustralianPoliticians.
Alexander, Rohan, and A Mahfouz. 2021. heapsofpapers: Easily Download Heaps of PDF and CSV Files. https://CRAN.R-project.org/package=heapsofpapers.
Alexander, Rohan, and Zachary Ward. 2018. “Age at Arrival and Assimilation During the Age of Mass Migration.” The Journal of Economic History 78 (3): 904–37. https://doi.org/10.1017/S0022050718000335.
Alexopoulos, Michelle, and Jon Cohen. 2015. The power of print: Uncertainty shocks, markets, and the economy.” International Review of Economics & Finance 40 (November): 8–28. https://doi.org/10.1016/j.iref.2015.02.002.
Allen, Jeff. 2021. plumberDeploy: Plumber Deployment. https://CRAN.R-project.org/package=plumberDeploy.
Alsan, Marcella, and Amy Finkelstein. 2021. “Beyond Causality: Additional Benefits of Randomized Controlled Trials for Improving Health Care Delivery.” The Milbank Quarterly 99 (4): 864–81. https://doi.org/10.1111/1468-0009.12521.
Alsan, Marcella, and Marianne Wanamaker. 2018. “Tuskegee and the Health of Black Men.” The Quarterly Journal of Economics 133 (1): 407–55. https://doi.org/10.1093/qje/qjx029.
Altman, Douglas, and Martin Bland. 1995. Statistics notes: The normal distribution.” BMJ 310 (6975): 298–98. https://doi.org/10.1136/bmj.310.6975.298.
Amaka, Ofunne, and Amber Thomas. 2021. “The Naked Truth: How the Names of 6,816 Complexion Products Can Reveal Bias in Beauty.” The Pudding, March. https://pudding.cool/2021/03/foundation-names/.
American Medical Association and New York Academy of Medicine. 1848. Code of Medical Ethics. Academy of Medicine. https://hdl.handle.net/2027/chi.57108026.
Anders, Jake, Silvan Has, John Jerrim, Nikki Shure, and Laura Zieger. 2020. Is Canada really an education superpower? The impact of non-participation on results from PISA 2015.” Educational Assessment, Evaluation and Accountability 33 (1): 229–49. https://doi.org/10.1007/s11092-020-09329-5.
Andersen, Robert, and David Armstrong. 2021. Presenting Statistical Results Effectively. London: Sage.
Anderson, Margo. (1988) 2015. The American Census: A Social History. 2nd ed. Yale University Press.
Anderson, Margo, and Stephen Fienberg. 1999. Who Counts?: The Politics of Census-Taking in Contemporary America. Russell Sage Foundation. http://www.jstor.org/stable/10.7758/9781610440059.
Andrews, David, and Agnes Herzberg. 2012. Data: A Collection of Problems from Many Fields for the Student and Research Worker. New York: Springer Science & Business Media.
Angelucci, Charles, and Julia Cagé. 2019. “Newspapers in Times of Low Advertising Revenues.” American Economic Journal: Microeconomics 11 (3): 319–64. https://doi.org/10.1257/mic.20170306.
Angrist, Joshua, and Alan Krueger. 2001. “Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments.” Journal of Economic Perspectives 15 (4): 69–85. https://doi.org/10.1257/jep.15.4.69.
Angrist, Joshua, and Jörn-Steffen Pischke. 2010. “The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con Out of Econometrics.” Journal of Economic Perspectives 24 (2): 3–30. https://doi.org/10.1257/jep.24.2.3.
Annas, George. 2003. “HIPAA Regulations: A New Era of Medical-Record Privacy?” New England Journal of Medicine 348 (15): 1486–90. https://doi.org/10.1056/NEJMlim035027.
Ansolabehere, Stephen, Brian Schaffner, and Sam Luks. 2021. Guide to the 2020 Cooperative Election Study.” https://doi.org/10.7910/DVN/E9N6PH.
Aprameya, Lavanya. 2020. “Improving Duolingo, One Experiment at a Time.” Duolingo Blog, January. https://blog.duolingo.com/improving-duolingo-one-experiment-at-a-time/.
Arel-Bundock, Vincent. 2021. WDI: World Development Indicators and Other World Bank Data. https://CRAN.R-project.org/package=WDI.
———. 2022. modelsummary: Data and Model Summaries in R.” Journal of Statistical Software 103 (1): 1–23. https://doi.org/10.18637/jss.v103.i01.
———. 2023. marginaleffects: Predictions, Comparisons, Slopes, Marginal Means, and Hypothesis Tests. https://vincentarelbundock.github.io/marginaleffects/.
———. 2024. tinytable: Simple and Configurable Tables in “HTML,” “LaTeX,” “Markdown,” “Word,” “PNG,” “PDF,” and “Typst” Formats. https://vincentarelbundock.github.io/tinytable/.
Arel-Bundock, Vincent, Ryan Briggs, Hristos Doucouliagos, Marco Mendoza Aviña, and T. D. Stanley. 2022. “Quantitative Political Science Research Is Greatly Underpowered.” https://osf.io/bzj9y/.
Armstrong, Zan. 2022. “Stop Aggregating Away the Signal in Your Data.” The Overflow, March. https://stackoverflow.blog/2022/03/03/stop-aggregating-away-the-signal-in-your-data/.
Arnold, Jeffrey. 2021. ggthemes: Extra Themes, Scales and Geoms for “ggplot2”. https://CRAN.R-project.org/package=ggthemes.
Asher, Sam, Tobias Lunt, Ryu Matsuura, and Paul Novosad. 2021. “Development Research at High Geographic Resolution: An Analysis of Night Lights, Firms, and Poverty in India Using the SHRUG Open Data Platform.” World Bank Economic Review 35 (4). https://shrug-assets-ddl.s3.amazonaws.com/static/main/assets/other/almn-shrug.pdf.
Athey, Susan, and Guido Imbens. 2017a. “The Econometrics of Randomized Experiments.” In Handbook of Field Experiments, 73–140. Elsevier. https://doi.org/10.1016/bs.hefe.2016.10.003.
———. 2017b. “The State of Applied Econometrics: Causality and Policy Evaluation.” Journal of Economic Perspectives 31 (2): 3–32. https://doi.org/10.1257/jep.31.2.3.
Athey, Susan, Guido Imbens, Jonas Metzger, and Evan Munro. 2021. “Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations.” Journal of Econometrics. https://doi.org/10.1016/j.jeconom.2020.09.013.
Au, Randy. 2020. “Data Cleaning IS Analysis, Not Grunt Work,” September. https://counting.substack.com/p/data-cleaning-is-analysis-not-grunt.
———. 2022. “Celebrating Everyone Counting Things,” February. https://counting.substack.com/p/celebrating-everyone-counting-things.
Bååth, Rasmus. 2018. beepr: Easily Play Notification Sounds on any Platform. https://CRAN.R-project.org/package=beepr.
Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R. https://CRAN.R-project.org/package=magrittr.
Backus, John. 1981. The History of FORTRAN I, II, and III.” In History of Programming Languages, edited by Richard Wexelblat, 25–74. Academic Press.
Bailey, Rosemary. 2008. Design of Comparative Experiments. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511611483.
Baio, Gianluca, and Marta Blangiardo. 2010. “Bayesian Hierarchical Model for the Prediction of Football Results.” Journal of Applied Statistics 37 (2): 253–64. https://doi.org/10.1080/02664760802684177.
Baker, Dominique. 2023. “Scams Will Not Save Us (Tuition Dollars),” February. http://www.dominiquebaker.com/blog/2023/2/16/scams-will-not-save-us-tuition-dollars.
Baker, Reg, Michael Brick, Nancy Bates, Mike Battaglia, Mick Couper, Jill Dever, Krista Gile, and Roger Tourangeau. 2013. Summary Report of the AAPOR Task Force on Non-Probability Sampling.” Journal of Survey Statistics and Methodology 1 (2): 90–143. https://doi.org/10.1093/jssam/smt008.
Bandy, John, and Nicholas Vincent. 2021. “Addressing ‘Documentation Debt’ in Machine Learning: A Retrospective Datasheet for BookCorpus.” In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, edited by J. Vanschoren and S. Yeung. Vol. 1. https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/54229abfcfa5649e7003b83dd4755294-Paper-round1.pdf.
Banerjee, Abhijit, and Esther Duflo. 2011. Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty. New York: PublicAffairs.
Banerjee, Abhijit, Esther Duflo, Rachel Glennerster, and Cynthia Kinnan. 2015. “The Miracle of Microfinance? Evidence from a Randomized Evaluation.” American Economic Journal: Applied Economics 7 (1): 22–53. https://doi.org/10.1257/app.20130533.
Banes, Graham, Emily Fountain, Alyssa Karklus, Robert Fulton, Lucinda Antonacci-Fulton, and Joanne Nelson. 2022. Nine out of ten samples were mistakenly switched by The Orang-utan Genome Consortium.” Scientific Data 9 (1). https://doi.org/10.1038/s41597-022-01602-0.
Barba, Lorena. 2018. “Terminologies for Reproducible Research.” https://arxiv.org/abs/1802.03311.
Barrett, Malcolm. 2021a. Data Science as an Atomic Habit. https://malco.io/articles/2021-01-04-data-science-as-an-atomic-habit.
———. 2021b. ggdag: Analyze and Create Elegant Directed Acyclic Graphs. https://CRAN.R-project.org/package=ggdag.
Barron, Alexander, Jenny Huang, Rebecca Spang, and Simon DeDeo. 2018. “Individuals, Institutions, and Innovation in the Debates of the French Revolution.” Proceedings of the National Academy of Sciences 115 (18): 4607–12. https://doi.org/10.1073/pnas.1717729115.
Baumer, Benjamin, Daniel Kaplan, and Nicholas Horton. 2021. Modern Data Science With R. 2nd ed. Chapman; Hall/CRC. https://mdsr-book.github.io/mdsr2e/.
Baumgartner, Jason, Savvas Zannettou, Brian Keegan, Megan Squire, and Jeremy Blackburn. 2020. “The Pushshift Reddit Dataset.” arXiv. https://doi.org/10.48550/arxiv.2001.08435.
Baumgartner, Peter. 2021. Ways I Use Testing as a Data Scientist,” December. https://www.peterbaumgartner.com/blog/testing-for-data-science/.
Beaumont, Jean-Francois. 2020. “Are Probability Surveys Bound to Disappear for the Production of Official Statistics?” Survey Methodology 46 (1): 1–29.
Beauregard, Katrine, and Jill Sheppard. 2021. “Antiwomen but Proquota: Disaggregating Sexism and Support for Gender Quota Policies.” Political Psychology 42 (2): 219–37. https://doi.org/10.1111/pops.12696.
Becker, Richard, Allan Wilks, Ray Brownrigg, Thomas Minka, and Alex Deckmyn. 2022. maps: Draw Geographical Maps. https://CRAN.R-project.org/package=maps.
Beelen, Kaspar, Timothy Alberdingk Thim, Christopher Cochrane, Kees Halvemaan, Graeme Hirst, Michael Kimmins, Sander Lijbrink, et al. 2017. “Digitization of the Canadian Parliamentary Debates.” Canadian Journal of Political Science 50 (3): 849–64.
Begley, Glenn, and Lee Ellis. 2012. “Raise Standards for Preclinical Cancer Research.” Nature 483 (7391): 531--533. https://doi.org/10.1038/483531a.
Bender, Emily, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. ACM. https://doi.org/10.1145/3442188.3445922.
Bengtsson, Henrik. 2021. A Unifying Framework for Parallel and Distributed Processing in R using Futures.” The R Journal 13 (2): 208–27. https://doi.org/10.32614/RJ-2021-048.
Benoit, Kenneth. 2020. “Text as Data: An Overview.” In The SAGE Handbook of Research Methods in Political Science and International Relations, edited by Luigi Curini and Robert Franzese, 461–97. London: SAGE Publishing. https://doi.org/10.4135/9781526486387.n29.
Benoit, Kenneth, and Michael Laver. 2006. Party Policy in Modern Democracies. Routledge.
———. 2007. “Estimating Party Policy Positions: Comparing Expert Surveys and Hand-Coded Content Analysis.” Electoral Studies 26 (1): 90–107. https://doi.org/10.1016/j.electstud.2006.04.008.
Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. 2018. quanteda: An R package for the quantitative analysis of textual data.” Journal of Open Source Software 3 (30): 774. https://doi.org/10.21105/joss.00774.
Bensinger, Greg. 2020. “Google Redraws the Borders on Maps Depending on Who’s Looking.” The Washington Post, February. https://www.washingtonpost.com/technology/2020/02/14/google-maps-political-borders/.
Berdine, Gilbert, Vincent Geloso, and Benjamin Powell. 2018. “Cuban Infant Mortality and Longevity: Health Care or Repression?” Health Policy and Planning 33 (6): 755–57. https://doi.org/10.1093/heapol/czy033.
Berkson, Joseph. 1946. “Limitations of the Application of Fourfold Table Analysis to Hospital Data.” Biometrics Bulletin 2 (3): 47–53. https://doi.org/10.2307/3002000.
Berners-Lee, Timothy. 1989. “Information Management: A Proposal.” https://www.w3.org/History/1989/proposal.html.
Berry, Donald. 1989. “Comment: Ethics and ECMO.” Statistical Science 4 (4): 306–10. https://www.jstor.org/stable/2245830.
Bertrand, Marianne, and Sendhil Mullainathan. 2004. “Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination.” American Economic Review 94 (4): 991–1013. https://doi.org/10.1257/0002828042002561.
Bethlehem, R. A. I., J. Seidlitz, S. R. White, J. W. Vogel, K. M. Anderson, C. Adamson, S. Adler, et al. 2022. “Brain Charts for the Human Lifespan.” Nature 604 (7906): 525–33. https://doi.org/10.1038/s41586-022-04554-y.
Betz, Timm, Scott Cook, and Florian Hollenbach. 2018. “On the Use and Abuse of Spatial Instruments.” Political Analysis 26 (4): 474–79. https://doi.org/10.1017/pan.2018.10.
Bickel, Peter, Eugene Hammel, and William O’Connell. 1975. “Sex Bias in Graduate Admissions: Data from Berkeley: Measuring Bias Is Harder Than Is Usually Assumed, and the Evidence Is Sometimes Contrary to Expectation.” Science 187 (4175): 398–404. https://doi.org/10.1126/science.187.4175.398.
Biderman, Stella, Kieran Bicheno, and Leo Gao. 2022. “Datasheet for the Pile.” https://arxiv.org/abs/2201.07311.
Birkmeyer, John, Jonathan Finks, Amanda O’Reilly, Mary Oerline, Arthur Carlin, Andre Nunn, Justin Dimick, Mousumi Banerjee, and Nancy Birkmeyer. 2013. “Surgical Skill and Complication Rates After Bariatric Surgery.” New England Journal of Medicine 369 (15): 1434–42. https://doi.org/10.1056/nejmsa1300625.
Blair, Ed, Seymour Sudman, Norman M Bradburn, and Carol Stocking. 1977. “How to Ask Questions about Drinking and Sex: Response Effects in Measuring Consumer Behavior.” Journal of Marketing Research 14 (3): 316–21. https://doi.org/10.2307/3150769.
Blair, Graeme, Jasper Cooper, Alexander Coppock, and Macartan Humphreys. 2019. “Declaring and Diagnosing Research Designs.” American Political Science Review 113 (3): 838–59. https://doi.org/10.1017/S0003055419000194.
Blair, Graeme, Jasper Cooper, Alexander Coppock, Macartan Humphreys, and Luke Sonnet. 2021. estimatr: Fast Estimators for Design-Based Inference. https://CRAN.R-project.org/package=estimatr.
Blair, James. 2019. Democratizing R with Plumber APIs. https://posit.co/resources/videos/democratizing-r-with-plumber-apis/.
Bland, Martin, and Douglas Altman. 1986. “Statistical Methods for Assessing Agreement Between Two Methods of Clinical Measurement.” The Lancet 327 (8476): 307–10. https://doi.org/10.1016/S0140-6736(86)90837-8.
Blei, David. 2012. “Probabilistic Topic Models.” Communications of the ACM 55 (4): 77–84. https://doi.org/10.1145/2133806.2133826.
Blei, David, Andrew Ng, and Michael Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3 (Jan): 993–1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf.
Bloom, Howard, Andrew Bell, and Kayla Reiman. 2020. “Using Data from Randomized Trials to Assess the Likely Generalizability of Educational Treatment-Effect Estimates from Regression Discontinuity Designs.” Journal of Research on Educational Effectiveness 13 (3): 488–517. https://doi.org/10.1080/19345747.2019.1634169.
Blumenthal, Mark. 2014. “Polls, Forecasts, and Aggregators.” PS: Political Science & Politics 47 (02): 297–300. https://doi.org/10.1017/s1049096514000055.
Boland, Philip. 1984. “A Biographical Glimpse of William Sealy Gosset.” The American Statistician 38 (3): 179–83. https://doi.org/10.2307/2683648.
Bolker, Ben, and David Robinson. 2022. broom.mixed: Tidying Methods for Mixed Models. https://CRAN.R-project.org/package=broom.mixed.
Bolton, Ruth, and Randall Chapman. 1986. “Searching for Positive Returns at the Track.” Management Science 32 (August): 1040–60. https://doi.org/10.1287/mnsc.32.8.1040.
Bombieri, Giulia, Vincenzo Penteriani, Kamran Almasieh, Hüseyin Ambarlı, Mohammad Reza Ashrafzadeh, Chandan Surabhi Das, Nishith Dharaiya, et al. 2023. “A Worldwide Perspective on Large Carnivore Attacks on Humans.” PLOS Biology 21 (1): e3001946. https://doi.org/10.1371/journal.pbio.3001946.
Bor, Jacob, Atheendar Venkataramani, David Williams, and Alexander Tsai. 2018. “Police Killings and Their Spillover Effects on the Mental Health of Black Americans: A Population-Based, Quasi-Experimental Study.” The Lancet 392 (10144): 302–10. https://doi.org/10.1016/s0140-6736(18)31130-9.
Borer, Elizabeth T., Eric W. Seabloom, Matthew B. Jones, and Mark Schildhauer. 2009. “Some Simple Guidelines for Effective Data Management.” Bulletin of the Ecological Society of America 90 (2): 205–14. https://doi.org/10.1890/0012-9623-90.2.205.
Borghi, John, and Ana Van Gulick. 2022. “Promoting Open Science Through Research Data Management.” Harvard Data Science Review 4 (3). https://doi.org/10.1162/99608f92.9497f68e.
Borkin, Michelle, Zoya Bylinskii, Nam Wook Kim, Constance May Bainbridge, Chelsea Yeh, Daniel Borkin, Hanspeter Pfister, and Aude Oliva. 2015. “Beyond Memorability: Visualization Recognition and Recall.” IEEE Transactions on Visualization and Computer Graphics 22 (1): 519–28. https://doi.org/10.1109/TVCG.2015.2467732.
Bosch, Oriol, and Melanie Revilla. 2022. When survey science met web tracking: Presenting an error framework for metered data.” Journal of the Royal Statistical Society: Series A (Statistics in Society), November, 1–29. https://doi.org/10.1111/rssa.12956.
Bouguen, Adrien, Yue Huang, Michael Kremer, and Edward Miguel. 2019. “Using Randomized Controlled Trials to Estimate Long-Run Impacts in Development Economics.” Annual Review of Economics 11 (1): 523–61. https://doi.org/10.1146/annurev-economics-080218-030333.
Bouie, Jamelle. 2022. “We Still Can’t See American Slavery for What It Was.” The New York Times, January. https://www.nytimes.com/2022/01/28/opinion/slavery-voyages-data-sets.html.
Bowen, Claire McKay. 2022. Protecting Your Privacy in a Data-Driven World. 1st ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781003122043.
Bowers, Jake, and Maarten Voors. 2016. “How to Improve Your Relationship with Your Future Self.” Revista de Ciencia Polı́tica 36 (3): 829–48. https://doi.org/10.4067/S0718-090X2016000300011.
Bowley, Arthur Lyon. 1901. Elements of Statistics. London: P. S. King.
———. 1913. “Working-Class Households in Reading.” Journal of the Royal Statistical Society 76 (7): 672–701. https://doi.org/10.2307/2339708.
Box, George E. P. 1976. “Science and Statistics.” Journal of the American Statistical Association 71 (356): 791–99. https://doi.org/10.1080/01621459.1976.10480949.
Boykis, Vicki. 2019. “A Deep Dive on Python Type Hints,” July. https://vickiboykis.com/2019/07/08/a-deep-dive-on-python-type-hints/.
Boysel, Sam, and Davis Vaughan. 2021. fredr: An R Client for the “FRED” API. https://CRAN.R-project.org/package=fredr.
Bradley, Valerie, Shiro Kuriwaki, Michael Isakov, Dino Sejdinovic, Xiao-Li Meng, and Seth Flaxman. 2021. “Unrepresentative Big Surveys Significantly Overestimated US Vaccine Uptake.” Nature 600 (7890): 695–700. https://doi.org/10.1038/s41586-021-04198-4.
Braginsky, Mika. 2020. wordbankr: Accessing the Wordbank Database. https://CRAN.R-project.org/package=wordbankr.
Brandt, Allan. 1978. “Racism and Research: The Case of the Tuskegee Syphilis Study.” Hastings Center Report, 21–29. https://doi.org/10.2307/3561468.
Breiman, Leo. 1994. “The 1991 Census Adjustment: Undercount or Bad Data?” Statistical Science 9 (4). https://doi.org/10.1214/ss/1177010259.
———. 2001. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3): 199–231. https://doi.org/10.1214/ss/1009213726.
Bremer, Nadieh, and Shirley Wu. 2021. Data Sketches. A K Peters/CRC Press. https://doi.org/10.1201/9780429445019.
Brewer, Cynthia. 2015. Designing Better Maps: A Guide for GIS Users. 2nd ed.
Brewer, Ken. 2013. “Three Controversies in the History of Survey Sampling.” Survey Methodology 39 (2): 249–63.
Breznau, Nate, Eike Mark Rinke, Alexander Wuttke, Hung HV Nguyen, Muna Adem, Jule Adriaans, Amalia Alvarez-Benjumea, et al. 2022. “Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty.” Proceedings of the National Academy of Sciences 119 (44): e2203150119. https://doi.org/10.1073/pnas.2203150119.
Briggs, Ryan. 2021. “Why Does Aid Not Target the Poorest?” International Studies Quarterly 65 (3): 739–52. https://doi.org/10.1093/isq/sqab035.
Brodeur, Abel, Nikolai Cook, and Anthony Heyes. 2020. Methods Matter: p-Hacking and Publication Bias in Causal Analysis in Economics.” American Economic Review 110 (11): 3634–60. https://doi.org/10.1257/aer.20190687.
Brokowski, Carolyn, and Mazhar Adli. 2019. “CRISPR Ethics: Moral Considerations for Applications of a Powerful Tool.” Journal of Molecular Biology 431 (1): 88–101. https://doi.org/10.1016/j.jmb.2018.05.044.
Bronner, Laura. 2020. “Why Statistics Don’t Capture the Full Extent of the Systemic Bias in Policing.” FiveThirtyEight, June. https://fivethirtyeight.com/features/why-statistics-dont-capture-the-full-extent-of-the-systemic-bias-in-policing/.
———. 2021. “Quantitative Editing.” YouTube, June. https://youtu.be/LI5m9RzJgWc.
Brontë, Charlotte. 1847. Jane Eyre. https://www.gutenberg.org/files/1260/1260-h/1260-h.htm.
———. 1857. The Professor. https://www.gutenberg.org/files/1028/1028-h/1028-h.htm.
Brook, Robert, John Ware, William Rogers, Emmett Keeler, Allyson Ross Davies, Cathy Sherbourne, George Goldberg, Kathleen Lohr, Patricia Camp, and Joseph Newhouse. 1984. “The Effect of Coinsurance on the Health of Adults: Results from the RAND Health Insurance Experiment.” https://www.rand.org/pubs/reports/R3055.html.
Brown, Zack. 2018. “A Git Origin Story.” Linux Journal, July. https://www.linuxjournal.com/content/git-origin-story.
Bryan, Jenny. 2015. “Naming Things.” Reproducible Science Workshop, May. https://speakerdeck.com/jennybc/how-to-name-files.
———. 2018a. “Excuse Me, Do You Have a Moment to Talk about Version Control?” The American Statistician 72 (1): 20–27. https://doi.org/10.1080/00031305.2017.1399928.
———. 2018b. “Code Smells and Feels.” YouTube, July. https://youtu.be/7oyiPBjLAWY.
———. 2020. Happy Git and GitHub for the useR. https://happygitwithr.com.
Bryan, Jenny, and Jim Hester. 2020. What They Forgot to Teach You About R. https://rstats.wtf/index.html.
Bryan, Jenny, Jim Hester, David Robinson, Hadley Wickham, and Christophe Dervieux. 2022. reprex: Prepare Reproducible Example Code via the Clipboard. https://CRAN.R-project.org/package=reprex.
Bryan, Jenny, and Hadley Wickham. 2021. gh: GitHub API. https://CRAN.R-project.org/package=gh.
Buckheit, Jonathan, and David Donoho. 1995. “Wavelab and Reproducible Research.” In Wavelets and Statistics, 55–81. Springer. https://doi.org/10.1007/978-1-4612-2544-7_5.
Bueno de Mesquita, Ethan, and Anthony Fowler. 2021. Thinking Clearly with Data: A Guide to Quantitative Reasoning and Analysis. New Jersey: Princeton University Press.
Buhr, Ray. 2017. Using R as a Production Machine Learning Language (Part I). https://raybuhr.github.io/blog/posts/making-predictions-over-http/.
Buja, Andreas, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung Lee, Deborah F. Swayne, and Hadley Wickham. 2009. “Statistical Inference for Exploratory Data Analysis and Model Diagnostics.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 367 (1906): 4361–83. https://doi.org/10.1098/rsta.2009.0120.
Buja, Andreas, Dianne Cook, and Deborah Swayne. 1996. “Interactive High-Dimensional Data Visualization.” Journal of Computational and Graphical Statistics 5 (1): 78–99. https://doi.org/10.2307/1390754.
Buneman, Peter, Sanjeev Khanna, and Tan Wang-Chiew. 2001. “Why and Where: A Characterization of Data Provenance.” In Database Theory ICDT 2001, 316–30. Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-44503-x_20.
Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” In Conference on Fairness, Accountability and Transparency, 77–91.
Burch, Tyler James. 2023. 2023 NHL Playoff Predictions,” April. https://tylerjamesburch.com/blog/misc/nhl-predictions.
Burton, Jason, Nicole Cruz, and Ulrike Hahn. 2021. “Reconsidering Evidence of Moral Contagion in Online Social Networks.” Nature Human Behaviour 5 (12): 1629–35. https://doi.org/10.1038/s41562-021-01133-5.
Bush, Vannevar. 1945. “As We May Think.” The Atlantic Monthly, July. https://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/.
Byrd, James Brian, Anna Greene, Deepashree Venkatesh Prasad, Xiaoqian Jiang, and Casey Greene. 2020. “Responsible, Practical Genomic Data Sharing That Accelerates Research.” Nature Reviews Genetics 21 (10): 615–29. https://doi.org/10.1038/s41576-020-0257-5.
Cahill, Niamh, Michelle Weinberger, and Leontine Alkema. 2020. “What Increase in Modern Contraceptive Use Is Needed in FP2020 Countries to Reach 75% Demand Satisfied by 2030? An Assessment Using the Accelerated Transition Method and Family Planning Estimation Model.” Gates Open Research 4. https://doi.org/10.12688/gatesopenres.13125.1.
Calonico, Sebastian, Matias Cattaneo, Max Farrell, and Rocio Titiunik. 2021. rdrobust: Robust Data-Driven Statistical Inference in Regression-Discontinuity Designs. https://CRAN.R-project.org/package=rdrobust.
Cambon, Jesse, and Christopher Belanger. 2021. tidygeocoder: Geocoding Made Easy.” Zenodo. https://doi.org/10.5281/zenodo.3981510.
Canty, Angelo, and B. D. Ripley. 2021. boot: Bootstrap R (S-Plus) Functions.
Cardoso, Tom. 2020. Bias behind bars: A Globe investigation finds a prison system stacked against Black and Indigenous inmates.” The Globe and Mail, October. https://www.theglobeandmail.com/canada/article-investigation-racial-bias-in-canadian-prison-risk-assessments/.
Carl, Sebastian, Ben Baldwin, Lee Sharpe, Tan Ho, and John Edwards. 2023. Nflverse: Easily Install and Load the ’Nflverse’. https://CRAN.R-project.org/package=nflverse.
Carleton, Chris. 2021. wccarleton/conflict-europe: Acce.” Zenodo. https://doi.org/10.5281/zenodo.4550688.
Carleton, Chris, Dave Campbell, and Mark Collard. 2021. “A Reassessment of the Impact of Temperature Change on European Conflict During the Second Millennium CE Using a Bespoke Bayesian Time-Series Model.” Climatic Change 165 (1): 1–16. https://doi.org/10.1007/s10584-021-03022-2.
Caro, Robert. 2019. Working. 1st ed. New York: Knopf.
Carpenter, Christopher, and Carlos Dobkin. 2014. Replication data for: The Minimum Legal Drinking Age and Crime.” https://doi.org/10.7910/DVN/27070.
———. 2015. The Minimum Legal Drinking Age and Crime.” The Review of Economics and Statistics 97 (2): 521–24. https://doi.org/10.1162/REST_a_00489.
Carroll, Lewis. 1871. Through the Looking-Glass. Macmillan. https://www.gutenberg.org/files/12/12-h/12-h.htm.
Castro, Marcia, Susie Gurzenda, Cassio Turra, Sun Kim, Theresa Andrasfay, and Noreen Goldman. 2023. “Research Note: COVID-19 Is Not an Independent Cause of Death.” Demography, February. https://doi.org/10.1215/00703370-10575276.
Caughey, Devin, and Jasjeet Sekhon. 2011. Elections and the Regression Discontinuity Design: Lessons from Close U.S. House Races, 1942–2008.” Political Analysis 19 (4): 385–408. https://doi.org/10.1093/pan/mpr032.
Chamberlain, Scott, Hadley Wickham, Winston Chang, and Mauricio Vargas. 2022. Analogsea: Interface to “Digital Ocean”. https://CRAN.R-project.org/package=analogsea.
Chamberlin, Donald. 2012. “Early History of SQL.” IEEE Annals of the History of Computing 34 (4): 78–82. https://doi.org/10.1109/mahc.2012.61.
Chambliss, Daniel. 1989. “The Mundanity of Excellence: An Ethnographic Report on Stratification and Olympic Swimmers.” Sociological Theory 7 (1): 70–86. https://doi.org/10.2307/202063.
Chambru, Cédric, and Paul Maneuvrier-Hervieu. 2022. Introducing HiSCoD: A new gateway for the study of historical social conflict.” Working Paper Series, Department of Economics, University of Zurich. https://doi.org/10.5167/uzh-217109.
Chan, Duo. 2021. “Combining Statistical, Physical, and Historical Evidence to Improve Historical Sea-Surface Temperature Records.” Harvard Data Science Review 3 (1). https://doi.org/10.1162/99608f92.edcee38f.
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2021. shiny: Web Application Framework for R. https://CRAN.R-project.org/package=shiny.
Chase, William. 2020. “The Glamour of Graphics.” RStudio Conference, January. https://posit.co/resources/videos/the-glamour-of-graphics/.
Chawla, Dalmeet Singh. 2020. “Critiqued Coronavirus Simulation Gets Thumbs up from Code-Checking Efforts.” Nature 582: 323–24. https://doi.org/10.1038/d41586-020-01685-y.
Chellel, Kit. 2018. “The Gambler Who Cracked the Horse-Racing Code.” Bloomberg Businessweek, May. https://www.bloomberg.com/news/features/2018-05-03/the-gambler-who-cracked-the-horse-racing-code.
Chen, Heng, Marie-Hélène Felt, and Christopher Henry. 2018. “2017 Methods-of-Payment Survey: Sample Calibration and Variance Estimation.” Bank of Canada. https://doi.org/10.34989/tr-114.
Chen, Wei, Xilu Chen, Chang-Tai Hsieh, and Zheng Song. 2019. “A Forensic Examination of China’s National Accounts.” Brookings Papers on Economic Activity, 77–127. https://www.jstor.org/stable/26798817.
Chen, Weijun, Yan Qi, Yuwen Zhang, Christina Brown, Akos Lada, and Harivardan Jayaraman. 2022. “Notifications: Why Less Is More,” December. https://medium.com/@AnalyticsAtMeta/notifications-why-less-is-more-how-facebook-has-been-increasing-both-user-satisfaction-and-app-9463f7325e7d.
Cheng, Joe, Bhaskar Karambelkar, and Yihui Xie. 2021. leaflet: Create Interactive Web Maps with the JavaScript “Leaflet” Library. https://CRAN.R-project.org/package=leaflet.
Cheriet, Mohamed, Nawwaf Kharma, Cheng-Lin Liu, and Ching Suen. 2007. Character Recognition Systems: A Guide for Students and Practitioner. Wiley.
Chouldechova, Alexandra, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan. 2018. “A Case Study of Algorithm-Assisted Decision Making in Child Maltreatment Hotline Screening Decisions.” In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, edited by Sorelle Friedler and Christo Wilson, 81:134–48. Proceedings of Machine Learning Research. https://proceedings.mlr.press/v81/chouldechova18a.html.
Chrétien, Jean. 2007. My Years as Prime Minister. 1st ed. Toronto: Knopf Canada.
Christensen, Garret, Allan Dafoe, Edward Miguel, Don Moore, and Andrew Rose. 2019. “A Study of the Impact of Data Sharing on Article Citations Using Journal Policies as a Natural Experiment.” PLOS ONE 14 (12): e0225883. https://doi.org/10.1371/journal.pone.0225883.
Christensen, Garret, Jeremy Freese, and Edward Miguel. 2019. Transparent and Reproducible Social Science Research. California: University of California Press.
Christian, Brian. 2012. The A/B Test: Inside the Technology That’s Changing the Rules of Business.” Wired, April. https://www.wired.com/2012/04/ff-abtesting/.
Cirone, Alexandra, and Arthur Spirling. 2021. “Turning History into Data: Data Collection, Measurement, and Inference in HPE.” Journal of Historical Political Economy 1 (1): 127–54. https://doi.org/10.1561/115.00000005.
City of Toronto. 2021. 2021 Street Needs Assessment. https://www.toronto.ca/city-government/data-research-maps/research-reports/housing-and-homelessness-research-and-reports/.
Cleveland, William. (1985) 1994. The Elements of Graphing Data. 2nd ed. New Jersey: Hobart Press.
Clinton, Joshua, John Lapinski, and Marc Trussler. 2022. “Reluctant Republicans, Eager Democrats?” Public Opinion Quarterly 86 (2): 247–69. https://doi.org/10.1093/poq/nfac011.
Cohen, Glenn, and Michelle Mello. 2018. HIPAA and Protecting Health Information in the 21st Century.” JAMA 320 (3): 231. https://doi.org/10.1001/jama.2018.5630.
Cohen, Jason, Steven Teleki, and Eric Brown. 2006. Best Kept Secrets of Peer Code Review. Smart Bear Incorporated.
Cohn, Alain. 2019. Data and code for: Civic Honesty Around the Globe.” Harvard Dataverse. https://doi.org/10.7910/dvn/ykbodn.
Cohn, Alain, Michel André Maréchal, David Tannenbaum, and Christian Lukas Zünd. 2019a. “Civic Honesty Around the Globe.” Science 365 (6448): 70–73. https://doi.org/10.1126/science.aau8712.
———. 2019b. “Supplementary Materials for: Civic Honesty Around the Globe.” Science 365 (6448): 70–73.
Cohn, Nate. 2016. “We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results.” The New York Times, September. https://www.nytimes.com/interactive/2016/09/20/upshot/the-error-the-polling-world-rarely-talks-about.html.
Collins, Annie, and Rohan Alexander. 2022. “Reproducibility of COVID-19 Pre-Prints.” Scientometrics 127: 4655–73. https://doi.org/10.1007/s11192-022-04418-2.
Colombo, Tommaso, Holger Fröning, Pedro Javier Garcı̀a, and Wainer Vandelli. 2016. “Optimizing the Data-Collection Time of a Large-Scale Data-Acquisition System Through a Simulation Framework.” The Journal of Supercomputing 72 (12): 4546–72. https://doi.org/10.1007/s11227-016-1764-1.
Comer, Benjamin P., and Jason R. Ingram. 2022. “Comparing Fatal Encounters, Mapping Police Violence, and Washington Post Fatal Police Shooting Data from 2015-2019: A Research Note.” Criminal Justice Review, January, 073401682110710. https://doi.org/10.1177/07340168211071014.
Congelio, Bradley. 2024. Introduction to NFL Analytics with R. 1st ed. Chapman; Hall/CRC. https://bradcongelio.com/nfl-analytics-with-r-book/.
Cook, Dianne, Andreas Buja, Javier Cabrera, and Catherine Hurley. 1995. Grand Tour and Projection Pursuit.” Journal of Computational and Graphical Statistics 4 (3): 155–72. https://doi.org/10.1080/10618600.1995.10474674.
Cook, Dianne, Nancy Reid, and Emi Tanaka. 2021. “The Foundation Is Available for Thinking about Data Visualization Inferentially.” Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.8453435d.
Cook, Dianne, and Deborah Swayne. 2007. Interactive and Dynamic Graphics for Data Analysis: With R and GGobi. 1st ed. Springer.
Cooley, David. 2020. mapdeck: Interactive Maps Using “Mapbox GL JS” and “Deck.gl”. https://CRAN.R-project.org/package=mapdeck.
Council of European Union. 2016. “General Data Protection Regulation 2016/679.” https://eur-lex.europa.eu/eli/reg/2016/679/oj.
Cowen, Tyler. 2021. “Episode 132: Amia Srinivasan on Utopian Feminism.” Conversations with Tyler, September. https://conversationswithtyler.com/episodes/amia-srinivasan/.
———. 2023. “Episode 168: Katherine Rundell on the Art of Words.” Conversations with Tyler, January. https://conversationswithtyler.com/episodes/katherine-rundell/.
Cox, David. 2018. “In Gentle Praise of Significance Tests.” YouTube, October. https://youtu.be/txLj%5FP9UlCQ.
Cox, David, and Nancy Reid. 1987. “Parameter Orthogonality and Approximate Conditional Inference.” Journal of the Royal Statistical Society: Series B (Methodological) 49 (1): 1–18. https://doi.org/10.1111/j.2517-6161.1987.tb01422.x.
Cox, Murray. 2021. Inside Airbnb—Toronto Data.” http://insideairbnb.com/get-the-data.html.
Coyle, Edward, Andrew Coggan, Mari Hopper, and Thomas Walters. 1988. Determinants of Endurance in Well-Trained Cyclists.” Journal of Applied Physiology 64 (6): 2622–30. https://doi.org/10.1152/jappl.1988.64.6.2622.
Craiu, Radu. 2019. “The Hiring Gambit: In Search of the Twofer Data Scientist.” Harvard Data Science Review 1 (1). https://doi.org/10.1162/99608f92.440445cb.
Cramer, Jan Salomon. 2003. “The Origins of Logistic Regression.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.360300.
Crane, Nicola, Stephanie Hazlitt, and Apache Arrow. 2023. Apache Arrow R Cookbook. https://arrow.apache.org/cookbook/r/.
Crawford, Kate. 2021. Atlas of AI. 1st ed. New Haven: Yale University Press.
Crosby, Alfred. 1997. The Measure of Reality: Quantification in Western Europe, 1250-1600. Cambridge: Cambridge University Press.
Csárdi, Gábor. 2022. gitcreds: Query “git” Credentials from “R”. https://CRAN.R-project.org/package=gitcreds.
Csárdi, Gábor, Jim Hester, Hadley Wickham, Winston Chang, Martin Morgan, and Dan Tenenbaum. 2021. remotes: R Package Installation from Remote Repositories, Including “GitHub”. https://CRAN.R-project.org/package=remotes.
Cummins, Neil. 2022. “The Hidden Wealth of English Dynasties, 1892–2016.” The Economic History Review 75 (3): 667–702. https://doi.org/10.1111/ehr.13120.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. 1st ed. New Haven: Yale Press. https://mixtape.scunning.com.
D’Ignazio, Catherine, and Lauren Klein. 2020. Data Feminism. Massachusetts: The MIT Press. https://data-feminism.mitpress.mit.edu.
Dagan, Noa, Noam Barda, Eldad Kepten, Oren Miron, Shay Perchik, Mark Katz, Miguel Hernán, Marc Lipsitch, Ben Reis, and Ran Balicer. 2021. “BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination Setting.” New England Journal of Medicine 384 (15): 1412–23. https://doi.org/10.1056/NEJMoa2101765.
Daston, Lorraine. 2000. “Why Statistics Tend Not Only to Describe the World but to Change It.” London Review of Books 22 (8). https://www.lrb.co.uk/the-paper/v22/n08/lorraine-daston/why-statistics-tend-not-only-to-describe-the-world-but-to-change-it.
Data and Justice Criminology Lab, Institute of Criminology and Criminal Justice, Carleton University; The Centre for Research & Innovation for Black Survivors of Homicide Victims (The CRIB), at the Factor-Inwentash Faculty of Social Work, University of Toronto; Canadian Civil Liberties Association; Ethics and Technology Lab, Queen’s University. 2022. “Tracking (in)justice: A Living Data Set Tracking Canadian Police-Involved Deaths.” https://trackinginjustice.ca.
Dattani, Saloni. 2024. “The Rise in Reported Maternal Mortality Rates in the US Is Largely Due to a Change in Measurement.” Our World in Data.
Davidson, Thomas, Debasmita Bhattacharya, and Ingmar Weber. 2019. “Racial Bias in Hate Speech and Abusive Language Detection Datasets.” In Proceedings of the Third Workshop on Abusive Language Online, 25–35.
Davies, Neil M., Gibran Hemani, Jenae M. Neiderhiser, Hilary C. Martin, Melinda C. Mills, Peter M. Visscher, Loïc Yengo, Alexander Strudwick Young, and Matthew C. Keller. 2024. “The Importance of Family-Based Sampling for Biobanks.” Nature 634 (8035): 795–803. https://doi.org/10.1038/s41586-024-07721-5.
Davies, Rhian, Steph Locke, and Lucy D’Agostino McGowan. 2022. datasauRus: Datasets from the Datasaurus Dozen. https://CRAN.R-project.org/package=datasauRus.
Davis, Darren. 1997. “Nonrandom Measurement Error and Race of Interviewer Effects Among African Americans.” The Public Opinion Quarterly 61 (1): 183–207. https://doi.org/10.1086/297792.
Davison, A. C., and D. V. Hinkley. 1997. Bootstrap Methods and Their Applications. Cambridge: Cambridge University Press. http://statwww.epfl.ch/davison/BMA/.
De Jonge, Edwin, and Mark van der Loo. 2013. An introduction to data cleaning with R. Statistics Netherlands Heerlen. https://cran.r-project.org/doc/contrib/de%5FJonge+van%5Fder%5FLoo-Introduction%5Fto%5Fdata%5Fcleaning%5Fwith%5FR.pdf.
Dean, Natalie. 2022. “Tracking COVID-19 Infections: Time for Change.” Nature 602 (7896): 185. https://doi.org/10.1038/d41586-022-00336-8.
Deaton, Angus. 2010. “Instruments, Randomization, and Learning about Development.” Journal of Economic Literature 48 (2): 424–55. https://doi.org/10.1257/jel.48.2.424.
Denby, Lorraine, and Colin Mallows. 2009. “Variations on the Histogram.” Journal of Computational and Graphical Statistics 18 (1): 21–31. https://doi.org/10.1198/jcgs.2009.0002.
DeWitt, Helen. 2000. The Last Samurai. 1st ed. United States: Talk Mirimax Books.
Dillman, Don, Jolene Smyth, and Leah Christian. (1978) 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. 4th ed. Wiley.
Doggers, Peter. 2021. “Carlsen Wins Game 6, Longest World Chess Championship Game of All Time,” December. https://www.chess.com/news/view/fide-world-chess-championship-2021-game-6.
Dolatsara, Hamidreza Ahady, Ying-Ju Chen, Robert Leonard, Fadel Megahed, and Allison Jones-Farmer. 2021. “Explaining Predictive Model Performance: An Experimental Study of Data Preparation and Model Choice.” Big Data, October. https://doi.org/10.1089/big.2021.0067.
Doll, Richard, and Bradford Hill. 1950. “Smoking and Carcinoma of the Lung.” British Medical Journal 2 (4682): 739–48. https://doi.org/10.1136/bmj.2.4682.739.
Druckman, James, and Donald Green. 2021. “A New Era of Experimental Political Science.” In Advances in Experimental Political Science, 1–16. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108777919.002.
Du, Kai, Steven Huddart, and Xin Daniel Jiang. 2022. “Lost in Standardization: Effects of Financial Statement Database Discrepancies on Inference.” Journal of Accounting and Economics, December, 101573. https://doi.org/10.1016/j.jacceco.2022.101573.
Duflo, Esther. 2020. “Field Experiments and the Practice of Policy.” American Economic Review 110 (7): 1952–73. https://doi.org/10.1257/aer.110.7.1952.
Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. “Calibrating Noise to Sensitivity in Private Data Analysis.” In Theory of Cryptography Conference, 265–84. Springer. https://doi.org/10.1007/11681878_14.
Dwork, Cynthia, and Aaron Roth. 2013. “The Algorithmic Foundations of Differential Privacy.” Foundations and Trends in Theoretical Computer Science 9 (3-4): 211–407. https://doi.org/10.1561/0400000042.
Edelman, Murray, Liberty Vittert, and Xiao-Li Meng. 2021. “An Interview with Murray Edelman on the History of the Exit Poll.” Harvard Data Science Review 3 (1). https://doi.org/10.1162/99608f92.3a25cd24.
Edgeworth, Francis Ysidro. 1885. “Methods of Statistics.” Journal of the Statistical Society of London, 181–217.
Edwards, Jonathan. 2017. PACE team response shows a disregard for the principles of science.” Journal of Health Psychology 22 (9): 1155–58. https://doi.org/10.1177/1359105317700886.
Efron, Bradley, and Carl Morris. 1977. “Stein’s Paradox in Statistics.” Scientific American 236 (May): 119–27. https://doi.org/10.1038/scientificamerican0577-119.
Eghbal, Nadia. 2020. Working in Public: The Making and Maintenance of Open Source Software. California: Stripe Press.
Eisenstein, Michael. 2022. “Need Web Data? Here’s How to Harvest Them.” Nature 607: 200–201. https://doi.org/10.1038/d41586-022-01830-9.
Elliott, Michael, Brady West, Xinyu Zhang, and Stephanie Coffey. 2022. “The Anchoring Method: Estimation of Interviewer Effects in the Absence of Interpenetrated Sample Assignment.” Survey Methodology 48 (1): 25–48. http://www.statcan.gc.ca/pub/12-001-x/2022001/article/00005-eng.htm.
Elson, Malte. 2018. “Question Wording and Item Formulation.” https://doi.org/10.31234/osf.io/e4ktc.
Enns, Peter, and Jake Rothschild. 2022. “Do You Know Where Your Survey Data Come From?” May. https://medium.com/3streams/surveys-3ec95995dde2.
Farrugia, Patricia, Bradley Petrisor, Forough Farrokhyar, and Mohit Bhandari. 2010. “Research Questions, Hypotheses and Objectives.” Canadian Journal of Surgery 53 (4): 278.
Feldman, Gilad. 2024. RRR Assessment Peer Review. https://mgto.org/rrrassessmentreviewtemplate.
Finkelstein, Amy, Sarah Taubman, Bill Wright, Mira Bernstein, Jonathan Gruber, Joseph Newhouse, Heidi Allen, Katherine Baicker, and Oregon Health Study Group. 2012. “The Oregon Health Insurance Experiment: Evidence from the First Year.” The Quarterly Journal of Economics 127 (3): 1057–1106. https://doi.org/10.1093/qje/qjs020.
Firke, Sam. 2023. janitor: Simple Tools for Examining and Cleaning Dirty Data. https://CRAN.R-project.org/package=janitor.
Fisher, Ronald. (1925) 1928. Statistical Methods for Research Workers. 2nd ed. London: Oliver; Boyd.
———. (1935) 1949. The Design of Experiments. 5th ed. London: Oliver; Boyd.
Fiske, Susan, and Shiro Kuriwaki. 2021. “Words to the Wise on Writing Scientific Papers,” November. https://doi.org/10.31234/osf.io/n32qw.
Fitts, Alexis Sobel. 2014. “The King of Content: How Upworthy Aims to Alter the Web, and Could End up Altering the World.” Columbia Journalism Review 53: 34–38. https://archives.cjr.org/feature/the%5Fking%5Fof%5Fcontent.php.
Flake, Jessica, and Eiko Fried. 2020. “Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them.” Advances in Methods and Practices in Psychological Science 3 (4): 456–65. https://doi.org/10.1177/2515245920952393.
Flynn, Michael. 2022. troopdata: Tools for Analyzing Cross-National Military Deployment and Basing Data. https://CRAN.R-project.org/package=troopdata.
Ford, Paul. 2015. “What Is Code?” Bloomberg Businessweek, June. https://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/.
Forster, Edward Morgan. 1927. Aspects of the Novel. London: Edward Arnold.
Foster, Gordon. 1968. “Computers, Statistics and Planning: Systems or Chaos?” Geary Lecture. https://www.esri.ie/system/files/publications/GLS2.pdf.
Fourcade, Marion, and Kieran Healy. 2017. “Seeing Like a Market.” Socio-Economic Review 15 (1): 9–29. https://doi.org/10.1093/ser/mww033.
Fowler, Martin, and Kent Beck. 2018. Refactoring: Improving the Design of Existing Code. 2nd ed. New York: Addison-Wesley Professional.
Fox, John, and Robert Andersen. 2006. “Effect Displays for Multinomial and Proportional-Odds Logit Models.” Sociological Methodology 36 (1): 225–55. https://doi.org/10.1111/j.1467-9531.2006.00180.
Fox, John, Sanford Weisberg, and Brad Price. 2022. carData: Companion to Applied Regression Data Sets. https://CRAN.R-project.org/package=carData.
Franconeri, Steven, Lace Padilla, Priti Shah, Jeffrey Zacks, and Jessica Hullman. 2021. “The Science of Visual Data Communication: What Works.” Psychological Science in the Public Interest 22 (3): 110–61. https://doi.org/10.1177/15291006211051956.
Frandell, Ashlee, Mary Feeney, Timothy Johnson, Eric Welch, Lesley Michalegko, and Heyjie Jung. 2021. “The Effects of Electronic Alert Letters for Internet Surveys of Academic Scientists.” Scientometrics 126 (8): 7167–81. https://doi.org/10.1007/s11192-021-04029-3.
Franklin, Laura. 2005. “Exploratory Experiments.” Philosophy of Science 72 (5): 888–99. https://doi.org/10.1086/508117.
Frei, Christoph, and Liam Welsh. 2022. How the Closure of a U.S. Tax Loophole May Affect Investor Portfolios.” Journal of Risk and Financial Management 15 (5): 209. https://doi.org/10.3390/jrfm15050209.
Frick, Hannah, Fanny Chow, Max Kuhn, Michael Mahoney, Julia Silge, and Hadley Wickham. 2022. rsample: General Resampling Infrastructure. https://CRAN.R-project.org/package=rsample.
Fried, Eiko, Jessica Flake, and Donald Robinaugh. 2022. “Revisiting the Theoretical and Methodological Foundations of Depression Measurement.” Nature Reviews Psychology 1 (6): 358–68. https://doi.org/10.1038/s44159-022-00050-2.
Friedman, Jerome, Robert Tibshirani, and Trevor Hastie. 2009. The Elements of Statistical Learning. 2nd ed. Springer. https://hastie.su.domains/ElemStatLearn/.
Friendly, Michael. 2021. HistData: Data Sets from the History of Statistics and Data Visualization. https://CRAN.R-project.org/package=HistData.
Friendly, Michael, and Howard Wainer. 2021. A History of Data Visualization and Graphic Communication. 1st ed. Massachusetts: Harvard University Press.
Fry, Hannah. 2020. “Big Tech Is Testing You.” The New Yorker, February, 61–65. https://www.newyorker.com/magazine/2020/03/02/big-tech-is-testing-you.
Fryzlewicz, Piotr. 2024. Telling Stories with Data: With Applications in R.” The American Statistician, April, 1–5. https://doi.org/10.1080/00031305.2024.2339562.
Fuller, Mark, and James Mosher. 1987. “Raptor Survey Techniques.” In Raptor Management Techniques Manual, edited by Beth Pendleton, Brian Millsap, Keith Cline, and David Bird, 37–65. National Wildlife Federation. https://www.sandiegocounty.gov/content/dam/sdc/pds/ceqa/JVR/AdminRecord/IncorporatedByReference/Appendices/Appendix-D---Biological-Resources-Report/Fuller%20and%20Mosher%201987.pdf.
Funkhouser, Gray. 1937. “Historical Development of the Graphical Representation of Statistical Data.” Osiris 3: 269–404. https://doi.org/10.1086/368480.
Gagolewski, Marek. 2022. stringi: Fast and Portable Character String Processing in R.” Journal of Statistical Software 103 (2): 1–59. https://doi.org/10.18637/jss.v103.i02.
Galef, Julia. 2020. “Episode 248: Are Democrats Being Irrational? (David Shor).” Rationally Speaking, December. http://rationallyspeakingpodcast.org/248-are-democrats-being-irrational-david-shor/.
Gao, Lucy, Jacob Bien, and Daniela Witten. 2022. “Selective Inference for Hierarchical Clustering.” Journal of the American Statistical Association, October, 1–11. https://doi.org/10.1080/01621459.2022.2116331.
Gao, Zheng, Christian Bird, and Earl T. Barr. 2017. “To Type or Not to Type: Quantifying Detectable Bugs in JavaScript.” In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE. https://doi.org/10.1109/icse.2017.75.
Garfinkel, Irwin, Lee Rainwater, and Timothy Smeeding. 2006. “A Re-Examination of Welfare States and Inequality in Rich Nations: How in-Kind Transfers and Indirect Taxes Change the Story.” Journal of Policy Analysis and Management 25 (4): 897–919. https://doi.org/10.1002/pam.20213.
Gargiulo, Maria. 2022. “Statistical Biases, Measurement Challenges, and Recommendations for Studying Patterns of Femicide in Conflict.” Peace Review 34 (2): 163–76. https://doi.org/10.1080/10402659.2022.2049002.
Garnier, Simon, Noam Ross, Robert Rudis, Antônio Camargo, Marco Sciaini, and Cédric Scherer. 2021. viridis – Colorblind-Friendly Color Maps for R. https://doi.org/10.5281/zenodo.4679424.
Gazeley, Ursula, Georges Reniers, Hallie Eilerts-Spinelli, Julio Romero Prieto, Momodou Jasseh, Sammy Khagayi, and Veronique Filippi. 2022. “Women’s Risk of Death Beyond 42 Days Post Partum: A Pooled Analysis of Longitudinal Health and Demographic Surveillance System Data in Sub-Saharan Africa.” The Lancet Global Health 10 (11): e1582–89. https://doi.org/10.1016/s2214-109x(22)00339-4.
Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021. “Datasheets for Datasets.” Communications of the ACM 64 (12): 86–92. https://doi.org/10.1145/3458723.
Gelfand, Sharla. 2021. “Make a ReprEx... Please.” YouTube, February. https://youtu.be/G5Nm-GpmrLw.
———. 2022a. Astrologer: Chani Nicholas Weekly Horoscopes (2013-2017). http://github.com/sharlagelfand/astrologer.
———. 2022b. opendatatoronto: Access the City of Toronto Open Data Portal. https://CRAN.R-project.org/package=opendatatoronto.
Gelman, Andrew. 2016. What has happened down here is the winds have changed,” September. https://statmodeling.stat.columbia.edu/2016/09/21/what-has-happened-down-here-is-the-winds-have-changed/.
———. 2019. “Another Regression Discontinuity Disaster and What Can We Learn from It,” June. https://statmodeling.stat.columbia.edu/2019/06/25/another-regression-discontinuity-disaster-and-what-can-we-learn-from-it/.
———. 2020. “Statistical Models of Election Outcomes.” YouTube, August. https://youtu.be/7gjDnrbLQ4k.
Gelman, Andrew, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin. (1995) 2014. Bayesian Data Analysis. 3rd ed. Chapman; Hall/CRC.
Gelman, Andrew, Sharad Goel, Douglas Rivers, and David Rothschild. 2016. “The Mythical Swing Voter.” Quarterly Journal of Political Science 11 (1): 103–30. https://doi.org/10.1561/100.00015031.
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. 1st ed. Cambridge University Press.
Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2020. Regression and Other Stories. Cambridge University Press. https://avehtari.github.io/ROS-Examples/.
Gelman, Andrew, and Guido Imbens. 2019. “Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs.” Journal of Business & Economic Statistics 37 (3): 447–56. https://doi.org/10.1080/07350015.2017.1366909.
Gelman, Andrew, and Eric Loken. 2013. “The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No ‘Fishing Expedition’ or ‘p-Hacking’ and the Research Hypothesis Was Posited Ahead of Time.” Department of Statistics, Columbia University. http://www.stat.columbia.edu/~gelman/research/unpublished/p%5Fhacking.pdf.
Gelman, Andrew, Greggor Mattson, and Daniel Simpson. 2018. “Gaydar and the Fallacy of Decontextualized Measurement.” Sociological Science 5 (12): 270–80. https://doi.org/10.15195/v5.a12.
Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician 56 (2): 121–30. https://doi.org/10.1198/000313002317572790.
Gelman, Andrew, and Aki Vehtari. 2021. “What Are the Most Important Statistical Ideas of the Past 50 Years?” Journal of the American Statistical Association 116 (536): 2087–97. https://doi.org/10.1080/01621459.2021.1938081.
———. 2024. Active Statistics: Stories, Games, Problems, and Hands-on Demonstrations for Applied Regression and Causal Inference. Cambridge University Press. https://doi.org/10.1017/9781009436243.
Gelman, Andrew, Aki Vehtari, Daniel Simpson, Charles Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian Bürkner, and Martin Modrák. 2020. “Bayesian Workflow.” arXiv. https://doi.org/10.48550/arXiv.2011.01808.
Gentemann, Chelle Leigh, Chris Holdgraf, Ryan Abernathey, Daniel Crichton, James Colliander, Edward Joseph Kearns, Yuvi Panda, and Richard Signell. 2021. “Science Storms the Cloud.” AGU Advances 2 (2). https://doi.org/10.1029/2020av000354.
Gerber, Alan, and Donald Green. 2012. Field Experiments: Design, Analysis, and Interpretation. New York: WW Norton.
Gerring, John. 2012. “Mere Description.” British Journal of Political Science 42 (4): 721–46. https://doi.org/10.1017/s0007123412000130.
Gertler, Paul, Sebastian Martinez, Patrick Premand, Laura Rawlings, and Christel Vermeersch. 2016. Impact Evaluation in Practice. 2nd ed. The World Bank. https://doi.org/10.1596/978-1-4648-0779-4.
Geuenich, Michael, Jinyu Hou, Sunyun Lee, Shanza Ayub, Hartland Jackson, and Kieran Campbell. 2021a. “Automated Assignment of Cell Identity from Single-Cell Multiplexed Imaging and Proteomic Data.” Cell Systems 12 (12): 1173–86. https://doi.org/10.1016/j.cels.2021.08.012.
———. 2021b. “Replication Materials: "Automated Assignment of Cell Identity from Single-Cell Multiplexed Imaging and Proteomic Data".” https://doi.org/10.5281/ZENODO.5156049.
Ghitza, Yair, and Andrew Gelman. 2020. “Voter Registration Databases and MRP: Toward the Use of Large-Scale Databases in Public Opinion Research.” Political Analysis 28 (4): 507–31. https://doi.org/10.1017/pan.2020.3.
Gibney, Elizabeth. 2022. The leap second’s time is up: world votes to stop pausing clocks.” Nature 612 (7938): 18–18. https://doi.org/10.1038/d41586-022-03783-5.
Gleick, James. 1990. “The Census: Why We Can’t Count.” The New York Times, July. https://www.nytimes.com/1990/07/15/magazine/the-census-why-we-can-t-count.html.
Godfrey, Ernest. 1918. “History and Development of Statistics in Canada.” In The History of Statistics–Their Development and Progress in Many Countries. New York: Macmillan, edited by John Koren, 179–98. Macmillan Company of New York.
Goodman, Leo. 1961. “Snowball Sampling.” The Annals of Mathematical Statistics 32 (1): 148–70. https://doi.org/10.1214/aoms/1177705148.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2023. rstanarm: Bayesian applied regression modeling via Stan.” https://mc-stan.org/rstanarm.
Google. 2022. “What to Look for in a Code Review.” Google Engineering Practices Documentation. https://google.github.io/eng-practices/review/reviewer/looking-for.html.
Gordon, Brett, Robert Moakler, and Florian Zettelmeyer. 2022. “Close Enough? A Large-Scale Exploration of Non-Experimental Approaches to Advertising Measurement.” Marketing Science, November. https://doi.org/10.1287/mksc.2022.1413.
Gordon, Brett, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky. 2019. “A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook.” Marketing Science 38 (2): 193–225. https://doi.org/10.1287/mksc.2018.1135.
Gould, Elliot, Hannah Fraser, Timothy Parker, Shinichi Nakagawa, Simon Griffith, Peter Vesk, and Fiona Fidler. 2023. “Same Data, Different Analysts: Variation in Effect Sizes Due to Analytical Decisions in Ecology and Evolutionary Biology,” October. https://doi.org/10.32942/x2gg62.
Graham, Paul. 2020. “How to Write Usefully,” February. http://paulgraham.com/useful.html.
Gray, Charles T., and Ben Marwick. 2019. “Truth, Proof, and Reproducibility: There’s No Counter-Attack for the Codeless.” In Communications in Computer and Information Science, 111–29. Springer Singapore. https://doi.org/10.1007/978-981-15-1960-4_8.
Green, Donald, Terence Leong, Holger Kern, Alan Gerber, and Christopher Larimer. 2009. “Testing the Accuracy of Regression Discontinuity Analysis Using Experimental Benchmarks.” Political Analysis 17 (4): 400–417. https://doi.org/10.1093/pan/mpp018.
Green, Eric. 2020. Nivi Research: Mister P helps us understand vaccine hesitancy,” December. https://research.nivi.io/posts/2020-12-08-mister-p-helps-us-understand-vaccine-hesitancy/.
Greenberg, Bernard, Abdel-Latif Abul-Ela, Walt Simmons, and Daniel Horvitz. 1969. “The Unrelated Question Randomized Response Model: Theoretical Framework.” Journal of the American Statistical Association 64 (326): 520–39. https://doi.org/10.1080/01621459.1969.10500991.
Greenland, Sander, Stephen Senn, Kenneth Rothman, John Carlin, Charles Poole, Steven Goodman, and Douglas Altman. 2016. Statistical Tests, P values, Confidence Intervals, and Power: A Guide to Misinterpretations.” European Journal of Epidemiology 31 (4): 337–50. https://doi.org/10.1007/s10654-016-0149-3.
Greifer, Noah. 2021. “Why Do We Do Matching for Causal Inference Vs Regressing on Confounders?” Cross Validated, September. https://stats.stackexchange.com/q/544958.
Grimmer, Justin, Margaret Roberts, and Brandon Stewart. 2022. Text As Data: A New Framework for Machine Learning and the Social Sciences. New Jersey: Princeton University Press.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. https://doi.org/10.18637/jss.v040.i03.
Gronsbell, Jessica, Jessica Minnier, Sheng Yu, Katherine Liao, and Tianxi Cai. 2019. “Automated Feature Selection of Predictors in Electronic Medical Records Data.” Biometrics 75 (1): 268–77. https://doi.org/10.1111/biom.12987.
Groves, Robert. 2011. “Three Eras of Survey Research.” Public Opinion Quarterly 75 (5): 861–71. https://doi.org/10.1093/poq/nfr057.
Groves, Robert, and Lars Lyberg. 2010. Total Survey Error: Past, Present, and Future.” Public Opinion Quarterly 74 (5): 849–79. https://doi.org/10.1093/poq/nfq065.
Grün, Bettina, and Kurt Hornik. 2011. topicmodels: An R Package for Fitting Topic Models.” Journal of Statistical Software 40 (13): 1–30. https://doi.org/10.18637/jss.v040.i13.
Gustafsson, Karl, and Linus Hagström. 2017. “What Is the Point? Teaching Graduate Students How to Construct Political Science Research Puzzles.” European Political Science 17 (4): 634–48. https://doi.org/10.1057/s41304-017-0130-y.
Gutman, Robert. 1958. “Birth and Death Registration in Massachusetts: II. The Inauguration of a Modern System, 1800-1849.” The Milbank Memorial Fund Quarterly 36 (4): 373–402.
Hackett, Robert. 2016. Researchers Caused an Uproar By Publishing Data From 70,000 OkCupid Users.” Fortune, May. https://fortune.com/2016/05/18/okcupid-data-research/.
Halberstam, David. 1972. The Best and the Brightest. 1st ed. New York: Random House.
Hamming, Richard. (1997) 2020. The Art of Doing Science and Engineering. 2nd ed. Stripe Press.
Hammond, Jennifer, Heidi Leister-Tebbe, Annie Gardner, Paula Abreu, Weihang Bao, Wayne Wisemandle, MaryLynn Baniecki, et al. 2022. “Oral Nirmatrelvir for High-Risk, Nonhospitalized Adults with Covid-19.” New England Journal of Medicine 386 (15): 1397–1408. https://doi.org/10.1056/nejmoa2118542.
Hand, David. 2018. “Statistical Challenges of Administrative and Transaction Data.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 181 (3): 555–605. https://doi.org/10.1111/rssa.12315.
Handcock, Mark, and Krista Gile. 2011. “Comment: On the Concept of Snowball Sampling.” Sociological Methodology 41 (1): 367–71. https://doi.org/10.1111/j.1467-9531.2011.01243.x.
Hangartner, Dominik, Daniel Kopp, and Michael Siegenthaler. 2021. “Monitoring Hiring Discrimination Through Online Recruitment Platforms.” Nature 589 (7843): 572–76. https://doi.org/10.1038/s41586-020-03136-0.
Hanretty, Chris. 2020. “An Introduction to Multilevel Regression and Post-Stratification for Estimating Constituency Opinion.” Political Studies Review 18 (4): 630–45. https://doi.org/10.1177/1478929919864773.
Hao, Karen. 2019. This is How AI Bias Really Happens—And Why It’s So Hard To Fix.” MIT Technology Review, February. https://www.technologyreview.com/2019/02/04/137602/this-is-how-ai-bias-really-happensand-why-its-so-hard-to-fix/.
Hart, Edmund, Pauline Barmby, David LeBauer, François Michonneau, Sarah Mount, Patrick Mulrooney, Timothée Poisot, Kara Woo, Naupaka Zimmerman, and Jeffrey Hollister. 2016. “Ten Simple Rules for Digital Data Storage.” PLOS Computational Biology 12 (10): e1005097. https://doi.org/10.1371/journal.pcbi.1005097.
Hartocollis, Anemona. 2022. U.S. News Ranked Columbia No. 2, but a Math Professor Has His Doubts.” The New York Times, March. https://www.nytimes.com/2022/03/17/us/columbia-university-rank.html.
Hassan, Mai. 2022. “New Insights on Africa’s Autocratic Past.” African Affairs 121 (483): 321–33. https://doi.org/10.1093/afraf/adac002.
Hastie, Trevor, and Robert Tibshirani. 1990. Generalized Additive Models. 1st ed. Boca Raton: Chapman; Hall/CRC.
Hawes, Michael. 2020. “Implementing Differential Privacy: Seven Lessons From the 2020 United States Census.” Harvard Data Science Review 2 (2). https://doi.org/10.1162/99608f92.353c6f99.
Hayot, Eric. 2014. The Elements of Academic Style. New York: Columbia University Press.
Healy, Kieran. 2018. Data Visualization. New Jersey: Princeton University Press. https://socviz.co.
———. 2020. “The Kitchen Counter Observatory,” May. https://kieranhealy.org/blog/archives/2020/05/21/the-kitchen-counter-observatory/.
———. 2022. “Unhappy in Its Own Way,” July. https://kieranhealy.org/blog/archives/2022/07/22/unhappy-in-its-own-way/.
Heckathorn, Douglas. 1997. “Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations.” Social Problems 44 (2): 174–99. https://doi.org/10.2307/3096941.
Heil, Benjamin, Michael Hoffman, Florian Markowetz, Su-In Lee, Casey Greene, and Stephanie Hicks. 2021. “Reproducibility Standards for Machine Learning in the Life Sciences.” Nature Methods 18 (10): 1132–35. https://doi.org/10.1038/s41592-021-01256-7.
Heller, Jean. 2022. “AP Exposes the Tuskegee Syphilis Study: The 50th Anniversary.” AP, July. https://apnews.com/article/tuskegee-study-ap-story-investigation-syphilis-53403657e77d76f52df6c2e2892788c9.
Hermans, Felienne. 2017. “Peter Hilton on Naming.” IEEE Software 34 (3): 117–20. https://doi.org/10.1109/MS.2017.81.
———. 2021. The Programmer’s Brain: What Every Programmer Needs to Know about Cognition. 1st ed. New York: Simon; Schuster. https://www.manning.com/books/the-programmers-brain.
Hernán, Miguel, David Clayton, and Niels Keiding. 2011. “The Simpson’s Paradox Unraveled.” International Journal of Epidemiology 40 (3): 780–85. https://doi.org/10.1093/ije/dyr041.
Hernán, Miguel, and James Robins. 2023. What If. 1st ed. Boca Raton: Chapman & Hall/CRC. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.
Herndon, Thomas, Michael Ash, and Robert Pollin. 2014. “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff.” Cambridge Journal of Economics 38 (2): 257–79. https://doi.org/10.1093/cje/bet075.
Hester, Jim, Florent Angly, Russ Hyde, Michael Chirico, Kun Ren, Alexander Rosenstock, and Indrajeet Patil. 2022. lintr: A “Linter” for R Code. https://CRAN.R-project.org/package=lintr.
Hester, Jim, Hadley Wickham, and Gábor Csárdi. 2021. fs: Cross-Platform File System Operations Based on “libuv”. https://CRAN.R-project.org/package=fs.
Hill, Austin Bradford. 1965. “The Environment and Disease: Association or Causation?” Proceedings of the Royal Society of Medicine 58 (5): 295–300.
Hillel, Wayne. 2017. How Do We Trust Our Science Code? https://www.hillelwayne.com/how-do-we-trust-science-code/.
Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2011. MatchIt: Nonparametric Preprocessing for Parametric Causal Inference.” Journal of Statistical Software 42 (8): 1–28. https://doi.org/10.18637/jss.v042.i08.
Hodgetts, Paul. 2022. “The Negative Space of Data,” March. https://hodgettsp.netlify.app/post/data-negativespace/.
Hofmeister, Johannes, Janet Siegmund, and Daniel Holt. 2017. “Shorter Identifier Names Take Longer to Comprehend.” In 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), 217–27. https://doi.org/10.1109/saner.2017.7884623.
Holland, Paul. 1986. “Statistics and Causal Inference.” Journal of the American Statistical Association 81 (396): 945–60. https://doi.org/10.2307/2289064.
Holliday, Derek, Tyler Reny, Alex Rossell Hayes, Aaron Rudkin, Chris Tausanovitch, and Lynn Vavreck. 2021. Democracy Fund + UCLA Nationscape Methodology and Representativeness Assessment.”
Hopper, Nate. 2022. “The Thorny Problem of Keeping the Internet’s Time.” The New Yorker, September. https://www.newyorker.com/tech/annals-of-technology/the-thorny-problem-of-keeping-the-internets-time.
Horst, Allison Marie, Alison Presmanes Hill, and Kristen Gorman. 2020. palmerpenguins: Palmer Archipelago (Antarctica) penguin data. https://doi.org/10.5281/zenodo.3960218.
Horton, Nicholas, Rohan Alexander, Micaela Parker, Aneta Piekut, and Colin Rundel. 2022. “The Growing Importance of Reproducibility and Responsible Workflow in the Data Science and Statistics Curriculum.” Journal of Statistics and Data Science Education 30 (3): 207–8. https://doi.org/10.1080/26939169.2022.2141001.
Horton, Nicholas, and Stuart Lipsitz. 2001. “Multiple Imputation in Practice.” The American Statistician 55 (3): 244–54. https://doi.org/10.1198/000313001317098266.
Hotz, Joseph, Christopher Bollinger, Tatiana Komarova, Charles Manski, Robert Moffitt, Denis Nekipelov, Aaron Sojourner, and Bruce Spencer. 2022. “Balancing Data Privacy and Usability in the Federal Statistical System.” Proceedings of the National Academy of Sciences 119 (31): 1–10. https://doi.org/10.1073/pnas.2104906119.
Howes, Adam. 2022. “Representing Uncertainty Using Significant Figures,” April. https://athowes.github.io/posts/2022-04-24-representing-uncertainty-using-significant-figures/.
Hug, Lucia, Monica Alexander, Danzhen You, Leontine Alkema, and UN Inter-agency Group for Child. 2019. “National, Regional, and Global Levels and Trends in Neonatal Mortality Between 1990 and 2017, with Scenario-Based Projections to 2030: A Systematic Analysis.” Lancet Global Health 7 (6): e710–20. https://doi.org/10.1016/S2214-109X(19)30163-9.
Hughes, Nicola, and Jill Rutter. 2016. “Ministers Reflect: Interview with Oliver Letwin,” December. https://www.instituteforgovernment.org.uk/ministers-reflect/person/oliver-letwin/.
Hulley, Stephen, Steven Cummings, Warren Browner, Deborah Grady, and Thomas Newman. 2007. Designing Clinical Research. 3rd ed. Lippincott Williams & Wilkins.
Hullman, Jessica, and Andrew Gelman. 2021. “Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference.” Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.3ab8a587.
Huntington-Klein, Nick. 2021. The Effect: An Introduction to Research Design and Causality. 1st ed. Chapman & Hall. https://theeffectbook.net.
———. 2022. “Library of Statistical Techniques.” https://lost-stats.github.io.
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni, Jeffrey Bloem, Pralhad Burli, Naibin Chen, et al. 2021. “The Influence of Hidden Researcher Decisions in Applied Microeconomics.” Economic Inquiry 59: 944–60. https://doi.org/10.1111/ecin.12992.
Huyen, Chip. 2020. “Machine Learning Is Going Real-Time,” December. https://huyenchip.com/2020/12/27/real-time-machine-learning.html.
Hvitfeldt, Emil, and Julia Silge. 2021. Supervised Machine Learning for Text Analysis in R. 1st ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781003093459.
Hyman, Michael, Luca Sartore, and Linda J Young. 2021. Capture-Recapture Estimation of Characteristics of U.S. Local Food Farms Using a Web-Scraped List Frame.” Journal of Survey Statistics and Methodology 10 (4): 979–1004. https://doi.org/10.1093/jssam/smab008.
Hyndman, Rob, Timothy Hyndman, Charles Gray, Sayani Gupta, and Jacquie Tran. 2022. cricketdata: International Cricket Data. https://CRAN.R-project.org/package=cricketdata.
Iannone, Richard. 2022. DiagrammeR: Graph/Network Visualization. https://CRAN.R-project.org/package=DiagrammeR.
Iannone, Richard, and Mauricio Vargas. 2022. pointblank: Data Validation and Organization of Metadata for Local and Remote Tables. https://CRAN.R-project.org/package=pointblank.
International Organization Of Legal Metrology. 2007. International Vocabulary of Metrology – Basic and General Concepts and Associated Terms. 3rd ed. https://www.oiml.org/en/files/pdf%5Fv/v002-200-e07.pdf.
Ioannidis, John. 2005. “Why Most Published Research Findings Are False.” PLOS Medicine 2 (8): e124. https://doi.org/10.1371/journal.pmed.0020124.
Irizarry, Rafael. 2020. The Role of Academia in Data Science Education.” Harvard Data Science Review 2 (1). https://doi.org/10.1162/99608f92.dd363929.
Irving, Damien, Kate Hertweck, Luke Johnston, Joel Ostblom, Charlotte Wickham, and Greg Wilson. 2021. Research Software Engineering with Python. Chapman; Hall/CRC.
Isaacson, Walter. 2011. Steve Jobs. 1st ed. Simon & Schuster.
Ishiguro, Kazuo. 1989. The Remains of the Day. 1st ed. Faber; Faber.
Izrailev, Sergei. 2022. tictoc: Functions for Timing R Scripts, as Well as Implementations of “Stack” and “List” Structures. https://CRAN.R-project.org/package=tictoc.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. (2013) 2021. An Introduction to Statistical Learning with Applications in R. 2nd ed. Springer. https://www.statlearning.com.
Jenkins, Jennifer, Steven Rich, Andrew Ba Tran, Paige Moody, Julie Tate, and Ted Mellnik. 2022. “How the Washington Post Examines Police Shootings in the United States.” https://www.washingtonpost.com/investigations/2022/12/05/washington-post-fatal-police-shootings-methodology/.
Jet Propulsion Laboratory. 2009. JPL Institutional Coding Standard for the C Programming Language.” Document Number D-60411, March. https://web.archive.org/web/20111015064908/http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf.
Johnson, Alicia, Miles Ott, and Mine Dogucu. 2022. Bayes Rules! An Introduction to Bayesian Modeling with R. 1st ed. Chapman; Hall/CRC. https://www.bayesrulesbook.com.
Johnson, Kaneesha. 2021. “Two Regimes of Prison Data Collection.” Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.72825001.
Johnston, Myfanwy, and David Robinson. 2022. gutenbergr: Download and Process Public Domain Works from Project Gutenberg. https://CRAN.R-project.org/package=gutenbergr.
Jones, Arnold. 1953. “Census Records of the Later Roman Empire.” The Journal of Roman Studies 43: 49–64. https://doi.org/10.2307/297781.
Jordan, Michael. 2004. “Graphical Models.” Statistical Science 19 (1). https://doi.org/10.1214/088342304000000026.
———. 2019. Artificial Intelligence–The Revolution Hasn’t Happened Yet.” Harvard Data Science Review 1 (1). https://doi.org/10.1162/99608f92.f06c6e61.
Joyner, Michael. 1991. “Modeling: Optimal Marathon Performance on the Basis of Physiological Factors.” Journal of Applied Physiology 70 (2): 683–87. https://doi.org/10.1152/jappl.1991.70.2.683.
Jurafsky, Dan, and James Martin. (2000) 2023. Speech and Language Processing. 3rd ed. https://web.stanford.edu/~jurafsky/slp3/.
Kahan, Brennan, Suzie Cro, Fan Li, and Michael Harhay. 2023. “Eliminating Ambiguous Treatment Effects Using Estimands.” American Journal of Epidemiology, February. https://doi.org/10.1093/aje/kwad036.
Kahan, Brennan, Joanna Hindley, Mark Edwards, Suzie Cro, and Tim Morris. 2024. The estimands framework: a primer on the ICH E9(R1) addendum.” BMJ, January, e076316. https://doi.org/10.1136/bmj-2023-076316.
Kahan, Brennan, Fan Li, Andrew Copas, and Michael Harhay. 2022. “Estimands in Cluster-Randomized Trials: Choosing Analyses That Answer the Right Question.” International Journal of Epidemiology, July. https://doi.org/10.1093/ije/dyac131.
Kahle, David, and Hadley Wickham. 2013. ggmap: Spatial Visualization with ggplot2.” The R Journal 5 (1): 144–61. http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf.
Kahneman, Daniel, Olivier Sibony, and Cass Sunstein. 2021. Noise: A Flaw in Human Judgment. William Collins.
Kalamara, Eleni, Arthur Turrell, Chris Redl, George Kapetanios, and Sujit Kapadia. 2022. Making text count: Economic forecasting using newspaper text.” Journal of Applied Econometrics 37 (5): 896–919. https://doi.org/10.1002/jae.2907.
Kalgin, Alexander. 2014. Implementation of Performance Management in Regional Government in Russia: Evidence of Data Manipulation.” Public Management Review 18 (1): 110–38. https://doi.org/10.1080/14719037.2014.965271.
Kapoor, Sayash, and Arvind Narayanan. 2023. “Leakage and the Reproducibility Crisis in Machine-Learning-Based Science.” Patterns 4 (9): 1–12. https://doi.org/10.1016/j.patter.2023.100804.
Karsten, Karl. 1923. Charts and Graphs. New York: Prentice-Hall.
Kasy, Maximilian, and Alexander Teytelboym. 2023. “Matching with Semi-Bandits.” The Econometrics Journal 26 (1): 45–66. https://doi.org/10.1093/ectj/utac021.
Katz, Lindsay, and Rohan Alexander. 2023a. A new, comprehensive database of all proceedings of the Australian Parliamentary Debates (1998-2022).” Zenodo. https://doi.org/10.5281/zenodo.7799678.
———. 2023b. “Digitization of the Australian Parliamentary Debates, 1998–2022.” Scientific Data 10 (1): 1–14. https://doi.org/10.1038/s41597-023-02464-w.
Kay, Matthew. 2022. tidybayes: Tidy Data and Geoms for Bayesian Models. https://doi.org/10.5281/zenodo.1308151.
Kennedy, Lauren, and Jonah Gabry. 2020. MRP with rstanarm,” July. https://mc-stan.org/rstanarm/articles/mrp.html.
Kennedy, Lauren, and Andrew Gelman. 2021. “Know Your Population and Know Your Model: Using Model-Based Regression and Poststratification to Generalize Findings Beyond the Observed Sample.” Psychological Methods 26 (5): 547–58. https://doi.org/10.1037/met0000362.
Kennedy, Lauren, Katharine Khanna, Daniel Simpson, Andrew Gelman, Yajun Jia, and Julien Teitler. 2022. “He, She, They: Using Sex and Gender in Survey Adjustment.” https://arxiv.org/abs/2009.14401.
Kenny, Christopher T., Shiro Kuriwaki, Cory McCartan, Evan T. R. Rosenman, Tyler Simko, and Kosuke Imai. 2021. The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census.” Science Advances 7 (41). https://doi.org/10.1126/sciadv.abk3283.
———. 2023. “Comment: The Essential Role of Policy Evaluation for the 2020 Census Disclosure Avoidance System.” Harvard Data Science Review, no. Special Issue 2. https://doi.org/10.1162/99608f92.abc2c765.
Kent, William. 1993. “My Height: A Model for Numeric Information.” https://www.bkent.net/Doc/myheight.htm.
Keshav, Srinivasan. 2007. “How to Read a Paper.” ACM SIGCOMM Computer Communication Review 37 (3): 83–84. https://doi.org/10.1145/1273445.1273458.
Keyes, Os. 2019. “Counting the Countless.” Real Life. https://reallifemag.com/counting-the-countless/.
Kharecha, Pushker, and James Hansen. 2013. “Prevented Mortality and Greenhouse Gas Emissions from Historical and Projected Nuclear Power.” Environmental Science & Technology 47 (9): 4889–95. https://doi.org/10.1021/es3051197.
Kiang, Mathew, Alexander Tsai, Monica Alexander, David Rehkopf, and Sanjay Basu. 2021. “Racial/Ethnic Disparities in Opioid-Related Mortality in the USA, 1999–2019: The Extreme Case of Washington DC.” Journal of Urban Health 98 (5): 589–95. https://doi.org/10.1007/s11524-021-00573-8.
King, Gary. 2006. “Publication, Publication.” PS: Political Science & Politics 39 (1): 119–25. https://doi.org/10.1017/S1049096506060252.
King, Gary, and Richard Nielsen. 2019. “Why Propensity Scores Should Not Be Used for Matching.” Political Analysis 27 (4): 435–54. https://doi.org/10.1017/pan.2019.11.
King, Stephen. 2000. On Writing: A Memoir of the Craft. 1st ed. Scribner.
Kirkegaard, Emil, and Julius Bjerrekær. 2016. “The OKCupid Dataset: A Very Large Public Dataset of Dating Site Users.” Open Differential Psychology, 1–10. https://doi.org/10.26775/ODP.2016.11.03.
Kish, Leslie. 1959. “Some Statistical Problems in Research Design.” American Sociological Review 24 (3): 328–38. https://doi.org/10.2307/2089381.
Kleiber, Christian, and Achim Zeileis. 2008. Applied Econometrics with R. New York: Springer-Verlag. https://CRAN.R-project.org/package=AER.
Knuth, Donald. 1984. “Literate Programming.” The Computer Journal 27 (2): 97–111. https://doi.org/10.1093/comjnl/27.2.97.
———. 1998. Art of Computer Programming, Volume 2: Seminumerical Algorithms. 2nd ed.
Knutson, Victoria, Serge Aleshin-Guendel, Ariel Karlinsky, William Msemburi, and Jon Wakefield. 2022. “Estimating Global and Country-Specific Excess Mortality During the COVID-19 Pandemic,” May. https://cdn.who.int/media/docs/default-source/world-health-data-platform/covid-19-excessmortality/covid-methods-paper-revision.pdf.
Koenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John Rickford, Dan Jurafsky, and Sharad Goel. 2020. “Racial Disparities in Automated Speech Recognition.” Proceedings of the National Academy of Sciences 117 (14): 7684–89. https://doi.org/10.1073/pnas.1915768117.
Koenecke, Allison, and Hal Varian. 2020. “Synthetic Data Generation for Economists.” https://arxiv.org/abs/2011.01374.
Koenker, Roger, and Achim Zeileis. 2009. “On Reproducible Econometric Research.” Journal of Applied Econometrics 24 (5): 833–47. https://doi.org/10.1002/jae.1083.
Koerner, Lisbet. 2000. Linnaeus: Nature and Nation. Cambridge: Harvard University Press.
Kohavi, Ron, Alex Deng, Brian Frasca, Roger Longbotham, Toby Walker, and Ya Xu. 2012. “Trustworthy Online Controlled Experiments.” In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 12, 1st ed. ACM Press. https://doi.org/10.1145/2339530.2339653.
Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press.
Koitsalu, Marie, Martin Eklund, Jan Adolfsson, Henrik Grönberg, and Yvonne Brandberg. 2018. “Effects of Pre-Notification, Invitation Length, Questionnaire Length and Reminder on Participation Rate: A Quasi-Randomised Controlled Trial.” BMC Medical Research Methodology 18 (3): 1–5. https://doi.org/10.1186/s12874-017-0467-5.
Krantz, Sebastian. 2023. collapse: Advanced and Fast Data Transformation. https://CRAN.R-project.org/package=collapse.
Kuhn, Max. 2022. tune: Tidy Tuning Tools. https://CRAN.R-project.org/package=tune.
Kuhn, Max, and Hannah Frick. 2022. poissonreg: Model Wrappers for Poisson Regression. https://CRAN.R-project.org/package=poissonreg.
Kuhn, Max, and Davis Vaughan. 2022. parsnip: A Common API to Modeling and Analysis Functions. https://CRAN.R-project.org/package=parsnip.
Kuhn, Max, Davis Vaughan, and Emil Hvitfeldt. 2022. yardstick: Tidy Characterizations of Model Performance. https://CRAN.R-project.org/package=yardstick.
Kuhn, Max, and Hadley Wickham. 2020. tidymodels: a collection of packages for modeling and machine learning using tidyverse principles. https://www.tidymodels.org.
———. 2022. recipes: Preprocessing and Feature Engineering Steps for Modeling. https://CRAN.R-project.org/package=recipes.
Kuriwaki, Shiro, Will Beasley, and Thomas Leeper. 2023. dataverse: R Client for Dataverse 4+ Repositories.
Kuznets, Simon, Lillian Epstein, and Elizabeth Jenks. 1941. National Income and Its Composition, 1919-1938. National Bureau of Economic Research.
Lamott, Anne. 1994. Bird by Bird: Some Instructions on Writing and Life. Anchor Books.
Landau, William Michael. 2021. The targets R Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959.
Lane, Nick. 2015. “The Unseen World: Reflections on Leeuwenhoek (1677) ‘Concerning Little Animals’.” Philosophical Transactions of the Royal Society B: Biological Sciences 370 (1666): 20140344. https://doi.org/10.1098/rstb.2014.0344.
Laouenan, Morgane, Palaash Bhargava, Jean-Benoı̂t Eyméoud, Olivier Gergaud, Guillaume Plique, and Etienne Wasmer. 2022. A Cross-Verified Database of Notable People, 3500BC–2018AD.” Scientific Data 9 (290). https://doi.org/10.1038/s41597-022-01369-4.
Larmarange, Joseph. 2023. labelled: Manipulating Labelled Data. https://CRAN.R-project.org/package=labelled.
Latour, Bruno. 1996. “On Actor-Network Theory: A Few Clarifications.” Soziale Welt 47 (4): 369–81. http://www.jstor.org/stable/40878163.
Lauderdale, Benjamin, Delia Bailey, Jack Blumenau, and Douglas Rivers. 2020. “Model-Based Pre-Election Polling for National and Sub-National Outcomes in the US and UK.” International Journal of Forecasting 36 (2): 399–413. https://doi.org/10.1016/j.ijforecast.2019.05.012.
Laver, Michael, Kenneth Benoit, and John Garry. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97 (2): 311–31. https://doi.org/10.1017/S0003055403000698.
Leek, Jeff, Blakeley McShane, Andrew Gelman, David Colquhoun, Michèle Nuijten, and Steven Goodman. 2017. “Five Ways to Fix Statistics.” Nature 551 (7682): 557–59. https://doi.org/10.1038/d41586-017-07522-z.
Leek, Jeff, and Roger Peng. 2020. Advanced Data Science 2020.” http://jtleek.com/ads2020/index.html.
Leonelli, Sabina. 2020. “Learning from Data Journeys.” In Data Journeys in the Sciences, 1–24. Springer International Publishing. https://doi.org/10.1007/978-3-030-37177-7_1.
Leos-Barajas, Vianey, Theoni Photopoulou, Roland Langrock, Toby Patterson, Yuuki Watanabe, Megan Murgatroyd, and Yannis Papastamatiou. 2016. “Analysis of Animal Accelerometer Data Using Hidden Markov Models.” Methods in Ecology and Evolution 8 (2): 161–73. https://doi.org/10.1111/2041-210x.12657.
Letterman, Clark. 2021. Q&A: How Pew Research Center surveyed nearly 30,000 people in India,” July. https://medium.com/pew-research-center-decoded/q-a-how-pew-research-center-surveyed-nearly-30-000-people-in-india-7c778f6d650e.
Levay, Kevin, Jeremy Freese, and James Druckman. 2016. “The Demographic and Political Composition of Mechanical Turk Samples.” SAGE Open 6 (1): 1–17. https://doi.org/10.1177/2158244016636433.
Levine, Judah, Patrizia Tavella, and Martin Milton. 2022. “Towards a Consensus on a Continuous Coordinated Universal Time.” Metrologia 60 (1): 014001. https://doi.org/10.1088/1681-7575/ac9da5.
Lewis, Crystal. 2024. Data Management in Large-Scale Education Research. 1st ed. Chapman; Hall/CRC. https://datamgmtinedresearch.com/index.html.
Lichand, Guilherme, and Sharon Wolf. 2022. “Measuring Child Labor: Whom Should Be Asked, and Why It Matters,” March. https://doi.org/10.21203/rs.3.rs-1474562/v1.
Light, Richard, Judith Singer, and John Willett. 1990. By Design: Planning Research on Higher Education. 1st ed. Cambridge: Harvard University Press.
Lima, Renato de, Oliver Phillips, Alvaro Duque, Sebastian Tello, Stuart Davies, Alexandre Adalardo de Oliveira, Sandra Muller, et al. 2022. “Making Forest Data Fair and Open.” Nature Ecology & Evolution 6 (April): 656–58. https://doi.org/10.1038/s41559-022-01738-7.
Lin, Herbert. 2014. “A Proposal to Reduce Government Overclassification of Information Related to National Security.” Journal of National Security Law and Policy 7: 443–63.
Lin, Sarah, Ibraheem Ali, and Greg Wilson. 2021. “Ten Quick Tips for Making Things Findable.” PLOS Computational Biology 16 (12): 1–10. https://doi.org/10.1371/journal.pcbi.1008469.
Lips, Hilary. 2020. Sex and Gender: An Introduction. 7th ed. Illinois: Waveland Press.
Little, Roderick, and Roger Lewis. 2021. “Estimands, Estimators, and Estimates.” JAMA 326 (10): 967. https://doi.org/10.1001/jama.2021.2886.
Liu, Emily, Lenny Bronner, and Jeremy Bowers. 2022. “What the Washington Post Elections Engineering Team Had to Learn about Election Data.” Washington Post Engineering, April. https://washpost.engineering/what-the-washington-post-elections-engineering-team-had-to-learn-about-election-data-a41603daf9ca.
Lockheed Martin. 2005. Joint Strike Fighter Air Vehicle C++ Coding Standards For The System Development And Demonstration Program.” Document Number 2RDU00001 Rev C, December. https://www.stroustrup.com/JSF-AV-rules.pdf.
Lohr, Sharon. (1999) 2022. Sampling: Design and Analysis. 3rd ed. Chapman; Hall/CRC.
Loken, Meredith, and Hilary Matfess. 2023. “Introducing the Women’s Activities in Armed Rebellion (WAAR) Project, 1946-2015.” Journal of Peace Research.
Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019. Geocomputation with R. 1st ed. Chapman; Hall/CRC. https://geocompr.robinlovelace.net.
Lucas, Jack, Reed Merrill, Kelly Blidook, Sandra Breux, Laura Conrad, Gabriel Eidelman, Royce Koop, et al. 2020. Canadian Municipal Elections Database.” Scholars Portal Dataverse. https://doi.org/10.5683/sp2/4mzjpq.
Lucas, Robert. 1978. “Asset Prices in an Exchange Economy.” Econometrica 46 (6): 1429–45. https://doi.org/10.2307/1913837.
Luebke, David Martin, and Sybil Milton. 1994. “Locating the Victim: An Overview of Census-Taking, Tabulation Technology, and Persecution in Nazi Germany.” IEEE Annals of the History of Computing 16 (3): 25–39. https://doi.org/10.1109/MAHC.1994.298418.
Lumley, Thomas. 2020. survey: analysis of complex survey samples.” https://cran.r-project.org/web/packages/survey/index.html.
Lundberg, Ian, Rebecca Johnson, and Brandon Stewart. 2021. “What Is Your Estimand? Defining the Target Quantity Connects Statistical Evidence to Theory.” American Sociological Review 86 (3): 532–65. https://doi.org/10.1177/00031224211004187.
Luscombe, Alex, Kevin Dick, and Kevin Walby. 2021. “Algorithmic Thinking in the Public Interest: Navigating Technical, Legal, and Ethical Hurdles to Web Scraping in the Social Sciences.” Quality & Quantity 56 (3): 1–22. https://doi.org/10.1007/s11135-021-01164-0.
Luscombe, Alex, Jamie Duncan, and Kevin Walby. 2022. “Jumpstarting the Justice Disciplines: A Computational-Qualitative Approach to Collecting and Analyzing Text and Image Data in Criminology and Criminal Justice Studies.” Journal of Criminal Justice Education 33 (2): 151–71. https://doi.org/10.1080/10511253.2022.2027477.
Luscombe, Alex, and Alexander McClelland. 2020. “Policing the Pandemic: Tracking the Policing of Covid-19 Across Canada,” April. https://doi.org/10.31235/osf.io/9pn27.
Lyman, Frank. 1981. “The Responsive Classroom Discussion: The Inclusion of All Students.” Mainstreaming Digest 109: 109–13.
MacDorman, Marian, and Eugene Declercq. 2018. “The Failure of United States Maternal Mortality Reporting and Its Impact on Women’s Lives.” Birth 45 (2): 105–8. https://doi.org/1111/birt.12333.
Maher, Michael. 1982. “Modelling Association Football Scores.” Statistica Neerlandica 36 (3): 109–18. https://doi.org/10.1111/j.1467-9574.1982.tb00782.x.
Maier, Maximilian, František Bartoš, Tom Stanley, David Shanks, Adam Harris, and Eric-Jan Wagenmakers. 2022. “No Evidence for Nudging After Adjusting for Publication Bias.” Proceedings of the National Academy of Sciences 119 (31): e2200300119. https://doi.org/10.1073/pnas.2200300119.
Mammoliti, Anthony, Petr Smirnov, Minoru Nakano, Zhaleh Safikhani, Christopher Eeles, Heewon Seo, Sisira Kadambat Nair, et al. 2021. “Orchestrating and Sharing Large Multimodal Data for Transparent and Reproducible Research.” Nature Communications 12 (1). https://doi.org/10.1038/s41467-021-25974-w.
Manski, Charles. 2022. “Inference with Imputed Data: The Allure of Making Stuff Up.” arXiv. https://doi.org/10.48550/arXiv.2205.07388.
Marchese, David. 2022. “Her Discovery Changed the World. How Does She Think We Should Use It?” The New York Times, August. https://www.nytimes.com/interactive/2022/08/15/magazine/jennifer-doudna-crispr-interview.html.
Marlowe, Christopher. 1604. The Tragical History of Doctor Faustus. https://www.gutenberg.org/files/779/779-h/779-h.htm.
———. 1616. The Tragical History of Doctor Faustus. https://www.gutenberg.org/cache/epub/811/pg811-images.html.
Martin, Charles, and Ben Popper. 2021. “Don’t Push That Button: Exploring the Software That Flies SpaceX Rockets and Starships.” The Overflow, December. https://stackoverflow.blog/2021/12/27/dont-push-that-button-exploring-the-software-that-flies-spacex-starships/.
Martı́nez, Luis. 2022. “How Much Should We Trust the Dictator’s GDP Growth Estimates?” Journal of Political Economy 130 (10): 2731–69. https://doi.org/10.1086/720458.
Matias, Nathan, Kevin Munger, Marianne Aubin Le Quere, and Charles Ebersole. 2021. The Upworthy Research Archive, a time series of 32,487 experiments in U.S. media.” Scientific Data 8 (1): 1–8. https://doi.org/10.1038/s41597-021-00934-7.
Matsumoto, Yukihiro. 2007. Treating Code as an Essay.” In Beautiful Code, edited by Andy Oram and Greg Wilson, 477–81. O’Reilly.
Mattson, Greggor. 2017. “Artificial Intelligence Discovers Gayface. Sigh.” https://greggormattson.com/2017/09/09/artificial-intelligence-discovers-gayface/amp/.
McCarthy, Fiona M., Tamsin E. M. Jones, Anne E. Kwitek, Cynthia L. Smith, Peter D. Vize, Monte Westerfield, and Elspeth A. Bruford. 2023. “The Case for Standardizing Gene Nomenclature in Vertebrates.” Nature 614 (7948): E31–32. https://doi.org/10.1038/s41586-022-05633-w.
McClelland, Alexander. 2019. ‘Lock This Whore up’: Legal Violence and Flows of Information Precipitating Personal Violence Against People Criminalised for HIV-Related Crimes in Canada.” European Journal of Risk Regulation 10 (1): 132–47. https://doi.org/10.1017/err.2019.20.
McElreath, Richard. (2015) 2020. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. 2nd ed. Chapman; Hall/CRC.
———. 2020. “Science as Amateur Software Development.” YouTube, September. https://youtu.be/zwRdO9%5FGGhY.
McIlroy, Doug, Ray Brownrigg, Thomas Minka, and Roger Bivand. 2023. mapproj: Map Projections. https://CRAN.R-project.org/package=mapproj.
McKenzie, David. 2021. What Do You Need To Do To Make A Matching Estimator Convincing? Rhetorical vs Statistical Checks.” World Bank Blogs—Development Impact, February. https://blogs.worldbank.org/impactevaluations/what-do-you-need-do-make-matching-estimator-convincing-rhetorical-vs-statistical.
McKinney, Wes. (2011) 2022. Python for Data Analysis. 3rd ed. https://wesmckinney.com/book/.
McPhee, John. 2017. Draft No. 4. 1st ed. Farrar, Straus; Giroux.
McQuire, Scott. 2019. “One Map to Rule Them All? Google Maps as Digital Technical Object.” Communication and the Public 4 (2): 150–65. https://doi.org/10.1177/2057047319850192.
Mellon, Jonathan. 2024. “Rain, Rain, Go Away: 194 Potential Exclusion‐restriction Violations for Studies Using Weather as an Instrumental Variable.” American Journal of Political Science, 1–18. https://doi.org/10.1111/ajps.12894.
Meng, Xiao-Li. 1994. “Multiple-Imputation Inferences with Uncongenial Sources of Input.” Statistical Science 9 (4): 538–58. https://doi.org/10.1214/ss/1177010269.
———. 2012. “You Want Me to Analyze Data i Don’t Have? Are You Insane?” Shanghai Archives of Psychiatry 24 (5): 297–301. https://doi.org/10.3969/j.issn.1002-0829.2012.05.011.
———. 2018. “Statistical Paradises and Paradoxes in Big Data (i): Law of Large Populations, Big Data Paradox, and the 2016 US Presidential Election.” The Annals of Applied Statistics 12 (2): 685–726. https://doi.org/10.1214/18-AOAS1161SF.
———. 2021. “What Are the Values of Data, Data Science, or Data Scientists?” Harvard Data Science Review 3 (1). https://doi.org/10.1162/99608f92.ee717cf7.
Merali, Zeeya. 2010. “Computational Science:... Error.” Nature 467 (7317): 775–77. https://doi.org/10.1038/467775a.
Miceli, Milagros, Julian Posada, and Tianling Yang. 2022. “Studying up Machine Learning Data.” Proceedings of the ACM on Human-Computer Interaction 6 (January): 1–14. https://doi.org/10.1145/3492853.
Michener, William. 2015. “Ten Simple Rules for Creating a Good Data Management Plan.” PLOS Computational Biology 11 (10): e1004525. https://doi.org/10.1371/journal.pcbi.1004525.
Mill, James. 1817. The History of British India. 1st ed. https://books.google.ca/books?id=Orw_AAAAcAAJ.
Miller, Greg. 2014. The Cartographer Who’s Transforming Map Design.” Wired, October. https://www.wired.com/2014/10/cindy-brewer-map-design/.
Miller, Michael, and Joseph Sutherland. 2022. “The Effect of Gender on Interruptions at Congressional Hearings.” American Political Science Review, 1–19. https://doi.org/10.1017/S0003055422000260.
Mills, David L. 1991. “Internet Time Synchronization: The Network Time Protocol.” IEEE Transactions on Communications 39 (10): 1482–93.
Mindell, David. 2008. Digital Apollo: Human and Machine in Spaceflight. 1st ed. New York: The MIT Press.
Mineault, Patrick, and The Good Research Code Handbook Community. 2021. “The Good Research Code Handbook.” https://doi.org/10.5281/zenodo.5796873.
Minsky, Yaron. 2011. OCaml for the masses.” Communications of the ACM 54 (11): 53–58. https://doi.org/10.1145/2018396.2018413.
———. 2015. “Automated Trading and OCaml with Yaron Minsky.” Hackers — Software Engineering Daily, November. https://softwareengineeringdaily.com/2015/11/09/automated-trading-and-ocaml-with-yaron-minsky/.
Mitchell, Alanna. 2022a. “Get Ready for the New, Improved Second.” The New York Times, April. https://www.nytimes.com/2022/04/25/science/time-second-measurement.html.
———. 2022b. “Time Has Run Out for the Leap Second.” The New York Times, November. https://www.nytimes.com/2022/11/14/science/time-leap-second.html.
Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. “Model Cards for Model Reporting.” Proceedings of the Conference on Fairness, Accountability, and Transparency, January. https://doi.org/10.1145/3287560.3287596.
Mitrovski, Alen, Xiaoyan Yang, and Matthew Wankiewicz. 2020. “Joe Biden Projected to Win Popular Vote in 2020 US Election.” https://github.com/matthewwankiewicz/US_election_forecast.
Miyakawa, Tsuyoshi. 2020. “No Raw Data, No Science: Another Possible Source of the Reproducibility Crisis.” Molecular Brain 13 (1): 1–6. https://doi.org/10.1186/s13041-020-0552-2.
Mok, Lillio, Samuel Way, Lucas Maystre, and Ashton Anderson. 2022. “The Dynamics of Exploration on Spotify.” In Proceedings of the International AAAI Conference on Web and Social Media, 16:663–74. https://doi.org/10.1609/icwsm.v16i1.19324.
Molanphy, Chris. 2012. “100 & Single: Three Rules to Define the Term ‘One-Hit Wonder’ in 2012.” The Village Voice, September. https://www.villagevoice.com/2012/09/10/100-single-three-rules-to-define-the-term-one-hit-wonder-in-2012/.
Morange, Michel. 2016. A History of Biology. New Jersey: Princeton University Press.
Moyer, Brian, and Abe Dunn. 2020. “Measuring the Gross Domestic Product (GDP): The Ultimate Data Science Project.” Harvard Data Science Review 2 (1). https://doi.org/10.1162/99608f92.414caadb.
Mullard, Asher. 2021. “Half of Top Cancer Studies Fail High-Profile Reproducibility Effort.” Nature 600 (7889): 368--369. https://doi.org/10.1038/d41586-021-03691-0.
Müller, Kirill. 2020. here: A Simpler Way to Find Your Files. https://CRAN.R-project.org/package=here.
Müller, Kirill, Tobias Schieferdecker, and Patrick Schratz. 2019. Visualization, Transformation and Reporting with the Tidyverse. https://krlmlr.github.io/vistransrep/.
Müller, Kirill, and Lorenz Walthert. 2022. styler: Non-Invasive Pretty Printing of R Code. https://CRAN.R-project.org/package=styler.
Müller, Kirill, and Hadley Wickham. 2022. tibble: Simple Data Frames. https://CRAN.R-project.org/package=tibble.
Murphy, Heather. 2017. “Why Stanford Researchers Tried to Create a ‘Gaydar’ Machine.” The New York Times, October. https://www.nytimes.com/2017/10/09/science/stanford-sexual-orientation-study.html.
National Academies of Sciences, Engineering, and Medicine. 2019. Reproducibility and Replicability in Science. 1st ed. National Academies Press. https://doi.org/10.17226/25303.
Navarro, Danielle. 2022. Binding Apache Arrow to R,” January. https://blog.djnavarro.net/posts/2022-01-18%5Fbinding-arrow-to-r/.
Navarro, Danielle, Jonathan Keane, and Stephanie Hazlitt. 2022. Larger-Than-Memory Data Workflows with Apache Arrow,” June. https://arrow-user2022.netlify.app.
Nelder, John. 1999. “From Statistics to Statistical Science.” Journal of the Royal Statistical Society: Series D (The Statistician) 48 (2): 257–69. https://doi.org/10.1111/1467-9884.00187.
Nelder, John, and Robert Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society: Series A (General) 135 (3): 370–84. https://doi.org/10.2307/2344614.
Neufeld, Anna, and Daniela Witten. 2021. “Discussion of Breiman’s "Two Cultures": From Two Cultures to One.” Observational Studies 7 (1): 171–74. https://doi.org/10.1353/obs.2021.0004.
Neufeld, Michael. 2002. “Wernher von Braun, the SS, and Concentration Camp Labor: Questions of Moral, Political, and Criminal Responsibility.” German Studies Review 25 (1): 57–78. https://doi.org/10.2307/1433245.
Neuwirth, Erich. 2022. RColorBrewer: ColorBrewer Palettes. https://CRAN.R-project.org/package=RColorBrewer.
Newman, Daniel. 2014. “Missing Data: Five Practical Guidelines.” Organizational Research Methods 17 (4): 372–411. https://doi.org/10.1177/1094428114548590.
Neyman, Jerzy. 1934. “On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection.” Journal of the Royal Statistical Society 97 (4): 558–625. https://doi.org/10.2307/2342192.
Nix, Justin, and M. James Lozada. 2020. “Police Killings of Unarmed Black Americans: A Reassessment of Community Mental Health Spillover Effects,” January. https://doi.org/10.31235/osf.io/ajz2q.
Nobles, Melissa. 2002. “Racial Categorization and Censuses.” In Census and Identity: The Politics of Race, Ethnicity, and Language in National Censuses, edited by David Kertzer and Dominique Arel, 43–70. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511606045.003.
Northcutt, Curtis, Anish Athalye, and Jonas Mueller. 2021. “Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks.” In Proceedings of the 35th Conference on Neural Information Processing Systems Track on Datasets and Benchmarks. https://doi.org/10.48550/arXiv.2103.14749.
Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations.” Science 366 (6464): 447–53. https://doi.org/10.1126/science.aax2342.
Oberski, Daniel, and Frauke Kreuter. 2020. “Differential Privacy and Social Science: An Urgent Puzzle.” Harvard Data Science Review 2 (1). https://doi.org/10.1162/99608f92.63a22079.
OECD. 2014. “The Essential Macroeconomic Aggregates.” In Understanding National Accounts, 13–46. OECD. https://doi.org/10.1787/9789264214637-2-en.
———. 2022. Quarterly GDP. https://data.oecd.org/gdp/quarterly-gdp.htm.
Ooms, Jeroen. 2014. The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805.
———. 2022a. openssl: Toolkit for Encryption, Signatures and Certificates Based on OpenSSL. https://CRAN.R-project.org/package=openssl.
———. 2022b. pdftools: Text Extraction, Rendering and Converting of PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2022c. ssh: Secure Shell (SSH) Client for R. https://CRAN.R-project.org/package=ssh.
———. 2022d. tesseract: Open Source OCR Engine. https://CRAN.R-project.org/package=tesseract.
Open Science Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349 (6251): aac4716. https://doi.org/10.1126/science.aac4716.
Orwell, George. 1946. Politics and the English Language. https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-and-the-english-language/.
Osborne, Jason. 2012. Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data. SAGE Publications.
Osgood, D. Wayne. 2000. “Poisson-Based Regression Analysis of Aggregate Crime Rates.” Journal of Quantitative Criminology 16 (1): 21–43. https://doi.org/10.1023/a:1007521427059.
Palmer Station Antarctica LTER, and Gorman, Kristen. 2020. “Structural Size Measurements and Isotopic Signatures of Foraging Among Adult Male and Female Adélie Penguins (Pygoscelis Adeliae) Nesting Along the Palmer Archipelago Near Palmer Station, 2007-2009.” https://doi.org/10.6073/PASTA/98B16D7D563F265CB52372C8CA99E60F.
Pasek, Josh. 2015. Predicting Elections: Considering Tools to Pool the Polls.” Public Opinion Quarterly 79 (2): 594–619. https://doi.org/10.1093/poq/nfu060.
Patki, Neha, Roy Wedge, and Kalyan Veeramachaneni. 2016. “The Synthetic Data Vault.” In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 399–410. https://doi.org/10.1109/DSAA.2016.49.
Paullada, Amandalynne, Inioluwa Deborah Raji, Emily Bender, Emily Denton, and Alex Hanna. 2021. “Data and Its (Dis)contents: A Survey of Dataset Development and Use in Machine Learning Research.” Patterns 2 (11): 100336. https://doi.org/10.1016/j.patter.2021.100336.
Pavlik, Kaylin. 2019. “Understanding + Classifying Genres Using Spotify Audio Features.” https://www.kaylinpavlik.com/classifying-songs-genres/.
Pedersen, Thomas Lin. 2022. patchwork: The Composer of Plots. https://CRAN.R-project.org/package=patchwork.
Penrose, Carly. 2024. “Deadly Fires: Risk of Death, Injury Highest in Toronto’s Poor Neighbourhoods.” CBC News, April. https://www.cbc.ca/news/canada/toronto/fatal-fires-lower-income-1.7177356.
Perepolkin, Dmytro. 2022. polite: Be Nice on the Web. https://CRAN.R-project.org/package=polite.
Perkel, Jeffrey. 2021. “Ten Computer Codes That Transformed Science.” Nature 589 (7842): 344–48. https://doi.org/10.1038/d41586-021-00075-2.
———. 2023. “The Sleight-of-Hand Trick That Can Simplify Scientific Computing.” Nature 617 (7959): 212--213. https://doi.org/10.1038/d41586-023-01469-0.
Phillips, Alban. 1958. “The Relation Between Unemployment and the Rate of Change of Money Wage Rates in the United Kingdom, 1861-1957.” Economica 25 (100): 283–99. https://doi.org/10.1111/j.1468-0335.1958.tb00003.x.
Piller, Charles. 2022. “Blots on a Field?” Science 377 (6604): 358–63. https://doi.org/10.1126/science.ade0209.
Pineau, Joelle, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent Larivière, Alina Beygelzimer, Florence d’Alché-Buc, Emily Fox, and Hugo Larochelle. 2021. “Improving Reproducibility in Machine Learning Research (a Report from the NeurIPS 2019 Reproducibility Program).” Journal of Machine Learning Research 22 (164): 1–20. http://jmlr.org/papers/v22/20-303.html.
Pitman, Jim. 1993. Probability. 1st ed. New York: Springer. https://doi.org/10.1007/978-1-4612-4374-8.
Plant, Anne, and Robert Hanisch. 2020. “Reproducibility in Science: A Metrology Perspective.” Harvard Data Science Review 2 (4). https://doi.org/10.1162/99608f92.eb6ddee4.
Podlogar, Tim, Peter Leo, and James Spragg. 2022. Using VO2max as a marker of training status in athletes—Can we do better? Journal of Applied Physiology 133 (6): 144–47. https://doi.org/10.1152/japplphysiol.00723.2021.
Preece, Donald Arthur. 1981. “Distributions of Final Digits in Data.” The Statistician 30 (1): 31. https://doi.org/10.2307/2987702.
Prévost, Jean-Guy, and Jean-Pierre Beaud. 2015. Statistics, Public Debate and the State, 1800–1945: A Social, Political and Intellectual History of Numbers. Routledge.
Python Software Foundation. 2024. Python Language Reference, version 3.13.0. https://docs.python.org/3/index.html.
R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
R Special Interest Group on Databases (R-SIG-DB), Hadley Wickham, and Kirill Müller. 2022. DBI: R Database Interface. https://CRAN.R-project.org/package=DBI.
Radcliffe, Nicholas. 2023. Test-Driven Data Analysis (Python TDDA library). https://tdda.readthedocs.io/en/latest/index.html.
Register, Yim. 2020a. “Introduction to Sampling and Randomization.” YouTube, November. https://youtu.be/U272FFxG8LE.
———. 2020b. “Data Science Ethics in 6 Minutes.” YouTube, December. https://youtu.be/mA4gypAiRYU.
Rehaag, Sean. 2023. “Supreme Court of Canada Bulk Decisions Dataset.” Refugee Law Laboratory. https://refugeelab.ca/bulk-data/scc.
Reid, Nancy. 2003. “Asymptotics and the Theory of Inference.” The Annals of Statistics 31 (6): 1695–1731. https://doi.org/10.1214/aos/1074290325.
Richardson, Neal, Ian Cook, Nic Crane, Dewey Dunnington, Romain François, Jonathan Keane, Dragoș Moldovan-Grünfeld, Jeroen Ooms, and Apache Arrow. 2023. arrow: Integration to Apache Arrow. https://CRAN.R-project.org/package=arrow.
Riederer, Emily. 2020. “Column Names as Contracts,” September. https://emilyriederer.netlify.app/post/column-name-contracts/.
———. 2021. “Causal Design Patterns for Data Analysts,” January. https://emilyriederer.netlify.app/post/causal-design-patterns/.
Riffe, Tim, Enrique Acosta, Enrique José Acosta, Diego Manuel Aburto, Anna Alburez-Gutierrez, Ainhoa Altová, Ugofilippo Alustiza, et al. 2021. “Data Resource Profile: COVerAGE-DB: A Global Demographic Database of COVID-19 Cases and Deaths.” International Journal of Epidemiology 50 (2): 390–390f. https://doi.org/10.1093/ije/dyab027.
Rilke, Rainer Maria. (1929) 2014. Letters to a Young Poet. Penguin Classics.
Roberts, Margaret, Brandon Stewart, and Dustin Tingley. 2019. stm: An R Package for Structural Topic Models.” Journal of Statistical Software 91 (2): 1–40. https://doi.org/10.18637/jss.v091.i02.
Robinson, David, Alex Hayes, and Simon Couch. 2022. broom: Convert Statistical Objects into Tidy Tibbles. https://CRAN.R-project.org/package=broom.
Robinson, Emily, and Jacqueline Nolis. 2020. Build a Career in Data Science. Shelter Island: Manning Publications. https://livebook.manning.com/book/build-a-career-in-data-science.
Rockoff, Hugh. 2019. “On the Controversies Behind the Origins of the Federal Economic Statistics.” Journal of Economic Perspectives 33 (1): 147–64. https://doi.org/10.1257/jep.33.1.147.
Romer, Paul. 2018. “Jupyter, Mathematica, and the Future of the Research Paper,” April. https://paulromer.net/jupyter-mathematica-and-the-future-of-the-research-paper/.
Rose, Angela, Rebecca Grais, Denis Coulombier, and Helga Ritter. 2006. “A Comparison of Cluster and Systematic Sampling Methods for Measuring Crude Mortality.” Bulletin of the World Health Organization 84: 290–96. https://doi.org/10.2471/blt.05.029181.
Rosenau, James N. 1999. “A Transformed Observer in a Transforming World.” Studia Diplomatica 52 (1/2): 5–14. http://www.jstor.org/stable/44838096.
Ross, Casey. 2022. “How a Decades-Old Database Became a Hugely Profitable Dossier on the Health of 270 Million Americans.” Stat, February. https://www.statnews.com/2022/02/01/ibm-watson-health-marketscan-data/.
Rubinstein, Benjamin, and Francesco Alda. 2017. “Pain-Free Random Differential Privacy with Sensitivity Sampling.” In 34th International Conference on Machine Learning (ICML’2017).
Rudis, Bob. 2020. hrbrthemes: Additional Themes, Theme Components and Utilities for “ggplot2”. https://CRAN.R-project.org/package=hrbrthemes.
Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan Schroeder. 2019. “Differential Privacy and Census Data: Implications for Social and Economic Research.” AEA Papers and Proceedings 109 (May): 403–8. https://doi.org/10.1257/pandp.20191107.
Ruggles, Steven, Sarah Flood, Sophia Foster, Ronald Goeken, Jose Pacas, Megan Schouweiler, and Matthew Sobek. 2021. “IPUMS USA: Version 11.0.” Minneapolis, MN: IPUMS. https://doi.org/10.18128/d010.v11.0.
Ryan, Philip. 2015. “Keeping a Lab Notebook.” YouTube, May. https://youtu.be/-MAIuaOL64I.
Sadowski, Caitlin, Emma Söderberg, Luke Church, Michal Sipko, and Alberto Bacchelli. 2018. “Modern Code Review: A Case Study at Google.” In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, 181–90. ICSE-SEIP ’18. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3183519.3183525.
Sakshaug, Joseph, Ting Yan, and Roger Tourangeau. 2010. “Nonresponse Error, Measurement Error, and Mode of Data Collection: Tradeoffs in a Multi-Mode Survey of Sensitive and Non-Sensitive Items.” Public Opinion Quarterly 74 (5): 907–33. https://doi.org/10.1093/poq/nfq057.
Salganik, Matthew. 2018. Bit by Bit: Social Research in the Digital Age. New Jersey: Princeton University Press.
Salganik, Matthew, Peter Sheridan Dodds, and Duncan Watts. 2006. “Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market.” Science 311 (5762): 854–56. https://doi.org/10.1126/science.1121066.
Salganik, Matthew, and Douglas Heckathorn. 2004. “Sampling and Estimation in Hidden Populations Using Respondent-Driven Sampling.” Sociological Methodology 34 (1): 193–240. https://doi.org/10.1111/j.0081-1750.2004.00152.x.
Sambasivan, Nithya, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen Paritosh, and Lora Aroyo. 2021. ‘Everyone Wants to Do the Model Work, Not the Data Work’: Data Cascades in High-Stakes AI.” In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM. https://doi.org/10.1145/3411764.3445518.
Samuel, Arthur. 1959. “Some Studies in Machine Learning Using the Game of Checkers.” IBM Journal of Research and Development 3 (3): 210–29. https://doi.org/10.1147/rd.33.0210.
Saulnier, Lucile, Siddharth Karamcheti, Hugo Laurençon, Léo Tronchon, Thomas Wang, Victor Sanh, Amanpreet Singh, et al. 2022. “Putting Ethical Principles at the Core of the Research Lifecycle.” https://huggingface.co/blog/ethical-charter-multimodal.
Savage, Van, and Pamela Yeh. 2019. “Novelist Cormac McCarthy’s Tips on How to Write a Great Science Paper.” Nature 574 (7778): 441–42. https://doi.org/10.1038/d41586-019-02918-5.
Schaffner, Brian, Stephen Ansolabehere, and Sam Luks. 2021. Cooperative Election Study Common Content, 2020.” Harvard Dataverse. https://doi.org/10.7910/DVN/E9N6PH.
Schloerke, Barret, and Jeff Allen. 2022. plumber: An API Generator for R. https://CRAN.R-project.org/package=plumber.
Schmertmann, Carl. 2022. “UN API Test,” July. https://bonecave.schmert.net/un-api-example.html.
Schofield, Alexandra, Måns Magnusson, and David Mimno. 2017. “Pulling Out the Stops: Rethinking Stopword Removal for Topic Models.” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 432–36. Valencia, Spain: Association for Computational Linguistics. https://aclanthology.org/E17-2069.
Schofield, Alexandra, Måns Magnusson, Laure Thompson, and David Mimno. 2017. “Understanding Text Pre-Processing for Latent Dirichlet Allocation.” In ACL Workshop for Women in NLP (WiNLP). https://www.cs.cornell.edu/~xanda/winlp2017.pdf.
Schofield, Alexandra, Laure Thompson, and David Mimno. 2017. “Quantifying the Effects of Text Duplication on Semantic Models.” In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2737–47. Copenhagen, Denmark: Association for Computational Linguistics. https://doi.org/10.18653/v1/D17-1290.
Scott, James. 1998. Seeing Like a State. Yale University Press.
Sekhon, Jasjeet, and Rocío Titiunik. 2017. “Understanding Regression Discontinuity Designs as Observational Studies.” Observational Studies 3 (2): 174–82. https://doi.org/10.1353/obs.2017.0005.
Sen, Amartya. 1980. Description as Choice.” Oxford Economic Papers 32 (3): 353–69. https://doi.org/10.1093/oxfordjournals.oep.a041484.
Shankar, Shreya, Rolando Garcia, Joseph Hellerstein, and Aditya Parameswaran. 2022. “Operationalizing Machine Learning: An Interview Study.” arXiv. https://doi.org/10.48550/ARXIV.2209.09125.
Si, Yajuan. 2020. “On the Use of Auxiliary Variables in Multilevel Regression and Poststratification.” https://arxiv.org/abs/2011.00360.
Sides, John, Lynn Vavreck, and Christopher Warshaw. 2021. “The Effect of Television Advertising in United States Elections.” American Political Science Review, 1–17. https://doi.org/10.1017/s000305542100112x.
Silberzahn, Raphael, Eric Uhlmann, Daniel Martin, Pasquale Anselmi, Frederik Aust, Eli Awtrey, Štěpán Bahnı́k, et al. 2018. “Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results.” Advances in Methods and Practices in Psychological Science 1 (3): 337–56. https://doi.org/10.1177/2515245917747646.
Silge, Julia, and David Robinson. 2016. tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” The Journal of Open Source Software 1 (3). https://doi.org/10.21105/joss.00037.
Silver, Nate. 2020. “We Fixed an Issue with How Our Primary Forecast Was Calculating Candidates’ Demographic Strengths.” FiveThirtyEight, February. https://fivethirtyeight.com/features/we-fixed-a-mistake-in-how-our-primary-forecast-was-calculating-candidates-demographic-strengths/.
Simonsohn, Uri. 2013. “Just Post It: The Lesson from Two Cases of Fabricated Data Detected by Statistics Alone.” Psychological Science 24 (10): 1875–88. https://doi.org/10.1177/0956797613480366.
Simpkinson, Scott. 1971. Testing to Ensure Mission Success.” In What Made Apollo a Success, edited by NASA, 21–29.
Simpson, Edward. 1951. “The Interpretation of Interaction in Contingency Tables.” Journal of the Royal Statistical Society: Series B (Methodological) 13 (2): 238–41. https://doi.org/10.1111/j.2517-6161.1951.tb00088.x.
Smith, Jessie, Saleema Amershi, Solon Barocas, Hanna Wallach, and Jennifer Wortman Vaughan. 2022. “REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research.” 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22). https://doi.org/10.1145/3531146.3533122.
Smith, Matthew. 2018. “Should Milk Go in a Cup of Tea First or Last?” July. https://yougov.co.uk/topics/consumer/articles-reports/2018/07/30/should-milk-go-cup-tea-first-or-last.
Smith, Richard. 2002. “A Statistical Assessment of Buchanan’s Vote in Palm Beach County.” Statistical Science 17 (4): 441–57. https://doi.org/10.1214/ss/1049993203.
Sobek, Matthew, and Steven Ruggles. 1999. “The IPUMS Project: An Update.” Historical Methods: A Journal of Quantitative and Interdisciplinary History 32 (3): 102–10. https://doi.org/10.1080/01615449909598930.
Somers, James. 2015. Toolkits for the Mind.” MIT Technology Review, April. https://www.technologyreview.com/2015/04/02/168469/toolkits-for-the-mind/.
———. 2017. “Torching the Modern-Day Library of Alexandria.” The Atlantic, April. https://www.theatlantic.com/technology/archive/2017/04/the-tragedy-of-google-books/523320/.
———. 2018. “The Scientific Paper Is Obsolete.” The Atlantic, April. https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/.
Spear, Mary Eleanor. 1952. Charting Statistics. https://archive.org/details/ChartingStatistics_201801/.
Sprint, Gina, and Jason Conci. 2019. “Mining GitHub Classroom Commit Behavior in Elective and Introductory Computer Science Courses.” Journal of Computing Sciences in Colleges 35 (1): 76–84.
Staicu, Ana-Maria. 2017. “Interview with Nancy Reid.” International Statistical Review 85 (3): 381–403. https://doi.org/10.1111/insr.12237.
Staniak, Mateusz, and Przemysław Biecek. 2019. The Landscape of R Packages for Automated Exploratory Data Analysis.” The R Journal 11 (2): 347–69. https://doi.org/10.32614/RJ-2019-033.
Stantcheva, Stefanie. 2023. “How to Run Surveys: A Guide to Creating Your Own Identifying Variation and Revealing the Invisible.” Annual Review of Economics 15 (1): 205–34. https://doi.org/10.1146/annurev-economics-091622-010157.
Statistics Canada. 2020. “Sex at Birth and Gender: Technical Report on Changes for the 2021 Census.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2021/ref/98-20-0002/982000022020002-eng.pdf.
———. 2023. “Guide to the Census of Population, 2021.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2021/ref/98-304/98-304-x2021001-eng.pdf.
Steckel, Richard. 1991. “The Quality of Census Data for Historical Inquiry: A Research Agenda.” Social Science History 15 (4): 579–99. https://doi.org/10.2307/1171470.
Steele, Fiona. 2007. “Multilevel Models for Longitudinal Data.” Journal of the Royal Statistical Society Series A: Statistics in Society 171 (1): 5–19. https://doi.org/10.1111/j.1467-985x.2007.00509.x.
Steele, Fiona, Anna Vignoles, and Andrew Jenkins. 2007. “The Effect of School Resources on Pupil Attainment: A Multilevel Simultaneous Equation Modelling Approach.” Journal of the Royal Statistical Society Series A: Statistics in Society 170 (3): 801–24. https://doi.org/10.1111/j.1467-985x.2007.00476.x.
Stevens, Wallace. 1934. The Idea of Order at Key West. https://www.poetryfoundation.org/poems/43431/the-idea-of-order-at-key-west.
Steyvers, Mark, and Tom Griffiths. 2006. “Probabilistic Topic Models.” In Latent Semantic Analysis: A Road to Meaning, edited by T. Landauer, D McNamara, S. Dennis, and W. Kintsch. https://cocosci.princeton.edu/tom/papers/SteyversGriffiths.pdf.
Stigler, Stephen. 1978. Francis Ysidro Edgeworth, Statistician.” Journal of the Royal Statistical Society. Series A (General) 141 (3): 287–322. https://doi.org/10.2307/2344804.
———. 1986. The History of Statistics. Massachusetts: Belknap Harvard.
Stock, James, and Francesco Trebbi. 2003. “Retrospectives: Who Invented Instrumental Variable Regression?” Journal of Economic Perspectives 17 (3): 177–94. https://doi.org/10.1257/089533003769204416.
Stolberg, Michael. 2006. “Inventing the Randomized Double-Blind Trial: The Nuremberg Salt Test of 1835.” Journal of the Royal Society of Medicine 99 (12): 642–43. https://doi.org/10.1177/014107680609901216.
Stoler, Ann Laura. 2002. “Colonial Archives and the Arts of Governance.” Archival Science 2 (March): 87–109. https://doi.org/10.1007/bf02435632.
Stolley, Paul. 1991. “When Genius Errs: R. A. Fisher and the Lung Cancer Controversy.” American Journal of Epidemiology 133 (5): 416–25. https://doi.org/10.1093/oxfordjournals.aje.a115904.
Stommes, Drew, P. M. Aronow, and Fredrik Sävje. 2023. “On the Reliability of Published Findings Using the Regression Discontinuity Design in Political Science.” Research & Politics 10 (2). https://doi.org/https://doi.org/10.1177/2053168023116645.
Student. 1908. “The Probable Error of a Mean.” Biometrika 6 (1): 1–25. https://doi.org/10.2307/2331554.
Sunstein, Cass, and Lucia Reisch. 2017. The Economics of Nudge. Routledge.
Suriyakumar, Vinith, Nicolas Papernot, Anna Goldenberg, and Marzyeh Ghassemi. 2021. “Chasing Your Long Tails.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. https://doi.org/10.1145/3442188.3445934.
Swain, Larry. 1985. “Basic Principles of Questionnaire Design.” Survey Methodology 11 (2): 161–70.
Sylvester, Christine, Anastasia Ershova, Aleksandra Khokhlova, Nikoleta Yordanova, and Zachary Greene. 2023. ParlEE plenary speeches V2 data set: Annotated full-text of 15.1 million sentence-level plenary speeches of six EU legislative chambers.” Harvard Dataverse. https://doi.org/10.7910/DVN/VOPK0E.
Szaszi, Barnabas, Anthony Higney, Aaron Charlton, Andrew Gelman, Ignazio Ziano, Balazs Aczel, Daniel Goldstein, David Yeager, and Elizabeth Tipton. 2022. “No Reason to Expect Large and Consistent Effects of Nudge Interventions.” Proceedings of the National Academy of Sciences 119 (31): e2200732119. https://doi.org/10.1073/pnas.2200732119.
Taddy, Matt. 2019. Business Data Science. 1st ed. McGraw Hill.
Taflaga, Marija, and Matthew Kerby. 2019. “Who Does What Work in a Ministerial Office: Politically Appointed Staff and the Descriptive Representation of Women in Australian Political Offices, 19792010.” Political Studies 68 (2): 463–85. https://doi.org/10.1177/0032321719853459.
Tal, Eran. 2020. Measurement in Science.” In The Stanford Encyclopedia of Philosophy, edited by Edward Zalta, Fall 2020. https://plato.stanford.edu/archives/fall2020/entries/measurement-science/; Metaphysics Research Lab, Stanford University.
Tang, John. 2015. Pollution havens and the trade in toxic chemicals: Evidence from U.S. trade flows.” Ecological Economics 112 (April): 150–60. https://doi.org/10.1016/j.ecolecon.2015.02.022.
Tang, Jun, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, and Xiaofeng Wang. 2017. “Privacy Loss in Apple’s Implementation of Differential Privacy on MacOS 10.12.” arXiv. https://doi.org/10.48550/arXiv.1709.02753.
Tausanovitch, Chris, and Lynn Vavreck. 2021. Democracy Fund + UCLA Nationscape Project.” https://www.voterstudygroup.org/data/nationscape.
Taylor, Adam. 2015. “New Zealand Says No to Jedis.” The Washington Post, September. https://www.washingtonpost.com/news/worldviews/wp/2015/09/29/new-zealand-says-no-to-jedis/.
Teate, Renée. 2022. SQL for Data Scientists. Wiley.
The Economist. 2013. “Johnson: Those Six Little Rules: George Orwell on Writing,” July. https://www.economist.com/prospero/2013/07/29/johnson-those-six-little-rules.
———. 2022a. “What Spotify Data Show about the Decline of English,” January. https://www.economist.com/interactives/graphic-detail/2022/01/29/what-spotify-data-show-about-the-decline-of-english.
———. 2022b. “Will Emmanuel Macron Win a Second Term?” April. https://www.economist.com/interactive/france-2022/forecast.
———. 2022c. “France’s Presidential Election: The Second Round in Detail,” April. https://www.economist.com/interactive/france-2022/results-round-two.
The Washington Post. 2023. “Fatal Force Database.” https://github.com/washingtonpost/data-police-shootings.
The White House. 2023. “Recommendations on the Best Practices for the Collection of Sexual Orientation and Gender Identity Data on Federal Statistical Survey,” January. https://www.whitehouse.gov/wp-content/uploads/2023/01/SOGI-Best-Practices.pdf.
Thieme, Nick. 2018. “R Generation.” Significance 15 (4): 14–19. https://doi.org/10.1111/j.1740-9713.2018.01169.x.
Thistlethwaite, Donald, and Donald Campbell. 1960. “Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment.” Journal of Educational Psychology 51 (6): 309–17. https://doi.org/10.1037/h0044319.
Thompson, Charlie, Daniel Antal, Josiah Parry, Donal Phipps, and Tom Wolff. 2022. spotifyr: R Wrapper for the “Spotify” Web API. https://CRAN.R-project.org/package=spotifyr.
Thomson-DeVeaux, Amelia, Laura Bronner, and Damini Sharma. 2021. Cities Spend Millions On Police Misconduct Every Year. Here’s Why It’s So Difficult to Hold Departments Accountable.” FiveThirtyEight, February. https://fivethirtyeight.com/features/police-misconduct-costs-cities-millions-every-year-but-thats-where-the-accountability-ends/.
Thornhill, John. 2021. “Lunch with the FT: Mathematician Hannah Fry.” Financial Times, July. https://www.ft.com/content/a5e33e5a-99b9-4bbc-948f-8a527c7675c3.
Tierney, Nicholas, Di Cook, Miles McBain, and Colin Fay. 2021. naniar: Data Structures, Summaries, and Visualisations for Missing Data. https://CRAN.R-project.org/package=naniar.
Tierney, Nicholas, and Karthik Ram. 2020. “A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility.” https://arxiv.org/abs/2002.11626.
———. 2021. “Common-Sense Approaches to Sharing Tabular Data Alongside Publication.” Patterns 2 (12): 100368. https://doi.org/10.1016/j.patter.2021.100368.
Timbers, Tiffany. 2020. canlang: Canadian Census language data. https://ttimbers.github.io/canlang/.
Timbers, Tiffany, Trevor Campbell, and Melissa Lee. 2022. Data Science: A First Introduction. Chapman; Hall/CRC. https://datasciencebook.ca.
Tolley, Erin, and Mireille Paquet. 2021. “Gender, Municipal Party Politics, and Montreal’s First Woman Mayor.” Canadian Journal of Urban Research 30 (1): 40–52. https://cjur.uwinnipeg.ca/index.php/cjur/article/view/323.
Tourangeau, Roger, Lance Rips, and Kenneth Rasinski. 2000. The Psychology of Survey Response. 1st ed. Cambridge University Press. https://doi.org/10.1017/CBO9780511819322.003.
Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, et al. 2023. LLaMA: Open and Efficient Foundation Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2302.13971.
Trisovic, Ana, Matthew Lau, Thomas Pasquier, and Mercè Crosas. 2022. “A Large-Scale Study on Research Code Quality and Execution.” Scientific Data 9 (1). https://doi.org/10.1038/s41597-022-01143-6.
Tukey, John. 1962. “The Future of Data Analysis.” The Annals of Mathematical Statistics 33 (1): 1–67. https://doi.org/10.1214/aoms/1177704711.
———. 1977. Exploratory Data Analysis.
Turcotte, Alexi, Aviral Goel, Filip Křikava, and Jan Vitek. 2020. “Designing Types for r, Empirically.” Proceedings of the ACM on Programming Languages 4 (OOPSLA): 1–25. https://doi.org/10.1145/3428249.
UN IGME. 2021. “Levels and Trends in Child Mortality, 2021.” https://childmortality.org/wp-content/uploads/2021/12/UNICEF-2021-Child-Mortality-Report.pdf.
Urban, Steve, Rangarajan Sreenivasan, and Vineet Kannan. 2016. It’s All A/Bout Testing: The Netflix Experimentation Platform.” Netflix Technology Blog, April. https://netflixtechblog.com/its-all-a-bout-testing-the-netflix-experimentation-platform-4e1ca458c15.
Ushey, Kevin. 2022. renv: Project Environments. https://CRAN.R-project.org/package=renv.
van Buuren, Stef, and Karin Groothuis-Oudshoorn. 2011. mice: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software 45 (3): 1–67. https://doi.org/10.18637/jss.v045.i03.
Van den Broeck, Jan, Solveig Argeseanu Cunningham, Roger Eeckels, and Kobus Herbst. 2005. “Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities.” PLOS Medicine 2 (10): e267. https://doi.org/10.1371/journal.pmed.0020267.
van der Loo, Mark. 2022. The Data Validation Cookbook. https://data-cleaning.github.io/validate/.
van der Loo, Mark, and Edwin De Jonge. 2021. Data Validation Infrastructure for R.” Journal of Statistical Software 97 (10): 1–33. https://doi.org/10.18637/jss.v097.i10.
Vanderplas, Susan, Dianne Cook, and Heike Hofmann. 2020. “Testing Statistical Charts: What Makes a Good Graph?” Annual Review of Statistics and Its Application 7: 61–88. https://doi.org/10.1146/annurev-statistics-031219-041252.
Vanhoenacker, Mark. 2015. Skyfaring: A Journey with a Pilot. 1st ed. Alfred A. Knopf.
Varin, Cristiano, Nancy Reid, and David Firth. 2011. “An Overview of Composite Likelihood Methods.” Statistica Sinica, 5–42. https://www.jstor.org/stable/24309261.
Varner, Maddy, and Aaron Sankin. 2020. “Suckers List: How Allstate’s Secret Auto Insurance Algorithm Squeezes Big Spenders.” The Markup, February. https://themarkup.org/allstates-algorithm/2020/02/25/car-insurance-suckers-list.
Vavreck, Lynn, and Chris Tausanovitch. 2021. Democracy Fund + UCLA Nationscape Project User Guide.” https://www.voterstudygroup.org/data/nationscape.
Vickers, Andrew, and Emily Vertosick. 2016. “An Empirical Study of Race Times in Recreational Endurance Runners.” BMC Sports Science, Medicine and Rehabilitation 8 (1). https://doi.org/10.1186/s13102-016-0052-y.
Vidoni, Melina. 2021. Evaluating Unit Testing Practices in R Packages.” In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 1523–34. https://doi.org/10.1109/ICSE43902.2021.00136.
von Bergmann, Jens, Dmitry Shkolnik, and Aaron Jacobs. 2021. cancensus: R package to access, retrieve, and work with Canadian Census data and geography. https://mountainmath.github.io/cancensus/.
Walby, Kevin, and Alex Luscombe. 2019. Freedom of Information and Social Science Research Design. Routledge.
Walker, Kyle. 2022. Analyzing US Census Data. Chapman; Hall/CRC. https://walker-data.com/census-r/index.html.
Walker, Kyle, and Matt Herman. 2022. tidycensus: Load US Census Boundary and Attribute Data as “tidyverse” and “sf”-Ready Data Frames. https://CRAN.R-project.org/package=tidycensus.
Wallach, Hanna. 2018. “Computational Social Science Computer Science + Social Data.” Communications of the ACM 61 (3): 42–44. https://doi.org/10.1145/3132698.
Wan, Mengting, and Julian J. McAuley. 2018. “Item Recommendation on Monotonic Behavior Chains.” In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2-7, 2018, edited by Sole Pera, Michael D. Ekstrand, Xavier Amatriain, and John O’Donovan, 86–94. ACM. https://doi.org/10.1145/3240323.3240369.
Wan, Mengting, Rishabh Misra, Ndapa Nakashole, and Julian J. McAuley. 2019. “Fine-Grained Spoiler Detection from Large-Scale Review Corpora.” In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, edited by Anna Korhonen, David R. Traum, and Lluı́s Màrquez, 2605–10. Association for Computational Linguistics. https://doi.org/10.18653/v1/p19-1248.
Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015. “Forecasting Elections with Non-Representative Polls.” International Journal of Forecasting 31 (3): 980–91. https://doi.org/10.1016/j.ijforecast.2014.06.001.
Wang, Yilun, and Michal Kosinski. 2018. “Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation from Facial Images.” Journal of Personality and Social Psychology 114 (2): 246–57. https://doi.org/10.1037/pspa0000098.
Wardrop, Robert. 1995. “Simpson’s Paradox and the Hot Hand in Basketball.” The American Statistician 49 (1): 24–28. https://doi.org/10.2307/2684806.
Ware, James. 1989. “Investigating Therapies of Potentially Great Benefit: ECMO.” Statistical Science 4 (4): 298–306. https://doi.org/10.1214/ss/1177012384.
Wasserman, Larry. 2005. All of Statistics. Springer.
Wei, Eugene. 2017. Remove the Legend to Become One. https://www.eugenewei.com/blog/2017/11/13/remove-the-legend.
Wei, LJ, and S Durham. 1978. “The Randomized Play-the-Winner Rule in Medical Trials.” Journal of the American Statistical Association 73 (364): 840–43. https://doi.org/10.2307/2286290.
Weinberg, Gerald. 1971. The Psychology of Computer Programming. New York: Van Nostrand Reinhold Company.
Weissgerber, Tracey, Natasa Milic, Stacey Winham, and Vesna Garovic. 2015. “Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm.” PLoS Biology 13 (4): e1002128. https://doi.org/10.1371/journal.pbio.1002128.
Whitby, Andrew. 2020. The Sum of the People. New York: Basic Books.
Whitelaw, James. 1805. An Essay on the Population of Dublin. Being the Result of an Actual Survey Taken in 1798, with Great Care and Precision, and Arranged in a Manner Entirely New. Graisberry; Campbell.
Wicherts, Jelte, Marjan Bakker, and Dylan Molenaar. 2011. “Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results.” PLOS ONE 6 (11): e26828. https://doi.org/10.1371/journal.pone.0026828.
Wickham, Hadley. 2009. “Manipulating Data.” In ggplot2, 157–75. Springer New York. https://doi.org/10.1007/978-0-387-98141-3_9.
———. 2011. testthat: Get Started with Testing.” The R Journal 3: 5–10. https://journal.r-project.org/archive/2011-1/RJournal%5F2011-1%5FWickham.pdf.
———. 2014. “Tidy Data.” Journal of Statistical Software 59 (1): 1–23. https://doi.org/10.18637/jss.v059.i10.
———. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2017. tidyverse: Easily Install and Load the “Tidyverse”. https://CRAN.R-project.org/package=tidyverse.
———. 2018. “Whole Game.” YouTube, January. https://youtu.be/go5Au01Jrvs.
———. 2019. Advanced R. 2nd ed. Chapman; Hall/CRC. https://adv-r.hadley.nz.
———. 2020. Tidyverse. https://www.tidyverse.org/.
———. 2021a. babynames: US Baby Names 1880-2017. https://CRAN.R-project.org/package=babynames.
———. 2021b. Mastering Shiny. 1st ed. O’Reilly Media. https://mastering-shiny.org.
———. 2021c. The Tidyverse Style Guide. https://style.tidyverse.org/index.html.
———. 2022a. R Packages. 2nd ed. O’Reilly Media. https://r-pkgs.org.
———. 2022b. rvest: Easily Harvest (Scrape) Web Pages. https://CRAN.R-project.org/package=rvest.
———. 2022c. stringr: Simple, Consistent Wrappers for Common String Operations. https://CRAN.R-project.org/package=stringr.
———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). https://CRAN.R-project.org/package=forcats.
———. 2023b. httr: Tools for Working with URLs and HTTP. https://CRAN.R-project.org/package=httr.
Wickham, Hadley, Mara Averick, Jenny Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. Welcome to the Tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, and Jennifer Bryan. 2023. readxl: Read Excel Files. https://CRAN.R-project.org/package=readxl.
Wickham, Hadley, Jennifer Bryan, and Malcolm Barrett. 2022. usethis: Automate Package and Project Setup. https://CRAN.R-project.org/package=usethis.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. (2016) 2023. R for Data Science. 2nd ed. O’Reilly Media. https://r4ds.hadley.nz.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022. dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, Maximilian Girlich, and Edgar Ruiz. 2022. dbplyr: A “dplyr” Back End for Databases. https://CRAN.R-project.org/package=dbplyr.
Wickham, Hadley, and Lionel Henry. 2022. purrr: Functional Programming Tools. https://CRAN.R-project.org/package=purrr.
Wickham, Hadley, Jim Hester, and Jenny Bryan. 2022. readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.
Wickham, Hadley, Jim Hester, Winston Chang, and Jenny Bryan. 2022. devtools: Tools to Make Developing R Packages Easier. https://CRAN.R-project.org/package=devtools.
Wickham, Hadley, Jim Hester, and Jeroen Ooms. 2021. xml2: Parse XML. https://CRAN.R-project.org/package=xml2.
Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export “SPSS” “Stata” and “SAS” Files. https://CRAN.R-project.org/package=haven.
Wickham, Hadley, and Dana Seidel. 2022. scales: Scale Functions for Visualization. https://CRAN.R-project.org/package=scales.
Wickham, Hadley, and Lisa Stryjewski. 2011. “40 Years of Boxplots,” November. https://vita.had.co.nz/papers/boxplots.pdf.
Wickham, Hadley, Davis Vaughan, and Maximilian Girlich. 2023. tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.
Wiessner, Polly. 2014. “Embers of Society: Firelight Talk Among the Ju/’hoansi Bushmen.” Proceedings of the National Academy of Sciences 111 (39): 14027–35. https://doi.org/10.1073/pnas.1404212111.
Wilde, Oscar. 1891. The Picture of Dorian Gray. https://www.gutenberg.org/files/174/174-h/174-h.htm.
Wilford, John Noble. 1977. “Wernher von Braun, Rocket Pioneer, Dies.” The New York Times, June. https://www.nytimes.com/1977/06/18/archives/wernher-von-braun-rocket-pioneer-dies-wernher-von-braun-pioneer-in.html.
Wilkinson, Leland. 2005. The Grammar of Graphics. 2nd ed. Springer.
Wilkinson, Mark, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9. https://doi.org/10.1038/sdata.2016.18.
Wilson, Greg, Jenny Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy Teal. 2017. “Good Enough Practices in Scientific Computing.” PLOS Computational Biology 13 (6): 1–20. https://doi.org/10.1371/journal.pcbi.1005510.
Wong, Julia Carrie. 2020. “One Year Inside Trump’s Monumental Facebook Campaign.” The Guardian, January. https://www.theguardian.com/us-news/2020/jan/28/donald-trump-facebook-ad-campaign-2020-election.
Wood, Simon. 2015. Core Statistics. Cambridge University Press. https://www.maths.ed.ac.uk/\%7Eswood34/core-statistics.pdf.
World Health Organization. 2019. “Trends in Maternal Mortality 2000 to 2017: Estimates by WHO, UNICEF, UNFPA, World Bank Group and the United Nations Population Division.” https://apps.who.int/iris/handle/10665/327596.
Wright, Philip. 1928. The Tariff on Animal and Vegetable Oils. New York: Macmillan Company.
Wu, Changbao, and Mary Thompson. 2020. Sampling Theory and Practice. Springer.
Xie, Yihui. 2019. TinyTeX: A lightweight, cross-platform, and easy-to-maintain LaTeX distribution based on TeX Live.” TUGboat, no. 1: 30–32. https://tug.org/TUGboat/Contents/contents40-1.html.
———. 2023. knitr: A General-Purpose Package for Dynamic Report Generation in R. https://yihui.org/knitr/.
Xu, Ya. 2020. “Causal Inference Challenges in Industry: A Perspective from Experiences at LinkedIn.” YouTube, July. https://youtu.be/OoKsLAvyIYA.
Yoshioka, Alan. 1998. “Use of Randomisation in the Medical Research Council’s Clinical Trial of Streptomycin in Pulmonary Tuberculosis in the 1940s.” BMJ 317 (7167): 1220–23. https://doi.org/10.1136/bmj.317.7167.1220.
Zhang, Ping, XunPeng Shi, YongPing Sun, Jingbo Cui, and Shuai Shao. 2019. Have China’s provinces achieved their targets of energy intensity reduction? Reassessment based on nighttime lighting data.” Energy Policy 128 (May): 276–83. https://doi.org/10.1016/j.enpol.2019.01.014.
Zhang, Susan, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, et al. 2022. “OPT: Open Pre-Trained Transformer Language Models.” arXiv. https://doi.org/10.48550/arXiv.2205.01068.
Zimmer, Michael. 2018. “Addressing Conceptual Gaps in Big Data Research Ethics: An Application of Contextual Integrity.” Social Media + Society 4 (2): 1–11. https://doi.org/10.1177/2056305118768300.
Zinsser, William. 1976. On Writing Well. New York: HarperCollins.
Zook, Matthew, Solon Barocas, danah boyd, Kate Crawford, Emily Keller, Seeta Peña Gangadharan, Alyssa Goodman, et al. 2017. “Ten Simple Rules for Responsible Big Data Research.” PLOS Computational Biology 13 (3): e1005399. https://doi.org/10.1371/journal.pcbi.1005399.