References
Abadie, Alberto, Susan Athey, Guido Imbens, and Jeffrey Wooldridge.
2017. “When Should You Adjust Standard Errors for
Clustering?” Working Paper 24003. Working Paper Series. National
Bureau of Economic Research. https://doi.org/10.3386/w24003.
Abelson, Harold, and Gerald Jay Sussman. 1996. Structure and
Interpretation of Computer Programs. Cambridge: The MIT Press.
Abeysooriya, Mandhri, Megan Soria, Mary Sravya Kasu, and Mark Ziemann.
2021. “Gene Name Errors: Lessons Not Learned.” PLOS
Computational Biology 17 (7): 1–13. https://doi.org/10.1371/journal.pcbi.1008984.
Acemoglu, Daron, Simon Johnson, and James Robinson. 2001. “The
Colonial Origins of Comparative Development: An Empirical
Investigation.” American Economic Review 91
(5): 1369–1401. https://doi.org/10.1257/aer.91.5.1369.
Achen, Christopher. 1978. “Measuring Representation.”
American Journal of Political Science 22 (3): 475–510. https://doi.org/10.2307/2110458.
Akerlof, George. 1970. “The Market for ‘Lemons’:
Quality Uncertainty and the Market Mechanism.” The Quarterly
Journal of Economics 84 (3): 488–500. https://doi.org/10.2307/1879431.
Alexander, Monica. 2019a. “Reproducibility in Demographic
Research.” https://www.monicaalexander.com/posts/2019-10-20-reproducibility/.
———. 2019b. “The Concentration and Uniqueness of Baby Names in
Australia and the US,” January. https://www.monicaalexander.com/posts/2019-20-01-babynames/.
———. 2019c. “Analyzing Name Changes After Marriage Using a
Non-Representative Survey,” August. https://www.monicaalexander.com/posts/2019-08-07-mrp/.
———. 2021. “Overcoming Barriers to Sharing Code.”
YouTube, February. https://youtu.be/yvM2C6aZ94k.
Alexander, Monica, and Leontine Alkema. 2022. “A Bayesian Cohort Component Projection Model to Estimate
Women of Reproductive Age at the Subnational Level in Data-Sparse
Settings.” Demography 59 (5): 1713–37. https://doi.org/10.1215/00703370-10216406.
Alexander, Monica, Mathew Kiang, and Magali Barbieri. 2018.
“Trends in Black and White Opioid Mortality in the United States,
1979–2015.” Epidemiology 29 (5): 707–15. https://doi.org/10.1097/EDE.0000000000000858.
Alexander, Rohan, and Monica Alexander. 2021. “The Increased
Effect of Elections and Changing Prime Ministers on Topics Discussed in
the Australian Federal Parliament Between 1901 and 2018.” https://doi.org/10.48550/arXiv.2111.09299.
Alexander, Rohan, and Paul Hodgetts. 2021.
AustralianPoliticians: Provides Datasets About Australian
Politicians. https://CRAN.R-project.org/package=AustralianPoliticians.
Alexander, Rohan, and A Mahfouz. 2021. heapsofpapers: Easily Download Heaps of PDF and CSV
Files. https://CRAN.R-project.org/package=heapsofpapers.
Alexander, Rohan, and Zachary Ward. 2018. “Age at Arrival and
Assimilation During the Age of Mass Migration.” The Journal
of Economic History 78 (3): 904–37. https://doi.org/10.1017/S0022050718000335.
Alexopoulos, Michelle, and Jon Cohen. 2015. “The power of print: Uncertainty shocks, markets, and the
economy.” International Review of Economics
& Finance 40 (November): 8–28. https://doi.org/10.1016/j.iref.2015.02.002.
Allen, Jeff. 2021. plumberDeploy: Plumber
Deployment. https://CRAN.R-project.org/package=plumberDeploy.
Alsan, Marcella, and Amy Finkelstein. 2021. “Beyond Causality:
Additional Benefits of Randomized Controlled Trials for Improving Health
Care Delivery.” The Milbank Quarterly 99 (4): 864–81. https://doi.org/10.1111/1468-0009.12521.
Alsan, Marcella, and Marianne Wanamaker. 2018. “Tuskegee and the
Health of Black Men.” The Quarterly Journal of Economics
133 (1): 407–55. https://doi.org/10.1093/qje/qjx029.
Altman, Douglas, and Martin Bland. 1995. “Statistics notes: The normal distribution.”
BMJ 310 (6975): 298–98. https://doi.org/10.1136/bmj.310.6975.298.
Amaka, Ofunne, and Amber Thomas. 2021. “The Naked Truth: How the
Names of 6,816 Complexion Products Can Reveal Bias in Beauty.”
The Pudding, March. https://pudding.cool/2021/03/foundation-names/.
American Medical Association and New York Academy of Medicine. 1848.
Code of Medical Ethics. Academy of Medicine. https://hdl.handle.net/2027/chi.57108026.
Anders, Jake, Silvan Has, John Jerrim, Nikki Shure, and Laura Zieger.
2020. “Is Canada really an education
superpower? The impact of non-participation on results from PISA
2015.” Educational Assessment, Evaluation and
Accountability 33 (1): 229–49. https://doi.org/10.1007/s11092-020-09329-5.
Andersen, Robert, and David Armstrong. 2021. Presenting Statistical
Results Effectively. London: Sage.
Anderson, Margo. (1988) 2015. The American Census: A Social
History. 2nd ed. Yale University Press.
Anderson, Margo, and Stephen Fienberg. 1999. Who Counts?: The Politics of Census-Taking in
Contemporary America. Russell Sage Foundation. http://www.jstor.org/stable/10.7758/9781610440059.
Andrews, David, and Agnes Herzberg. 2012. Data: A Collection of
Problems from Many Fields for the Student and Research Worker. New
York: Springer Science & Business Media.
Angelucci, Charles, and Julia Cagé. 2019. “Newspapers in Times of
Low Advertising Revenues.” American Economic Journal:
Microeconomics 11 (3): 319–64. https://doi.org/10.1257/mic.20170306.
Angrist, Joshua, and Alan Krueger. 2001. “Instrumental Variables
and the Search for Identification: From Supply and Demand to Natural
Experiments.” Journal of Economic Perspectives 15 (4):
69–85. https://doi.org/10.1257/jep.15.4.69.
Angrist, Joshua, and Jörn-Steffen Pischke. 2010. “The Credibility
Revolution in Empirical Economics: How Better Research Design Is Taking
the Con Out of Econometrics.” Journal of Economic
Perspectives 24 (2): 3–30. https://doi.org/10.1257/jep.24.2.3.
Annas, George. 2003. “HIPAA Regulations: A New Era of
Medical-Record Privacy?” New England Journal of Medicine
348 (15): 1486–90. https://doi.org/10.1056/NEJMlim035027.
Ansolabehere, Stephen, Brian Schaffner, and Sam Luks. 2021. “Guide to the 2020 Cooperative Election
Study.” https://doi.org/10.7910/DVN/E9N6PH.
Aprameya, Lavanya. 2020. “Improving Duolingo, One Experiment at a
Time.” Duolingo Blog, January. https://blog.duolingo.com/improving-duolingo-one-experiment-at-a-time/.
Arel-Bundock, Vincent. 2021. WDI: World
Development Indicators and Other World Bank Data. https://CRAN.R-project.org/package=WDI.
———. 2022. “modelsummary: Data and
Model Summaries in R.” Journal of Statistical
Software 103 (1): 1–23. https://doi.org/10.18637/jss.v103.i01.
———. 2023. marginaleffects: Predictions,
Comparisons, Slopes, Marginal Means, and Hypothesis Tests.
https://vincentarelbundock.github.io/marginaleffects/.
———. 2024. tinytable: Simple and Configurable
Tables in “HTML,” “LaTeX,”
“Markdown,” “Word,” “PNG,”
“PDF,” and “Typst” Formats. https://vincentarelbundock.github.io/tinytable/.
Arel-Bundock, Vincent, Ryan Briggs, Hristos Doucouliagos, Marco Mendoza
Aviña, and T. D. Stanley. 2022. “Quantitative Political Science
Research Is Greatly Underpowered.” https://osf.io/bzj9y/.
Armstrong, Zan. 2022. “Stop Aggregating Away the Signal in Your
Data.” The Overflow, March. https://stackoverflow.blog/2022/03/03/stop-aggregating-away-the-signal-in-your-data/.
Arnold, Jeffrey. 2021. ggthemes: Extra Themes,
Scales and Geoms for “ggplot2”. https://CRAN.R-project.org/package=ggthemes.
Asher, Sam, Tobias Lunt, Ryu Matsuura, and Paul Novosad. 2021.
“Development Research at High Geographic Resolution: An Analysis
of Night Lights, Firms, and Poverty in India Using the SHRUG Open Data
Platform.” World Bank Economic Review 35 (4). https://shrug-assets-ddl.s3.amazonaws.com/static/main/assets/other/almn-shrug.pdf.
Athey, Susan, and Guido Imbens. 2017a. “The Econometrics of
Randomized Experiments.” In Handbook of Field
Experiments, 73–140. Elsevier. https://doi.org/10.1016/bs.hefe.2016.10.003.
———. 2017b. “The State of Applied Econometrics: Causality and
Policy Evaluation.” Journal of Economic Perspectives 31
(2): 3–32. https://doi.org/10.1257/jep.31.2.3.
Athey, Susan, Guido Imbens, Jonas Metzger, and Evan Munro. 2021.
“Using Wasserstein Generative Adversarial Networks for the Design
of Monte Carlo Simulations.” Journal of Econometrics. https://doi.org/10.1016/j.jeconom.2020.09.013.
Au, Randy. 2020. “Data Cleaning IS Analysis, Not Grunt
Work,” September. https://counting.substack.com/p/data-cleaning-is-analysis-not-grunt.
———. 2022. “Celebrating Everyone Counting Things,”
February. https://counting.substack.com/p/celebrating-everyone-counting-things.
Bååth, Rasmus. 2018. beepr: Easily Play
Notification Sounds on any Platform. https://CRAN.R-project.org/package=beepr.
Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R. https://CRAN.R-project.org/package=magrittr.
Backus, John. 1981. “The History of FORTRAN
I, II, and III.” In History of Programming
Languages, edited by Richard Wexelblat, 25–74. Academic Press.
Bailey, Rosemary. 2008. Design of Comparative Experiments.
Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511611483.
Baio, Gianluca, and Marta Blangiardo. 2010. “Bayesian Hierarchical
Model for the Prediction of Football Results.” Journal of
Applied Statistics 37 (2): 253–64. https://doi.org/10.1080/02664760802684177.
Baker, Dominique. 2023. “Scams Will Not Save Us (Tuition
Dollars),” February. http://www.dominiquebaker.com/blog/2023/2/16/scams-will-not-save-us-tuition-dollars.
Baker, Reg, Michael Brick, Nancy Bates, Mike Battaglia, Mick Couper,
Jill Dever, Krista Gile, and Roger Tourangeau. 2013. “Summary Report of the AAPOR Task Force on Non-Probability
Sampling.” Journal of Survey Statistics and
Methodology 1 (2): 90–143. https://doi.org/10.1093/jssam/smt008.
Bandy, John, and Nicholas Vincent. 2021. “Addressing
‘Documentation Debt’ in Machine Learning: A Retrospective
Datasheet for BookCorpus.” In Proceedings of the Neural
Information Processing Systems Track on Datasets and Benchmarks,
edited by J. Vanschoren and S. Yeung. Vol. 1. https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/54229abfcfa5649e7003b83dd4755294-Paper-round1.pdf.
Banerjee, Abhijit, and Esther Duflo. 2011. Poor Economics: A Radical
Rethinking of the Way to Fight Global Poverty. New York:
PublicAffairs.
Banerjee, Abhijit, Esther Duflo, Rachel Glennerster, and Cynthia Kinnan.
2015. “The Miracle of Microfinance? Evidence from a Randomized
Evaluation.” American Economic Journal: Applied
Economics 7 (1): 22–53. https://doi.org/10.1257/app.20130533.
Banes, Graham, Emily Fountain, Alyssa Karklus, Robert Fulton, Lucinda
Antonacci-Fulton, and Joanne Nelson. 2022. “Nine out of ten samples were mistakenly switched by The
Orang-utan Genome Consortium.” Scientific Data 9
(1). https://doi.org/10.1038/s41597-022-01602-0.
Barba, Lorena. 2018. “Terminologies for Reproducible
Research.” https://arxiv.org/abs/1802.03311.
Barrett, Malcolm. 2021a. Data Science as an Atomic Habit. https://malco.io/articles/2021-01-04-data-science-as-an-atomic-habit.
———. 2021b. ggdag: Analyze and Create Elegant
Directed Acyclic Graphs. https://CRAN.R-project.org/package=ggdag.
Barron, Alexander, Jenny Huang, Rebecca Spang, and Simon DeDeo. 2018.
“Individuals, Institutions, and Innovation in the Debates of the
French Revolution.” Proceedings of the National Academy of
Sciences 115 (18): 4607–12. https://doi.org/10.1073/pnas.1717729115.
Baumer, Benjamin, Daniel Kaplan, and Nicholas Horton. 2021.
Modern Data Science With R. 2nd ed. Chapman;
Hall/CRC. https://mdsr-book.github.io/mdsr2e/.
Baumgartner, Jason, Savvas Zannettou, Brian Keegan, Megan Squire, and
Jeremy Blackburn. 2020. “The Pushshift Reddit Dataset.”
arXiv. https://doi.org/10.48550/arxiv.2001.08435.
Baumgartner, Peter. 2021. “Ways I Use Testing
as a Data Scientist,” December. https://www.peterbaumgartner.com/blog/testing-for-data-science/.
Beaumont, Jean-Francois. 2020. “Are Probability Surveys Bound to
Disappear for the Production of Official Statistics?” Survey
Methodology 46 (1): 1–29.
Beauregard, Katrine, and Jill Sheppard. 2021. “Antiwomen but
Proquota: Disaggregating Sexism and Support for Gender Quota
Policies.” Political Psychology 42 (2): 219–37. https://doi.org/10.1111/pops.12696.
Becker, Richard, Allan Wilks, Ray Brownrigg, Thomas Minka, and Alex
Deckmyn. 2022. maps: Draw Geographical
Maps. https://CRAN.R-project.org/package=maps.
Beelen, Kaspar, Timothy Alberdingk Thim, Christopher Cochrane, Kees
Halvemaan, Graeme Hirst, Michael Kimmins, Sander Lijbrink, et al. 2017.
“Digitization of the Canadian Parliamentary Debates.”
Canadian Journal of Political Science 50 (3): 849–64.
Begley, Glenn, and Lee Ellis. 2012. “Raise Standards for
Preclinical Cancer Research.” Nature 483 (7391):
531--533. https://doi.org/10.1038/483531a.
Bender, Emily, Timnit Gebru, Angelina McMillan-Major, and Shmargaret
Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can
Language Models Be Too Big?” In Proceedings of the 2021
ACM Conference on Fairness, Accountability, and
Transparency. ACM. https://doi.org/10.1145/3442188.3445922.
Bengtsson, Henrik. 2021. “A Unifying
Framework for Parallel and Distributed Processing in R using
Futures.” The R Journal 13 (2): 208–27. https://doi.org/10.32614/RJ-2021-048.
Benoit, Kenneth. 2020. “Text as Data: An Overview.” In
The SAGE Handbook of Research Methods in Political Science and
International Relations, edited by Luigi Curini and Robert
Franzese, 461–97. London: SAGE Publishing. https://doi.org/10.4135/9781526486387.n29.
Benoit, Kenneth, and Michael Laver. 2006. Party
Policy in Modern Democracies. Routledge.
———. 2007. “Estimating Party Policy Positions: Comparing Expert
Surveys and Hand-Coded Content Analysis.” Electoral
Studies 26 (1): 90–107. https://doi.org/10.1016/j.electstud.2006.04.008.
Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng,
Stefan Müller, and Akitaka Matsuo. 2018. “quanteda: An R package for the quantitative analysis of
textual data.” Journal of Open Source Software 3
(30): 774. https://doi.org/10.21105/joss.00774.
Bensinger, Greg. 2020. “Google Redraws the Borders on Maps
Depending on Who’s Looking.” The Washington Post,
February. https://www.washingtonpost.com/technology/2020/02/14/google-maps-political-borders/.
Berdine, Gilbert, Vincent Geloso, and Benjamin Powell. 2018.
“Cuban Infant Mortality and Longevity: Health Care or
Repression?” Health Policy and Planning 33 (6): 755–57.
https://doi.org/10.1093/heapol/czy033.
Berkson, Joseph. 1946. “Limitations of the Application of Fourfold
Table Analysis to Hospital Data.” Biometrics Bulletin 2
(3): 47–53. https://doi.org/10.2307/3002000.
Berners-Lee, Timothy. 1989. “Information Management: A
Proposal.” https://www.w3.org/History/1989/proposal.html.
Berry, Donald. 1989. “Comment: Ethics and ECMO.”
Statistical Science 4 (4): 306–10. https://www.jstor.org/stable/2245830.
Bertrand, Marianne, and Sendhil Mullainathan. 2004. “Are Emily and
Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor
Market Discrimination.” American Economic Review 94 (4):
991–1013. https://doi.org/10.1257/0002828042002561.
Bethlehem, R. A. I., J. Seidlitz, S. R. White, J. W. Vogel, K. M.
Anderson, C. Adamson, S. Adler, et al. 2022. “Brain Charts for the
Human Lifespan.” Nature 604 (7906): 525–33. https://doi.org/10.1038/s41586-022-04554-y.
Betz, Timm, Scott Cook, and Florian Hollenbach. 2018. “On the Use
and Abuse of Spatial Instruments.” Political Analysis 26
(4): 474–79. https://doi.org/10.1017/pan.2018.10.
Bickel, Peter, Eugene Hammel, and William O’Connell. 1975. “Sex
Bias in Graduate Admissions: Data from Berkeley: Measuring Bias Is
Harder Than Is Usually Assumed, and the Evidence Is Sometimes Contrary
to Expectation.” Science 187 (4175): 398–404. https://doi.org/10.1126/science.187.4175.398.
Biderman, Stella, Kieran Bicheno, and Leo Gao. 2022. “Datasheet
for the Pile.” https://arxiv.org/abs/2201.07311.
Birkmeyer, John, Jonathan Finks, Amanda O’Reilly, Mary Oerline, Arthur
Carlin, Andre Nunn, Justin Dimick, Mousumi Banerjee, and Nancy
Birkmeyer. 2013. “Surgical Skill and Complication Rates After
Bariatric Surgery.” New England Journal of Medicine 369
(15): 1434–42. https://doi.org/10.1056/nejmsa1300625.
Blair, Ed, Seymour Sudman, Norman M Bradburn, and Carol Stocking. 1977.
“How to Ask Questions about Drinking and Sex: Response Effects in
Measuring Consumer Behavior.” Journal of Marketing
Research 14 (3): 316–21. https://doi.org/10.2307/3150769.
Blair, Graeme, Jasper Cooper, Alexander Coppock, and Macartan Humphreys.
2019. “Declaring and Diagnosing Research Designs.”
American Political Science Review 113 (3): 838–59. https://doi.org/10.1017/S0003055419000194.
Blair, Graeme, Jasper Cooper, Alexander Coppock, Macartan Humphreys, and
Luke Sonnet. 2021. estimatr: Fast Estimators
for Design-Based Inference. https://CRAN.R-project.org/package=estimatr.
Blair, James. 2019. Democratizing R with
Plumber APIs. https://posit.co/resources/videos/democratizing-r-with-plumber-apis/.
Bland, Martin, and Douglas Altman. 1986. “Statistical Methods for
Assessing Agreement Between Two Methods of Clinical Measurement.”
The Lancet 327 (8476): 307–10. https://doi.org/10.1016/S0140-6736(86)90837-8.
Blei, David. 2012. “Probabilistic Topic Models.”
Communications of the ACM 55 (4): 77–84. https://doi.org/10.1145/2133806.2133826.
Blei, David, Andrew Ng, and Michael Jordan. 2003. “Latent
Dirichlet Allocation.” Journal of Machine Learning
Research 3 (Jan): 993–1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf.
Bloom, Howard, Andrew Bell, and Kayla Reiman. 2020. “Using Data
from Randomized Trials to Assess the Likely Generalizability of
Educational Treatment-Effect Estimates from Regression Discontinuity
Designs.” Journal of Research on Educational
Effectiveness 13 (3): 488–517. https://doi.org/10.1080/19345747.2019.1634169.
Blumenthal, Mark. 2014. “Polls, Forecasts, and
Aggregators.” PS: Political Science & Politics 47
(02): 297–300. https://doi.org/10.1017/s1049096514000055.
Boland, Philip. 1984. “A Biographical Glimpse of William Sealy
Gosset.” The American Statistician 38 (3): 179–83. https://doi.org/10.2307/2683648.
Bolker, Ben, and David Robinson. 2022. broom.mixed: Tidying Methods for Mixed
Models. https://CRAN.R-project.org/package=broom.mixed.
Bolton, Ruth, and Randall Chapman. 1986. “Searching for Positive
Returns at the Track.” Management Science 32 (August):
1040–60. https://doi.org/10.1287/mnsc.32.8.1040.
Bombieri, Giulia, Vincenzo Penteriani, Kamran Almasieh, Hüseyin Ambarlı,
Mohammad Reza Ashrafzadeh, Chandan Surabhi Das, Nishith Dharaiya, et al.
2023. “A Worldwide Perspective on Large Carnivore Attacks on
Humans.” PLOS Biology 21 (1): e3001946. https://doi.org/10.1371/journal.pbio.3001946.
Bor, Jacob, Atheendar Venkataramani, David Williams, and Alexander Tsai.
2018. “Police Killings and Their Spillover Effects on the Mental
Health of Black Americans: A Population-Based, Quasi-Experimental
Study.” The Lancet 392 (10144): 302–10. https://doi.org/10.1016/s0140-6736(18)31130-9.
Borer, Elizabeth T., Eric W. Seabloom, Matthew B. Jones, and Mark
Schildhauer. 2009. “Some Simple Guidelines for Effective Data
Management.” Bulletin of the Ecological Society of
America 90 (2): 205–14. https://doi.org/10.1890/0012-9623-90.2.205.
Borghi, John, and Ana Van Gulick. 2022. “Promoting Open Science
Through Research Data Management.” Harvard Data Science
Review 4 (3). https://doi.org/10.1162/99608f92.9497f68e.
Borkin, Michelle, Zoya Bylinskii, Nam Wook Kim, Constance May
Bainbridge, Chelsea Yeh, Daniel Borkin, Hanspeter Pfister, and Aude
Oliva. 2015. “Beyond Memorability: Visualization Recognition and
Recall.” IEEE Transactions on Visualization and Computer
Graphics 22 (1): 519–28. https://doi.org/10.1109/TVCG.2015.2467732.
Bosch, Oriol, and Melanie Revilla. 2022. “When survey science met web tracking: Presenting an error
framework for metered data.” Journal of the Royal
Statistical Society: Series A (Statistics in Society), November,
1–29. https://doi.org/10.1111/rssa.12956.
Bouguen, Adrien, Yue Huang, Michael Kremer, and Edward Miguel. 2019.
“Using Randomized Controlled Trials to Estimate Long-Run Impacts
in Development Economics.” Annual Review of Economics 11
(1): 523–61. https://doi.org/10.1146/annurev-economics-080218-030333.
Bouie, Jamelle. 2022. “We Still Can’t See American Slavery for
What It Was.” The New York Times, January. https://www.nytimes.com/2022/01/28/opinion/slavery-voyages-data-sets.html.
Bowen, Claire McKay. 2022. Protecting Your
Privacy in a Data-Driven World. 1st ed. Chapman; Hall/CRC.
https://doi.org/10.1201/9781003122043.
Bowers, Jake, and Maarten Voors. 2016. “How to Improve Your
Relationship with Your Future Self.” Revista de Ciencia
Polı́tica 36 (3): 829–48. https://doi.org/10.4067/S0718-090X2016000300011.
Bowley, Arthur Lyon. 1901. Elements of Statistics. London: P.
S. King.
———. 1913. “Working-Class Households in Reading.”
Journal of the Royal Statistical Society 76 (7): 672–701. https://doi.org/10.2307/2339708.
Box, George E. P. 1976. “Science and Statistics.”
Journal of the American Statistical Association 71 (356):
791–99. https://doi.org/10.1080/01621459.1976.10480949.
Boykis, Vicki. 2019. “A Deep Dive on Python Type Hints,”
July. https://vickiboykis.com/2019/07/08/a-deep-dive-on-python-type-hints/.
Boysel, Sam, and Davis Vaughan. 2021. fredr: An
R Client for the “FRED” API. https://CRAN.R-project.org/package=fredr.
Bradley, Valerie, Shiro Kuriwaki, Michael Isakov, Dino Sejdinovic,
Xiao-Li Meng, and Seth Flaxman. 2021. “Unrepresentative Big
Surveys Significantly Overestimated US Vaccine
Uptake.” Nature 600 (7890): 695–700. https://doi.org/10.1038/s41586-021-04198-4.
Braginsky, Mika. 2020. wordbankr: Accessing the
Wordbank Database. https://CRAN.R-project.org/package=wordbankr.
Brandt, Allan. 1978. “Racism and Research: The Case of the
Tuskegee Syphilis Study.” Hastings Center Report, 21–29.
https://doi.org/10.2307/3561468.
Breiman, Leo. 1994. “The 1991 Census Adjustment: Undercount or Bad
Data?” Statistical Science 9 (4). https://doi.org/10.1214/ss/1177010259.
———. 2001. “Statistical Modeling: The Two Cultures.”
Statistical Science 16 (3): 199–231. https://doi.org/10.1214/ss/1009213726.
Bremer, Nadieh, and Shirley Wu. 2021. Data Sketches. A K
Peters/CRC Press. https://doi.org/10.1201/9780429445019.
Brewer, Cynthia. 2015. Designing Better Maps: A Guide for GIS
Users. 2nd ed.
Brewer, Ken. 2013. “Three Controversies in the History of Survey
Sampling.” Survey Methodology 39 (2): 249–63.
Breznau, Nate, Eike Mark Rinke, Alexander Wuttke, Hung HV Nguyen, Muna
Adem, Jule Adriaans, Amalia Alvarez-Benjumea, et al. 2022.
“Observing Many Researchers Using the Same Data and Hypothesis
Reveals a Hidden Universe of Uncertainty.” Proceedings of the
National Academy of Sciences 119 (44): e2203150119. https://doi.org/10.1073/pnas.2203150119.
Briggs, Ryan. 2021. “Why Does Aid Not Target the Poorest?”
International Studies Quarterly 65 (3): 739–52. https://doi.org/10.1093/isq/sqab035.
Brodeur, Abel, Nikolai Cook, and Anthony Heyes. 2020. “Methods Matter: p-Hacking and Publication Bias in Causal
Analysis in Economics.” American Economic Review
110 (11): 3634–60. https://doi.org/10.1257/aer.20190687.
Brokowski, Carolyn, and Mazhar Adli. 2019. “CRISPR Ethics: Moral
Considerations for Applications of a Powerful Tool.” Journal
of Molecular Biology 431 (1): 88–101. https://doi.org/10.1016/j.jmb.2018.05.044.
Bronner, Laura. 2020. “Why Statistics Don’t Capture the Full
Extent of the Systemic Bias in Policing.”
FiveThirtyEight, June. https://fivethirtyeight.com/features/why-statistics-dont-capture-the-full-extent-of-the-systemic-bias-in-policing/.
———. 2021. “Quantitative Editing.” YouTube, June.
https://youtu.be/LI5m9RzJgWc.
Brontë, Charlotte. 1847. Jane Eyre. https://www.gutenberg.org/files/1260/1260-h/1260-h.htm.
———. 1857. The Professor. https://www.gutenberg.org/files/1028/1028-h/1028-h.htm.
Brook, Robert, John Ware, William Rogers, Emmett Keeler, Allyson Ross
Davies, Cathy Sherbourne, George Goldberg, Kathleen Lohr, Patricia Camp,
and Joseph Newhouse. 1984. “The Effect of Coinsurance on the
Health of Adults: Results from the RAND Health Insurance
Experiment.” https://www.rand.org/pubs/reports/R3055.html.
Brown, Zack. 2018. “A Git Origin Story.” Linux
Journal, July. https://www.linuxjournal.com/content/git-origin-story.
Bryan, Jenny. 2015. “Naming Things.” Reproducible
Science Workshop, May. https://speakerdeck.com/jennybc/how-to-name-files.
———. 2018a. “Excuse Me, Do You Have a Moment to Talk about Version
Control?” The American Statistician 72 (1): 20–27. https://doi.org/10.1080/00031305.2017.1399928.
———. 2018b. “Code Smells and Feels.” YouTube,
July. https://youtu.be/7oyiPBjLAWY.
———. 2020. Happy Git and GitHub for the
useR. https://happygitwithr.com.
Bryan, Jenny, and Jim Hester. 2020. What They
Forgot to Teach You About R. https://rstats.wtf/index.html.
Bryan, Jenny, Jim Hester, David Robinson, Hadley Wickham, and Christophe
Dervieux. 2022. reprex: Prepare Reproducible
Example Code via the Clipboard. https://CRAN.R-project.org/package=reprex.
Bryan, Jenny, and Hadley Wickham. 2021. gh:
GitHub API. https://CRAN.R-project.org/package=gh.
Buckheit, Jonathan, and David Donoho. 1995. “Wavelab and
Reproducible Research.” In Wavelets and Statistics,
55–81. Springer. https://doi.org/10.1007/978-1-4612-2544-7_5.
Bueno de Mesquita, Ethan, and Anthony Fowler. 2021. Thinking Clearly
with Data: A Guide to Quantitative Reasoning and Analysis. New
Jersey: Princeton University Press.
Buhr, Ray. 2017. Using R as a Production
Machine Learning Language (Part I). https://raybuhr.github.io/blog/posts/making-predictions-over-http/.
Buja, Andreas, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung
Lee, Deborah F. Swayne, and Hadley Wickham. 2009. “Statistical
Inference for Exploratory Data Analysis and Model Diagnostics.”
Philosophical Transactions of the Royal Society A:
Mathematical, Physical and Engineering Sciences 367 (1906):
4361–83. https://doi.org/10.1098/rsta.2009.0120.
Buja, Andreas, Dianne Cook, and Deborah Swayne. 1996. “Interactive
High-Dimensional Data Visualization.” Journal of
Computational and Graphical Statistics 5 (1): 78–99. https://doi.org/10.2307/1390754.
Buneman, Peter, Sanjeev Khanna, and Tan Wang-Chiew. 2001. “Why and
Where: A Characterization of Data Provenance.” In Database
Theory ICDT 2001, 316–30. Springer
Berlin Heidelberg. https://doi.org/10.1007/3-540-44503-x_20.
Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades:
Intersectional Accuracy Disparities in Commercial Gender
Classification.” In Conference on Fairness, Accountability
and Transparency, 77–91.
Burch, Tyler James. 2023. “2023 NHL Playoff
Predictions,” April. https://tylerjamesburch.com/blog/misc/nhl-predictions.
Burton, Jason, Nicole Cruz, and Ulrike Hahn. 2021. “Reconsidering
Evidence of Moral Contagion in Online Social Networks.”
Nature Human Behaviour 5 (12): 1629–35. https://doi.org/10.1038/s41562-021-01133-5.
Bush, Vannevar. 1945. “As We May Think.” The Atlantic
Monthly, July. https://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/.
Byrd, James Brian, Anna Greene, Deepashree Venkatesh Prasad, Xiaoqian
Jiang, and Casey Greene. 2020. “Responsible, Practical Genomic
Data Sharing That Accelerates Research.” Nature Reviews
Genetics 21 (10): 615–29. https://doi.org/10.1038/s41576-020-0257-5.
Cahill, Niamh, Michelle Weinberger, and Leontine Alkema. 2020.
“What Increase in Modern Contraceptive Use Is Needed in FP2020
Countries to Reach 75% Demand Satisfied by 2030? An Assessment Using the
Accelerated Transition Method and Family Planning Estimation
Model.” Gates Open Research 4. https://doi.org/10.12688/gatesopenres.13125.1.
Calonico, Sebastian, Matias Cattaneo, Max Farrell, and Rocio Titiunik.
2021. rdrobust: Robust Data-Driven Statistical
Inference in Regression-Discontinuity Designs. https://CRAN.R-project.org/package=rdrobust.
Cambon, Jesse, and Christopher Belanger. 2021. “tidygeocoder: Geocoding Made Easy.” Zenodo.
https://doi.org/10.5281/zenodo.3981510.
Canty, Angelo, and B. D. Ripley. 2021. boot:
Bootstrap R (S-Plus) Functions.
Cardoso, Tom. 2020. “Bias behind bars: A
Globe investigation finds a prison system stacked against Black and
Indigenous inmates.” The Globe and Mail, October.
https://www.theglobeandmail.com/canada/article-investigation-racial-bias-in-canadian-prison-risk-assessments/.
Carl, Sebastian, Ben Baldwin, Lee Sharpe, Tan Ho, and John Edwards.
2023. Nflverse: Easily Install and Load the ’Nflverse’. https://CRAN.R-project.org/package=nflverse.
Carleton, Chris. 2021. “wccarleton/conflict-europe: Acce.” Zenodo.
https://doi.org/10.5281/zenodo.4550688.
Carleton, Chris, Dave Campbell, and Mark Collard. 2021. “A
Reassessment of the Impact of Temperature Change on European Conflict
During the Second Millennium CE Using a Bespoke Bayesian Time-Series
Model.” Climatic Change 165 (1): 1–16. https://doi.org/10.1007/s10584-021-03022-2.
Caro, Robert. 2019. Working. 1st ed. New York: Knopf.
Carpenter, Christopher, and Carlos Dobkin. 2014. “Replication data for: The Minimum Legal Drinking Age and
Crime.” https://doi.org/10.7910/DVN/27070.
———. 2015. “The Minimum Legal Drinking Age
and Crime.” The Review of Economics and
Statistics 97 (2): 521–24. https://doi.org/10.1162/REST_a_00489.
Carroll, Lewis. 1871. Through the Looking-Glass. Macmillan. https://www.gutenberg.org/files/12/12-h/12-h.htm.
Castro, Marcia, Susie Gurzenda, Cassio Turra, Sun Kim, Theresa
Andrasfay, and Noreen Goldman. 2023. “Research Note:
COVID-19 Is Not an Independent Cause of Death.”
Demography, February. https://doi.org/10.1215/00703370-10575276.
Caughey, Devin, and Jasjeet Sekhon. 2011. “Elections and the Regression Discontinuity Design:
Lessons from Close U.S. House Races, 1942–2008.”
Political Analysis 19 (4): 385–408. https://doi.org/10.1093/pan/mpr032.
Chamberlain, Scott, Hadley Wickham, Winston Chang, and Mauricio Vargas.
2022. Analogsea: Interface to “Digital Ocean”. https://CRAN.R-project.org/package=analogsea.
Chamberlin, Donald. 2012. “Early History of
SQL.” IEEE Annals of the History of
Computing 34 (4): 78–82. https://doi.org/10.1109/mahc.2012.61.
Chambliss, Daniel. 1989. “The Mundanity of Excellence: An
Ethnographic Report on Stratification and Olympic Swimmers.”
Sociological Theory 7 (1): 70–86. https://doi.org/10.2307/202063.
Chan, Duo. 2021. “Combining Statistical, Physical, and Historical
Evidence to Improve Historical Sea-Surface Temperature Records.”
Harvard Data Science Review 3 (1). https://doi.org/10.1162/99608f92.edcee38f.
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke,
Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara
Borges. 2021. shiny: Web Application Framework
for R. https://CRAN.R-project.org/package=shiny.
Chase, William. 2020. “The Glamour of Graphics.”
RStudio Conference, January. https://posit.co/resources/videos/the-glamour-of-graphics/.
Chawla, Dalmeet Singh. 2020. “Critiqued Coronavirus Simulation
Gets Thumbs up from Code-Checking Efforts.” Nature 582:
323–24. https://doi.org/10.1038/d41586-020-01685-y.
Chellel, Kit. 2018. “The Gambler Who Cracked the Horse-Racing
Code.” Bloomberg Businessweek, May. https://www.bloomberg.com/news/features/2018-05-03/the-gambler-who-cracked-the-horse-racing-code.
Chen, Heng, Marie-Hélène Felt, and Christopher Henry. 2018. “2017
Methods-of-Payment Survey: Sample Calibration and Variance
Estimation.” Bank of Canada. https://doi.org/10.34989/tr-114.
Chen, Wei, Xilu Chen, Chang-Tai Hsieh, and Zheng Song. 2019. “A
Forensic Examination of China’s National Accounts.” Brookings
Papers on Economic Activity, 77–127. https://www.jstor.org/stable/26798817.
Chen, Weijun, Yan Qi, Yuwen Zhang, Christina Brown, Akos Lada, and
Harivardan Jayaraman. 2022. “Notifications: Why Less Is
More,” December. https://medium.com/@AnalyticsAtMeta/notifications-why-less-is-more-how-facebook-has-been-increasing-both-user-satisfaction-and-app-9463f7325e7d.
Cheng, Joe, Bhaskar Karambelkar, and Yihui Xie. 2021. leaflet: Create Interactive Web Maps with the JavaScript
“Leaflet” Library. https://CRAN.R-project.org/package=leaflet.
Cheriet, Mohamed, Nawwaf Kharma, Cheng-Lin Liu, and Ching Suen. 2007.
Character Recognition Systems: A Guide for Students and
Practitioner. Wiley.
Chouldechova, Alexandra, Diana Benavides-Prado, Oleksandr Fialko, and
Rhema Vaithianathan. 2018. “A Case Study of Algorithm-Assisted
Decision Making in Child Maltreatment Hotline Screening
Decisions.” In Proceedings of the 1st Conference on Fairness,
Accountability and Transparency, edited by Sorelle Friedler and
Christo Wilson, 81:134–48. Proceedings of Machine Learning Research. https://proceedings.mlr.press/v81/chouldechova18a.html.
Chrétien, Jean. 2007. My Years as Prime Minister. 1st ed.
Toronto: Knopf Canada.
Christensen, Garret, Allan Dafoe, Edward Miguel, Don Moore, and Andrew
Rose. 2019. “A Study of the Impact of Data Sharing on Article
Citations Using Journal Policies as a Natural Experiment.”
PLOS ONE 14 (12): e0225883. https://doi.org/10.1371/journal.pone.0225883.
Christensen, Garret, Jeremy Freese, and Edward Miguel. 2019.
Transparent and Reproducible Social Science Research.
California: University of California Press.
Christian, Brian. 2012. “The A/B Test: Inside
the Technology That’s Changing the Rules of Business.”
Wired, April. https://www.wired.com/2012/04/ff-abtesting/.
Cirone, Alexandra, and Arthur Spirling. 2021. “Turning History
into Data: Data Collection, Measurement, and Inference in HPE.”
Journal of Historical Political Economy 1 (1): 127–54. https://doi.org/10.1561/115.00000005.
City of Toronto. 2021. 2021 Street Needs Assessment. https://www.toronto.ca/city-government/data-research-maps/research-reports/housing-and-homelessness-research-and-reports/.
Cleveland, William. (1985) 1994. The Elements of Graphing Data.
2nd ed. New Jersey: Hobart Press.
Clinton, Joshua, John Lapinski, and Marc Trussler. 2022.
“Reluctant Republicans, Eager Democrats?” Public
Opinion Quarterly 86 (2): 247–69. https://doi.org/10.1093/poq/nfac011.
Cohen, Glenn, and Michelle Mello. 2018. “HIPAA and
Protecting Health Information in the 21st Century.”
JAMA 320 (3): 231. https://doi.org/10.1001/jama.2018.5630.
Cohen, Jason, Steven Teleki, and Eric Brown. 2006. Best Kept Secrets
of Peer Code Review. Smart Bear Incorporated.
Cohn, Alain. 2019. “Data and code for: Civic
Honesty Around the Globe.” Harvard Dataverse. https://doi.org/10.7910/dvn/ykbodn.
Cohn, Alain, Michel André Maréchal, David Tannenbaum, and Christian
Lukas Zünd. 2019a. “Civic Honesty Around the Globe.”
Science 365 (6448): 70–73. https://doi.org/10.1126/science.aau8712.
———. 2019b. “Supplementary Materials for: Civic Honesty Around the
Globe.” Science 365 (6448): 70–73.
Cohn, Nate. 2016. “We Gave Four Good Pollsters the Same Raw Data.
They Had Four Different Results.” The New York Times,
September. https://www.nytimes.com/interactive/2016/09/20/upshot/the-error-the-polling-world-rarely-talks-about.html.
Collins, Annie, and Rohan Alexander. 2022. “Reproducibility of
COVID-19 Pre-Prints.” Scientometrics 127: 4655–73. https://doi.org/10.1007/s11192-022-04418-2.
Colombo, Tommaso, Holger Fröning, Pedro Javier Garcı̀a, and Wainer
Vandelli. 2016. “Optimizing the Data-Collection Time of a
Large-Scale Data-Acquisition System Through a Simulation
Framework.” The Journal of Supercomputing 72 (12):
4546–72. https://doi.org/10.1007/s11227-016-1764-1.
Comer, Benjamin P., and Jason R. Ingram. 2022. “Comparing Fatal
Encounters, Mapping Police Violence, and Washington Post Fatal Police
Shooting Data from 2015-2019: A Research Note.” Criminal
Justice Review, January, 073401682110710. https://doi.org/10.1177/07340168211071014.
Congelio, Bradley. 2024. Introduction to NFL
Analytics with R. 1st ed. Chapman; Hall/CRC. https://bradcongelio.com/nfl-analytics-with-r-book/.
Cook, Dianne, Andreas Buja, Javier Cabrera, and Catherine Hurley. 1995.
“Grand Tour and Projection
Pursuit.” Journal of Computational and Graphical
Statistics 4 (3): 155–72. https://doi.org/10.1080/10618600.1995.10474674.
Cook, Dianne, Nancy Reid, and Emi Tanaka. 2021. “The Foundation Is
Available for Thinking about Data Visualization Inferentially.”
Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.8453435d.
Cook, Dianne, and Deborah Swayne. 2007. Interactive and Dynamic Graphics for Data Analysis: With
R and GGobi. 1st ed. Springer.
Cooley, David. 2020. mapdeck: Interactive Maps
Using “Mapbox GL JS” and
“Deck.gl”. https://CRAN.R-project.org/package=mapdeck.
Council of European Union. 2016. “General Data Protection
Regulation 2016/679.” https://eur-lex.europa.eu/eli/reg/2016/679/oj.
Cowen, Tyler. 2021. “Episode 132: Amia Srinivasan on Utopian
Feminism.” Conversations with Tyler, September. https://conversationswithtyler.com/episodes/amia-srinivasan/.
———. 2023. “Episode 168: Katherine Rundell on the Art of
Words.” Conversations with Tyler, January. https://conversationswithtyler.com/episodes/katherine-rundell/.
Cox, David. 2018. “In Gentle Praise of Significance Tests.”
YouTube, October. https://youtu.be/txLj%5FP9UlCQ.
Cox, David, and Nancy Reid. 1987. “Parameter Orthogonality and
Approximate Conditional Inference.” Journal of the Royal
Statistical Society: Series B (Methodological) 49 (1): 1–18. https://doi.org/10.1111/j.2517-6161.1987.tb01422.x.
Cox, Murray. 2021. “Inside Airbnb—Toronto
Data.” http://insideairbnb.com/get-the-data.html.
Coyle, Edward, Andrew Coggan, Mari Hopper, and Thomas Walters. 1988.
“Determinants of Endurance in Well-Trained
Cyclists.” Journal of Applied Physiology 64 (6):
2622–30. https://doi.org/10.1152/jappl.1988.64.6.2622.
Craiu, Radu. 2019. “The Hiring Gambit: In Search of the Twofer
Data Scientist.” Harvard Data Science Review 1 (1). https://doi.org/10.1162/99608f92.440445cb.
Cramer, Jan Salomon. 2003. “The Origins of Logistic
Regression.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.360300.
Crane, Nicola, Stephanie Hazlitt, and Apache Arrow. 2023.
Apache Arrow R Cookbook. https://arrow.apache.org/cookbook/r/.
Crawford, Kate. 2021. Atlas of AI.
1st ed. New Haven: Yale University Press.
Crosby, Alfred. 1997. The Measure of Reality: Quantification in
Western Europe, 1250-1600. Cambridge: Cambridge University Press.
Csárdi, Gábor. 2022. gitcreds: Query
“git” Credentials from “R”. https://CRAN.R-project.org/package=gitcreds.
Csárdi, Gábor, Jim Hester, Hadley Wickham, Winston Chang, Martin Morgan,
and Dan Tenenbaum. 2021. remotes: R Package
Installation from Remote Repositories, Including
“GitHub”. https://CRAN.R-project.org/package=remotes.
Cummins, Neil. 2022. “The Hidden Wealth of English Dynasties,
1892–2016.” The Economic History Review 75 (3): 667–702.
https://doi.org/10.1111/ehr.13120.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. 1st ed.
New Haven: Yale Press. https://mixtape.scunning.com.
D’Ignazio, Catherine, and Lauren Klein. 2020. Data Feminism.
Massachusetts: The MIT Press. https://data-feminism.mitpress.mit.edu.
Dagan, Noa, Noam Barda, Eldad Kepten, Oren Miron, Shay Perchik, Mark
Katz, Miguel Hernán, Marc Lipsitch, Ben Reis, and Ran Balicer. 2021.
“BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination
Setting.” New England Journal of Medicine 384 (15):
1412–23. https://doi.org/10.1056/NEJMoa2101765.
Daston, Lorraine. 2000. “Why Statistics Tend Not Only to Describe
the World but to Change It.” London Review of Books 22
(8). https://www.lrb.co.uk/the-paper/v22/n08/lorraine-daston/why-statistics-tend-not-only-to-describe-the-world-but-to-change-it.
Data and Justice Criminology Lab, Institute of Criminology and Criminal
Justice, Carleton University; The Centre for Research & Innovation
for Black Survivors of Homicide Victims (The CRIB), at the
Factor-Inwentash Faculty of Social Work, University of Toronto; Canadian
Civil Liberties Association; Ethics and Technology Lab, Queen’s
University. 2022. “Tracking (in)justice: A Living Data Set
Tracking Canadian Police-Involved Deaths.” https://trackinginjustice.ca.
Dattani, Saloni. 2024. “The Rise in Reported Maternal Mortality
Rates in the US Is Largely Due to a Change in Measurement.”
Our World in Data.
Davidson, Thomas, Debasmita Bhattacharya, and Ingmar Weber. 2019.
“Racial Bias in Hate Speech and Abusive Language Detection
Datasets.” In Proceedings of the Third Workshop on Abusive
Language Online, 25–35.
Davies, Neil M., Gibran Hemani, Jenae M. Neiderhiser, Hilary C. Martin,
Melinda C. Mills, Peter M. Visscher, Loïc Yengo, Alexander Strudwick
Young, and Matthew C. Keller. 2024. “The Importance of
Family-Based Sampling for Biobanks.” Nature 634 (8035):
795–803. https://doi.org/10.1038/s41586-024-07721-5.
Davies, Rhian, Steph Locke, and Lucy D’Agostino McGowan. 2022. datasauRus: Datasets from the Datasaurus
Dozen. https://CRAN.R-project.org/package=datasauRus.
Davis, Darren. 1997. “Nonrandom Measurement Error and Race of
Interviewer Effects Among African Americans.” The Public
Opinion Quarterly 61 (1): 183–207. https://doi.org/10.1086/297792.
Davison, A. C., and D. V. Hinkley. 1997. Bootstrap Methods and Their
Applications. Cambridge: Cambridge University Press. http://statwww.epfl.ch/davison/BMA/.
De Jonge, Edwin, and Mark van der Loo. 2013. An
introduction to data cleaning with R. Statistics Netherlands
Heerlen. https://cran.r-project.org/doc/contrib/de%5FJonge+van%5Fder%5FLoo-Introduction%5Fto%5Fdata%5Fcleaning%5Fwith%5FR.pdf.
Dean, Natalie. 2022. “Tracking COVID-19 Infections:
Time for Change.” Nature 602 (7896): 185. https://doi.org/10.1038/d41586-022-00336-8.
Deaton, Angus. 2010. “Instruments, Randomization, and Learning
about Development.” Journal of Economic Literature 48
(2): 424–55. https://doi.org/10.1257/jel.48.2.424.
Denby, Lorraine, and Colin Mallows. 2009. “Variations on the
Histogram.” Journal of Computational and Graphical
Statistics 18 (1): 21–31. https://doi.org/10.1198/jcgs.2009.0002.
DeWitt, Helen. 2000. The Last Samurai. 1st ed. United States:
Talk Mirimax Books.
Dillman, Don, Jolene Smyth, and Leah Christian. (1978) 2014.
Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design
Method. 4th ed. Wiley.
Doggers, Peter. 2021. “Carlsen Wins Game 6, Longest World Chess
Championship Game of All Time,” December. https://www.chess.com/news/view/fide-world-chess-championship-2021-game-6.
Dolatsara, Hamidreza Ahady, Ying-Ju Chen, Robert Leonard, Fadel Megahed,
and Allison Jones-Farmer. 2021. “Explaining Predictive Model
Performance: An Experimental Study of Data Preparation and Model
Choice.” Big Data, October. https://doi.org/10.1089/big.2021.0067.
Doll, Richard, and Bradford Hill. 1950. “Smoking and Carcinoma of
the Lung.” British Medical Journal 2 (4682): 739–48. https://doi.org/10.1136/bmj.2.4682.739.
Druckman, James, and Donald Green. 2021. “A New Era of
Experimental Political Science.” In Advances in Experimental
Political Science, 1–16. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108777919.002.
Du, Kai, Steven Huddart, and Xin Daniel Jiang. 2022. “Lost in
Standardization: Effects of Financial Statement Database Discrepancies
on Inference.” Journal of Accounting and Economics,
December, 101573. https://doi.org/10.1016/j.jacceco.2022.101573.
Duflo, Esther. 2020. “Field Experiments and the Practice of
Policy.” American Economic Review 110 (7): 1952–73. https://doi.org/10.1257/aer.110.7.1952.
Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006.
“Calibrating Noise to Sensitivity in Private Data
Analysis.” In Theory of Cryptography Conference, 265–84.
Springer. https://doi.org/10.1007/11681878_14.
Dwork, Cynthia, and Aaron Roth. 2013. “The Algorithmic Foundations
of Differential Privacy.” Foundations and Trends in
Theoretical Computer Science 9 (3-4): 211–407. https://doi.org/10.1561/0400000042.
Edelman, Murray, Liberty Vittert, and Xiao-Li Meng. 2021. “An
Interview with Murray Edelman on the History of the Exit Poll.”
Harvard Data Science Review 3 (1). https://doi.org/10.1162/99608f92.3a25cd24.
Edgeworth, Francis Ysidro. 1885. “Methods of Statistics.”
Journal of the Statistical Society of London, 181–217.
Edwards, Jonathan. 2017. “PACE team response
shows a disregard for the principles of science.”
Journal of Health Psychology 22 (9): 1155–58. https://doi.org/10.1177/1359105317700886.
Efron, Bradley, and Carl Morris. 1977. “Stein’s Paradox in
Statistics.” Scientific American 236 (May): 119–27. https://doi.org/10.1038/scientificamerican0577-119.
Eghbal, Nadia. 2020. Working in Public: The Making and Maintenance
of Open Source Software. California: Stripe Press.
Eisenstein, Michael. 2022. “Need Web Data? Here’s How to Harvest
Them.” Nature 607: 200–201. https://doi.org/10.1038/d41586-022-01830-9.
Elliott, Michael, Brady West, Xinyu Zhang, and Stephanie Coffey. 2022.
“The Anchoring Method: Estimation of Interviewer Effects in the
Absence of Interpenetrated Sample Assignment.” Survey
Methodology 48 (1): 25–48. http://www.statcan.gc.ca/pub/12-001-x/2022001/article/00005-eng.htm.
Elson, Malte. 2018. “Question Wording and Item
Formulation.” https://doi.org/10.31234/osf.io/e4ktc.
Enns, Peter, and Jake Rothschild. 2022. “Do You Know Where Your
Survey Data Come From?” May. https://medium.com/3streams/surveys-3ec95995dde2.
Farrugia, Patricia, Bradley Petrisor, Forough Farrokhyar, and Mohit
Bhandari. 2010. “Research Questions, Hypotheses and
Objectives.” Canadian Journal of Surgery 53 (4): 278.
Feldman, Gilad. 2024. RRR Assessment Peer Review.
https://mgto.org/rrrassessmentreviewtemplate.
Finkelstein, Amy, Sarah Taubman, Bill Wright, Mira Bernstein, Jonathan
Gruber, Joseph Newhouse, Heidi Allen, Katherine Baicker, and Oregon
Health Study Group. 2012. “The Oregon Health Insurance Experiment:
Evidence from the First Year.” The Quarterly Journal of
Economics 127 (3): 1057–1106. https://doi.org/10.1093/qje/qjs020.
Firke, Sam. 2023. janitor: Simple Tools for
Examining and Cleaning Dirty Data. https://CRAN.R-project.org/package=janitor.
Fisher, Ronald. (1925) 1928. Statistical Methods for Research
Workers. 2nd ed. London: Oliver; Boyd.
———. (1935) 1949. The Design of Experiments. 5th ed. London:
Oliver; Boyd.
Fiske, Susan, and Shiro Kuriwaki. 2021. “Words to the Wise on
Writing Scientific Papers,” November. https://doi.org/10.31234/osf.io/n32qw.
Fitts, Alexis Sobel. 2014. “The King of Content: How Upworthy Aims
to Alter the Web, and Could End up Altering the World.”
Columbia Journalism Review 53: 34–38. https://archives.cjr.org/feature/the%5Fking%5Fof%5Fcontent.php.
Flake, Jessica, and Eiko Fried. 2020. “Measurement Schmeasurement:
Questionable Measurement Practices and How to Avoid Them.”
Advances in Methods and Practices in Psychological Science 3
(4): 456–65. https://doi.org/10.1177/2515245920952393.
Flynn, Michael. 2022. troopdata: Tools for
Analyzing Cross-National Military Deployment and Basing
Data. https://CRAN.R-project.org/package=troopdata.
Ford, Paul. 2015. “What Is Code?” Bloomberg
Businessweek, June. https://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/.
Forster, Edward Morgan. 1927. Aspects of the Novel. London:
Edward Arnold.
Foster, Gordon. 1968. “Computers, Statistics and Planning: Systems
or Chaos?” Geary Lecture. https://www.esri.ie/system/files/publications/GLS2.pdf.
Fourcade, Marion, and Kieran Healy. 2017. “Seeing Like a
Market.” Socio-Economic Review 15 (1): 9–29. https://doi.org/10.1093/ser/mww033.
Fowler, Martin, and Kent Beck. 2018. Refactoring: Improving the Design of Existing
Code. 2nd ed. New York: Addison-Wesley Professional.
Fox, John, and Robert Andersen. 2006. “Effect Displays for
Multinomial and Proportional-Odds Logit Models.” Sociological
Methodology 36 (1): 225–55. https://doi.org/10.1111/j.1467-9531.2006.00180.
Fox, John, Sanford Weisberg, and Brad Price. 2022. carData:
Companion to Applied Regression Data Sets. https://CRAN.R-project.org/package=carData.
Franconeri, Steven, Lace Padilla, Priti Shah, Jeffrey Zacks, and Jessica
Hullman. 2021. “The Science of Visual Data Communication: What
Works.” Psychological Science in the Public Interest 22
(3): 110–61. https://doi.org/10.1177/15291006211051956.
Frandell, Ashlee, Mary Feeney, Timothy Johnson, Eric Welch, Lesley
Michalegko, and Heyjie Jung. 2021. “The Effects of Electronic
Alert Letters for Internet Surveys of Academic Scientists.”
Scientometrics 126 (8): 7167–81. https://doi.org/10.1007/s11192-021-04029-3.
Franklin, Laura. 2005. “Exploratory Experiments.”
Philosophy of Science 72 (5): 888–99. https://doi.org/10.1086/508117.
Frei, Christoph, and Liam Welsh. 2022. “How
the Closure of a U.S. Tax Loophole May Affect Investor
Portfolios.” Journal of Risk and Financial
Management 15 (5): 209. https://doi.org/10.3390/jrfm15050209.
Frick, Hannah, Fanny Chow, Max Kuhn, Michael Mahoney, Julia Silge, and
Hadley Wickham. 2022. rsample: General
Resampling Infrastructure. https://CRAN.R-project.org/package=rsample.
Fried, Eiko, Jessica Flake, and Donald Robinaugh. 2022.
“Revisiting the Theoretical and Methodological Foundations of
Depression Measurement.” Nature Reviews Psychology 1
(6): 358–68. https://doi.org/10.1038/s44159-022-00050-2.
Friedman, Jerome, Robert Tibshirani, and Trevor Hastie. 2009. The
Elements of Statistical Learning. 2nd ed. Springer. https://hastie.su.domains/ElemStatLearn/.
Friendly, Michael. 2021. HistData: Data Sets from the History of
Statistics and Data Visualization. https://CRAN.R-project.org/package=HistData.
Friendly, Michael, and Howard Wainer. 2021. A History of Data
Visualization and Graphic Communication. 1st ed. Massachusetts:
Harvard University Press.
Fry, Hannah. 2020. “Big Tech Is Testing You.” The New
Yorker, February, 61–65. https://www.newyorker.com/magazine/2020/03/02/big-tech-is-testing-you.
Fryzlewicz, Piotr. 2024. “Telling Stories
with Data: With Applications in R.” The American
Statistician, April, 1–5. https://doi.org/10.1080/00031305.2024.2339562.
Fuller, Mark, and James Mosher. 1987. “Raptor Survey
Techniques.” In Raptor Management Techniques Manual,
edited by Beth Pendleton, Brian Millsap, Keith Cline, and David Bird,
37–65. National Wildlife Federation. https://www.sandiegocounty.gov/content/dam/sdc/pds/ceqa/JVR/AdminRecord/IncorporatedByReference/Appendices/Appendix-D---Biological-Resources-Report/Fuller%20and%20Mosher%201987.pdf.
Funkhouser, Gray. 1937. “Historical Development of the Graphical
Representation of Statistical Data.” Osiris 3: 269–404.
https://doi.org/10.1086/368480.
Gagolewski, Marek. 2022. “stringi:
Fast and Portable Character String Processing in
R.” Journal of Statistical Software 103
(2): 1–59. https://doi.org/10.18637/jss.v103.i02.
Galef, Julia. 2020. “Episode 248: Are Democrats Being Irrational?
(David Shor).” Rationally Speaking, December. http://rationallyspeakingpodcast.org/248-are-democrats-being-irrational-david-shor/.
Gao, Lucy, Jacob Bien, and Daniela Witten. 2022. “Selective
Inference for Hierarchical Clustering.” Journal of the
American Statistical Association, October, 1–11. https://doi.org/10.1080/01621459.2022.2116331.
Gao, Zheng, Christian Bird, and Earl T. Barr. 2017. “To Type or
Not to Type: Quantifying Detectable Bugs in
JavaScript.” In 2017
IEEE/ACM 39th International Conference on
Software Engineering (ICSE). IEEE. https://doi.org/10.1109/icse.2017.75.
Garfinkel, Irwin, Lee Rainwater, and Timothy Smeeding. 2006. “A
Re-Examination of Welfare States and Inequality in Rich Nations: How
in-Kind Transfers and Indirect Taxes Change the Story.”
Journal of Policy Analysis and Management 25 (4): 897–919. https://doi.org/10.1002/pam.20213.
Gargiulo, Maria. 2022. “Statistical Biases, Measurement
Challenges, and Recommendations for Studying Patterns of Femicide in
Conflict.” Peace Review 34 (2): 163–76. https://doi.org/10.1080/10402659.2022.2049002.
Garnier, Simon, Noam Ross, Robert Rudis, Antônio Camargo, Marco Sciaini,
and Cédric Scherer. 2021. viridis –
Colorblind-Friendly Color Maps for R. https://doi.org/10.5281/zenodo.4679424.
Gazeley, Ursula, Georges Reniers, Hallie Eilerts-Spinelli, Julio Romero
Prieto, Momodou Jasseh, Sammy Khagayi, and Veronique Filippi. 2022.
“Women’s Risk of Death Beyond 42 Days Post Partum: A Pooled
Analysis of Longitudinal Health and Demographic Surveillance System Data
in Sub-Saharan Africa.” The Lancet Global Health 10
(11): e1582–89. https://doi.org/10.1016/s2214-109x(22)00339-4.
Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman
Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021.
“Datasheets for Datasets.” Communications of the
ACM 64 (12): 86–92. https://doi.org/10.1145/3458723.
Gelfand, Sharla. 2021. “Make a ReprEx... Please.”
YouTube, February. https://youtu.be/G5Nm-GpmrLw.
———. 2022a. Astrologer: Chani Nicholas Weekly Horoscopes
(2013-2017). http://github.com/sharlagelfand/astrologer.
———. 2022b. opendatatoronto: Access the City of
Toronto Open Data Portal. https://CRAN.R-project.org/package=opendatatoronto.
Gelman, Andrew. 2016. “What has happened down
here is the winds have changed,” September. https://statmodeling.stat.columbia.edu/2016/09/21/what-has-happened-down-here-is-the-winds-have-changed/.
———. 2019. “Another Regression Discontinuity Disaster and What Can
We Learn from It,” June. https://statmodeling.stat.columbia.edu/2019/06/25/another-regression-discontinuity-disaster-and-what-can-we-learn-from-it/.
———. 2020. “Statistical Models of Election Outcomes.”
YouTube, August. https://youtu.be/7gjDnrbLQ4k.
Gelman, Andrew, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and
Donald Rubin. (1995) 2014. Bayesian Data Analysis. 3rd ed.
Chapman; Hall/CRC.
Gelman, Andrew, Sharad Goel, Douglas Rivers, and David Rothschild. 2016.
“The Mythical Swing Voter.” Quarterly Journal of
Political Science 11 (1): 103–30. https://doi.org/10.1561/100.00015031.
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using
Regression and Multilevel/Hierarchical Models. 1st ed. Cambridge
University Press.
Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2020. Regression and
Other Stories. Cambridge University Press. https://avehtari.github.io/ROS-Examples/.
Gelman, Andrew, and Guido Imbens. 2019. “Why High-Order
Polynomials Should Not Be Used in Regression Discontinuity
Designs.” Journal of Business & Economic Statistics
37 (3): 447–56. https://doi.org/10.1080/07350015.2017.1366909.
Gelman, Andrew, and Eric Loken. 2013. “The Garden of Forking
Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No
‘Fishing Expedition’ or ‘p-Hacking’ and the
Research Hypothesis Was Posited Ahead of Time.” Department of
Statistics, Columbia University. http://www.stat.columbia.edu/~gelman/research/unpublished/p%5Fhacking.pdf.
Gelman, Andrew, Greggor Mattson, and Daniel Simpson. 2018. “Gaydar
and the Fallacy of Decontextualized Measurement.”
Sociological Science 5 (12): 270–80. https://doi.org/10.15195/v5.a12.
Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s
Practice What We Preach: Turning Tables into Graphs.” The
American Statistician 56 (2): 121–30. https://doi.org/10.1198/000313002317572790.
Gelman, Andrew, and Aki Vehtari. 2021. “What Are the Most
Important Statistical Ideas of the Past 50 Years?” Journal of
the American Statistical Association 116 (536): 2087–97. https://doi.org/10.1080/01621459.2021.1938081.
———. 2024. Active Statistics: Stories, Games,
Problems, and Hands-on Demonstrations for Applied Regression and Causal
Inference. Cambridge University Press. https://doi.org/10.1017/9781009436243.
Gelman, Andrew, Aki Vehtari, Daniel Simpson, Charles Margossian, Bob
Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian
Bürkner, and Martin Modrák. 2020. “Bayesian Workflow.”
arXiv. https://doi.org/10.48550/arXiv.2011.01808.
Gentemann, Chelle Leigh, Chris Holdgraf, Ryan Abernathey, Daniel
Crichton, James Colliander, Edward Joseph Kearns, Yuvi Panda, and
Richard Signell. 2021. “Science Storms the Cloud.”
AGU Advances 2 (2). https://doi.org/10.1029/2020av000354.
Gerber, Alan, and Donald Green. 2012. Field Experiments: Design,
Analysis, and Interpretation. New York: WW Norton.
Gerring, John. 2012. “Mere Description.” British
Journal of Political Science 42 (4): 721–46. https://doi.org/10.1017/s0007123412000130.
Gertler, Paul, Sebastian Martinez, Patrick Premand, Laura Rawlings, and
Christel Vermeersch. 2016. Impact Evaluation in Practice. 2nd
ed. The World Bank. https://doi.org/10.1596/978-1-4648-0779-4.
Geuenich, Michael, Jinyu Hou, Sunyun Lee, Shanza Ayub, Hartland Jackson,
and Kieran Campbell. 2021a. “Automated Assignment of Cell Identity
from Single-Cell Multiplexed Imaging and Proteomic Data.”
Cell Systems 12 (12): 1173–86. https://doi.org/10.1016/j.cels.2021.08.012.
———. 2021b. “Replication Materials: "Automated Assignment of Cell
Identity from Single-Cell Multiplexed Imaging and Proteomic
Data".” https://doi.org/10.5281/ZENODO.5156049.
Ghitza, Yair, and Andrew Gelman. 2020. “Voter Registration
Databases and MRP: Toward the Use of Large-Scale Databases in Public
Opinion Research.” Political Analysis 28 (4): 507–31. https://doi.org/10.1017/pan.2020.3.
Gibney, Elizabeth. 2022. “The leap second’s
time is up: world votes to stop pausing clocks.”
Nature 612 (7938): 18–18. https://doi.org/10.1038/d41586-022-03783-5.
Gleick, James. 1990. “The Census: Why We Can’t Count.”
The New York Times, July. https://www.nytimes.com/1990/07/15/magazine/the-census-why-we-can-t-count.html.
Godfrey, Ernest. 1918. “History and Development of Statistics in
Canada.” In The History of Statistics–Their Development and
Progress in Many Countries. New York: Macmillan, edited by John
Koren, 179–98. Macmillan Company of New York.
Goodman, Leo. 1961. “Snowball Sampling.” The Annals of
Mathematical Statistics 32 (1): 148–70. https://doi.org/10.1214/aoms/1177705148.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2023.
“rstanarm: Bayesian applied
regression modeling via Stan.” https://mc-stan.org/rstanarm.
Google. 2022. “What to Look for in a Code Review.” Google
Engineering Practices Documentation. https://google.github.io/eng-practices/review/reviewer/looking-for.html.
Gordon, Brett, Robert Moakler, and Florian Zettelmeyer. 2022.
“Close Enough? A Large-Scale Exploration of Non-Experimental
Approaches to Advertising Measurement.” Marketing
Science, November. https://doi.org/10.1287/mksc.2022.1413.
Gordon, Brett, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky.
2019. “A Comparison of Approaches to Advertising Measurement:
Evidence from Big Field Experiments at Facebook.” Marketing
Science 38 (2): 193–225. https://doi.org/10.1287/mksc.2018.1135.
Gould, Elliot, Hannah Fraser, Timothy Parker, Shinichi Nakagawa, Simon
Griffith, Peter Vesk, and Fiona Fidler. 2023. “Same Data,
Different Analysts: Variation in Effect Sizes Due to Analytical
Decisions in Ecology and Evolutionary Biology,” October. https://doi.org/10.32942/x2gg62.
Graham, Paul. 2020. “How to Write Usefully,” February. http://paulgraham.com/useful.html.
Gray, Charles T., and Ben Marwick. 2019. “Truth, Proof, and
Reproducibility: There’s No Counter-Attack for the Codeless.” In
Communications in Computer and Information Science, 111–29.
Springer Singapore. https://doi.org/10.1007/978-981-15-1960-4_8.
Green, Donald, Terence Leong, Holger Kern, Alan Gerber, and Christopher
Larimer. 2009. “Testing the Accuracy of Regression Discontinuity
Analysis Using Experimental Benchmarks.” Political
Analysis 17 (4): 400–417. https://doi.org/10.1093/pan/mpp018.
Green, Eric. 2020. “Nivi Research: Mister P
helps us understand vaccine hesitancy,” December. https://research.nivi.io/posts/2020-12-08-mister-p-helps-us-understand-vaccine-hesitancy/.
Greenberg, Bernard, Abdel-Latif Abul-Ela, Walt Simmons, and Daniel
Horvitz. 1969. “The Unrelated Question Randomized Response Model:
Theoretical Framework.” Journal of the American Statistical
Association 64 (326): 520–39. https://doi.org/10.1080/01621459.1969.10500991.
Greenland, Sander, Stephen Senn, Kenneth Rothman, John Carlin, Charles
Poole, Steven Goodman, and Douglas Altman. 2016. “Statistical Tests, P values, Confidence Intervals, and
Power: A Guide to Misinterpretations.” European
Journal of Epidemiology 31 (4): 337–50. https://doi.org/10.1007/s10654-016-0149-3.
Greifer, Noah. 2021. “Why Do We Do Matching for Causal Inference
Vs Regressing on Confounders?” Cross Validated,
September. https://stats.stackexchange.com/q/544958.
Grimmer, Justin, Margaret Roberts, and Brandon Stewart. 2022. Text As Data: A New Framework for Machine Learning and
the Social Sciences. New Jersey: Princeton University Press.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times
Made Easy with lubridate.”
Journal of Statistical Software 40 (3): 1–25. https://doi.org/10.18637/jss.v040.i03.
Gronsbell, Jessica, Jessica Minnier, Sheng Yu, Katherine Liao, and
Tianxi Cai. 2019. “Automated Feature Selection of Predictors in
Electronic Medical Records Data.” Biometrics 75 (1):
268–77. https://doi.org/10.1111/biom.12987.
Groves, Robert. 2011. “Three Eras of Survey Research.”
Public Opinion Quarterly 75 (5): 861–71. https://doi.org/10.1093/poq/nfr057.
Groves, Robert, and Lars Lyberg. 2010. “Total
Survey Error: Past, Present, and Future.” Public
Opinion Quarterly 74 (5): 849–79. https://doi.org/10.1093/poq/nfq065.
Grün, Bettina, and Kurt Hornik. 2011. “topicmodels: An R Package for Fitting
Topic Models.” Journal of Statistical Software 40 (13):
1–30. https://doi.org/10.18637/jss.v040.i13.
Gustafsson, Karl, and Linus Hagström. 2017. “What Is the Point?
Teaching Graduate Students How to Construct Political Science Research
Puzzles.” European Political Science 17 (4): 634–48. https://doi.org/10.1057/s41304-017-0130-y.
Gutman, Robert. 1958. “Birth and Death Registration in
Massachusetts: II. The Inauguration of a Modern System,
1800-1849.” The Milbank Memorial Fund Quarterly 36 (4):
373–402.
Hackett, Robert. 2016. “Researchers Caused an
Uproar By Publishing Data From 70,000 OkCupid Users.”
Fortune, May. https://fortune.com/2016/05/18/okcupid-data-research/.
Halberstam, David. 1972. The Best and the
Brightest. 1st ed. New York: Random House.
Hamming, Richard. (1997) 2020. The Art of Doing
Science and Engineering. 2nd ed. Stripe Press.
Hammond, Jennifer, Heidi Leister-Tebbe, Annie Gardner, Paula Abreu,
Weihang Bao, Wayne Wisemandle, MaryLynn Baniecki, et al. 2022.
“Oral Nirmatrelvir for High-Risk, Nonhospitalized Adults with
Covid-19.” New England Journal of Medicine 386 (15):
1397–1408. https://doi.org/10.1056/nejmoa2118542.
Hand, David. 2018. “Statistical Challenges of Administrative and
Transaction Data.” Journal of the Royal Statistical Society:
Series A (Statistics in Society) 181 (3): 555–605. https://doi.org/10.1111/rssa.12315.
Handcock, Mark, and Krista Gile. 2011. “Comment: On the Concept of
Snowball Sampling.” Sociological Methodology 41 (1):
367–71. https://doi.org/10.1111/j.1467-9531.2011.01243.x.
Hangartner, Dominik, Daniel Kopp, and Michael Siegenthaler. 2021.
“Monitoring Hiring Discrimination Through Online Recruitment
Platforms.” Nature 589 (7843): 572–76. https://doi.org/10.1038/s41586-020-03136-0.
Hanretty, Chris. 2020. “An Introduction to Multilevel Regression
and Post-Stratification for Estimating Constituency Opinion.”
Political Studies Review 18 (4): 630–45. https://doi.org/10.1177/1478929919864773.
Hao, Karen. 2019. “This is How AI Bias Really
Happens—And Why It’s So Hard To Fix.” MIT Technology
Review, February. https://www.technologyreview.com/2019/02/04/137602/this-is-how-ai-bias-really-happensand-why-its-so-hard-to-fix/.
Hart, Edmund, Pauline Barmby, David LeBauer, François Michonneau, Sarah
Mount, Patrick Mulrooney, Timothée Poisot, Kara Woo, Naupaka Zimmerman,
and Jeffrey Hollister. 2016. “Ten Simple Rules for Digital Data
Storage.” PLOS Computational Biology 12
(10): e1005097. https://doi.org/10.1371/journal.pcbi.1005097.
Hartocollis, Anemona. 2022. “U.S. News Ranked
Columbia No. 2, but a Math Professor Has His Doubts.”
The New York Times, March. https://www.nytimes.com/2022/03/17/us/columbia-university-rank.html.
Hassan, Mai. 2022. “New Insights on Africa’s Autocratic
Past.” African Affairs 121 (483): 321–33. https://doi.org/10.1093/afraf/adac002.
Hastie, Trevor, and Robert Tibshirani. 1990. Generalized Additive
Models. 1st ed. Boca Raton: Chapman; Hall/CRC.
Hawes, Michael. 2020. “Implementing Differential
Privacy: Seven Lessons From the
2020 United States
Census.” Harvard Data Science Review 2 (2).
https://doi.org/10.1162/99608f92.353c6f99.
Hayot, Eric. 2014. The Elements of Academic Style. New York:
Columbia University Press.
Healy, Kieran. 2018. Data Visualization. New Jersey: Princeton
University Press. https://socviz.co.
———. 2020. “The Kitchen Counter Observatory,” May. https://kieranhealy.org/blog/archives/2020/05/21/the-kitchen-counter-observatory/.
———. 2022. “Unhappy in Its Own Way,” July. https://kieranhealy.org/blog/archives/2022/07/22/unhappy-in-its-own-way/.
Heckathorn, Douglas. 1997. “Respondent-Driven Sampling: A New
Approach to the Study of Hidden Populations.” Social
Problems 44 (2): 174–99. https://doi.org/10.2307/3096941.
Heil, Benjamin, Michael Hoffman, Florian Markowetz, Su-In Lee, Casey
Greene, and Stephanie Hicks. 2021. “Reproducibility Standards for
Machine Learning in the Life Sciences.” Nature Methods
18 (10): 1132–35. https://doi.org/10.1038/s41592-021-01256-7.
Heller, Jean. 2022. “AP Exposes the Tuskegee Syphilis Study: The
50th Anniversary.” AP, July. https://apnews.com/article/tuskegee-study-ap-story-investigation-syphilis-53403657e77d76f52df6c2e2892788c9.
Hermans, Felienne. 2017. “Peter Hilton on Naming.” IEEE
Software 34 (3): 117–20. https://doi.org/10.1109/MS.2017.81.
———. 2021. The Programmer’s Brain: What Every Programmer Needs to
Know about Cognition. 1st ed. New York: Simon; Schuster. https://www.manning.com/books/the-programmers-brain.
Hernán, Miguel, David Clayton, and Niels Keiding. 2011. “The
Simpson’s Paradox Unraveled.” International Journal of
Epidemiology 40 (3): 780–85. https://doi.org/10.1093/ije/dyr041.
Hernán, Miguel, and James Robins. 2023. What If. 1st ed. Boca
Raton: Chapman & Hall/CRC. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.
Herndon, Thomas, Michael Ash, and Robert Pollin. 2014. “Does High
Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart
and Rogoff.” Cambridge Journal of Economics 38 (2):
257–79. https://doi.org/10.1093/cje/bet075.
Hester, Jim, Florent Angly, Russ Hyde, Michael Chirico, Kun Ren,
Alexander Rosenstock, and Indrajeet Patil. 2022. lintr: A “Linter” for R Code. https://CRAN.R-project.org/package=lintr.
Hester, Jim, Hadley Wickham, and Gábor Csárdi. 2021. fs: Cross-Platform File System Operations Based on
“libuv”. https://CRAN.R-project.org/package=fs.
Hill, Austin Bradford. 1965. “The Environment and Disease:
Association or Causation?” Proceedings of the Royal Society
of Medicine 58 (5): 295–300.
Hillel, Wayne. 2017. How Do We Trust Our Science Code? https://www.hillelwayne.com/how-do-we-trust-science-code/.
Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2011.
“MatchIt: Nonparametric Preprocessing for Parametric
Causal Inference.” Journal of Statistical Software 42
(8): 1–28. https://doi.org/10.18637/jss.v042.i08.
Hodgetts, Paul. 2022. “The Negative Space of Data,” March.
https://hodgettsp.netlify.app/post/data-negativespace/.
Hofmeister, Johannes, Janet Siegmund, and Daniel Holt. 2017.
“Shorter Identifier Names Take Longer to Comprehend.” In
2017 IEEE 24th International Conference on Software Analysis,
Evolution and Reengineering (SANER), 217–27. https://doi.org/10.1109/saner.2017.7884623.
Holland, Paul. 1986. “Statistics and Causal Inference.”
Journal of the American Statistical Association 81 (396):
945–60. https://doi.org/10.2307/2289064.
Holliday, Derek, Tyler Reny, Alex Rossell Hayes, Aaron Rudkin, Chris
Tausanovitch, and Lynn Vavreck. 2021. “Democracy Fund + UCLA Nationscape Methodology and
Representativeness Assessment.”
Hopper, Nate. 2022. “The Thorny Problem of Keeping the Internet’s
Time.” The New Yorker, September. https://www.newyorker.com/tech/annals-of-technology/the-thorny-problem-of-keeping-the-internets-time.
Horst, Allison Marie, Alison Presmanes Hill, and Kristen Gorman. 2020.
palmerpenguins: Palmer Archipelago (Antarctica)
penguin data. https://doi.org/10.5281/zenodo.3960218.
Horton, Nicholas, Rohan Alexander, Micaela Parker, Aneta Piekut, and
Colin Rundel. 2022. “The Growing Importance of Reproducibility and
Responsible Workflow in the Data Science and Statistics
Curriculum.” Journal of Statistics and Data Science
Education 30 (3): 207–8. https://doi.org/10.1080/26939169.2022.2141001.
Horton, Nicholas, and Stuart Lipsitz. 2001. “Multiple Imputation
in Practice.” The American Statistician 55 (3): 244–54.
https://doi.org/10.1198/000313001317098266.
Hotz, Joseph, Christopher Bollinger, Tatiana Komarova, Charles Manski,
Robert Moffitt, Denis Nekipelov, Aaron Sojourner, and Bruce Spencer.
2022. “Balancing Data Privacy and Usability in the Federal
Statistical System.” Proceedings of the National Academy of
Sciences 119 (31): 1–10. https://doi.org/10.1073/pnas.2104906119.
Howes, Adam. 2022. “Representing Uncertainty Using Significant
Figures,” April. https://athowes.github.io/posts/2022-04-24-representing-uncertainty-using-significant-figures/.
Hug, Lucia, Monica Alexander, Danzhen You, Leontine Alkema, and UN
Inter-agency Group for Child. 2019. “National, Regional, and
Global Levels and Trends in Neonatal Mortality Between 1990 and 2017,
with Scenario-Based Projections to 2030: A Systematic Analysis.”
Lancet Global Health 7 (6): e710–20. https://doi.org/10.1016/S2214-109X(19)30163-9.
Hughes, Nicola, and Jill Rutter. 2016. “Ministers Reflect:
Interview with Oliver Letwin,” December. https://www.instituteforgovernment.org.uk/ministers-reflect/person/oliver-letwin/.
Hulley, Stephen, Steven Cummings, Warren Browner, Deborah Grady, and
Thomas Newman. 2007. Designing Clinical Research. 3rd ed.
Lippincott Williams & Wilkins.
Hullman, Jessica, and Andrew Gelman. 2021. “Designing for
Interactive Exploratory Data Analysis Requires Theories of Graphical
Inference.” Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.3ab8a587.
Huntington-Klein, Nick. 2021. The Effect: An Introduction to
Research Design and Causality. 1st ed. Chapman & Hall. https://theeffectbook.net.
———. 2022. “Library of Statistical Techniques.” https://lost-stats.github.io.
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni,
Jeffrey Bloem, Pralhad Burli, Naibin Chen, et al. 2021. “The
Influence of Hidden Researcher Decisions in Applied
Microeconomics.” Economic Inquiry 59: 944–60. https://doi.org/10.1111/ecin.12992.
Huyen, Chip. 2020. “Machine Learning Is Going Real-Time,”
December. https://huyenchip.com/2020/12/27/real-time-machine-learning.html.
Hvitfeldt, Emil, and Julia Silge. 2021. Supervised Machine Learning for Text Analysis in
R. 1st ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781003093459.
Hyman, Michael, Luca Sartore, and Linda J Young. 2021. “Capture-Recapture Estimation of Characteristics of U.S.
Local Food Farms Using a Web-Scraped List Frame.”
Journal of Survey Statistics and Methodology 10 (4): 979–1004.
https://doi.org/10.1093/jssam/smab008.
Hyndman, Rob, Timothy Hyndman, Charles Gray, Sayani Gupta, and Jacquie
Tran. 2022. cricketdata: International Cricket
Data. https://CRAN.R-project.org/package=cricketdata.
Iannone, Richard. 2022. DiagrammeR: Graph/Network
Visualization. https://CRAN.R-project.org/package=DiagrammeR.
Iannone, Richard, and Mauricio Vargas. 2022. pointblank: Data Validation and Organization of Metadata
for Local and Remote Tables. https://CRAN.R-project.org/package=pointblank.
International Organization Of Legal Metrology. 2007. International
Vocabulary of Metrology – Basic and General Concepts and Associated
Terms. 3rd ed. https://www.oiml.org/en/files/pdf%5Fv/v002-200-e07.pdf.
Ioannidis, John. 2005. “Why Most Published Research Findings Are
False.” PLOS Medicine 2 (8): e124. https://doi.org/10.1371/journal.pmed.0020124.
Irizarry, Rafael. 2020. “The Role of Academia
in Data Science Education.” Harvard Data Science
Review 2 (1). https://doi.org/10.1162/99608f92.dd363929.
Irving, Damien, Kate Hertweck, Luke Johnston, Joel Ostblom, Charlotte
Wickham, and Greg Wilson. 2021. Research Software Engineering with
Python. Chapman; Hall/CRC.
Isaacson, Walter. 2011. Steve Jobs. 1st ed. Simon &
Schuster.
Ishiguro, Kazuo. 1989. The Remains of the Day. 1st ed. Faber;
Faber.
Izrailev, Sergei. 2022. tictoc: Functions for
Timing R Scripts, as Well as Implementations of “Stack” and
“List” Structures. https://CRAN.R-project.org/package=tictoc.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani.
(2013) 2021. An Introduction to Statistical
Learning with Applications in R. 2nd ed. Springer. https://www.statlearning.com.
Jenkins, Jennifer, Steven Rich, Andrew Ba Tran, Paige Moody, Julie Tate,
and Ted Mellnik. 2022. “How the Washington Post Examines Police
Shootings in the United States.” https://www.washingtonpost.com/investigations/2022/12/05/washington-post-fatal-police-shootings-methodology/.
Jet Propulsion Laboratory. 2009. “JPL
Institutional Coding Standard for the C Programming
Language.” Document Number D-60411, March. https://web.archive.org/web/20111015064908/http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf.
Johnson, Alicia, Miles Ott, and Mine Dogucu. 2022. Bayes Rules! An Introduction to Bayesian Modeling with
R. 1st ed. Chapman; Hall/CRC. https://www.bayesrulesbook.com.
Johnson, Kaneesha. 2021. “Two Regimes of Prison Data
Collection.” Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.72825001.
Johnston, Myfanwy, and David Robinson. 2022. gutenbergr: Download and Process Public Domain Works from
Project Gutenberg. https://CRAN.R-project.org/package=gutenbergr.
Jones, Arnold. 1953. “Census Records of the Later Roman
Empire.” The Journal of Roman Studies 43: 49–64. https://doi.org/10.2307/297781.
Jordan, Michael. 2004. “Graphical Models.” Statistical
Science 19 (1). https://doi.org/10.1214/088342304000000026.
———. 2019. “Artificial Intelligence–The
Revolution Hasn’t Happened Yet.” Harvard Data Science
Review 1 (1). https://doi.org/10.1162/99608f92.f06c6e61.
Joyner, Michael. 1991. “Modeling: Optimal Marathon Performance on
the Basis of Physiological Factors.” Journal of Applied
Physiology 70 (2): 683–87. https://doi.org/10.1152/jappl.1991.70.2.683.
Jurafsky, Dan, and James Martin. (2000) 2023. Speech and Language
Processing. 3rd ed. https://web.stanford.edu/~jurafsky/slp3/.
Kahan, Brennan, Suzie Cro, Fan Li, and Michael Harhay. 2023.
“Eliminating Ambiguous Treatment Effects Using Estimands.”
American Journal of Epidemiology, February. https://doi.org/10.1093/aje/kwad036.
Kahan, Brennan, Joanna Hindley, Mark Edwards, Suzie Cro, and Tim Morris.
2024. “The estimands framework: a primer on
the ICH E9(R1) addendum.” BMJ, January, e076316.
https://doi.org/10.1136/bmj-2023-076316.
Kahan, Brennan, Fan Li, Andrew Copas, and Michael Harhay. 2022.
“Estimands in Cluster-Randomized Trials: Choosing Analyses That
Answer the Right Question.” International Journal of
Epidemiology, July. https://doi.org/10.1093/ije/dyac131.
Kahle, David, and Hadley Wickham. 2013. “ggmap: Spatial Visualization with ggplot2.”
The R Journal 5 (1): 144–61. http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf.
Kahneman, Daniel, Olivier Sibony, and Cass Sunstein. 2021. Noise: A
Flaw in Human Judgment. William Collins.
Kalamara, Eleni, Arthur Turrell, Chris Redl, George Kapetanios, and
Sujit Kapadia. 2022. “Making text count:
Economic forecasting using newspaper text.”
Journal of Applied Econometrics 37 (5): 896–919.
https://doi.org/10.1002/jae.2907.
Kalgin, Alexander. 2014. “Implementation of
Performance Management in Regional Government in Russia: Evidence of
Data Manipulation.” Public Management Review 18
(1): 110–38. https://doi.org/10.1080/14719037.2014.965271.
Kapoor, Sayash, and Arvind Narayanan. 2023. “Leakage and the
Reproducibility Crisis in Machine-Learning-Based Science.”
Patterns 4 (9): 1–12. https://doi.org/10.1016/j.patter.2023.100804.
Karsten, Karl. 1923. Charts and Graphs. New York:
Prentice-Hall.
Kasy, Maximilian, and Alexander Teytelboym. 2023. “Matching with
Semi-Bandits.” The Econometrics Journal 26 (1): 45–66.
https://doi.org/10.1093/ectj/utac021.
Katz, Lindsay, and Rohan Alexander. 2023a. “A
new, comprehensive database of all proceedings of the Australian
Parliamentary Debates (1998-2022).” Zenodo. https://doi.org/10.5281/zenodo.7799678.
———. 2023b. “Digitization of the Australian Parliamentary Debates,
1998–2022.” Scientific Data 10 (1): 1–14. https://doi.org/10.1038/s41597-023-02464-w.
Kay, Matthew. 2022. tidybayes: Tidy Data
and Geoms for Bayesian Models. https://doi.org/10.5281/zenodo.1308151.
Kennedy, Lauren, and Jonah Gabry. 2020. “MRP
with rstanarm,” July. https://mc-stan.org/rstanarm/articles/mrp.html.
Kennedy, Lauren, and Andrew Gelman. 2021. “Know Your Population
and Know Your Model: Using Model-Based Regression and Poststratification
to Generalize Findings Beyond the Observed Sample.”
Psychological Methods 26 (5): 547–58. https://doi.org/10.1037/met0000362.
Kennedy, Lauren, Katharine Khanna, Daniel Simpson, Andrew Gelman, Yajun
Jia, and Julien Teitler. 2022. “He, She, They: Using Sex and
Gender in Survey Adjustment.” https://arxiv.org/abs/2009.14401.
Kenny, Christopher T., Shiro Kuriwaki, Cory McCartan, Evan T. R.
Rosenman, Tyler Simko, and Kosuke Imai. 2021. “The use of differential privacy for census data and its
impact on redistricting: The case of the 2020 U.S.
Census.” Science Advances 7 (41). https://doi.org/10.1126/sciadv.abk3283.
———. 2023. “Comment: The Essential Role of Policy Evaluation for
the 2020 Census Disclosure Avoidance System.” Harvard Data
Science Review, no. Special Issue 2. https://doi.org/10.1162/99608f92.abc2c765.
Kent, William. 1993. “My Height: A Model for Numeric
Information.” https://www.bkent.net/Doc/myheight.htm.
Keshav, Srinivasan. 2007. “How to Read a Paper.”
ACM SIGCOMM Computer Communication
Review 37 (3): 83–84. https://doi.org/10.1145/1273445.1273458.
Keyes, Os. 2019. “Counting the Countless.” Real
Life. https://reallifemag.com/counting-the-countless/.
Kharecha, Pushker, and James Hansen. 2013. “Prevented Mortality
and Greenhouse Gas Emissions from Historical and Projected Nuclear
Power.” Environmental Science & Technology 47 (9):
4889–95. https://doi.org/10.1021/es3051197.
Kiang, Mathew, Alexander Tsai, Monica Alexander, David Rehkopf, and
Sanjay Basu. 2021. “Racial/Ethnic Disparities in Opioid-Related
Mortality in the USA, 1999–2019: The Extreme Case of Washington
DC.” Journal of Urban Health 98 (5): 589–95. https://doi.org/10.1007/s11524-021-00573-8.
King, Gary. 2006. “Publication, Publication.” PS:
Political Science & Politics 39 (1): 119–25. https://doi.org/10.1017/S1049096506060252.
King, Gary, and Richard Nielsen. 2019. “Why Propensity Scores
Should Not Be Used for Matching.” Political Analysis 27
(4): 435–54. https://doi.org/10.1017/pan.2019.11.
King, Stephen. 2000. On Writing: A Memoir of the Craft. 1st ed.
Scribner.
Kirkegaard, Emil, and Julius Bjerrekær. 2016. “The OKCupid
Dataset: A Very Large Public Dataset of Dating Site Users.”
Open Differential Psychology, 1–10. https://doi.org/10.26775/ODP.2016.11.03.
Kish, Leslie. 1959. “Some Statistical Problems in Research
Design.” American Sociological Review 24 (3): 328–38. https://doi.org/10.2307/2089381.
Kleiber, Christian, and Achim Zeileis. 2008. Applied Econometrics
with R. New York: Springer-Verlag. https://CRAN.R-project.org/package=AER.
Knuth, Donald. 1984. “Literate Programming.” The
Computer Journal 27 (2): 97–111. https://doi.org/10.1093/comjnl/27.2.97.
———. 1998. Art of Computer Programming, Volume 2: Seminumerical
Algorithms. 2nd ed.
Knutson, Victoria, Serge Aleshin-Guendel, Ariel Karlinsky, William
Msemburi, and Jon Wakefield. 2022. “Estimating Global and
Country-Specific Excess Mortality During the COVID-19 Pandemic,”
May. https://cdn.who.int/media/docs/default-source/world-health-data-platform/covid-19-excessmortality/covid-methods-paper-revision.pdf.
Koenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey,
Zion Mengesha, Connor Toups, John Rickford, Dan Jurafsky, and Sharad
Goel. 2020. “Racial Disparities in Automated Speech
Recognition.” Proceedings of the National Academy of
Sciences 117 (14): 7684–89. https://doi.org/10.1073/pnas.1915768117.
Koenecke, Allison, and Hal Varian. 2020. “Synthetic Data
Generation for Economists.” https://arxiv.org/abs/2011.01374.
Koenker, Roger, and Achim Zeileis. 2009. “On Reproducible
Econometric Research.” Journal of Applied Econometrics
24 (5): 833–47. https://doi.org/10.1002/jae.1083.
Koerner, Lisbet. 2000. Linnaeus: Nature and Nation. Cambridge:
Harvard University Press.
Kohavi, Ron, Alex Deng, Brian Frasca, Roger Longbotham, Toby Walker, and
Ya Xu. 2012. “Trustworthy Online Controlled Experiments.”
In Proceedings of the 18th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining -
KDD 12, 1st ed. ACM Press.
https://doi.org/10.1145/2339530.2339653.
Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical
Guide to A/B Testing. Cambridge University Press.
Koitsalu, Marie, Martin Eklund, Jan Adolfsson, Henrik Grönberg, and
Yvonne Brandberg. 2018. “Effects of Pre-Notification, Invitation
Length, Questionnaire Length and Reminder on Participation Rate: A
Quasi-Randomised Controlled Trial.” BMC Medical Research
Methodology 18 (3): 1–5. https://doi.org/10.1186/s12874-017-0467-5.
Krantz, Sebastian. 2023. collapse: Advanced and
Fast Data Transformation. https://CRAN.R-project.org/package=collapse.
Kuhn, Max. 2022. tune: Tidy Tuning
Tools. https://CRAN.R-project.org/package=tune.
Kuhn, Max, and Hannah Frick. 2022. poissonreg:
Model Wrappers for Poisson Regression. https://CRAN.R-project.org/package=poissonreg.
Kuhn, Max, and Davis Vaughan. 2022. parsnip: A
Common API to Modeling and Analysis Functions. https://CRAN.R-project.org/package=parsnip.
Kuhn, Max, Davis Vaughan, and Emil Hvitfeldt. 2022. yardstick: Tidy Characterizations of Model
Performance. https://CRAN.R-project.org/package=yardstick.
Kuhn, Max, and Hadley Wickham. 2020. tidymodels: a collection of packages for modeling and
machine learning using tidyverse principles. https://www.tidymodels.org.
———. 2022. recipes: Preprocessing and Feature
Engineering Steps for Modeling. https://CRAN.R-project.org/package=recipes.
Kuriwaki, Shiro, Will Beasley, and Thomas Leeper. 2023. dataverse: R Client for Dataverse 4+
Repositories.
Kuznets, Simon, Lillian Epstein, and Elizabeth Jenks. 1941. National Income and Its Composition,
1919-1938. National Bureau of Economic Research.
Lamott, Anne. 1994. Bird by Bird: Some Instructions on Writing and
Life. Anchor Books.
Landau, William Michael. 2021. “The targets R
Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for
Reproducibility and High-Performance Computing.”
Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959.
Lane, Nick. 2015. “The Unseen World: Reflections on Leeuwenhoek
(1677) ‘Concerning Little Animals’.”
Philosophical Transactions of the Royal Society B: Biological
Sciences 370 (1666): 20140344. https://doi.org/10.1098/rstb.2014.0344.
Laouenan, Morgane, Palaash Bhargava, Jean-Benoı̂t Eyméoud, Olivier
Gergaud, Guillaume Plique, and Etienne Wasmer. 2022. “A Cross-Verified Database of Notable People,
3500BC–2018AD.” Scientific Data 9 (290). https://doi.org/10.1038/s41597-022-01369-4.
Larmarange, Joseph. 2023. labelled:
Manipulating Labelled Data. https://CRAN.R-project.org/package=labelled.
Latour, Bruno. 1996. “On Actor-Network Theory: A Few
Clarifications.” Soziale Welt 47 (4): 369–81. http://www.jstor.org/stable/40878163.
Lauderdale, Benjamin, Delia Bailey, Jack Blumenau, and Douglas Rivers.
2020. “Model-Based Pre-Election Polling for National and
Sub-National Outcomes in the US and UK.” International
Journal of Forecasting 36 (2): 399–413. https://doi.org/10.1016/j.ijforecast.2019.05.012.
Laver, Michael, Kenneth Benoit, and John Garry. 2003. “Extracting
Policy Positions from Political Texts Using Words as Data.”
American Political Science Review 97 (2): 311–31. https://doi.org/10.1017/S0003055403000698.
Leek, Jeff, Blakeley McShane, Andrew Gelman, David Colquhoun, Michèle
Nuijten, and Steven Goodman. 2017. “Five Ways to Fix
Statistics.” Nature 551 (7682): 557–59. https://doi.org/10.1038/d41586-017-07522-z.
Leek, Jeff, and Roger Peng. 2020. “Advanced Data Science
2020.” http://jtleek.com/ads2020/index.html.
Leonelli, Sabina. 2020. “Learning from Data Journeys.” In
Data Journeys in the Sciences, 1–24. Springer International
Publishing. https://doi.org/10.1007/978-3-030-37177-7_1.
Leos-Barajas, Vianey, Theoni Photopoulou, Roland Langrock, Toby
Patterson, Yuuki Watanabe, Megan Murgatroyd, and Yannis Papastamatiou.
2016. “Analysis of Animal Accelerometer Data Using Hidden Markov
Models.” Methods in Ecology and Evolution 8 (2): 161–73.
https://doi.org/10.1111/2041-210x.12657.
Letterman, Clark. 2021. “Q&A: How Pew
Research Center surveyed nearly 30,000 people in India,”
July. https://medium.com/pew-research-center-decoded/q-a-how-pew-research-center-surveyed-nearly-30-000-people-in-india-7c778f6d650e.
Levay, Kevin, Jeremy Freese, and James Druckman. 2016. “The
Demographic and Political Composition of Mechanical Turk
Samples.” SAGE Open 6 (1): 1–17. https://doi.org/10.1177/2158244016636433.
Levine, Judah, Patrizia Tavella, and Martin Milton. 2022. “Towards
a Consensus on a Continuous Coordinated Universal Time.”
Metrologia 60 (1): 014001. https://doi.org/10.1088/1681-7575/ac9da5.
Lewis, Crystal. 2024. Data Management in Large-Scale Education
Research. 1st ed. Chapman; Hall/CRC. https://datamgmtinedresearch.com/index.html.
Lichand, Guilherme, and Sharon Wolf. 2022. “Measuring Child Labor:
Whom Should Be Asked, and Why It Matters,” March. https://doi.org/10.21203/rs.3.rs-1474562/v1.
Light, Richard, Judith Singer, and John Willett. 1990. By Design: Planning Research on Higher
Education. 1st ed. Cambridge: Harvard University Press.
Lima, Renato de, Oliver Phillips, Alvaro Duque, Sebastian Tello, Stuart
Davies, Alexandre Adalardo de Oliveira, Sandra Muller, et al. 2022.
“Making Forest Data Fair and Open.” Nature Ecology
& Evolution 6 (April): 656–58. https://doi.org/10.1038/s41559-022-01738-7.
Lin, Herbert. 2014. “A Proposal to Reduce Government
Overclassification of Information Related to National Security.”
Journal of National Security Law and Policy 7: 443–63.
Lin, Sarah, Ibraheem Ali, and Greg Wilson. 2021. “Ten Quick Tips
for Making Things Findable.” PLOS Computational Biology
16 (12): 1–10. https://doi.org/10.1371/journal.pcbi.1008469.
Lips, Hilary. 2020. Sex and Gender: An Introduction. 7th ed.
Illinois: Waveland Press.
Little, Roderick, and Roger Lewis. 2021. “Estimands, Estimators,
and Estimates.” JAMA 326 (10): 967. https://doi.org/10.1001/jama.2021.2886.
Liu, Emily, Lenny Bronner, and Jeremy Bowers. 2022. “What the
Washington Post Elections Engineering Team Had to Learn about Election
Data.” Washington Post Engineering, April. https://washpost.engineering/what-the-washington-post-elections-engineering-team-had-to-learn-about-election-data-a41603daf9ca.
Lockheed Martin. 2005. “Joint Strike Fighter Air Vehicle C++
Coding Standards For The System Development And Demonstration
Program.” Document Number 2RDU00001 Rev C,
December. https://www.stroustrup.com/JSF-AV-rules.pdf.
Lohr, Sharon. (1999) 2022. Sampling: Design and Analysis. 3rd
ed. Chapman; Hall/CRC.
Loken, Meredith, and Hilary Matfess. 2023. “Introducing the
Women’s Activities in Armed Rebellion (WAAR) Project, 1946-2015.”
Journal of Peace Research.
Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019. Geocomputation with R. 1st ed. Chapman;
Hall/CRC. https://geocompr.robinlovelace.net.
Lucas, Jack, Reed Merrill, Kelly Blidook, Sandra Breux, Laura Conrad,
Gabriel Eidelman, Royce Koop, et al. 2020. “Canadian
Municipal Elections Database.” Scholars Portal Dataverse.
https://doi.org/10.5683/sp2/4mzjpq.
Lucas, Robert. 1978. “Asset Prices in an Exchange Economy.”
Econometrica 46 (6): 1429–45. https://doi.org/10.2307/1913837.
Luebke, David Martin, and Sybil Milton. 1994. “Locating the
Victim: An Overview of Census-Taking, Tabulation Technology, and
Persecution in Nazi Germany.” IEEE Annals of the History of
Computing 16 (3): 25–39. https://doi.org/10.1109/MAHC.1994.298418.
Lumley, Thomas. 2020. “survey: analysis of
complex survey samples.” https://cran.r-project.org/web/packages/survey/index.html.
Lundberg, Ian, Rebecca Johnson, and Brandon Stewart. 2021. “What
Is Your Estimand? Defining the Target Quantity Connects Statistical
Evidence to Theory.” American Sociological Review 86
(3): 532–65. https://doi.org/10.1177/00031224211004187.
Luscombe, Alex, Kevin Dick, and Kevin Walby. 2021. “Algorithmic
Thinking in the Public Interest: Navigating Technical, Legal, and
Ethical Hurdles to Web Scraping in the Social Sciences.”
Quality & Quantity 56 (3): 1–22. https://doi.org/10.1007/s11135-021-01164-0.
Luscombe, Alex, Jamie Duncan, and Kevin Walby. 2022. “Jumpstarting
the Justice Disciplines: A Computational-Qualitative Approach to
Collecting and Analyzing Text and Image Data in Criminology and Criminal
Justice Studies.” Journal of Criminal Justice Education
33 (2): 151–71. https://doi.org/10.1080/10511253.2022.2027477.
Luscombe, Alex, and Alexander McClelland. 2020. “Policing the
Pandemic: Tracking the Policing of Covid-19 Across Canada,”
April. https://doi.org/10.31235/osf.io/9pn27.
Lyman, Frank. 1981. “The Responsive Classroom Discussion: The
Inclusion of All Students.” Mainstreaming Digest 109:
109–13.
MacDorman, Marian, and Eugene Declercq. 2018. “The Failure of
United States Maternal Mortality Reporting and Its Impact on Women’s
Lives.” Birth 45 (2): 105–8. https://doi.org/1111/birt.12333.
Maher, Michael. 1982. “Modelling Association Football
Scores.” Statistica Neerlandica 36 (3): 109–18. https://doi.org/10.1111/j.1467-9574.1982.tb00782.x.
Maier, Maximilian, František Bartoš, Tom Stanley, David Shanks, Adam
Harris, and Eric-Jan Wagenmakers. 2022. “No Evidence for Nudging
After Adjusting for Publication Bias.” Proceedings of the
National Academy of Sciences 119 (31): e2200300119. https://doi.org/10.1073/pnas.2200300119.
Mammoliti, Anthony, Petr Smirnov, Minoru Nakano, Zhaleh Safikhani,
Christopher Eeles, Heewon Seo, Sisira Kadambat Nair, et al. 2021.
“Orchestrating and Sharing Large Multimodal Data for Transparent
and Reproducible Research.” Nature Communications 12
(1). https://doi.org/10.1038/s41467-021-25974-w.
Manski, Charles. 2022. “Inference with Imputed Data: The Allure of
Making Stuff Up.” arXiv. https://doi.org/10.48550/arXiv.2205.07388.
Marchese, David. 2022. “Her Discovery Changed the World. How Does
She Think We Should Use It?” The New York Times, August.
https://www.nytimes.com/interactive/2022/08/15/magazine/jennifer-doudna-crispr-interview.html.
Marlowe, Christopher. 1604. The Tragical History of Doctor
Faustus. https://www.gutenberg.org/files/779/779-h/779-h.htm.
———. 1616. The Tragical History of Doctor Faustus. https://www.gutenberg.org/cache/epub/811/pg811-images.html.
Martin, Charles, and Ben Popper. 2021. “Don’t Push That Button:
Exploring the Software That Flies SpaceX Rockets and Starships.”
The Overflow, December. https://stackoverflow.blog/2021/12/27/dont-push-that-button-exploring-the-software-that-flies-spacex-starships/.
Martı́nez, Luis. 2022. “How Much Should We Trust the Dictator’s
GDP Growth Estimates?” Journal of Political
Economy 130 (10): 2731–69. https://doi.org/10.1086/720458.
Matias, Nathan, Kevin Munger, Marianne Aubin Le Quere, and Charles
Ebersole. 2021. “The Upworthy Research
Archive, a time series of 32,487 experiments in U.S.
media.” Scientific Data 8 (1): 1–8. https://doi.org/10.1038/s41597-021-00934-7.
Matsumoto, Yukihiro. 2007. “Treating Code as
an Essay.” In Beautiful Code, edited by Andy Oram
and Greg Wilson, 477–81. O’Reilly.
Mattson, Greggor. 2017. “Artificial Intelligence Discovers
Gayface. Sigh.” https://greggormattson.com/2017/09/09/artificial-intelligence-discovers-gayface/amp/.
McCarthy, Fiona M., Tamsin E. M. Jones, Anne E. Kwitek, Cynthia L.
Smith, Peter D. Vize, Monte Westerfield, and Elspeth A. Bruford. 2023.
“The Case for Standardizing Gene Nomenclature in
Vertebrates.” Nature 614 (7948): E31–32. https://doi.org/10.1038/s41586-022-05633-w.
McClelland, Alexander. 2019. “‘Lock This Whore up’:
Legal Violence and Flows of Information Precipitating Personal Violence
Against People Criminalised for HIV-Related Crimes in Canada.”
European Journal of Risk Regulation 10 (1): 132–47. https://doi.org/10.1017/err.2019.20.
McElreath, Richard. (2015) 2020. Statistical
Rethinking: A Bayesian Course with Examples in R and Stan.
2nd ed. Chapman; Hall/CRC.
———. 2020. “Science as Amateur Software Development.”
YouTube, September. https://youtu.be/zwRdO9%5FGGhY.
McIlroy, Doug, Ray Brownrigg, Thomas Minka, and Roger Bivand. 2023.
mapproj: Map Projections. https://CRAN.R-project.org/package=mapproj.
McKenzie, David. 2021. “What Do You Need To
Do To Make A Matching Estimator Convincing? Rhetorical vs Statistical
Checks.” World Bank Blogs—Development Impact,
February. https://blogs.worldbank.org/impactevaluations/what-do-you-need-do-make-matching-estimator-convincing-rhetorical-vs-statistical.
McKinney, Wes. (2011) 2022. Python for Data Analysis. 3rd ed.
https://wesmckinney.com/book/.
McPhee, John. 2017. Draft No. 4. 1st ed. Farrar, Straus;
Giroux.
McQuire, Scott. 2019. “One Map to Rule Them All? Google Maps as
Digital Technical Object.” Communication and the Public
4 (2): 150–65. https://doi.org/10.1177/2057047319850192.
Mellon, Jonathan. 2024. “Rain, Rain, Go Away: 194 Potential
Exclusion‐restriction Violations for Studies Using Weather as an
Instrumental Variable.” American Journal of Political
Science, 1–18. https://doi.org/10.1111/ajps.12894.
Meng, Xiao-Li. 1994. “Multiple-Imputation Inferences with
Uncongenial Sources of Input.” Statistical Science 9
(4): 538–58. https://doi.org/10.1214/ss/1177010269.
———. 2012. “You Want Me to Analyze Data i Don’t Have? Are You
Insane?” Shanghai Archives of Psychiatry 24 (5):
297–301. https://doi.org/10.3969/j.issn.1002-0829.2012.05.011.
———. 2018. “Statistical Paradises and Paradoxes in Big Data (i):
Law of Large Populations, Big Data Paradox, and the 2016 US Presidential
Election.” The Annals of Applied Statistics 12 (2):
685–726. https://doi.org/10.1214/18-AOAS1161SF.
———. 2021. “What Are the Values of Data, Data Science, or Data
Scientists?” Harvard Data Science Review 3 (1). https://doi.org/10.1162/99608f92.ee717cf7.
Merali, Zeeya. 2010. “Computational Science:... Error.”
Nature 467 (7317): 775–77. https://doi.org/10.1038/467775a.
Miceli, Milagros, Julian Posada, and Tianling Yang. 2022.
“Studying up Machine Learning Data.” Proceedings of the
ACM on Human-Computer Interaction 6 (January): 1–14.
https://doi.org/10.1145/3492853.
Michener, William. 2015. “Ten Simple Rules for Creating a Good
Data Management Plan.” PLOS Computational Biology 11
(10): e1004525. https://doi.org/10.1371/journal.pcbi.1004525.
Mill, James. 1817. The History of British India. 1st ed. https://books.google.ca/books?id=Orw_AAAAcAAJ.
Miller, Greg. 2014. “The Cartographer Who’s
Transforming Map Design.” Wired, October. https://www.wired.com/2014/10/cindy-brewer-map-design/.
Miller, Michael, and Joseph Sutherland. 2022. “The Effect of
Gender on Interruptions at Congressional Hearings.” American
Political Science Review, 1–19. https://doi.org/10.1017/S0003055422000260.
Mills, David L. 1991. “Internet Time Synchronization: The Network
Time Protocol.” IEEE Transactions on Communications 39
(10): 1482–93.
Mindell, David. 2008. Digital Apollo: Human and
Machine in Spaceflight. 1st ed. New York: The MIT Press.
Mineault, Patrick, and The Good Research Code Handbook Community. 2021.
“The Good Research Code Handbook.” https://doi.org/10.5281/zenodo.5796873.
Minsky, Yaron. 2011. “OCaml for the
masses.” Communications of the ACM 54 (11):
53–58. https://doi.org/10.1145/2018396.2018413.
———. 2015. “Automated Trading and OCaml with Yaron Minsky.”
Hackers — Software Engineering Daily, November. https://softwareengineeringdaily.com/2015/11/09/automated-trading-and-ocaml-with-yaron-minsky/.
Mitchell, Alanna. 2022a. “Get Ready for the New, Improved
Second.” The New York Times, April. https://www.nytimes.com/2022/04/25/science/time-second-measurement.html.
———. 2022b. “Time Has Run Out for the Leap Second.” The
New York Times, November. https://www.nytimes.com/2022/11/14/science/time-leap-second.html.
Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy
Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and
Timnit Gebru. 2019. “Model Cards for Model Reporting.”
Proceedings of the Conference on Fairness, Accountability, and
Transparency, January. https://doi.org/10.1145/3287560.3287596.
Mitrovski, Alen, Xiaoyan Yang, and Matthew Wankiewicz. 2020. “Joe
Biden Projected to Win Popular Vote in 2020 US Election.” https://github.com/matthewwankiewicz/US_election_forecast.
Miyakawa, Tsuyoshi. 2020. “No Raw Data, No Science: Another
Possible Source of the Reproducibility Crisis.” Molecular
Brain 13 (1): 1–6. https://doi.org/10.1186/s13041-020-0552-2.
Mok, Lillio, Samuel Way, Lucas Maystre, and Ashton Anderson. 2022.
“The Dynamics of Exploration on Spotify.” In
Proceedings of the International AAAI Conference on Web and Social
Media, 16:663–74. https://doi.org/10.1609/icwsm.v16i1.19324.
Molanphy, Chris. 2012. “100 & Single: Three Rules to Define
the Term ‘One-Hit Wonder’ in 2012.” The Village
Voice, September. https://www.villagevoice.com/2012/09/10/100-single-three-rules-to-define-the-term-one-hit-wonder-in-2012/.
Morange, Michel. 2016. A History of Biology. New Jersey:
Princeton University Press.
Moyer, Brian, and Abe Dunn. 2020. “Measuring the
Gross Domestic Product
(GDP): The Ultimate Data
Science Project.” Harvard Data
Science Review 2 (1). https://doi.org/10.1162/99608f92.414caadb.
Mullard, Asher. 2021. “Half of Top Cancer Studies Fail
High-Profile Reproducibility Effort.” Nature 600 (7889):
368--369. https://doi.org/10.1038/d41586-021-03691-0.
Müller, Kirill. 2020. here: A Simpler Way to
Find Your Files. https://CRAN.R-project.org/package=here.
Müller, Kirill, Tobias Schieferdecker, and Patrick Schratz. 2019.
Visualization, Transformation and Reporting with the Tidyverse.
https://krlmlr.github.io/vistransrep/.
Müller, Kirill, and Lorenz Walthert. 2022. styler: Non-Invasive Pretty Printing of R
Code. https://CRAN.R-project.org/package=styler.
Müller, Kirill, and Hadley Wickham. 2022. tibble: Simple Data Frames. https://CRAN.R-project.org/package=tibble.
Murphy, Heather. 2017. “Why Stanford Researchers Tried to Create a
‘Gaydar’ Machine.” The New York Times,
October. https://www.nytimes.com/2017/10/09/science/stanford-sexual-orientation-study.html.
National Academies of Sciences, Engineering, and Medicine. 2019.
Reproducibility and Replicability in Science. 1st ed. National
Academies Press. https://doi.org/10.17226/25303.
Nelder, John. 1999. “From Statistics to Statistical
Science.” Journal of the Royal Statistical Society: Series D
(The Statistician) 48 (2): 257–69. https://doi.org/10.1111/1467-9884.00187.
Nelder, John, and Robert Wedderburn. 1972. “Generalized Linear
Models.” Journal of the Royal Statistical Society: Series A
(General) 135 (3): 370–84. https://doi.org/10.2307/2344614.
Neufeld, Anna, and Daniela Witten. 2021. “Discussion of Breiman’s
"Two Cultures": From Two Cultures to One.” Observational
Studies 7 (1): 171–74. https://doi.org/10.1353/obs.2021.0004.
Neufeld, Michael. 2002. “Wernher von Braun, the SS, and
Concentration Camp Labor: Questions of Moral, Political, and Criminal
Responsibility.” German Studies Review 25 (1): 57–78. https://doi.org/10.2307/1433245.
Neuwirth, Erich. 2022. RColorBrewer: ColorBrewer
Palettes. https://CRAN.R-project.org/package=RColorBrewer.
Newman, Daniel. 2014. “Missing Data: Five Practical
Guidelines.” Organizational Research Methods 17 (4):
372–411. https://doi.org/10.1177/1094428114548590.
Neyman, Jerzy. 1934. “On the Two Different Aspects of the
Representative Method: The Method of Stratified Sampling and the Method
of Purposive Selection.” Journal of the Royal Statistical
Society 97 (4): 558–625. https://doi.org/10.2307/2342192.
Nix, Justin, and M. James Lozada. 2020. “Police Killings of
Unarmed Black Americans: A Reassessment of Community Mental Health
Spillover Effects,” January. https://doi.org/10.31235/osf.io/ajz2q.
Nobles, Melissa. 2002. “Racial Categorization and
Censuses.” In Census and Identity: The Politics of Race,
Ethnicity, and Language in National Censuses, edited by David
Kertzer and Dominique Arel, 43–70. Cambridge: Cambridge University
Press. https://doi.org/10.1017/CBO9780511606045.003.
Northcutt, Curtis, Anish Athalye, and Jonas Mueller. 2021.
“Pervasive Label Errors in Test Sets Destabilize Machine Learning
Benchmarks.” In Proceedings of the 35th Conference on Neural
Information Processing Systems Track on Datasets and Benchmarks. https://doi.org/10.48550/arXiv.2103.14749.
Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil
Mullainathan. 2019. “Dissecting Racial Bias in an Algorithm Used
to Manage the Health of Populations.” Science 366
(6464): 447–53. https://doi.org/10.1126/science.aax2342.
Oberski, Daniel, and Frauke Kreuter. 2020. “Differential Privacy
and Social Science: An Urgent
Puzzle.” Harvard Data Science Review 2 (1).
https://doi.org/10.1162/99608f92.63a22079.
OECD. 2014. “The Essential Macroeconomic Aggregates.” In
Understanding National Accounts, 13–46. OECD. https://doi.org/10.1787/9789264214637-2-en.
———. 2022. Quarterly GDP. https://data.oecd.org/gdp/quarterly-gdp.htm.
Ooms, Jeroen. 2014. “The jsonlite Package: A
Practical and Consistent Mapping Between JSON Data and R
Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805.
———. 2022a. openssl: Toolkit for Encryption,
Signatures and Certificates Based on OpenSSL. https://CRAN.R-project.org/package=openssl.
———. 2022b. pdftools: Text Extraction,
Rendering and Converting of PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2022c. ssh: Secure Shell (SSH) Client for
R. https://CRAN.R-project.org/package=ssh.
———. 2022d. tesseract: Open Source OCR
Engine. https://CRAN.R-project.org/package=tesseract.
Open Science Collaboration. 2015. “Estimating the Reproducibility
of Psychological Science.” Science 349 (6251): aac4716.
https://doi.org/10.1126/science.aac4716.
Orwell, George. 1946. Politics and the English Language. https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-and-the-english-language/.
Osborne, Jason. 2012. Best Practices in Data
Cleaning: A Complete Guide to Everything You Need to Do Before and After
Collecting Your Data. SAGE Publications.
Osgood, D. Wayne. 2000. “Poisson-Based Regression Analysis of
Aggregate Crime Rates.” Journal of Quantitative
Criminology 16 (1): 21–43. https://doi.org/10.1023/a:1007521427059.
Palmer Station Antarctica LTER, and Gorman, Kristen. 2020.
“Structural Size Measurements and Isotopic Signatures of Foraging
Among Adult Male and Female Adélie Penguins (Pygoscelis Adeliae) Nesting
Along the Palmer Archipelago Near Palmer Station, 2007-2009.” https://doi.org/10.6073/PASTA/98B16D7D563F265CB52372C8CA99E60F.
Pasek, Josh. 2015. “Predicting Elections:
Considering Tools to Pool the Polls.” Public Opinion
Quarterly 79 (2): 594–619. https://doi.org/10.1093/poq/nfu060.
Patki, Neha, Roy Wedge, and Kalyan Veeramachaneni. 2016. “The
Synthetic Data Vault.” In 2016 IEEE International Conference
on Data Science and Advanced Analytics (DSAA), 399–410. https://doi.org/10.1109/DSAA.2016.49.
Paullada, Amandalynne, Inioluwa Deborah Raji, Emily Bender, Emily
Denton, and Alex Hanna. 2021. “Data and Its (Dis)contents: A
Survey of Dataset Development and Use in Machine Learning
Research.” Patterns 2 (11): 100336. https://doi.org/10.1016/j.patter.2021.100336.
Pavlik, Kaylin. 2019. “Understanding + Classifying Genres Using
Spotify Audio Features.” https://www.kaylinpavlik.com/classifying-songs-genres/.
Pedersen, Thomas Lin. 2022. patchwork: The
Composer of Plots. https://CRAN.R-project.org/package=patchwork.
Penrose, Carly. 2024. “Deadly Fires: Risk of Death, Injury Highest
in Toronto’s Poor Neighbourhoods.” CBC News, April. https://www.cbc.ca/news/canada/toronto/fatal-fires-lower-income-1.7177356.
Perepolkin, Dmytro. 2022. polite: Be Nice on
the Web. https://CRAN.R-project.org/package=polite.
Perkel, Jeffrey. 2021. “Ten Computer Codes That Transformed
Science.” Nature 589 (7842): 344–48. https://doi.org/10.1038/d41586-021-00075-2.
———. 2023. “The Sleight-of-Hand Trick That Can Simplify Scientific
Computing.” Nature 617 (7959): 212--213. https://doi.org/10.1038/d41586-023-01469-0.
Phillips, Alban. 1958. “The Relation Between Unemployment and the
Rate of Change of Money Wage Rates in the United Kingdom,
1861-1957.” Economica 25 (100): 283–99. https://doi.org/10.1111/j.1468-0335.1958.tb00003.x.
Piller, Charles. 2022. “Blots on a Field?” Science
377 (6604): 358–63. https://doi.org/10.1126/science.ade0209.
Pineau, Joelle, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent
Larivière, Alina Beygelzimer, Florence d’Alché-Buc, Emily Fox, and Hugo
Larochelle. 2021. “Improving Reproducibility in Machine Learning
Research (a Report from the NeurIPS 2019 Reproducibility
Program).” Journal of Machine Learning Research 22
(164): 1–20. http://jmlr.org/papers/v22/20-303.html.
Pitman, Jim. 1993. Probability. 1st ed. New York: Springer. https://doi.org/10.1007/978-1-4612-4374-8.
Plant, Anne, and Robert Hanisch. 2020. “Reproducibility in
Science: A Metrology Perspective.” Harvard Data Science
Review 2 (4). https://doi.org/10.1162/99608f92.eb6ddee4.
Podlogar, Tim, Peter Leo, and James Spragg. 2022. “Using VO2max as a marker of training status in
athletes—Can we do better?” Journal of Applied
Physiology 133 (6): 144–47. https://doi.org/10.1152/japplphysiol.00723.2021.
Preece, Donald Arthur. 1981. “Distributions of Final Digits in
Data.” The Statistician 30 (1): 31. https://doi.org/10.2307/2987702.
Prévost, Jean-Guy, and Jean-Pierre Beaud. 2015. Statistics, Public
Debate and the State, 1800–1945: A Social, Political and Intellectual
History of Numbers. Routledge.
Python Software Foundation. 2024. Python
Language Reference, version 3.13.0. https://docs.python.org/3/index.html.
R Core Team. 2024. R: A Language and Environment for Statistical
Computing. Vienna, Austria: R Foundation for Statistical Computing.
https://www.R-project.org/.
R Special Interest Group on Databases (R-SIG-DB), Hadley Wickham, and
Kirill Müller. 2022. DBI: R Database Interface. https://CRAN.R-project.org/package=DBI.
Radcliffe, Nicholas. 2023. Test-Driven Data
Analysis (Python TDDA library). https://tdda.readthedocs.io/en/latest/index.html.
Register, Yim. 2020a. “Introduction to Sampling and
Randomization.” YouTube, November. https://youtu.be/U272FFxG8LE.
———. 2020b. “Data Science Ethics in 6 Minutes.”
YouTube, December. https://youtu.be/mA4gypAiRYU.
Rehaag, Sean. 2023. “Supreme Court of Canada Bulk Decisions
Dataset.” Refugee Law Laboratory. https://refugeelab.ca/bulk-data/scc.
Reid, Nancy. 2003. “Asymptotics and the Theory of
Inference.” The Annals of Statistics 31 (6): 1695–1731.
https://doi.org/10.1214/aos/1074290325.
Richardson, Neal, Ian Cook, Nic Crane, Dewey Dunnington, Romain
François, Jonathan Keane, Dragoș Moldovan-Grünfeld, Jeroen Ooms, and
Apache Arrow. 2023. arrow: Integration to
Apache Arrow. https://CRAN.R-project.org/package=arrow.
Riederer, Emily. 2020. “Column Names as Contracts,”
September. https://emilyriederer.netlify.app/post/column-name-contracts/.
———. 2021. “Causal Design Patterns for Data Analysts,”
January. https://emilyriederer.netlify.app/post/causal-design-patterns/.
Riffe, Tim, Enrique Acosta, Enrique José Acosta, Diego Manuel Aburto,
Anna Alburez-Gutierrez, Ainhoa Altová, Ugofilippo Alustiza, et al. 2021.
“Data Resource Profile: COVerAGE-DB: A
Global Demographic Database of COVID-19 Cases and
Deaths.” International Journal of Epidemiology 50 (2):
390–390f. https://doi.org/10.1093/ije/dyab027.
Rilke, Rainer Maria. (1929) 2014. Letters to a Young Poet.
Penguin Classics.
Roberts, Margaret, Brandon Stewart, and Dustin Tingley. 2019.
“stm: An R Package for
Structural Topic Models.” Journal of Statistical
Software 91 (2): 1–40. https://doi.org/10.18637/jss.v091.i02.
Robinson, David, Alex Hayes, and Simon Couch. 2022. broom: Convert Statistical Objects into Tidy
Tibbles. https://CRAN.R-project.org/package=broom.
Robinson, Emily, and Jacqueline Nolis. 2020. Build a Career in Data
Science. Shelter Island: Manning Publications. https://livebook.manning.com/book/build-a-career-in-data-science.
Rockoff, Hugh. 2019. “On the Controversies Behind the Origins of
the Federal Economic Statistics.” Journal of Economic
Perspectives 33 (1): 147–64. https://doi.org/10.1257/jep.33.1.147.
Romer, Paul. 2018. “Jupyter, Mathematica, and the Future of the
Research Paper,” April. https://paulromer.net/jupyter-mathematica-and-the-future-of-the-research-paper/.
Rose, Angela, Rebecca Grais, Denis Coulombier, and Helga Ritter. 2006.
“A Comparison of Cluster and Systematic Sampling Methods for
Measuring Crude Mortality.” Bulletin of the World Health
Organization 84: 290–96. https://doi.org/10.2471/blt.05.029181.
Rosenau, James N. 1999. “A Transformed Observer in a Transforming
World.” Studia Diplomatica 52 (1/2): 5–14. http://www.jstor.org/stable/44838096.
Ross, Casey. 2022. “How a Decades-Old Database Became a Hugely
Profitable Dossier on the Health of 270 Million Americans.”
Stat, February. https://www.statnews.com/2022/02/01/ibm-watson-health-marketscan-data/.
Rubinstein, Benjamin, and Francesco Alda. 2017. “Pain-Free Random
Differential Privacy with Sensitivity Sampling.” In 34th
International Conference on Machine Learning (ICML’2017).
Rudis, Bob. 2020. hrbrthemes: Additional
Themes, Theme Components and Utilities for
“ggplot2”. https://CRAN.R-project.org/package=hrbrthemes.
Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan
Schroeder. 2019. “Differential Privacy and Census Data:
Implications for Social and Economic Research.” AEA Papers
and Proceedings 109 (May): 403–8. https://doi.org/10.1257/pandp.20191107.
Ruggles, Steven, Sarah Flood, Sophia Foster, Ronald Goeken, Jose Pacas,
Megan Schouweiler, and Matthew Sobek. 2021. “IPUMS USA: Version
11.0.” Minneapolis, MN: IPUMS. https://doi.org/10.18128/d010.v11.0.
Ryan, Philip. 2015. “Keeping a Lab Notebook.”
YouTube, May. https://youtu.be/-MAIuaOL64I.
Sadowski, Caitlin, Emma Söderberg, Luke Church, Michal Sipko, and
Alberto Bacchelli. 2018. “Modern Code Review: A Case Study at
Google.” In Proceedings of the 40th International Conference
on Software Engineering: Software Engineering in Practice, 181–90.
ICSE-SEIP ’18. New York, NY, USA: Association for Computing Machinery.
https://doi.org/10.1145/3183519.3183525.
Sakshaug, Joseph, Ting Yan, and Roger Tourangeau. 2010.
“Nonresponse Error, Measurement Error, and Mode of Data
Collection: Tradeoffs in a Multi-Mode Survey of Sensitive and
Non-Sensitive Items.” Public Opinion Quarterly 74 (5):
907–33. https://doi.org/10.1093/poq/nfq057.
Salganik, Matthew. 2018. Bit by Bit: Social Research in the Digital
Age. New Jersey: Princeton University Press.
Salganik, Matthew, Peter Sheridan Dodds, and Duncan Watts. 2006.
“Experimental Study of Inequality and Unpredictability in an
Artificial Cultural Market.” Science 311 (5762): 854–56.
https://doi.org/10.1126/science.1121066.
Salganik, Matthew, and Douglas Heckathorn. 2004. “Sampling and
Estimation in Hidden Populations Using Respondent-Driven
Sampling.” Sociological Methodology 34 (1): 193–240. https://doi.org/10.1111/j.0081-1750.2004.00152.x.
Sambasivan, Nithya, Shivani Kapania, Hannah Highfill, Diana Akrong,
Praveen Paritosh, and Lora Aroyo. 2021. “‘Everyone Wants to
Do the Model Work, Not the Data Work’: Data Cascades in
High-Stakes AI.” In Proceedings of the 2021
CHI Conference on Human Factors in Computing Systems.
ACM. https://doi.org/10.1145/3411764.3445518.
Samuel, Arthur. 1959. “Some Studies in Machine Learning Using the
Game of Checkers.” IBM Journal of Research and
Development 3 (3): 210–29. https://doi.org/10.1147/rd.33.0210.
Saulnier, Lucile, Siddharth Karamcheti, Hugo Laurençon, Léo Tronchon,
Thomas Wang, Victor Sanh, Amanpreet Singh, et al. 2022. “Putting
Ethical Principles at the Core of the Research Lifecycle.” https://huggingface.co/blog/ethical-charter-multimodal.
Savage, Van, and Pamela Yeh. 2019. “Novelist Cormac
McCarthy’s Tips on How to Write a Great Science
Paper.” Nature 574 (7778): 441–42. https://doi.org/10.1038/d41586-019-02918-5.
Schaffner, Brian, Stephen Ansolabehere, and Sam Luks. 2021.
“Cooperative Election Study Common Content,
2020.” Harvard Dataverse. https://doi.org/10.7910/DVN/E9N6PH.
Schloerke, Barret, and Jeff Allen. 2022. plumber: An API Generator for R. https://CRAN.R-project.org/package=plumber.
Schmertmann, Carl. 2022. “UN API Test,” July. https://bonecave.schmert.net/un-api-example.html.
Schofield, Alexandra, Måns Magnusson, and David Mimno. 2017.
“Pulling Out the Stops: Rethinking Stopword Removal for Topic
Models.” In Proceedings of the 15th Conference of the
European Chapter of the Association for Computational
Linguistics: Volume 2, Short Papers, 432–36. Valencia, Spain:
Association for Computational Linguistics. https://aclanthology.org/E17-2069.
Schofield, Alexandra, Måns Magnusson, Laure Thompson, and David Mimno.
2017. “Understanding Text Pre-Processing for Latent Dirichlet
Allocation.” In ACL Workshop for Women in NLP (WiNLP).
https://www.cs.cornell.edu/~xanda/winlp2017.pdf.
Schofield, Alexandra, Laure Thompson, and David Mimno. 2017.
“Quantifying the Effects of Text Duplication on Semantic
Models.” In Proceedings of the 2017 Conference on Empirical
Methods in Natural Language Processing, 2737–47. Copenhagen,
Denmark: Association for Computational Linguistics. https://doi.org/10.18653/v1/D17-1290.
Scott, James. 1998. Seeing Like a State. Yale University Press.
Sekhon, Jasjeet, and Rocío Titiunik. 2017. “Understanding
Regression Discontinuity Designs as Observational Studies.”
Observational Studies 3 (2): 174–82. https://doi.org/10.1353/obs.2017.0005.
Sen, Amartya. 1980. “Description as
Choice.” Oxford Economic Papers 32 (3): 353–69.
https://doi.org/10.1093/oxfordjournals.oep.a041484.
Shankar, Shreya, Rolando Garcia, Joseph Hellerstein, and Aditya
Parameswaran. 2022. “Operationalizing Machine Learning: An
Interview Study.” arXiv. https://doi.org/10.48550/ARXIV.2209.09125.
Si, Yajuan. 2020. “On the Use of Auxiliary Variables in Multilevel
Regression and Poststratification.” https://arxiv.org/abs/2011.00360.
Sides, John, Lynn Vavreck, and Christopher Warshaw. 2021. “The
Effect of Television Advertising in United States Elections.”
American Political Science Review, 1–17. https://doi.org/10.1017/s000305542100112x.
Silberzahn, Raphael, Eric Uhlmann, Daniel Martin, Pasquale Anselmi,
Frederik Aust, Eli Awtrey, Štěpán Bahnı́k, et al. 2018. “Many
Analysts, One Data Set: Making Transparent How Variations in Analytic
Choices Affect Results.” Advances in Methods and Practices in
Psychological Science 1 (3): 337–56. https://doi.org/10.1177/2515245917747646.
Silge, Julia, and David Robinson. 2016. “tidytext: Text Mining and Analysis Using Tidy Data
Principles in R.” The Journal of Open Source
Software 1 (3). https://doi.org/10.21105/joss.00037.
Silver, Nate. 2020. “We Fixed an Issue with How Our Primary
Forecast Was Calculating Candidates’ Demographic Strengths.”
FiveThirtyEight, February. https://fivethirtyeight.com/features/we-fixed-a-mistake-in-how-our-primary-forecast-was-calculating-candidates-demographic-strengths/.
Simonsohn, Uri. 2013. “Just Post It: The Lesson from Two Cases of
Fabricated Data Detected by Statistics Alone.” Psychological
Science 24 (10): 1875–88. https://doi.org/10.1177/0956797613480366.
Simpkinson, Scott. 1971. “Testing to Ensure
Mission Success.” In What Made Apollo a Success,
edited by NASA, 21–29.
Simpson, Edward. 1951. “The Interpretation of Interaction in
Contingency Tables.” Journal of the Royal Statistical
Society: Series B (Methodological) 13 (2): 238–41. https://doi.org/10.1111/j.2517-6161.1951.tb00088.x.
Smith, Jessie, Saleema Amershi, Solon Barocas, Hanna Wallach, and
Jennifer Wortman Vaughan. 2022. “REAL ML: Recognizing, Exploring,
and Articulating Limitations of Machine Learning Research.”
2022 ACM Conference on Fairness, Accountability, and Transparency
(FAccT ’22). https://doi.org/10.1145/3531146.3533122.
Smith, Matthew. 2018. “Should Milk Go in a Cup of Tea First or
Last?” July. https://yougov.co.uk/topics/consumer/articles-reports/2018/07/30/should-milk-go-cup-tea-first-or-last.
Smith, Richard. 2002. “A Statistical Assessment of Buchanan’s Vote
in Palm Beach County.” Statistical Science 17 (4):
441–57. https://doi.org/10.1214/ss/1049993203.
Sobek, Matthew, and Steven Ruggles. 1999. “The IPUMS Project: An
Update.” Historical Methods: A Journal of Quantitative and
Interdisciplinary History 32 (3): 102–10. https://doi.org/10.1080/01615449909598930.
Somers, James. 2015. “Toolkits for the
Mind.” MIT Technology Review, April. https://www.technologyreview.com/2015/04/02/168469/toolkits-for-the-mind/.
———. 2017. “Torching the Modern-Day Library of Alexandria.”
The Atlantic, April. https://www.theatlantic.com/technology/archive/2017/04/the-tragedy-of-google-books/523320/.
———. 2018. “The Scientific Paper Is Obsolete.” The
Atlantic, April. https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/.
Spear, Mary Eleanor. 1952. Charting Statistics. https://archive.org/details/ChartingStatistics_201801/.
Sprint, Gina, and Jason Conci. 2019. “Mining GitHub Classroom
Commit Behavior in Elective and Introductory Computer Science
Courses.” Journal of Computing Sciences in Colleges 35
(1): 76–84.
Staicu, Ana-Maria. 2017. “Interview with Nancy Reid.”
International Statistical Review 85 (3): 381–403. https://doi.org/10.1111/insr.12237.
Staniak, Mateusz, and Przemysław Biecek. 2019. “The Landscape of R Packages for Automated Exploratory
Data Analysis.” The R Journal 11
(2): 347–69. https://doi.org/10.32614/RJ-2019-033.
Stantcheva, Stefanie. 2023. “How to Run Surveys: A Guide to
Creating Your Own Identifying Variation and Revealing the
Invisible.” Annual Review of Economics 15 (1): 205–34.
https://doi.org/10.1146/annurev-economics-091622-010157.
Statistics Canada. 2020. “Sex at Birth and Gender: Technical
Report on Changes for the 2021 Census.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2021/ref/98-20-0002/982000022020002-eng.pdf.
———. 2023. “Guide to the Census of Population, 2021.”
Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2021/ref/98-304/98-304-x2021001-eng.pdf.
Steckel, Richard. 1991. “The Quality of Census Data for Historical
Inquiry: A Research Agenda.” Social Science History 15
(4): 579–99. https://doi.org/10.2307/1171470.
Steele, Fiona. 2007. “Multilevel Models for Longitudinal
Data.” Journal of the Royal Statistical Society Series
A: Statistics in Society 171 (1): 5–19. https://doi.org/10.1111/j.1467-985x.2007.00509.x.
Steele, Fiona, Anna Vignoles, and Andrew Jenkins. 2007. “The
Effect of School Resources on Pupil Attainment: A Multilevel
Simultaneous Equation Modelling Approach.” Journal of the
Royal Statistical Society Series A: Statistics in Society 170 (3):
801–24. https://doi.org/10.1111/j.1467-985x.2007.00476.x.
Stevens, Wallace. 1934. The Idea of Order at Key West. https://www.poetryfoundation.org/poems/43431/the-idea-of-order-at-key-west.
Steyvers, Mark, and Tom Griffiths. 2006. “Probabilistic Topic
Models.” In Latent Semantic Analysis: A Road to Meaning,
edited by T. Landauer, D McNamara, S. Dennis, and W. Kintsch. https://cocosci.princeton.edu/tom/papers/SteyversGriffiths.pdf.
Stigler, Stephen. 1978. “Francis Ysidro Edgeworth,
Statistician.” Journal of the Royal Statistical
Society. Series A (General) 141 (3): 287–322. https://doi.org/10.2307/2344804.
———. 1986. The History of Statistics. Massachusetts: Belknap
Harvard.
Stock, James, and Francesco Trebbi. 2003. “Retrospectives: Who
Invented Instrumental Variable Regression?” Journal of
Economic Perspectives 17 (3): 177–94. https://doi.org/10.1257/089533003769204416.
Stolberg, Michael. 2006. “Inventing the Randomized Double-Blind
Trial: The Nuremberg Salt Test of 1835.” Journal of the Royal
Society of Medicine 99 (12): 642–43. https://doi.org/10.1177/014107680609901216.
Stoler, Ann Laura. 2002. “Colonial Archives and the Arts of
Governance.” Archival Science 2 (March): 87–109. https://doi.org/10.1007/bf02435632.
Stolley, Paul. 1991. “When Genius Errs: R. A. Fisher and the Lung
Cancer Controversy.” American Journal of Epidemiology
133 (5): 416–25. https://doi.org/10.1093/oxfordjournals.aje.a115904.
Stommes, Drew, P. M. Aronow, and Fredrik Sävje. 2023. “On the
Reliability of Published Findings Using the Regression Discontinuity
Design in Political Science.” Research & Politics 10
(2). https://doi.org/https://doi.org/10.1177/2053168023116645.
Student. 1908. “The Probable Error of a Mean.”
Biometrika 6 (1): 1–25. https://doi.org/10.2307/2331554.
Sunstein, Cass, and Lucia Reisch. 2017. The Economics of Nudge.
Routledge.
Suriyakumar, Vinith, Nicolas Papernot, Anna Goldenberg, and Marzyeh
Ghassemi. 2021. “Chasing Your Long Tails.” In
Proceedings of the 2021 ACM Conference on Fairness,
Accountability, and Transparency. https://doi.org/10.1145/3442188.3445934.
Swain, Larry. 1985. “Basic Principles of Questionnaire
Design.” Survey Methodology 11 (2): 161–70.
Sylvester, Christine, Anastasia Ershova, Aleksandra Khokhlova, Nikoleta
Yordanova, and Zachary Greene. 2023. “ParlEE
plenary speeches V2 data set: Annotated full-text of 15.1 million
sentence-level plenary speeches of six EU legislative
chambers.” Harvard Dataverse. https://doi.org/10.7910/DVN/VOPK0E.
Szaszi, Barnabas, Anthony Higney, Aaron Charlton, Andrew Gelman, Ignazio
Ziano, Balazs Aczel, Daniel Goldstein, David Yeager, and Elizabeth
Tipton. 2022. “No Reason to Expect Large and Consistent Effects of
Nudge Interventions.” Proceedings of the National Academy of
Sciences 119 (31): e2200732119. https://doi.org/10.1073/pnas.2200732119.
Taddy, Matt. 2019. Business Data Science. 1st ed. McGraw Hill.
Taflaga, Marija, and Matthew Kerby. 2019. “Who Does What Work in a
Ministerial Office: Politically Appointed Staff and the Descriptive
Representation of Women in Australian Political Offices,
19792010.” Political Studies 68 (2):
463–85. https://doi.org/10.1177/0032321719853459.
Tal, Eran. 2020. “Measurement in
Science.” In The Stanford Encyclopedia of
Philosophy, edited by Edward Zalta, Fall 2020. https://plato.stanford.edu/archives/fall2020/entries/measurement-science/;
Metaphysics Research Lab, Stanford University.
Tang, John. 2015. “Pollution havens and the
trade in toxic chemicals: Evidence from U.S. trade flows.”
Ecological Economics 112 (April): 150–60. https://doi.org/10.1016/j.ecolecon.2015.02.022.
Tang, Jun, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, and
Xiaofeng Wang. 2017. “Privacy Loss in Apple’s Implementation of
Differential Privacy on MacOS 10.12.” arXiv. https://doi.org/10.48550/arXiv.1709.02753.
Tausanovitch, Chris, and Lynn Vavreck. 2021. “Democracy Fund
+ UCLA Nationscape Project.” https://www.voterstudygroup.org/data/nationscape.
Taylor, Adam. 2015. “New Zealand Says No to Jedis.” The
Washington Post, September. https://www.washingtonpost.com/news/worldviews/wp/2015/09/29/new-zealand-says-no-to-jedis/.
Teate, Renée. 2022. SQL for Data Scientists. Wiley.
The Economist. 2013. “Johnson: Those Six Little Rules: George
Orwell on Writing,” July. https://www.economist.com/prospero/2013/07/29/johnson-those-six-little-rules.
———. 2022a. “What Spotify Data Show about the Decline of
English,” January. https://www.economist.com/interactives/graphic-detail/2022/01/29/what-spotify-data-show-about-the-decline-of-english.
———. 2022b. “Will Emmanuel Macron Win a Second Term?”
April. https://www.economist.com/interactive/france-2022/forecast.
———. 2022c. “France’s Presidential Election: The Second Round in
Detail,” April. https://www.economist.com/interactive/france-2022/results-round-two.
The Washington Post. 2023. “Fatal Force Database.” https://github.com/washingtonpost/data-police-shootings.
The White House. 2023. “Recommendations on the Best Practices for
the Collection of Sexual Orientation and Gender Identity Data on Federal
Statistical Survey,” January. https://www.whitehouse.gov/wp-content/uploads/2023/01/SOGI-Best-Practices.pdf.
Thieme, Nick. 2018. “R Generation.” Significance
15 (4): 14–19. https://doi.org/10.1111/j.1740-9713.2018.01169.x.
Thistlethwaite, Donald, and Donald Campbell. 1960.
“Regression-Discontinuity Analysis: An Alternative to the Ex Post
Facto Experiment.” Journal of Educational Psychology 51
(6): 309–17. https://doi.org/10.1037/h0044319.
Thompson, Charlie, Daniel Antal, Josiah Parry, Donal Phipps, and Tom
Wolff. 2022. spotifyr: R Wrapper for the
“Spotify” Web API. https://CRAN.R-project.org/package=spotifyr.
Thomson-DeVeaux, Amelia, Laura Bronner, and Damini Sharma. 2021.
“Cities Spend Millions On Police Misconduct
Every Year. Here’s Why It’s So Difficult to Hold Departments
Accountable.” FiveThirtyEight, February. https://fivethirtyeight.com/features/police-misconduct-costs-cities-millions-every-year-but-thats-where-the-accountability-ends/.
Thornhill, John. 2021. “Lunch with the FT: Mathematician Hannah
Fry.” Financial Times, July. https://www.ft.com/content/a5e33e5a-99b9-4bbc-948f-8a527c7675c3.
Tierney, Nicholas, Di Cook, Miles McBain, and Colin Fay. 2021. naniar: Data Structures, Summaries, and Visualisations
for Missing Data. https://CRAN.R-project.org/package=naniar.
Tierney, Nicholas, and Karthik Ram. 2020. “A Realistic Guide to
Making Data Available Alongside Code to Improve Reproducibility.”
https://arxiv.org/abs/2002.11626.
———. 2021. “Common-Sense Approaches to Sharing Tabular Data
Alongside Publication.” Patterns 2 (12): 100368. https://doi.org/10.1016/j.patter.2021.100368.
Timbers, Tiffany. 2020. canlang: Canadian
Census language data. https://ttimbers.github.io/canlang/.
Timbers, Tiffany, Trevor Campbell, and Melissa Lee. 2022. Data
Science: A First Introduction. Chapman; Hall/CRC. https://datasciencebook.ca.
Tolley, Erin, and Mireille Paquet. 2021. “Gender, Municipal Party
Politics, and Montreal’s First Woman Mayor.” Canadian Journal
of Urban Research 30 (1): 40–52. https://cjur.uwinnipeg.ca/index.php/cjur/article/view/323.
Tourangeau, Roger, Lance Rips, and Kenneth Rasinski. 2000. The
Psychology of Survey Response. 1st ed. Cambridge University Press.
https://doi.org/10.1017/CBO9780511819322.003.
Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet,
Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, et al. 2023.
“LLaMA: Open and Efficient Foundation
Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2302.13971.
Trisovic, Ana, Matthew Lau, Thomas Pasquier, and Mercè Crosas. 2022.
“A Large-Scale Study on Research Code Quality and
Execution.” Scientific Data 9 (1). https://doi.org/10.1038/s41597-022-01143-6.
Tukey, John. 1962. “The Future of Data Analysis.” The
Annals of Mathematical Statistics 33 (1): 1–67. https://doi.org/10.1214/aoms/1177704711.
———. 1977. Exploratory Data Analysis.
Turcotte, Alexi, Aviral Goel, Filip Křikava, and Jan Vitek. 2020.
“Designing Types for r, Empirically.” Proceedings of
the ACM on Programming Languages 4
(OOPSLA): 1–25. https://doi.org/10.1145/3428249.
UN IGME. 2021. “Levels and Trends in Child Mortality,
2021.” https://childmortality.org/wp-content/uploads/2021/12/UNICEF-2021-Child-Mortality-Report.pdf.
Urban, Steve, Rangarajan Sreenivasan, and Vineet Kannan. 2016.
“It’s All A/Bout Testing: The Netflix
Experimentation Platform.” Netflix Technology
Blog, April. https://netflixtechblog.com/its-all-a-bout-testing-the-netflix-experimentation-platform-4e1ca458c15.
Ushey, Kevin. 2022. renv: Project
Environments. https://CRAN.R-project.org/package=renv.
van Buuren, Stef, and Karin Groothuis-Oudshoorn. 2011. “mice: Multivariate Imputation by Chained Equations in
R.” Journal of Statistical Software 45 (3): 1–67.
https://doi.org/10.18637/jss.v045.i03.
Van den Broeck, Jan, Solveig Argeseanu Cunningham, Roger Eeckels, and
Kobus Herbst. 2005. “Data Cleaning: Detecting, Diagnosing, and
Editing Data Abnormalities.” PLOS Medicine 2 (10): e267.
https://doi.org/10.1371/journal.pmed.0020267.
van der Loo, Mark. 2022. The Data Validation Cookbook. https://data-cleaning.github.io/validate/.
van der Loo, Mark, and Edwin De Jonge. 2021. “Data Validation Infrastructure for R.”
Journal of Statistical Software 97 (10): 1–33. https://doi.org/10.18637/jss.v097.i10.
Vanderplas, Susan, Dianne Cook, and Heike Hofmann. 2020. “Testing
Statistical Charts: What Makes a Good Graph?” Annual Review
of Statistics and Its Application 7: 61–88. https://doi.org/10.1146/annurev-statistics-031219-041252.
Vanhoenacker, Mark. 2015. Skyfaring: A Journey with a Pilot.
1st ed. Alfred A. Knopf.
Varin, Cristiano, Nancy Reid, and David Firth. 2011. “An Overview
of Composite Likelihood Methods.” Statistica Sinica,
5–42. https://www.jstor.org/stable/24309261.
Varner, Maddy, and Aaron Sankin. 2020. “Suckers List: How
Allstate’s Secret Auto Insurance Algorithm Squeezes Big
Spenders.” The Markup, February. https://themarkup.org/allstates-algorithm/2020/02/25/car-insurance-suckers-list.
Vavreck, Lynn, and Chris Tausanovitch. 2021. “Democracy Fund
+ UCLA Nationscape Project User Guide.” https://www.voterstudygroup.org/data/nationscape.
Vickers, Andrew, and Emily Vertosick. 2016. “An Empirical Study of
Race Times in Recreational Endurance Runners.”
BMC Sports Science, Medicine and Rehabilitation 8
(1). https://doi.org/10.1186/s13102-016-0052-y.
Vidoni, Melina. 2021. “Evaluating Unit
Testing Practices in R Packages.” In 2021 IEEE/ACM
43rd International Conference on Software Engineering (ICSE),
1523–34. https://doi.org/10.1109/ICSE43902.2021.00136.
von Bergmann, Jens, Dmitry Shkolnik, and Aaron Jacobs. 2021. cancensus: R package to access, retrieve, and work with
Canadian Census data and geography. https://mountainmath.github.io/cancensus/.
Walby, Kevin, and Alex Luscombe. 2019. Freedom of Information and
Social Science Research Design. Routledge.
Walker, Kyle. 2022. Analyzing US Census Data. Chapman;
Hall/CRC. https://walker-data.com/census-r/index.html.
Walker, Kyle, and Matt Herman. 2022. tidycensus: Load US Census Boundary and Attribute Data as
“tidyverse” and “sf”-Ready Data
Frames. https://CRAN.R-project.org/package=tidycensus.
Wallach, Hanna. 2018. “Computational Social Science ≠ Computer Science + Social Data.”
Communications of the ACM 61 (3): 42–44. https://doi.org/10.1145/3132698.
Wan, Mengting, and Julian J. McAuley. 2018. “Item Recommendation
on Monotonic Behavior Chains.” In Proceedings of the 12th
ACM Conference on Recommender Systems, RecSys 2018,
Vancouver, BC, Canada, October 2-7, 2018, edited by Sole Pera,
Michael D. Ekstrand, Xavier Amatriain, and John O’Donovan, 86–94.
ACM. https://doi.org/10.1145/3240323.3240369.
Wan, Mengting, Rishabh Misra, Ndapa Nakashole, and Julian J. McAuley.
2019. “Fine-Grained Spoiler Detection from Large-Scale Review
Corpora.” In Proceedings of the 57th Conference of the
Association for Computational Linguistics, ACL 2019,
Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers,
edited by Anna Korhonen, David R. Traum, and Lluı́s Màrquez, 2605–10.
Association for Computational Linguistics. https://doi.org/10.18653/v1/p19-1248.
Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015.
“Forecasting Elections with Non-Representative Polls.”
International Journal of Forecasting 31 (3): 980–91. https://doi.org/10.1016/j.ijforecast.2014.06.001.
Wang, Yilun, and Michal Kosinski. 2018. “Deep Neural Networks Are
More Accurate Than Humans at Detecting Sexual Orientation from Facial
Images.” Journal of Personality and Social Psychology
114 (2): 246–57. https://doi.org/10.1037/pspa0000098.
Wardrop, Robert. 1995. “Simpson’s Paradox and the Hot Hand in
Basketball.” The American Statistician 49 (1): 24–28. https://doi.org/10.2307/2684806.
Ware, James. 1989. “Investigating Therapies of Potentially Great
Benefit: ECMO.” Statistical Science 4 (4): 298–306. https://doi.org/10.1214/ss/1177012384.
Wasserman, Larry. 2005. All of Statistics. Springer.
Wei, Eugene. 2017. Remove the Legend to Become One. https://www.eugenewei.com/blog/2017/11/13/remove-the-legend.
Wei, LJ, and S Durham. 1978. “The Randomized Play-the-Winner Rule
in Medical Trials.” Journal of the American Statistical
Association 73 (364): 840–43. https://doi.org/10.2307/2286290.
Weinberg, Gerald. 1971. The Psychology of Computer Programming.
New York: Van Nostrand Reinhold Company.
Weissgerber, Tracey, Natasa Milic, Stacey Winham, and Vesna Garovic.
2015. “Beyond Bar and Line Graphs: Time for a New Data
Presentation Paradigm.” PLoS Biology 13 (4): e1002128.
https://doi.org/10.1371/journal.pbio.1002128.
Whitby, Andrew. 2020. The Sum of the
People. New York: Basic Books.
Whitelaw, James. 1805. An Essay on the Population of Dublin. Being
the Result of an Actual Survey Taken in 1798, with Great Care and
Precision, and Arranged in a Manner Entirely New. Graisberry;
Campbell.
Wicherts, Jelte, Marjan Bakker, and Dylan Molenaar. 2011.
“Willingness to Share Research Data Is Related to the Strength of
the Evidence and the Quality of Reporting of Statistical
Results.” PLOS ONE 6 (11): e26828. https://doi.org/10.1371/journal.pone.0026828.
Wickham, Hadley. 2009. “Manipulating Data.” In ggplot2, 157–75. Springer New York. https://doi.org/10.1007/978-0-387-98141-3_9.
———. 2011. “testthat: Get Started with
Testing.” The R Journal 3: 5–10. https://journal.r-project.org/archive/2011-1/RJournal%5F2011-1%5FWickham.pdf.
———. 2014. “Tidy Data.” Journal of Statistical
Software 59 (1): 1–23. https://doi.org/10.18637/jss.v059.i10.
———. 2016. ggplot2: Elegant Graphics for Data
Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2017. tidyverse: Easily Install and Load
the “Tidyverse”. https://CRAN.R-project.org/package=tidyverse.
———. 2018. “Whole Game.” YouTube, January. https://youtu.be/go5Au01Jrvs.
———. 2019. Advanced R. 2nd ed. Chapman; Hall/CRC.
https://adv-r.hadley.nz.
———. 2020. Tidyverse. https://www.tidyverse.org/.
———. 2021a. babynames: US Baby Names
1880-2017. https://CRAN.R-project.org/package=babynames.
———. 2021b. Mastering Shiny. 1st ed. O’Reilly Media. https://mastering-shiny.org.
———. 2021c. The Tidyverse Style Guide. https://style.tidyverse.org/index.html.
———. 2022a. R Packages. 2nd ed. O’Reilly Media. https://r-pkgs.org.
———. 2022b. rvest: Easily Harvest (Scrape) Web
Pages. https://CRAN.R-project.org/package=rvest.
———. 2022c. stringr: Simple, Consistent
Wrappers for Common String Operations. https://CRAN.R-project.org/package=stringr.
———. 2023a. forcats: Tools for Working with
Categorical Variables (Factors). https://CRAN.R-project.org/package=forcats.
———. 2023b. httr: Tools for Working with URLs
and HTTP. https://CRAN.R-project.org/package=httr.
Wickham, Hadley, Mara Averick, Jenny Bryan, Winston Chang, Lucy
D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019.
“Welcome to the Tidyverse.”
Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, and Jennifer Bryan. 2023. readxl: Read Excel Files. https://CRAN.R-project.org/package=readxl.
Wickham, Hadley, Jennifer Bryan, and Malcolm Barrett. 2022. usethis: Automate Package and Project Setup.
https://CRAN.R-project.org/package=usethis.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. (2016)
2023. R for Data Science. 2nd ed. O’Reilly Media. https://r4ds.hadley.nz.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022.
dplyr: A Grammar of Data
Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, Maximilian Girlich, and Edgar Ruiz. 2022. dbplyr: A “dplyr” Back End for
Databases. https://CRAN.R-project.org/package=dbplyr.
Wickham, Hadley, and Lionel Henry. 2022. purrr:
Functional Programming Tools. https://CRAN.R-project.org/package=purrr.
Wickham, Hadley, Jim Hester, and Jenny Bryan. 2022. readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.
Wickham, Hadley, Jim Hester, Winston Chang, and Jenny Bryan. 2022.
devtools: Tools to Make Developing R Packages
Easier. https://CRAN.R-project.org/package=devtools.
Wickham, Hadley, Jim Hester, and Jeroen Ooms. 2021. xml2: Parse XML. https://CRAN.R-project.org/package=xml2.
Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export “SPSS”
“Stata” and “SAS” Files. https://CRAN.R-project.org/package=haven.
Wickham, Hadley, and Dana Seidel. 2022. scales:
Scale Functions for Visualization. https://CRAN.R-project.org/package=scales.
Wickham, Hadley, and Lisa Stryjewski. 2011. “40 Years of
Boxplots,” November. https://vita.had.co.nz/papers/boxplots.pdf.
Wickham, Hadley, Davis Vaughan, and Maximilian Girlich. 2023. tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.
Wiessner, Polly. 2014. “Embers of Society: Firelight Talk Among
the Ju/’hoansi Bushmen.” Proceedings of the National Academy
of Sciences 111 (39): 14027–35. https://doi.org/10.1073/pnas.1404212111.
Wilde, Oscar. 1891. The Picture of Dorian Gray. https://www.gutenberg.org/files/174/174-h/174-h.htm.
Wilford, John Noble. 1977. “Wernher von Braun, Rocket Pioneer,
Dies.” The New York Times, June. https://www.nytimes.com/1977/06/18/archives/wernher-von-braun-rocket-pioneer-dies-wernher-von-braun-pioneer-in.html.
Wilkinson, Leland. 2005. The Grammar of Graphics. 2nd ed.
Springer.
Wilkinson, Mark, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle
Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016.
“The FAIR Guiding Principles for Scientific Data Management and
Stewardship.” Scientific Data 3 (1): 1–9. https://doi.org/10.1038/sdata.2016.18.
Wilson, Greg, Jenny Bryan, Karen Cranston, Justin Kitzes, Lex
Nederbragt, and Tracy Teal. 2017. “Good Enough Practices in
Scientific Computing.” PLOS Computational Biology 13
(6): 1–20. https://doi.org/10.1371/journal.pcbi.1005510.
Wong, Julia Carrie. 2020. “One Year Inside Trump’s Monumental
Facebook Campaign.” The Guardian, January. https://www.theguardian.com/us-news/2020/jan/28/donald-trump-facebook-ad-campaign-2020-election.
Wood, Simon. 2015. Core Statistics. Cambridge University Press.
https://www.maths.ed.ac.uk/\%7Eswood34/core-statistics.pdf.
World Health Organization. 2019. “Trends in Maternal Mortality
2000 to 2017: Estimates by WHO, UNICEF, UNFPA, World Bank Group and the
United Nations Population Division.” https://apps.who.int/iris/handle/10665/327596.
Wright, Philip. 1928. The Tariff on Animal and Vegetable Oils.
New York: Macmillan Company.
Wu, Changbao, and Mary Thompson. 2020. Sampling Theory and
Practice. Springer.
Xie, Yihui. 2019. “TinyTeX: A lightweight,
cross-platform, and easy-to-maintain LaTeX distribution based on TeX
Live.” TUGboat, no. 1: 30–32. https://tug.org/TUGboat/Contents/contents40-1.html.
———. 2023. knitr: A General-Purpose Package for
Dynamic Report Generation in R. https://yihui.org/knitr/.
Xu, Ya. 2020. “Causal Inference Challenges in Industry: A
Perspective from Experiences at LinkedIn.” YouTube,
July. https://youtu.be/OoKsLAvyIYA.
Yoshioka, Alan. 1998. “Use of Randomisation in the Medical
Research Council’s Clinical Trial of Streptomycin in Pulmonary
Tuberculosis in the 1940s.” BMJ 317 (7167): 1220–23. https://doi.org/10.1136/bmj.317.7167.1220.
Zhang, Ping, XunPeng Shi, YongPing Sun, Jingbo Cui, and Shuai Shao.
2019. “Have China’s provinces achieved their
targets of energy intensity reduction? Reassessment based on nighttime
lighting data.” Energy Policy 128 (May): 276–83.
https://doi.org/10.1016/j.enpol.2019.01.014.
Zhang, Susan, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen,
Shuohui Chen, Christopher Dewan, et al. 2022. “OPT: Open
Pre-Trained Transformer Language Models.” arXiv. https://doi.org/10.48550/arXiv.2205.01068.
Zimmer, Michael. 2018. “Addressing Conceptual Gaps in Big Data
Research Ethics: An Application of Contextual Integrity.”
Social Media + Society 4 (2): 1–11. https://doi.org/10.1177/2056305118768300.
Zinsser, William. 1976. On Writing Well. New York:
HarperCollins.
Zook, Matthew, Solon Barocas, danah boyd, Kate Crawford, Emily Keller,
Seeta Peña Gangadharan, Alyssa Goodman, et al. 2017. “Ten Simple
Rules for Responsible Big Data Research.” PLOS Computational
Biology 13 (3): e1005399. https://doi.org/10.1371/journal.pcbi.1005399.