Abadie, Alberto, Susan Athey, Guido Imbens, and Jeffrey Wooldridge. 2017. “When Should You Adjust Standard Errors for Clustering?” Working Paper 24003. Working Paper Series. National Bureau of Economic Research.
Abelson, Harold, and Gerald Jay Sussman. 1996. Structure and Interpretation of Computer Programs. Massachusetts: The MIT Press.
Abeysooriya, Mandhri, Megan Soria, Mary Sravya Kasu, and Mark Ziemann. 2021. “Gene Name Errors: Lessons Not Learned.” PLOS Computational Biology 17 (7): 1–13.
Acemoglu, Daron, Simon Johnson, and James Robinson. 2001. “The Colonial Origins of Comparative Development: An Empirical Investigation.” American Economic Review 91 (5): 1369–1401.
Achen, Christopher. 1978. “Measuring Representation.” American Journal of Political Science 22 (3): 475–510.
Akerlof, George. 1970. “The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism.” The Quarterly Journal of Economics.
Alexander, Monica. 2019a. “Reproducibility in Demographic Research.”
———. 2019b. “The Concentration and Uniqueness of Baby Names in Australia and the US,” January.
———. 2019c. “Analyzing Name Changes After Marriage Using a Non-Representative Survey,” August.
———. 2021. “Overcoming Barriers to Sharing Code.” YouTube, February.
Alexander, Monica, and Leontine Alkema. 2021. “A Bayesian Cohort Component Projection Model to Estimate Adult Populations at the Subnational Level in Data-Sparse Settings.”
Alexander, Monica, Mathew Kiang, and Magali Barbieri. 2018. “Trends in Black and White Opioid Mortality in the United States, 1979–2015.” Epidemiology 29 (5): 707–15.
Alexander, Rohan, and Monica Alexander. 2021. “The Increased Effect of Elections and Changing Prime Ministers on Topics Discussed in the Australian Federal Parliament Between 1901 and 2018.”
Alexander, Rohan, and Paul Hodgetts. 2021. AustralianPoliticians: Provides Datasets About Australian Politicians.
Alexander, Rohan, and A Mahfouz. 2021. heapsofpapers: Easily Download Heaps of PDF and CSV Files.
Alexander, Rohan, and Zachary Ward. 2018. “Age at Arrival and Assimilation During the Age of Mass Migration.” The Journal of Economic History 78 (3): 904–37.
Allaire, JJ, Rich Iannone, Alison Presmanes Hill, and Yihui Xie. 2021. distill: “R Markdown” Format for Scientific and Technical Writing.
Allen, Jeff. 2021. plumberDeploy: Plumber Deployment.
Alsan, Marcella, and Amy Finkelstein. 2021. “Beyond Causality: Additional Benefits of Randomized Controlled Trials for Improving Health Care Delivery.” The Milbank Quarterly 99 (4): 864–81.
Alsan, Marcella, and Marianne Wanamaker. 2018. “Tuskegee and the Health of Black Men.” The Quarterly Journal of Economics 133 (1): 407–55.
Amaka, Ofunne, and Amber Thomas. 2021. “The Naked Truth: How the Names of 6,816 Complexion Products Can Reveal Bias in Beauty.” The Pudding, March.
American Medical Association and New York Academy of Medicine. 1848. Code of Medical Ethics. Academy of Medicine.
Andersen, Robert, and David Armstrong II. 2021. Presenting Statistical Results Effectively. London: Sage.
Anderson, Margo, and Stephen Fienberg. 1999. Who counts?: The politics of census-taking in contemporary America. Russell Sage Foundation.
Andrews, David, and Agnes Herzberg. 2012. Data: A Collection of Problems from Many Fields for the Student and Research Worker. New York: Springer Science & Business Media.
Angelucci, Charles, and Julia Cagé. 2019. “Newspapers in Times of Low Advertising Revenues.” American Economic Journal: Microeconomics 11 (3): 319–64.
Angrist, Joshua, and Jörn-Steffen Pischke. 2010. “The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con Out of Econometrics.” Journal of Economic Perspectives 24 (2): 3–30.
Annas, George. 2003. “HIPAA Regulations: A New Era of Medical-Record Privacy?” New England Journal of Medicine 348: 1486–90.
Aprameya, Lavanya. 2020. “Improving Duolingo, One Experiment at a Time.” Duolingo Blog, January.
Arel-Bundock, Vincent. 2021a. modelsummary: Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready.
———. 2021b. WDI: World Development Indicators and Other World Bank Data.
Arel-Bundock, Vincent, Ryan Briggs, Hristos Doucouliagos, Marco Mendoza Aviña, and T. D. Stanley. 2022. “Quantitative Political Science Research Is Greatly Underpowered.”
Armstrong, Zan. 2022. “Stop Aggregating Away the Signal in Your Data.” The Overflow, March.
Arnold, Jeffrey. 2021. ggthemes: Extra Themes, Scales and Geoms for “ggplot2”.
Asquith, Brian, Brad Hershbein, Tracy Kugler, Shane Reed, Steven Ruggles, Jonathan Schroeder, Steve Yesiltepe, and David Van Riper. 2022. “Assessing the Impact of Differential Privacy on Measures of Population and Racial Residential Segregation.” Harvard Data Science Review, no. Special Issue 2.
Athey, Susan, and Guido Imbens. 2017a. “The Econometrics of Randomized Experiments.” In Handbook of Field Experiments, 73–140. Elsevier.
———. 2017b. “The State of Applied Econometrics: Causality and Policy Evaluation.” Journal of Economic Perspectives 31 (2): 3–32.
Athey, Susan, Guido Imbens, Jonas Metzger, and Evan Munro. 2021. “Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations.” Journal of Econometrics.
Au, Randy. 2020. “Data Cleaning IS Analysis, Not Grunt Work,” September.
———. 2022. “Celebrating Everyone Counting Things,” February.
Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R.
Backus, John. 1981. The History of FORTRAN I, II, and III.” In History of Programming Languages, edited by Richard Wexelblat, 25–74. Academic Press.
Bailey, Rosemary. 2008. Design of Comparative Experiments. Cambridge: Cambridge University Press.
Baker, Reg, Michael Brick, Nancy Bates, Mike Battaglia, Mick Couper, Jill Dever, Krista Gile, and Roger Tourangeau. 2013. Summary Report of the AAPOR Task Force on Non-Probability Sampling.” Journal of Survey Statistics and Methodology 1 (2): 90–143.
Bandy, Jack, and Nicholas Vincent. 2021. “Addressing ‘Documentation Debt’ in Machine Learning Research: A Retrospective Datasheet for BookCorpus.” arXiv.
Banerjee, Abhijit, and Esther Duflo. 2011. Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty. New York: PublicAffairs.
Banerjee, Abhijit, Esther Duflo, Rachel Glennerster, and Cynthia Kinnan. 2015. “The Miracle of Microfinance? Evidence from a Randomized Evaluation.” American Economic Journal: Applied Economics 7 (1): 22–53.
Barba, Lorena. 2018. “Terminologies for Reproducible Research.”
Barrett, Malcolm. 2021a. Data Science as an Atomic Habit.
———. 2021b. ggdag: Analyze and Create Elegant Directed Acyclic Graphs.
Barron, Alexander, Jenny Huang, Rebecca Spang, and Simon DeDeo. 2018. “Individuals, Institutions, and Innovation in the Debates of the French Revolution.” Proceedings of the National Academy of Sciences 115 (18): 4607–12.
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1): 1–48.
Baumer, Benjamin, Daniel Kaplan, and Nicholas Horton. 2021. Modern Data Science With R. 2nd ed. Chapman; Hall/CRC.
Baumgartner, Jason, Savvas Zannettou, Brian Keegan, Megan Squire, and Jeremy Blackburn. 2020. “The Pushshift Reddit Dataset.” arXiv.
Baumgartner, Peter. 2021. Ways I Use Testing as a Data Scientist,” December.
Beaumont, Jean-Francois. 2020. “Are Probability Surveys Bound to Disappear for the Production of Official Statistics?” Survey Methodology 46 (1): 1–29.
Beauregard, Katrine, and Jill Sheppard. 2021. “Antiwomen but Proquota: Disaggregating Sexism and Support for Gender Quota Policies.” Political Psychology 42 (2): 219–37.
Becker, Richard, Allan Wilks, Ray Brownrigg, Thomas Minka, and Alex Deckmyn. 2021. maps: Draw Geographical Maps.
Bender, Emily, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. ACM.
Bengtsson, Henrik. 2021. “A Unifying Framework for Parallel and Distributed Processing in r Using Futures.” The R Journal 13 (2): 208–27.
Bensinger, Greg. 2020. “Google Redraws the Borders on Maps Depending on Who’s Looking.” Washington Post, February.
Berkson, Joseph. 1946. “Limitations of the Application of Fourfold Table Analysis to Hospital Data.” Biometrics Bulletin 2 (3): 47–53.
Berners-Lee, Timothy. 1989. “Information Management: A Proposal.”
Berry, Donald. 1989. “Comment: Ethics and ECMO.” Statistical Science 4 (4): 306–10.
Bertrand, Marianne, and Sendhil Mullainathan. 2004. “Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination.” American Economic Review 94 (4): 991–1013.
Bethlehem, R. A. I., J. Seidlitz, S. R. White, J. W. Vogel, K. M. Anderson, C. Adamson, S. Adler, et al. 2022. “Brain Charts for the Human Lifespan.” Nature, April.
Bickel, Peter, Eugene Hammel, and William O’Connell. 1975. “Sex Bias in Graduate Admissions: Data from Berkeley: Measuring Bias Is Harder Than Is Usually Assumed, and the Evidence Is Sometimes Contrary to Expectation.” Science 187 (4175): 398–404.
Biderman, Stella, Kieran Bicheno, and Leo Gao. 2022. “Datasheet for the Pile.”
Birkmeyer, John, Jonathan Finks, Amanda O’Reilly, Mary Oerline, Arthur Carlin, Andre Nunn, Justin Dimick, Mousumi Banerjee, and Nancy Birkmeyer. 2013. “Surgical Skill and Complication Rates After Bariatric Surgery.” New England Journal of Medicine 369 (15): 1434–42.
Blair, Ed, Seymour Sudman, Norman M Bradburn, and Carol Stocking. 1977. “How to Ask Questions about Drinking and Sex: Response Effects in Measuring Consumer Behavior.” Journal of Marketing Research 14 (3): 316–21.
Blair, Graeme, Jasper Cooper, Alexander Coppock, and Macartan Humphreys. 2019. “Declaring and Diagnosing Research Designs.” American Political Science Review 113: 838–59.
Blair, Graeme, Jasper Cooper, Alexander Coppock, Macartan Humphreys, and Luke Sonnet. 2021. estimatr: Fast Estimators for Design-Based Inference.
Blair, James. 2019. Democratizing R with Plumber APIs.
Bland, Martin, and Douglas Altman. 1986. “Statistical Methods for Assessing Agreement Between Two Methods of Clinical Measurement.” The Lancet 327 (8476): 307–10.
Blei, David. 2012. “Probabilistic Topic Models.” Communications of the ACM 55 (4): 77–84.
Blei, David, and John Lafferty. 2009. “Topic Models.” In Text Mining, edited by Ashok Srivastava and Mehran Sahami, 101–24. Chapman & Hall/CRC.
Blei, David, Andrew Ng, and Michael Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3 (Jan): 993–1022.
Bloom, Howard, Andrew Bell, and Kayla Reiman. 2020. “Using Data from Randomized Trials to Assess the Likely Generalizability of Educational Treatment-Effect Estimates from Regression Discontinuity Designs.” Journal of Research on Educational Effectiveness 13 (3): 488–517.
Boland, Philip. 1984. “A Biographical Glimpse of William Sealy Gosset.” The American Statistician 38 (3): 179–83.
Bolton, Ruth, and Randall Chapman. 1986. “Searching for Positive Returns at the Track.” Management Science 32 (August): 1040–60.
Borghi, John, and Ana Van Gulick. 2022. “Promoting Open Science Through Research Data Management.” Harvard Data Science Review 4 (3).
Borkin, Michelle, Zoya Bylinskii, Nam Wook Kim, Constance May Bainbridge, Chelsea Yeh, Daniel Borkin, Hanspeter Pfister, and Aude Oliva. 2015. “Beyond Memorability: Visualization Recognition and Recall.” IEEE Transactions on Visualization and Computer Graphics 22 (1): 519–28.
Bouguen, Adrien, Yue Huang, Michael Kremer, and Edward Miguel. 2019. “Using Randomized Controlled Trials to Estimate Long-Run Impacts in Development Economics.” Annual Review of Economics 11 (1): 523–61.
Bouie, Jamelle. 2022. “We Still Can’t See American Slavery for What It Was.” New York Times, January.
Bowen, Claire McKay. 2022. Protecting Your Privacy in a Data-Driven World. Chapman; Hall/CRC.
Bowers, Jake, and Maarten Voors. 2016. “How to Improve Your Relationship with Your Future Self.” Revista de Ciencia Polı́tica 36 (3): 829–48.
Bowley, Arthur Lyon. 1901. Elements of Statistics. London: P. S. King.
———. 1913. “Working-Class Households in Reading.” Journal of the Royal Statistical Society 76 (7): 672–701.
Boykis, Vicki. 2019. “A Deep Dive on Python Type Hints,” July.
———. 2022. “Duo, the Push, and the Bandits.” Normcore Tech, May.
Bradley, Valerie, Shiro Kuriwaki, Michael Isakov, Dino Sejdinovic, Xiao-Li Meng, and Seth Flaxman. 2021. “Unrepresentative Big Surveys Significantly Overestimated US Vaccine Uptake.” Nature 600 (7890): 695–700.
Braginsky, Mika. 2020. wordbankr: Accessing the Wordbank Database.
Brandt, Allan. 1978. “Racism and Research: The Case of the Tuskegee Syphilis Study.” Hastings Center Report, 21–29.\%5FRacism.pdf?sequence=1\&isAllowed=y.
Brewer, Ken. 2013. “Three Controversies in the History of Survey Sampling.” Survey Methodology 39 (2): 249–63.
Briggs, Ryan. 2021. “Why Does Aid Not Target the Poorest?” International Studies Quarterly 65 (3): 739–52.
Brokowski, Carolyn, and Mazhar Adli. 2019. “CRISPR Ethics: Moral Considerations for Applications of a Powerful Tool.” Journal of Molecular Biology 431 (1): 88–101.
Bronner, Lenny, Emily Liu, and Jeremy Bowers. 2022. “What the Washington Post Elections Engineering Team Had to Learn about Election Data.” Washington Post, April.
Brontë, Charlotte. 1847. Jane Eyre.
———. 1857. The Professor.
Brook, Robert, John Ware, William Rogers, Emmett Keeler, Allyson Ross Davies, Cathy Sherbourne, George Goldberg, Kathleen Lohr, Patricia Camp, and Joseph Newhouse. 1984. “The Effect of Coinsurance on the Health of Adults: Results from the RAND Health Insurance Experiment.”
Brown, Zack. 2018. “A Git Origin Story.” Linux Journal, July.
Bryan, Jenny. 2015. “Naming Things.” Reproducible Science Workshop, May.
———. 2018a. “Excuse Me, Do You Have a Moment to Talk about Version Control?” The American Statistician 72 (1): 20–27.
———. 2018b. “Code Smells and Feels.” YouTube, July.
———. 2020. Happy Git and GitHub for the useR.
Bryan, Jenny, and Jim Hester. 2020. What They Forgot to Teach You About R.
Bryan, Jenny, Jim Hester, David Robinson, and Hadley Wickham. 2019. reprex: Prepare Reproducible Example Code via the Clipboard.
Bryan, Jenny, and Hadley Wickham. 2021. Gh: ’GitHub’ ’API’.
Buckheit, Jonathan, and David Donoho. 1995. “Wavelab and Reproducible Research.” In Wavelets and Statistics, 55–81. Springer.\_5.
Bueno de Mesquita, Ethan, and Anthony Fowler. 2021. Thinking Clearly with Data: A Guide to Quantitative Reasoning and Analysis. New Jersey: Princeton University Press.
Buhr, Ray. 2017. Using R as a Production Machine Learning Language (Part I).
Buja, Andreas, Dianne Cook, and Deborah F Swayne. 1996. “Interactive High-Dimensional Data Visualization.” Journal of Computational and Graphical Statistics 5 (1): 78–99.
Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” In Conference on Fairness, Accountability and Transparency, 77–91. PMLR.
Burton, Jason, Nicole Cruz, and Ulrike Hahn. 2021. “Reconsidering Evidence of Moral Contagion in Online Social Networks.” Nature Human Behaviour 5 (12): 1629–35.
Bush, Vannevar. 1945. “As We May Think.” The Atlantic Monthly, July.
Byrd, James Brian, Anna Greene, Deepashree Venkatesh Prasad, Xiaoqian Jiang, and Casey Greene. 2020. “Responsible, Practical Genomic Data Sharing That Accelerates Research.” Nature Reviews Genetics 21 (10): 615–29.
Cahill, Niamh, Michelle Weinberger, and Leontine Alkema. 2020. “What Increase in Modern Contraceptive Use Is Needed in Fp2020 Countries to Reach 75% Demand Satisfied by 2030? An Assessment Using the Accelerated Transition Method and Family Planning Estimation Model.” Gates Open Research 4.
Calonico, Sebastian, Matias Cattaneo, Max Farrell, and Rocio Titiunik. 2021. rdrobust: Robust Data-Driven Statistical Inference in Regression-Discontinuity Designs.
Cambon, Jesse, and Christopher Belanger. 2021. tidygeocoder: Geocoding Made Easy.” Zenodo.
Cardoso, Tom. 2020. Bias behind bars: A Globe investigation finds a prison system stacked against Black and Indigenous inmates.” The Globe and Mail, October.
Carle, Eric. 1969. The Very Hungry Caterpillar. World Publishing Company.
Carleton, Chris. 2021. wccarleton/conflict-europe: Acce (version v1.0.0). Zenodo.
Carleton, Chris, Dave Campbell, and Mark Collard. 2021. “A Reassessment of the Impact of Temperature Change on European Conflict During the Second Millennium CE Using a Bespoke Bayesian Time-Series Model.” Climatic Change 165 (1): 1–16.
Caro, Robert. 2019. Working. 1st ed. New York: Knopf.
Carroll, Lewis. 1865. Alice’s Adventures in Wonderland. Macmillan.
———. 1871. Through the Looking-Glass. Macmillan.
Cédric, Chambru, and Maneuvrier-Hervieu Paul. 2022. Working Paper Series, Department of Economics, University of Zurich.
Chamberlain, Scott, Hadley Wickham, and Winston Chang. 2021. Analogsea: Interface to “Digital Ocean”.
Chambliss, Daniel. 1989. “The Mundanity of Excellence: An Ethnographic Report on Stratification and Olympic Swimmers.” Sociological Theory 7 (1): 70–86.
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2021. shiny: Web Application Framework for R.
Chase, William. 2020. “The Glamour of Graphics.” RStudio Conference, January.
Chawla, Dalmeet Singh. 2020. “Critiqued Coronavirus Simulation Gets Thumbs up from Code-Checking Efforts.” Nature 582: 323–24.
Chellel, Kit. 2018. “The Gambler Who Cracked the Horse-Racing Code.” Bloomberg Businessweek, May.
Chen, Heng, Marie-Hélène Felt, and Christopher Henry. 2018. “2017 Methods-of-Payment Survey: Sample Calibration and Variance Estimation.” Bank of Canada.
Chen, Wei, Xilu Chen, Chang-Tai Hsieh, and Zheng Song. 2019. “A Forensic Examination of China’s National Accounts.” Brookings Papers on Economic Activity, 77–127.
Cheng, Joe, Bhaskar Karambelkar, and Yihui Xie. 2021. leaflet: Create Interactive Web Maps with the JavaScript “Leaflet” Library.
Cheriet, Mohamed, Nawwaf Kharma, Cheng-Lin Liu, and Ching Suen. 2007. Character Recognition Systems: A Guide for Students and Practitioner. Wiley.
Chouldechova, Alexandra, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan. 2018. “A Case Study of Algorithm-Assisted Decision Making in Child Maltreatment Hotline Screening Decisions.” In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, edited by Sorelle Friedler and Christo Wilson, 81:134–48. Proceedings of Machine Learning Research. PMLR.
Chrétien, Jean. 2007. My Years as Prime Minister. 1st ed. Toronto: Knopf Canada.
Christensen, Garret, Allan Dafoe, Edward Miguel, Don Moore, and Andrew Rose. 2019. “A Study of the Impact of Data Sharing on Article Citations Using Journal Policies as a Natural Experiment.” PLoS One 14 (12): e0225883.
Christensen, Garret, Jeremy Freese, and Edward Miguel. 2019. Transparent and Reproducible Social Science Research. California: University of California Press.
Christian, Brian. 2012. The A/B Test: Inside the Technology That’s Changing the Rules of Business.” Wired, April.
Churchill, Winston. 1956. A History of the English-Speaking Peoples. Cassell.
Cirone, Alexandra, and Arthur Spirling. 2021. “Turning History into Data: Data Collection, Measurement, and Inference in HPE.” Journal of Historical Political Economy 1 (1): 127–54.
City of Toronto. 2021. 2021 Street Needs Assessment.
Clarke, Erik, and Scott Sherrill-Mix. 2017. ggbeeswarm: Categorical Scatter (Violin Point) Plots.
Cleveland, William. 1994. The Elements of Graphing Data. 2nd ed. Hobart Press.
Cohen, Glenn, and Michelle Mello. 2018. HIPAA and Protecting Health Information in the 21st Century.” JAMA 320 (3): 231.
Cohn, Alain. 2019. Data and code for: Civic Honesty Around the Globe.” Harvard Dataverse.
Cohn, Alain, Michel André Maréchal, David Tannenbaum, and Christian Lukas Zünd. 2019a. “Civic Honesty Around the Globe.” Science 365 (6448): 70–73.
———. 2019b. “Supplementary Materials for: Civic Honesty Around the Globe.” Science 365 (6448): 70–73.
Cohn, Nate. 2016. “We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results.” New York Times, September.
Collins, Annie, and Rohan Alexander. 2022. “Reproducibility of COVID-19 Pre-Prints.” Scientometrics.
Colombo, Tommaso, Holger Fröning, Pedro Javier Garcı̀a, and Wainer Vandelli. 2016. “Optimizing the Data-Collection Time of a Large-Scale Data-Acquisition System Through a Simulation Framework.” The Journal of Supercomputing 72 (12): 4546–72.
Cook, Dianne, Nancy Reid, and Emi Tanaka. 2021. “The Foundation Is Available for Thinking about Data Visualization Inferentially.” Harvard Data Science Review 3 (3).
Cooley, David. 2020. mapdeck: Interactive Maps Using “Mapbox GL JS” and “”.
Council of European Union. 2016. “General Data Protection Regulation 2016/679.”
Cox, David. 2018. “In Gentle Praise of Significance Tests.” YouTube, October.\%5FP9UlCQ.
Cox, David, and Nancy Reid. 1987. “Parameter Orthogonality and Approximate Conditional Inference.” Journal of the Royal Statistical Society: Series B (Methodological) 49 (1): 1–18.
Cox, Murray. 2021. Inside Airbnb—Toronto Data.”
Craiu, Radu. 2019. “The Hiring Gambit: In Search of the Twofer Data Scientist.” Harvard Data Science Review 1 (1).
Cramer, Jan Salomon. 2003. “The Origins of Logistic Regression.” SSRN Electronic Journal.
Crawford, Kate. 2021. Atlas of AI. New Haven: Yale University Press.
Crosby, Alfred. 1997. The Measure of Reality: Quantification in Western Europe, 1250-1600. Cambridge: Cambridge University Press.
Csárdi, Gábor. 2020. gitcreds: Query “git” Credentials from “R”.
Cummins, Neil. 2022. “The Hidden Wealth of English Dynasties, 1892–2016.” The Economic History Review 75 (3): 667–702.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. New Haven: Yale Press.
D’Ignazio, Catherine, and Lauren F Klein. 2020. Data Feminism. Massachusetts: The MIT Press.
Dagan, Noa, Noam Barda, Eldad Kepten, Oren Miron, Shay Perchik, Mark Katz, Miguel Hernán, Marc Lipsitch, Ben Reis, and Ran Balicer. 2021. “BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination Setting.” New England Journal of Medicine 384 (15): 1412–23.
Darling, William. 2011. “A Theoretical and Practical Implementation Tutorial on Topic Modeling and Gibbs Sampling.” In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 642–47.
Davis, Darren. 1997. “Nonrandom Measurement Error and Race of Interviewer Effects Among African Americans.” The Public Opinion Quarterly 61 (1): 183–207.
De Jonge, Edwin, and Mark Van Der Loo. 2013. An Introduction to Data Cleaning with r. Statistics Netherlands Heerlen.\%5FJonge+van\%5Fder\%5FLoo-Introduction\%5Fto\%5Fdata\%5Fcleaning\%5Fwith\%5FR.pdf.
Dean, Natalie. 2022. “Tracking COVID-19 Infections: Time for Change.” Nature 602 (7896): 185.
Deaton, Angus. 2010. “Instruments, Randomization, and Learning about Development.” Journal of Economic Literature 48 (2): 424–55.
DeWitt, Helen. 2000. The Last Samurai. 1st ed. United States: Talk Mirimax Books.
Dillman, Don, Jolene Smyth, and Leah Christian. 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. 4th ed. Wiley.
Dolatsara, Hamidreza Ahady, Ying-Ju Chen, Robert Leonard, Fadel Megahed, and Allison Jones-Farmer. 2021. “Explaining Predictive Model Performance: An Experimental Study of Data Preparation and Model Choice.” Big Data, October.
Doll, Richard, and Bradford Hill. 1950. “Smoking and Carcinoma of the Lung.” British Medical Journal 2 (4682): 739–48.
Druckman, James, and Donald Green. 2021. “A New Era of Experimental Political Science.” In Advances in Experimental Political Science, 1–16. Cambridge University Press.
Duflo, Esther. 2020. “Field Experiments and the Practice of Policy.” American Economic Review 110 (7): 1952–73.
Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. “Calibrating Noise to Sensitivity in Private Data Analysis.” In Theory of Cryptography Conference, 265–84. Springer.
Dwork, Cynthia, and Aaron Roth. 2013. “The Algorithmic Foundations of Differential Privacy.” Foundations and Trends in Theoretical Computer Science 9 (3-4): 211–407.
Edgeworth, Francis Ysidro. 1885. “Methods of Statistics.” Journal of the Statistical Society of London, 181–217.
Efron, Bradley, and Carl Morris. 1977. “Stein’s Paradox in Statistics.” Scientific American - SCI AMER 236 (May): 119–27.
Eghbal, Nadia. 2020. Working in Public: The Making and Maintenance of Open Source Software. California: Stripe Press.
Eisenstein, Michael. 2022. “Need Web Data? Here’s How to Harvest Them.” Nature 607: 200–201.
Elliott, Michael, Brady West, Xinyu Zhang, and Stephanie Coffey. 2022. “The Anchoring Method: Estimation of Interviewer Effects in the Absence of Interpenetrated Sample Assignment.” Survey Methodology 48 (1): 25–48.
Elson, Malte. n.d. “Question Wording and Item Formulation.”
Enns, Peter, and Jake Rothschild. 2022. “Do You Know Where Your Survey Data Come From?” May.
Farrugia, Patricia, Bradley Petrisor, Forough Farrokhyar, and Mohit Bhandari. 2010. “Research Questions, Hypotheses and Objectives.” Canadian Journal of Surgery 53 (4): 278.
Finkelstein, Amy, Sarah Taubman, Bill Wright, Mira Bernstein, Jonathan Gruber, Joseph Newhouse, Heidi Allen, Katherine Baicker, and Oregon Health Study Group. 2012. “The Oregon Health Insurance Experiment: Evidence from the First Year.” The Quarterly Journal of Economics 127 (3): 1057–1106.
Firke, Sam. 2020. janitor: Simple Tools for Examining and Cleaning Dirty Data.
Fisher, Ronald. 1926. The Arrangement of Field Experiments,” 503–15.
———. 1928. Statistical Methods for Research Workers. 2nd ed. London: Oliver; Boyd.
———. 1949. The Design of Experiments. 5th ed. London: Oliver; Boyd.
Fiske, Susan, and Shiro Kuriwaki. 2021. “Words to the Wise on Writing Scientific Papers,” November.
Fitts, Alexis Sobel. 2014. “The King of Content: How Upworthy Aims to Alter the Web, and Could End up Altering the World.” Columbia Journalism Review 53: 34–38.\%5Fking\%5Fof\%5Fcontent.php.
Flake, Jessica, and Eiko Fried. 2020. “Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them.” Advances in Methods and Practices in Psychological Science 3 (4): 456–65.
Flynn, Michael. 2021. troopdata: Tools for Analyzing Cross-National Military Deployment and Basing Data.
Forster, Edward Morgan. 1927. Aspects of the Novel. London: Edward Arnold.
Foster, Gordon. 1968. “Computers, Statistics and Planning: Systems or Chaos?” Geary Lecture.
Fourcade, Marion, and Kieran Healy. 2017. “Seeing Like a Market.” Socio-Economic Review 15 (1): 9–29.
Fowler, Martin, and Kent Beck. 2018. Refactoring: Improving the Design of Existing Code. 2nd ed. New York: Addison-Wesley Professional.
Fox, John, and Robert Andersen. 2006. “Effect Displays for Multinomial and Proportional-Odds Logit Models.” Sociological Methodology 36 (1): 225–55.
Franconeri, Steven, Lace Padilla, Priti Shah, Jeffrey Zacks, and Jessica Hullman. 2021. “The Science of Visual Data Communication: What Works.” Psychological Science in the Public Interest 22 (3): 110–61.
Frandell, Ashlee, Mary Feeney, Timothy Johnson, Eric Welch, Lesley Michalegko, and Heyjie Jung. 2021. “The Effects of Electronic Alert Letters for Internet Surveys of Academic Scientists.” Scientometrics 126 (8): 7167–81.
Franklin, Laura. 2005. “Exploratory Experiments.” Philosophy of Science 72 (5): 888–99.
Fried, Eiko, Jessica Flake, and Donald Robinaugh. 2022. “Revisiting the Theoretical and Methodological Foundations of Depression Measurement.” Nature Reviews Psychology, April.
Friedman, Jerome, Robert Tibshirani, and Trevor Hastie. 2009. The Elements of Statistical Learning. 2nd ed. Springer.
Friendly, Michael, and Howard Wainer. 2021. A History of Data Visaulization and Graphic Communication. 1st ed. Massachusetts: Harvard University Press.
Fry, Hannah. 2020. “Big Tech Is Testing You.” The New Yorker, February, 61–65.
Fuller, Mark, and James Mosher. 1987. “Raptor Survey Techniques.” In Raptor Management Techniques Manual, edited by Beth Pendleton, Brian Millsap, Keith Cline, and David Bird, 37–65. National Wildlife Federation.\%20and\%20Mosher\%201987.pdf.
Funkhouser, Gray. 1937. “Historical Development of the Graphical Representation of Statistical Data.” Osiris 3: 269–404.
Gagolewski, Marek. 2020. R Package Stringi: Character String Processing Facilities.
Garfinkel, Irwin, Lee Rainwater, and Timothy Smeeding. 2006. “A Re-Examination of Welfare States and Inequality in Rich Nations: How in-Kind Transfers and Indirect Taxes Change the Story.” Journal of Policy Analysis and Management: The Journal of the Association for Public Policy Analysis and Management 25 (4): 897–919.
Garnier, Simon, Noam Ross, Robert Rudis, Antônio Camargo, Marco Sciaini, and Cédric Scherer. 2021. viridis - Colorblind-Friendly Color Maps for R.
Gavras, Konstantin, Jan Karem Höhne, Annelies Blom, and Harald Schoen. 2022. “Innovating the Collection of Open-Ended Answers: The Linguistic and Content Characteristics of Written and Oral Answers to Political Attitude Questions.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 185 (3): 872–90.
Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021. “Datasheets for Datasets.” Communications of the ACM 64 (12): 86–92.
Gelfand, Sharla. 2019. Crying Sephora.
———. 2020. opendatatoronto: Access the City of Toronto Open Data Portal.
———. 2021. “Make a ReprEx... Please.” YouTube, February.
Gelman, Andrew. 2016. “What Has Happened down Here Is the Winds Have Changed,” September.
———. 2019. “Another Regression Discontinuity Disaster and What Can We Learn from It,” June.
———. 2020. “Statistical Models of Election Outcomes.” YouTube, August.
Gelman, Andrew, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin. 2014. Bayesian Data Analysis. 3rd ed. Chapman; Hall/CRC.
Gelman, Andrew, Sharad Goel, Douglas Rivers, and David Rothschild. 2016. “The Mythical Swing Voter.” Quarterly Journal of Political Science 11 (1): 103–30.
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.
Gelman, Andrew, and Eric Loken. 2013. “The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No ‘Fishing Expedition’ or ‘p-Hacking’ and the Research Hypothesis Was Posited Ahead of Time.” Department of Statistics, Columbia University.\%5Fhacking.pdf.
Gelman, Andrew, Greggor Mattson, and Daniel Simpson. 2018. “Gaydar and the Fallacy of Decontextualized Measurement.” Sociological Science 5 (12): 270–80.
Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician 56 (2): 121–30.
Gelman, Andrew, and Aki Vehtari. 2020. “What Are the Most Important Statistical Ideas of the Past 50 Years?” arXiv.
Gentemann, Chelle Leigh, Chris Holdgraf, Ryan Abernathey, Daniel Crichton, James Colliander, Edward Joseph Kearns, Yuvi Panda, and Richard Signell. 2021. “Science Storms the Cloud.” AGU Advances 2 (2).
Gerber, Alan, and Donald Green. 2012. Field Experiments: Design, Analysis, and Interpretation. W W Norton.
Gertler, Paul, Sebastian Martinez, Patrick Premand, Laura Rawlings, and Christel Vermeersch. 2016. Impact Evaluation in Practice. 2nd ed. The World Bank.
Geuenich, Michael, Jinyu Hou, Sunyun Lee, Shanza Ayub, Hartland Jackson, and Kieran Campbell. 2021a. “Automated Assignment of Cell Identity from Single-Cell Multiplexed Imaging and Proteomic Data.” Cell Systems 12 (12): 1173–86.
———. 2021b. “Automated Assignment of Cell Identity from Single-Cell Multiplexed Imaging and Proteomic Data.”
Ghitza, Yair, and Andrew Gelman. 2020. “Voter Registration Databases and MRP: Toward the Use of Large-Scale Databases in Public Opinion Research.” Political Analysis 28 (4): 507–31.
Godfrey, Ernest. 1918. “History and Development of Statistics in Canada.” In The History of Statistics–Their Development and Progress in Many Countries. New York: Macmillan, edited by John Koren, 179–98. Macmillan Company of New York.
Goodman, Leo. 1961. “Snowball Sampling.” The Annals of Mathematical Statistics 32 (1): 148–70.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2020. rstanarm: Bayesian applied regression modeling via Stan.”
Google. 2022. “What to Look for in a Code Review.” Google Engineering Practices Documentation.
Gordon, Brett, Robert Moakler, and Florian Zettelmeyer. 2022. “Close Enough? A Large-Scale Exploration of Non-Experimental Approaches to Advertising Measurement.” arXiv.
Gordon, Brett, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky. 2019. “A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook.” Marketing Science 38 (2): 193–225.
Graham, Paul. 2020. How to Write Usefully.
Green, Donald, Terence Leong, Holger Kern, Alan Gerber, and Christopher Larimer. 2009. “Testing the Accuracy of Regression Discontinuity Analysis Using Experimental Benchmarks.” Political Analysis 17 (4): 400–417.
Green, Eric. 2020. Nivi Research: Mister P helps us understand vaccine hesitancy.”
Greenberg, Bernard, Abdel-Latif Abul-Ela, Walt Simmons, and Daniel Horvitz. 1969. “The Unrelated Question Randomized Response Model: Theoretical Framework.” Journal of the American Statistical Association 64 (326): 520–39.
Greenland, Sander, Stephen Senn, Kenneth Rothman, John Carlin, Charles Poole, Steven Goodman, and Douglas Altman. 2016. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations.” European Journal of Epidemiology 31 (4): 337–50.
Griffiths, Thomas, and Mark Steyvers. 2004. “Finding Scientific Topics.” PNAS 101: 5228–35.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25.
Groves, Robert. 2011. “Three Eras of Survey Research.” Public Opinion Quarterly 75 (5): 861–71.
Groves, Robert, and Lars Lyberg. 2010. Total Survey Error: Past, Present, and Future.” Public Opinion Quarterly 74 (5): 849–79.
Grün, Bettina, and Kurt Hornik. 2011. topicmodels: An R Package for Fitting Topic Models.” Journal of Statistical Software 40 (13): 1–30.
Gustafsson, Karl, and Linus Hagström. 2017. “What Is the Point? Teaching Graduate Students How to Construct Political Science Research Puzzles.” European Political Science 17 (4): 634–48.
Gutman, Robert. 1958. “Birth and Death Registration in Massachusetts: II. The Inauguration of a Modern System, 1800-1849.” The Milbank Memorial Fund Quarterly 36 (4): 373–402.
Hackett, Robert. 2016. “Researchers Caused an Uproar by Publishing Data from 70,000 Okcupid Users.” Fortune, May.
Halberstam, David. 1972. The Best and the Brightest. 1st ed. New York: Random House.
Hamming, Richard. 1996. The Art of Doing Science and Engineering. Stripe Press.
Hand, David. 2018. “Statistical Challenges of Administrative and Transaction Data.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 181 (3): 555–605.
Handcock, Mark, and Krista Gile. 2011. “Comment: On the Concept of Snowball Sampling.” Sociological Methodology 41 (1): 367–71.
Hangartner, Dominik, Daniel Kopp, and Michael Siegenthaler. 2021. “Monitoring Hiring Discrimination Through Online Recruitment Platforms.” Nature 589 (7843): 572–76.
Hanretty, Chris. 2020. “An Introduction to Multilevel Regression and Post-Stratification for Estimating Constituency Opinion.” Political Studies Review 18 (4): 630–45.
Hao, Karen. 2019. This is how AI bias really happens—and why it’s so hard to fix.” MIT Technology Review, February.
Hart, Edmund, Pauline Barmby, David LeBauer, François Michonneau, Sarah Mount, Patrick Mulrooney, Timothée Poisot, Kara Woo, Naupaka Zimmerman, and Jeffrey Hollister. 2016. “Ten Simple Rules for Digital Data Storage.” PLOS Computational Biology 12 (10): e1005097.
Hartocollis, Anemona. 2022. U.S. News Ranked Columbia No. 2, but a Math Professor Has His Doubts.” The New York Times, March.
Hassan, Mai. 2022. “New Insights on Africa’s Autocratic Past.” African Affairs 121 (483): 321–33.
Hastie, Trevor, and Robert Tibshirani. 1990. Generalized Additive Models. Chapman; Hall/CRC.
Hawes, Michael. 2020. “Implementing Differential Privacy: Seven Lessons From the 2020 United States Census.” Harvard Data Science Review 2 (2).
Hayot, Eric. 2014. The Elements of Academic Style. Columbia University Press.
Healy, Kieran. 2018. Data Visualization. New Jersey: Princeton University Press.
———. 2020. “The Kitchen Counter Observatory,” May.
———. 2022. “Unhappy in Its Own Way,” July.
Heckathorn, Douglas. 1997. “Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations.” Social Problems 44 (2): 174–99.
Heil, Benjamin, Michael Hoffman, Florian Markowetz, Su-In Lee, Casey Greene, and Stephanie Hicks. 2021. “Reproducibility Standards for Machine Learning in the Life Sciences.” Nature Methods 18 (10): 1132–35.
Heller, Jean. 2022. “AP Exposes the Tuskegee Syphilis Study: The 50th Anniversary.” AP, July.
Henry, Lionel, and Hadley Wickham. 2020. purrr: Functional Programming Tools.
Hermans, Felienne. 2017. “Peter Hilton on Naming.” IEEE Software 34 (3): 117–20.
———. 2021. The Programmer’s Brain: What Every Programmer Needs to Know about Cognition. 1st ed. Simon; Schuster.
Hernan, Miguel, and James Robins. 2020. What If. 1st ed. Boca Raton: Chapman & Hall/CRC.
Herndon, Thomas, Michael Ash, and Robert Pollin. 2014. “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff.” Cambridge Journal of Economics 38 (2): 257–79.
Hester, Jim, Florent Angly, Russ Hyde, Michael Chirico, Kun Ren, and Alexander Rosenstock. 2022. Lintr: A ’Linter’ for r Code.
Hester, Jim, Hadley Wickham, and Gábor Csárdi. 2021. Fs: Cross-Platform File System Operations Based on ’Libuv’.
Hill, Austin Bradford. 1965. “The Environment and Disease: Association or Causation?” Proceedings of the Royal Society of Medicine. Sage Publications.
Hillel, Wayne. 2017. How Do We Trust Our Science Code?
Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2011. MatchIt: Nonparametric Preprocessing for Parametric Causal Inference.” Journal of Statistical Software 42 (8): 1–28.
Hodgetts, Paul. 2022. “The Negative Space of Data,” March.
Hofmeister, Johannes, Janet Siegmund, and Daniel Holt. 2017. “Shorter Identifier Names Take Longer to Comprehend.” In 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), 217–27.
Holland, Paul. 1986. “Statistics and Causal Inference.” Journal of the American Statistical Association 81 (396): 945–60.
Horst, Allison Marie, Alison Presmanes Hill, and Kristen Gorman. 2020. palmerpenguins: Palmer Archipelago (Antarctica) penguin data.
Hotz, Joseph, Christopher Bollinger, Tatiana Komarova, Charles Manski, Robert Moffitt, Denis Nekipelov, Aaron Sojourner, and Bruce Spencer. 2022. “Balancing Data Privacy and Usability in the Federal Statistical System.” Proceedings of the National Academy of Sciences 119 (31): 1–10.
Howes, Adam. 2022. “Representing Uncertainty Using Significant Figures,” April.
Hug, Lucia, Monica Alexander, Danzhen You, Leontine Alkema, and UN Inter-agency Group for Child. 2019. “National, Regional, and Global Levels and Trends in Neonatal Mortality Between 1990 and 2017, with Scenario-Based Projections to 2030: A Systematic Analysis.” Lancet Global Health 7 (6): e710–20.
Hughes, Nicola, and Jill Rutter. 2016. “Ministers Reflect: Interview with Oliver Letwin,” December.
Hulley, Stephen, Steven Cummings, Warren Browner, Deborah Grady, and Thomas Newman. 2007. Designing Clinical Research. 3rd ed. Lippincott Williams & Wilkins.
Hullman, Jessica, and Andrew Gelman. 2021. “Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference.” Harvard Data Science Review 3 (3).
Huntington-Klein, Nick. 2021. The Effect: An Introduction to Research Design and Causality. 1st ed. Chapman & Hall.
———. 2022. “Library of Statistical Techniques.”
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni, Jeffrey Bloem, Pralhad Burli, Naibin Chen, et al. 2021. “The Influence of Hidden Researcher Decisions in Applied Microeconomics.” Economic Inquiry 59: 944–60.
Huyen, Chip. 2020. “Machine Learning Is Going Real-Time,” December.
Hvitfeldt, Emil, and Julia Silge. 2021. Supervised Machine Learning for Text Analysis in R. 1st ed. Chapman; Hall/CRC.
Hyndman, Rob, Timothy Hyndman, Charles Gray, Sayani Gupta, and Jacquie Tran. 2022. cricketdata: International Cricket Data.
Iannone, Richard. 2020. DiagrammeR: Graph/Network Visualization.
Iannone, Richard, Joe Cheng, and Barret Schloerke. 2020. gt: Easily Create Presentation-Ready Display Tables.
Iannone, Richard, and Mauricio Vargas. 2022. pointblank: Data Validation and Organization of Metadata for Local and Remote Tables.
Igelström, Erik. 2020. Causal graphs in R with DiagrammeR.”
International Organization Of Legal Metrology. 2007. International Vocabulary of Metrology – Basic and General Concepts and Associated Terms. 3rd ed.\%5Fv/v002-200-e07.pdf.
Ioannidis, John. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124.
Irving, Damien, Kate Hertweck, Luke Johnston, Joel Ostblom, Charlotte Wickham, and Greg Wilson. 2021. Research Software Engineering with Python. Chapman; Hall/CRC.
Isaacson, Walter. 2011. Steve Jobs. 1st ed. Simon & Schuster.
Ishiguro, Kazuo. 1989. The Remains of the Day. 1st ed. Faber; Faber.
Izrailev, Sergei. 2014. tictoc: Functions for Timing R Scripts.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning with Applications in R. 2nd ed. Springer.
Johnson, Alicia, Miles Ott, and Mine Dogucu. 2022. Bayes Rules! An Introduction to Bayesian Modeling with R. 1st ed. Chapman; Hall/CRC.
Johnson, Kaneesha. 2021. “Two Regimes of Prison Data Collection.” Harvard Data Science Review 3 (3).
Jones, Arnold. 1953. “Census Records of the Later Roman Empire.” The Journal of Roman Studies 43: 49–64.
Jordan, Michael. 2019. “Artificial Intelligence–the Revolution Hasn’t Happened Yet.” Harvard Data Science Review 1 (1).
Joyner, Michael. 1991. “Modeling: Optimal Marathon Performance on the Basis of Physiological Factors.” Journal of Applied Physiology 70 (2): 683–87.
Kahan, Brennan, Fan Li, Andrew Copas, and Michael Harhay. 2022. “Estimands in Cluster-Randomized Trials: Choosing Analyses That Answer the Right Question.” International Journal of Epidemiology, July.
Kahle, David, and Hadley Wickham. 2013. ggmap: Spatial Visualization with ggplot2.” The R Journal 5 (1): 144–61.
Kahneman, Daniel, Olivier Sibony, and Cass Sunstein. 2021. Noise: A Flaw in Human Judgment. William Collins.
Karsten, Karl. 1923. Charts and Graphs. New York: Prentice-Hall.
Kastellec, Jonathan, and Eduardo Leoni. 2007. “Using Graphs Instead of Tables in Political Science.” Perspectives on Politics 5 (4): 755–71.
Kasy, Maximilian, and Alexander Teytelboym. 2022. “Matching with Semi-Bandits.” Econometrics Journal.\%5Fcombinatorial.pdf.
Kay, Matthew. 2020. tidybayes: Tidy Data and Geoms for Bayesian Models.
Kearney, Michael W. 2019. rtweet: Collecting and analyzing Twitter data.” Journal of Open Source Software 4 (42): 1829.
Kennedy, Lauren, and Jonah Gabry. 2020. MRP with rstanarm,” July.
Kennedy, Lauren, and Andrew Gelman. 2020. “Know Your Population and Know Your Model: Using Model-Based Regression and Poststratification to Generalize Findings Beyond the Observed Sample.”
Kennedy, Lauren, Katharine Khanna, Daniel Simpson, and Andrew Gelman. 2020. “Using Sex and Gender in Survey Adjustment.”
Kenny, Christopher, Shiro Kuriwaki, Cory McCartan, Evan Rosenman, Tyler Simko, and Kosuke Imai. 2021. The Impact of the U.S. Census Disclosure Avoidance System on Redistricting and Voting Rights Analysis.”
Keyes, Os. 2019. “Counting the Countless.” Real Life.
Kharecha, Pushker, and James Hansen. 2013. “Prevented Mortality and Greenhouse Gas Emissions from Historical and Projected Nuclear Power.” Environmental Science & Technology 47 (9): 4889–95.
Kiang, Mathew, Alexander Tsai, Monica Alexander, David Rehkopf, and Sanjay Basu. 2021. “Racial/Ethnic Disparities in Opioid-Related Mortality in the USA, 1999–2019: The Extreme Case of Washington DC.” Journal of Urban Health 98 (5): 589–95.
Kimmerer, Robin Wall. 2013. Braiding Sweetgrass. 1st ed. Milkweed Editions.
King, Gary. 2006. “Publication, Publication.” PS: Political Science & Politics 39 (1): 119–25.
King, Gary, and Richard Nielsen. 2019. “Why Propensity Scores Should Not Be Used for Matching.” Political Analysis 27 (4): 435–54.
King, Stephen. 2000. On Writing: A Memoir of the Craft. 1st ed. Scribner.
Kirkegaard, Emil, and Julius Bjerrekær. 2016. “The OKCupid Dataset: A Very Large Public Dataset of Dating Site Users.” Open Differential Psychology, 1–10.
Kleiber, Christian, and Achim Zeileis. 2008. Applied Econometrics with R. New York: Springer-Verlag.
Knuth, Donald. 1984. “Literate Programming.” The Computer Journal 27 (2): 97–111.
———. 1998. Art of Computer Programming, Volume 2: Seminumerical Algorithms. 2nd ed.
Koenecke, Allison, and Hal Varian. 2020. “Synthetic Data Generation for Economists.”
Koenker, Roger, and Achim Zeileis. 2009. “On Reproducible Econometric Research.” Journal of Applied Econometrics 24 (5): 833–47.
Kohavi, Ron, Alex Deng, Brian Frasca, Roger Longbotham, Toby Walker, and Ya Xu. 2012. “Trustworthy Online Controlled Experiments.” In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 12, 1st ed. ACM Press.
Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press.
Koitsalu, Marie, Martin Eklund, Jan Adolfsson, Henrik Grönberg, and Yvonne Brandberg. 2018. “Effects of Pre-Notification, Invitation Length, Questionnaire Length and Reminder on Participation Rate: A Quasi-Randomised Controlled Trial.” BMC Medical Research Methodology 18 (3): 1–5.
Kross, Sean. 2021. postcards: Create Beautiful, Simple Personal Websites.
Kuhn, Max. 2021. poissonreg: Model Wrappers for Poisson Regression.
Kuhn, Max, and Davis Vaughan. 2022. parsnip: A Common API to Modeling and Analysis Functions.
Kuhn, Max, and Hadley Wickham. 2020. Tidymodels: A Collection of Packages for Modeling and Machine Learning Using Tidyverse Principles.
Kuriwaki, Shiro, Will Beasley, and Thomas Leeper. 2022. dataverse: R Client for Dataverse 4+ Repositories.
Kuznets, Simon, Lillian Epstein, and Elizabeth Jenks. 1941. National Income and Its Composition, 1919-1938. National Bureau of Economic Research.
Lamott, Anne. 1994. Bird by Bird: Some Instructions on Writing and Life. Anchor Books.
Landau, William Michael. 2021. The targets R Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 6 (57): 2959.
Lane, Nick. 2015. “The Unseen World: Reflections on Leeuwenhoek (1677) ‘Concerning Little Animals’.” Philosophical Transactions of the Royal Society B: Biological Sciences 370 (1666): 20140344.
Laouenan, Morgane, Palaash Bhargava, Jean-Benoı̂t Eyméoud, Olivier Gergaud, Guillaume Plique, and Etienne Wasmer. 2022. A Cross-Verified Database of Notable People, 3500BC–2018AD.” Scientific Data 9 (1): 290.
Larmarange, Joseph. 2021. Labelled: Manipulating Labelled Data.
Latour, Bruno. 1996. “On Actor-Network Theory: A Few Clarifications.” Soziale Welt 47 (4): 369–81.
Lauderdale, Benjamin, Delia Bailey, Jack Blumenau, and Douglas Rivers. 2020. “Model-Based Pre-Election Polling for National and Sub-National Outcomes in the US and UK.” International Journal of Forecasting 36 (2): 399–413.
Lazear, Edward. 2000. “Economic Imperialism.” The Quarterly Journal of Economics 115 (1): 99–146.
Leek, Jeff, Blakeley McShane, Andrew Gelman, David Colquhoun, Michèle Nuijten, and Steven Goodman. 2017. “Five Ways to Fix Statistics.” Nature 551 (7682): 557–59.
Leek, Jeff, and Roger Peng. 2020. Advanced Data Science 2020.”
Leonelli, Sabina. 2020. “Learning from Data Journeys.” In Data Journeys in the Sciences, 1–24. Springer International Publishing.\_1.
Leos-Barajas, Vianey, Theoni Photopoulou, Roland Langrock, Toby Patterson, Yuuki Watanabe, Megan Murgatroyd, and Yannis Papastamatiou. 2016. “Analysis of Animal Accelerometer Data Using Hidden Markov Models.” Methods in Ecology and Evolution 8 (2): 161–73.
Levay, Kevin, Jeremy Freese, and James Druckman. 2016. “The Demographic and Political Composition of Mechanical Turk Samples.” SAGE Open 6 (1): 1–17.
Lichand, Guilherme, and Sharon Wolf. 2022. “Measuring Child Labor: Whom Should Be Asked, and Why It Matters,” March.
Lima, Renato de, Oliver Phillips, Alvaro Duque, Sebastian Tello, Stuart Davies, Alexandre Adalardo de Oliveira, Sandra Muller, et al. 2022. “Making Forest Data Fair and Open.” Nature Ecology & Evolution 6 (6): 656–58.
Lin, Herbert. 2014. “A Proposal to Reduce Government Overclassification of Information Related to National Security.” Journal of National Security Law and Policy 7: 443–63.
Lin, Sarah, Ibraheem Ali, and Greg Wilson. 2021. “Ten Quick Tips for Making Things Findable.” PLOS Computational Biology 16 (12): 1–10.
Lips, Hilary. 2020. Sex and Gender: An Introduction. 7th ed. Illinois: Waveland Press.
Little, Roderick, and Roger Lewis. 2021. “Estimands, Estimators, and Estimates.” JAMA 326 (10): 967.
Locke, Steph, and Lucy D’Agostino McGowan. 2018. datasauRus: Datasets from the Datasaurus Dozen.
Lockheed Martin. 2005. Joint Strike Fighter Air Vehicle C++ Coding Standards For The System Development And Demonstration Program.” Document Number 2rdu00001 Rev C, December.
Lohr, Sharon. 2022. Sampling: Design and Analysis. 3rd ed. Chapman; Hall/CRC.
Loo, Mark PJ van der, and Edwin de Jonge. 2021. “Data Validation Infrastructure for r.” Journal of Statistical Software 97: 1–33.
Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019. Geocomputation with R. 1st ed. Chapman; Hall/CRC.
Lucas, Jack, Reed Merrill, Kelly Blidook, Sandra Breux, Laura Conrad, Gabriel Eidelman, Royce Koop, et al. 2020. Canadian Municipal Elections Database.” Scholars Portal Dataverse.
Lucas, Robert. 1978. “Asset Prices in an Exchange Economy.” Econometrica 46 (6): 1429–45.
Luebke, David Martin, and Sybil Milton. 1994. “Locating the Victim: An Overview of Census-Taking, Tabulation Technology, and Persecution in Nazi Germany.” IEEE Annals of the History of Computing 16 (3): 25–39.
Lumley, Thomas. 2020. “Survey: Analysis of Complex Survey Samples.”
Lundberg, Ian, Rebecca Johnson, and Brandon Stewart. 2021. “What Is Your Estimand? Defining the Target Quantity Connects Statistical Evidence to Theory.” American Sociological Review 86 (3): 532–65.
Luscombe, Alex, Kevin Dick, and Kevin Walby. 2021. “Algorithmic Thinking in the Public Interest: Navigating Technical, Legal, and Ethical Hurdles to Web Scraping in the Social Sciences.” Quality & Quantity, 1–22.
Luscombe, Alex, and Alexander McClelland. 2020. “Policing the Pandemic: Tracking the Policing of COVID-19 Across Canada.”
Lyman, Frank. 1981. “The Responsive Classroom Discussion: The Inclusion of All Students.” Mainstreaming Digest 109: 109–13.
Macaulay, Thomas Babington. 1848. The History of England from the Accession of James the Second.
MacDorman, Marian, and Eugene Declercq. 2018. “The Failure of United States Maternal Mortality Reporting and Its Impact on Women’s Lives.” Birth (Berkeley, Calif.) 45 (2): 105.
Maier, Maximilian, František Bartoš, T. D. Stanley, David Shanks, Adam Harris, and Eric-Jan Wagenmakers. 2022. “No Evidence for Nudging After Adjusting for Publication Bias.” Proceedings of the National Academy of Sciences 119 (31): e2200300119.
Martin, Charles, and Ben Popper. 2021. “Don’t Push That Button: Exploring the Software That Flies SpaceX Rockets and Starships.” The Overflow, December.
Martinez, Luis. 2021. “How Much Should We Trust the Dictator’s GDP Growth Estimates?”\%5FWP\%5F2021-78.pdf.
Matias, Nathan, Kevin Munger, Marianne Aubin Le Quere, and Charles Ebersole. 2021. The Upworthy Research Archive, a time series of 32,487 experiments in U.S. media.” Scientific Data 8 (1): 1–8.
Mattson, Greggor. 2017. “Artificial Intelligence Discovers Gayface. Sigh.”
McClelland, Alexander. 2019. ‘Lock This Whore up’: Legal Violence and Flows of Information Precipitating Personal Violence Against People Criminalised for HIV-Related Crimes in Canada.” European Journal of Risk Regulation 10 (1): 132–47.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. 2nd ed. Chapman; Hall/CRC.
McPhee, John. 2017. Draft No. 4. 1st ed. Farrar, Straus; Giroux.
McQuire, Scott. 2019. “One Map to Rule Them All? Google Maps as Digital Technical Object.” Communication and the Public 4 (2): 150–65.
Meng, Xiao-Li. 2018. “Statistical Paradises and Paradoxes in Big Data (i): Law of Large Populations, Big Data Paradox, and the 2016 US Presidential Election.” The Annals of Applied Statistics 12 (2): 685–726.
———. 2021. “What Are the Values of Data, Data Science, or Data Scientists?” Harvard Data Science Review 3 (1).
Merali, Zeeya. 2010. “Computational Science:... Error.” Nature 467 (7317): 775–77.
Miceli, Milagros, Julian Posada, and Tianling Yang. 2022. “Studying up Machine Learning Data.” Proceedings of the ACM on Human-Computer Interaction 6 (January): 1–14.
Michener, William. 2015. “Ten Simple Rules for Creating a Good Data Management Plan.” PLoS Computational Biology 11 (10): e1004525.
Mindell, David. 2008. Digital Apollo: Human and Machine in Spaceflight. New York: The MIT Press.
Mineault, Patrick, and The Good Research Code Handbook Community. 2021. “The Good Research Code Handbook.”
Minsky, Yaron. 2011. OCaml for the masses.” Communications of the ACM 54 (11): 53–58.
———. 2015. “Automated Trading and OCaml with Yaron Minsky.” Hackers — Software Engineering Daily, November.
Mitchell, Alanna. 2022. “Get Ready for the New, Improved Second.” The New York Times, April.
Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. “Model Cards for Model Reporting.” Proceedings of the Conference on Fairness, Accountability, and Transparency, January.
Miyakawa, Tsuyoshi. 2020. “No Raw Data, No Science: Another Possible Source of the Reproducibility Crisis.” Molecular Brain 13 (1): 1–6.
Mok, Lillio, Samuel Way, Lucas Maystre, and Ashton Anderson. 2022. “The Dynamics of Exploration on Spotify.” In Proceedings of the International AAAI Conference on Web and Social Media, 16:663–74.
Molanphy, Chris. 2012. “100 & Single: Three Rules to Define the Term ‘One-Hit Wonder’ in 2012.” The Village Voice, September.
Morange, Michel. 2016. A History of Biology. New Jersey: Princeton University Press.
Moyer, Brian, and Abe Dunn. 2020. “Measuring the Gross Domestic Product (GDP): The Ultimate Data Science Project.” Harvard Data Science Review 2 (1).
Müller, Kirill, and Lorenz Walthert. 2022. styler: Non-Invasive Pretty Printing of R Code.
Müller, Kirill, and Hadley Wickham. 2021. tibble: Simple Data Frames.
Murphy, Heather. 2017. “Why Stanford Researchers Tried to Create a ‘Gaydar’ Machine.” New York Times, October.
Nelder, John, and Robert Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society: Series A (General) 135 (3): 370–84.
Neufeld, Michael. 2002. “Wernher von Braun, the SS, and Concentration Camp Labor: Questions of Moral, Political, and Criminal Responsibility.” German Studies Review 25 (1): 57–78.
Neuwirth, Erich. 2014. RColorBrewer: ColorBrewer Palettes.
Newman, Daniel. 2014. “Missing Data: Five Practical Guidelines.” Organizational Research Methods 17 (4): 372–411.
Neyman, Jerzy. 1934. “On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection.” Journal of the Royal Statistical Society 97 (4): 558–625.
Nobles, Melissa. 2002. “Racial Categorization and Censuses.” In Census and Identity: The Politics of Race, Ethnicity, and Language in National Censuses, edited by David Kertzer and Dominique Arel, 43–70. New York, NY: Cambridge University Press.
Northcutt, Curtis, Anish Athalye, and Jonas Mueller. 2021. “Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks.”
Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations.” Science 366 (6464): 447–53.
Oberski, Daniel L., and Frauke Kreuter. 2020. “Differential Privacy and Social Science: An Urgent Puzzle.” Harvard Data Science Review 2 (1).
OECD. 2014. “The Essential Macroeconomic Aggregates.” In Understanding National Accounts, 13–46. OECD.
———. 2022. Quarterly GDP.
Ooms, Jeroen. 2014. The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects.” arXiv:1403.2805 [Stat.CO].
———. 2019a. pdftools: Text Extraction, Rendering and Converting of PDF Documents.
———. 2019b. tesseract: Open Source OCR Engine.
———. 2021. openssl: Toolkit for Encryption, Signatures and Certificates Based on OpenSSL.
Oostrom, Tamar. 2022. “Funding of Clinical Trials and Reported Drug Efficacy.”
Orwell, George. 1946. Politics and the English Language.
Patki, Neha, Roy Wedge, and Kalyan Veeramachaneni. 2016. “The Synthetic Data Vault.” In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 399–410.
Paullada, Amandalynne, Inioluwa Deborah Raji, Emily Bender, Emily Denton, and Alex Hanna. 2021. “Data and Its (Dis)contents: A Survey of Dataset Development and Use in Machine Learning Research.” Patterns 2 (11): 100336.
Pavlik, Kaylin. 2019. “Understanding + Classifying Genres Using Spotify Audio Features.”
Pedersen, Thomas Lin. 2020. patchwork: The Composer of Plots.
Perepolkin, Dmytro. 2019. Polite: Be Nice on the Web.
Perkel, Jeffrey. 2021. “Ten Computer Codes That Transformed Science.” Nature 589 (7842): 344–48.
Pfeffer, Juergen, Angelina Mooseder, Luca Hammer, Oliver Stritzel, and David Garcia. 2022. “This Sample Seems to Be Good Enough! Assessing Coverage and Temporal Reliability of Twitter’s Academic API.” arXiv.
Phillips, Alban. 1958. “The Relation Between Unemployment and the Rate of Change of Money Wage Rates in the United Kingdom, 1861-1957.” Economica 25 (100): 283–99.
Piller, Charles. 2022. “Blots on a Field?” Science 377 (6604): 358–63.
Pineau, Joelle, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent Larivière, Alina Beygelzimer, Florence d’Alché-Buc, Emily Fox, and Hugo Larochelle. 2021. “Improving Reproducibility in Machine Learning Research (a Report from the NeurIPS 2019 Reproducibility Program).” Journal of Machine Learning Research 22 (164): 1–20.
Pitman, Jim. 1993. Probability. 1st ed. New York: Springer.
Plant, Anne, and Robert Hanisch. 2020. “Reproducibility in Science: A Metrology Perspective.” Harvard Data Science Review 2 (4).
Presmanes Hill, Alison. 2021a. M-F-E-O: postcards + distill.
———. 2021b. Up & Running with Blogdown in 2021.
Prévost, Jean-Guy, and Jean-Pierre Beaud. 2015. Statistics, Public Debate and the State, 1800–1945: A Social, Political and Intellectual History of Numbers. Routledge.
R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
R Special Interest Group on Databases (R-SIG-DB), Hadley Wickham, and Kirill Müller. 2022. DBI: R Database Interface.
Register, Yim. 2020a. “Introduction to Sampling and Randomization,” November.
———. 2020b. “Data Science Ethics in 6 Minutes.” YouTube, December.
Reid, Nancy. 2003. “Asymptotics and the Theory of Inference.” The Annals of Statistics 31 (6): 1695–1731.
Richardson, Neal, Ian Cook, Nic Crane, Jonathan Keane, Romain François, Jeroen Ooms, and Apache Arrow. 2022. arrow: Integration to “Apache” “Arrow”.
Riederer, Emily. 2020. “Column Names as Contracts,” September.
Riffe, Tim, Enrique Acosta, Enrique José Acosta, Diego Manuel Aburto, Anna Alburez-Gutierrez, Ainhoa Altová, Ugofilippo Alustiza, et al. 2021. “Data Resource Profile: COVerAGE-DB: A Global Demographic Database of COVID-19 Cases and Deaths.” International Journal of Epidemiology 50 (2): 390–390f.
Rilke, Rainer Maria. 1929. Letters to a Young Poet.
Robinson, David. 2021. gutenbergr: Download and Process Public Domain Works from Project Gutenberg.
Robinson, David, Alex Hayes, and Simon Couch. 2021. broom: Convert Statistical Objects into Tidy Tibbles.
Robinson, Emily, and Jacqueline Nolis. 2020. Build a Career in Data Science. Manning Publications.
Rockoff, Hugh. 2019. “On the Controversies Behind the Origins of the Federal Economic Statistics.” Journal of Economic Perspectives 33 (1): 147–64.
Rose, Angela, Rebecca Grais, Denis Coulombier, and Helga Ritter. 2006. “A Comparison of Cluster and Systematic Sampling Methods for Measuring Crude Mortality.” Bulletin of the World Health Organization 84: 290–96.
Ross, Casey. 2022. “How a Decades-Old Database Became a Hugely Profitable Dossier on the Health of 270 Million Americans.” Stat, February.
Rudis, Bob. 2020. hrbrthemes: Additional Themes, Theme Components and Utilities for “ggplot2”.
Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan Schroeder. 2019. “Differential Privacy and Census Data: Implications for Social and Economic Research.” AEA Papers and Proceedings 109 (May): 403–8.
Ruggles, Steven, Sarah Flood, Sophia Foster, Ronald Goeken, Jose Pacas, Megan Schouweiler, and Matthew Sobek. 2021. “IPUMS USA: Version 11.0.” Minneapolis, MN: IPUMS.
Ryan, Philip. 2015. “Keeping a Lab Notebook.” YouTube, May.
Sadowski, Caitlin, Emma Söderberg, Luke Church, Michal Sipko, and Alberto Bacchelli. 2018b. “Modern Code Review: A Case Study at Google.” In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, 181–90.
———. 2018a. “Modern Code Review: A Case Study at Google.” In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, 181–90. ICSE-SEIP ’18. New York, NY, USA: Association for Computing Machinery.
Sakshaug, Joseph, Ting Yan, and Roger Tourangeau. 2010. “Nonresponse Error, Measurement Error, and Mode of Data Collection: Tradeoffs in a Multi-Mode Survey of Sensitive and Non-Sensitive Items.” Public Opinion Quarterly 74 (5): 907–33.
Salganik, Matthew. 2018. Bit by Bit: Social Research in the Digital Age. New Jersey: Princeton University Press.
Salganik, Matthew, Peter Sheridan Dodds, and Duncan Watts. 2006. “Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market.” Science 311 (5762): 854–56.
Salganik, Matthew, and Douglas Heckathorn. 2004. “Sampling and Estimation in Hidden Populations Using Respondent-Driven Sampling.” Sociological Methodology 34 (1): 193–240.
Sambasivan, Nithya, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen Paritosh, and Lora Aroyo. 2021. ‘Everyone Wants to Do the Model Work, Not the Data Work’: Data Cascades in High-Stakes AI.” In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM.
Samuel, Arthur. 1959. “Some Studies in Machine Learning Using the Game of Checkers.” IBM Journal of Research and Development 3 (3): 210–29.
Saulnier, Lucile, Siddharth Karamcheti, Hugo Laurençon, Léo Tronchon, Thomas Wang, Victor Sanh, Amanpreet Singh, et al. 2022. “Putting Ethical Principles at the Core of the Research Lifecycle.”
Schloerke, Barret, and Jeff Allen. 2021. plumber: An API Generator for R.
Schmertmann, Carl. 2022. “UN API Test,” July.
Scott, James. 1998. Seeing Like a State. Yale University Press.
Sekhon, Jasjeet, and Rocío Titiunik. 2017. “Understanding Regression Discontinuity Designs as Observational Studies.” Observational Studies 3 (2): 174–82.
Si, Yajuan. 2020. “On the Use of Auxiliary Variables in Multilevel Regression and Poststratification.”
Sides, John, Lynn Vavreck, and Christopher Warshaw. 2021. “The Effect of Television Advertising in United States Elections.” American Political Science Review, 1–17.
Silberzahn, Raphael, Eric Uhlmann, Daniel Martin, Pasquale Anselmi, Frederik Aust, Eli Awtrey, Štěpán Bahnı́k, et al. 2018. “Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results.” Advances in Methods and Practices in Psychological Science 1 (3): 337–56.
Silge, Julia. 2018. “Text Classification with Tidy Data Principles,” December.
Silge, Julia, Fanny Chow, Max Kuhn, and Hadley Wickham. 2022. rsample: General Resampling Infrastructure.
Silge, Julia, and David Robinson. 2016. tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” The Journal of Open Source Software 1 (3).
Silver, Nate. 2020. “We Fixed an Issue with How Our Primary Forecast Was Calculating Candidates’ Demographic Strengths.” FiveThirtyEight, February.
Simon, Noah, Jerome Friedman, Trevor Hastie, and Rob Tibshirani. 2011. “Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent.” Journal of Statistical Software 39 (5): 1–13.
Simonsohn, Uri. 2013. “Just Post It: The Lesson from Two Cases of Fabricated Data Detected by Statistics Alone.” Psychological Science 24 (10): 1875–88.
Simpson, Edward. 1951. “The Interpretation of Interaction in Contingency Tables.” Journal of the Royal Statistical Society: Series B (Methodological) 13 (2): 238–41.
Smith, Jessie, Saleema Amershi, Solon Barocas, Hanna Wallach, and Jennifer Wortman Vaughan. 2022. “REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research.” 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22).
Sobek, Matthew, and Steven Ruggles. 1999. “The IPUMS Project: An Update.” Historical Methods: A Journal of Quantitative and Interdisciplinary History 32 (3): 102–10.
Somers, James. 2015. Toolkits for the Mind.” MIT Technology Review, April.
———. 2017. “Torching the Modern-Day Library of Alexandria.” The Atlantic, April.
Sprint, Gina, and Jason Conci. 2019. “Mining Github Classroom Commit Behavior in Elective and Introductory Computer Science Courses.” Journal of Computing Sciences in Colleges 35 (1): 76–84.
Staicu, Ana-Maria. 2017. “Interview with Nancy Reid.” International Statistical Review 85 (3): 381–403.
Staniak, Mateusz, and Przemyslaw Biecek. 2019. The landscape of R packages for automated exploratory data analysis.” arXiv Preprint arXiv:1904.02101.
Statistics Canada. 2017. “Guide to the Census of Population, 2016.” Statistics Canada.
———. 2020. “Sex at Birth and Gender: Technical Report on Changes for the 2021 Census.” Statistics Canada.
Steckel, Richard. 1991. “The Quality of Census Data for Historical Inquiry: A Research Agenda.” Social Science History 15 (4): 579–99.
Stevens, Wallace. 1934. The Idea of Order at Key West.
Steyvers, Mark, and Tom Griffiths. 2006. “Probabilistic Topic Models.” In Latent Semantic Analysis: A Road to Meaning, edited by T. Landauer, D McNamara, S. Dennis, and W. Kintsch.
Stigler, Stephen. 1986. The History of Statistics. Harvard University Press.
Stock, James, and Francesco Trebbi. 2003. “Retrospectives: Who Invented Instrumental Variable Regression?” Journal of Economic Perspectives 17 (3): 177–94.
Stolberg, Michael. 2006. “Inventing the Randomized Double-Blind Trial: The Nuremberg Salt Test of 1835.” Journal of the Royal Society of Medicine 99 (12): 642–43.
Stolley, Paul. 1991. “When Genius Errs: R. A. Fisher and the Lung Cancer Controversy.” American Journal of Epidemiology 133 (5): 416–25.
Student. 1908. “The Probable Error of a Mean.” Biometrika 6 (1): 1–25.
Sunstein, Cass, and Lucia Reisch. 2017. The Economics of Nudge. Routledge.
Suriyakumar, Vinith, Nicolas Papernot, Anna Goldenberg, and Marzyeh Ghassemi. 2021. “Chasing Your Long Tails.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. Acm.
Swain, Larry. 1985. “Basic Principles of Questionnaire Design.” Survey Methodology 11 (2): 161–70.
Taddy, Matt. 2019. Business Data Science. McGraw Hill.
Tal, Eran. 2020. Measurement in Science.” In The Stanford Encyclopedia of Philosophy, edited by Edward Zalta, Fall 2020.; Metaphysics Research Lab, Stanford University.
Tang, Jun, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, and Xiaofeng Wang. 2017. “Privacy Loss in Apple’s Implementation of Differential Privacy on MacOS 10.12.” arXiv.
The Economist. 2013. “Johnson: Those Six Little Rules: George Orwell on Writing,” July.
———. 2022a. “What Spotify Data Show about the Decline of English,” January.
———. 2022b. “Will Emmanuel Macron Win a Second Term?” April.
———. 2022c. “France’s Presidential Election: The Second Round in Detail,” April.
The Prize in Economic Sciences. 2019. “Scientific Background: Understanding Development and Poverty Alleviation.” The Committee for the Prize in Economic Sciences in Memory of Alfred Nobel.
Thieme, Nick. 2018. “R Generation.” Significance 15 (4): 14–19.
Thistlethwaite, Donald, and Donald Campbell. 1960. “Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment.” Journal of Educational Psychology 51 (6): 309.
Thompson, Charlie, Josiah Parry, Donal Phipps, and Tom Wolff. 2020. spotifyr: R Wrapper for the “Spotify” Web API.
Thornhill, John. 2021. “Lunch with the FT: Mathematician Hannah Fry.” Financial Times, July.
Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data Frames.” Journal of Open Source Software 2 (16): 355.
———. 2020. R Markdown for Scientists.
———. 2022. Quarto for Scientists.
Tierney, Nicholas, Di Cook, Miles McBain, and Colin Fay. 2021. Naniar: Data Structures, Summaries, and Visualisations for Missing Data.
Tierney, Nicholas, and Karthik Ram. 2020. “A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility.”
Timbers, Tiffany. 2020. canlang: Canadian Census language data.
Timbers, Tiffany, Trevor Campbell, and Melissa Lee. 2022. Data Science: A First Introduction. Chapman; Hall/CRC.
Tolley, Erin, and Mireille Paquet. 2021. “Gender, Municipal Party Politics, and Montreal’s First Woman Mayor.” Canadian Journal of Urban Research 30 (1): 40–52.
Tourangeau, Roger, Lance Rips, and Kenneth Rasinski. 2000. The Psychology of Survey Response. 1st ed. Cambridge University Press.
Trisovic, Ana, Matthew Lau, Thomas Pasquier, and Mercè Crosas. 2022. “A Large-Scale Study on Research Code Quality and Execution.” Scientific Data 9 (1).
Tukey, John. 1962. “The Future of Data Analysis.” The Annals of Mathematical Statistics 33 (1): 1–67.
UN IGME. 2021. “Levels and Trends in Child Mortality, 2021.”
Urban, Steve, Rangarajan Sreenivasan, and Vineet Kannan. 2016. It’s All A/Bout Testing: The Netflix Experimentation Platform.” Netflix Technology Blog, April.
Ushey, Kevin. 2022. renv: Project Environments.
Van den Broeck, Jan, Solveig Argeseanu Cunningham, Roger Eeckels, and Kobus Herbst. 2005. “Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities.” PLoS Medicine 2 (10): e267.
van der Loo, Mark. 2022. The Data Validation Cookbook.
Vanderplas, Susan, Dianne Cook, and Heike Hofmann. 2020. “Testing Statistical Charts: What Makes a Good Graph?” Annual Review of Statistics and Its Application 7: 61–88.
Vanhoenacker, Mark. 2015. Skyfaring: A Journey with a Pilot. Alfred A. Knopf.
Varin, Cristiano, Nancy Reid, and David Firth. 2011. “An Overview of Composite Likelihood Methods.” Statistica Sinica, 5–42.
von Bergmann, Jens, Dmitry Shkolnik, and Aaron Jacobs. 2021. cancensus: R package to access, retrieve, and work with Canadian Census data and geography.
Walby, Kevin, and Alex Luscombe. 2019. Freedom of Information and Social Science Research Design. Routledge.
Walker, Kyle. 2022. Analyzing US Census Data. Chapman; Hall/CRC.
Walker, Kyle, and Matt Herman. 2022. tidycensus: Load US Census Boundary and Attribute Data as “tidyverse” and “sf”-Ready Data Frames.
Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015. “Forecasting Elections with Non-Representative Polls.” International Journal of Forecasting 31 (3): 980–91.
Wang, Yilun, and Michal Kosinski. 2018. “Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation from Facial Images.” Journal of Personality and Social Psychology 114 (2): 246–57.
Wardrop, Robert. 1995. “Simpson’s Paradox and the Hot Hand in Basketball.” The American Statistician 49 (1): 24–28.
Ware, James. 1989. “Investigating Therapies of Potentially Great Benefit: ECMO.” Statistical Science 4 (4): 298–306.
Waring, Elin, Michael Quinn, Amelia McNamara, Eduardo Arino de la Rubia, Hao Zhu, and Shannon Ellis. 2022. Skimr: Compact and Flexible Summaries of Data.
Wasserman, Larry. 2005. All of Statistics. Springer.
Wei, LJ, and S Durham. 1978. “The Randomized Play-the-Winner Rule in Medical Trials.” Journal of the American Statistical Association 73 (364): 840–43.
Weinberg, Gerald. 1971. The Psychology of Computer Programming. New York: Van Nostrand Reinhold Company.
Weissgerber, Tracey, Natasa Milic, Stacey Winham, and Vesna Garovic. 2015. “Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm.” PLoS Biology 13 (4): e1002128.
Whitby, Andrew. 2020. The Sum of the People. New York: Basic Books.
Whitelaw, James. 1805. An Essay on the Population of Dublin. Being the Result of an Actual Survey Taken in 1798, with Great Care and Precision, and Arranged in a Manner Entirely New. Graisberry; Campbell.
Wicherts, Jelte, Marjan Bakker, and Dylan Molenaar. 2011. “Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results.” PLoS ONE 6 (11): e26828.
Wickham, Hadley. 2009. “Manipulating Data.” In ggplot2, 157–75. Springer New York.\_9.
———. 2010. “A Layered Grammar of Graphics.” Journal of Computational and Graphical Statistics 19 (1): 3–28.
———. 2011. testthat: Get Started with Testing.” The R Journal 3: 5–10.\%5F2011-1\%5FWickham.pdf.
———. 2014. “Tidy Data.” Journal of Statistical Software 59 (1): 1–23.
———. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.
———. 2017. tidyverse: Easily Install and Load the “Tidyverse”.
———. 2019a. Advanced R. 2nd ed. Chapman; Hall/CRC.
———. 2019b. babynames: US Baby Names 1880-2017.
———. 2019c. httr: Tools for Working with URLs and HTTP.
———. 2019d. rvest: Easily Harvest (Scrape) Web Pages.
———. 2019e. stringr: Simple, Consistent Wrappers for Common String Operations.
———. 2020a. forcats: Tools for Working with Categorical Variables (Factors).
———. 2020b. Tidyverse.
———. 2021a. Mastering Shiny. 1st ed. O’Reilly Media.
———. 2021b. The Tidyverse Style Guide.
———. 2021c. tidyr: Tidy Messy Data.
———. 2022. R Packages. 2nd ed. O’Reilly Media.
Wickham, Hadley, Mara Averick, Jenny Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686.
Wickham, Hadley, and Jenny Bryan. 2020. usethis: Automate Package and Project Setup.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022. dplyr: A Grammar of Data Manipulation.
Wickham, Hadley, Maximilian Girlich, and Edgar Ruiz. 2022. dbplyr: A “dplyr” Back End for Databases.
Wickham, Hadley, and Garrett Grolemund. 2022. R for Data Science. 2nd ed. O’Reilly Media.
Wickham, Hadley, Jim Hester, and Jenny Bryan. 2021. readr: Read Rectangular Text Data.
Wickham, Hadley, Jim Hester, and Winston Chang. 2020. devtools: Tools to Make Developing R Packages Easier.
Wickham, Hadley, Jim Hester, and Jeroen Ooms. 2021. xml2: Parse XML.
Wickham, Hadley, and Evan Miller. 2020. haven: Import and Export “SPSS,” “Stata” and “SAS” Files.
Wickham, Hadley, and Dana Seidel. 2020. scales: Scale Functions for Visualization.
Wiessner, Polly. 2014. “Embers of Society: Firelight Talk Among the Ju/’Hoansi Bushmen.” Proceedings of the National Academy of Sciences 111 (39): 14027–35.
Wilde, Oscar. 1891. The Picture of Dorian Gray.
Wilke, Claus. 2019. Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures. O’Reilly Media.
Wilkinson, Leland. 2005. The Grammar of Graphics. 2nd ed. Springer.
Wilkinson, Mark, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9.
Wilson, Greg. 2021. Building Software Together. CRC Books.
Wilson, Greg, Jenny Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy Teal. 2017. “Good Enough Practices in Scientific Computing.” PLOS Computational Biology 13 (6): 1–20.
Wong, Julia Carrie. 2020. “One Year Inside Trump’s Monumental Facebook Campaign.” The Guardian, January.
World Health Organization. 2019. “Trends in Maternal Mortality 2000 to 2017: Estimates by WHO, UNICEF, UNFPA, World Bank Group and the United Nations Population Division.”
Wright, Philip. 1928. The Tariff on Animal and Vegetable Oils. New York: Macmillan Company.
Wu, Changbao, and Mary Thompson. 2020. Sampling Theory and Practice. Springer.
Xie, Yihui. 2019. TinyTeX: A lightweight, cross-platform, and easy-to-maintain LaTeX distribution based on TeX Live.” TUGboat, no. 1: 30–32.
———. 2021. knitr: A General-Purpose Package for Dynamic Report Generation in R.
Xie, Yihui, Christophe Dervieux, and Alison Presmanes Hill. 2021. blogdown: Create Blogs and Websites with R Markdown.
Xie, Yihui, Amber Thomas, and Alison Presmanes Hill. 2021. blogdown: Creating Websites with R Markdown.
Xu, Ya. 2020. “Causal Inference Challenges in Industry: A Perspective from Experiences at LinkedIn.” YouTube, July.
Yeager, David, Jon Krosnick, LinChiat Chang, Harold Javitz, Matthew Levendusky, Alberto Simpser, and Rui Wang. 2011. “Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability and Non-Probability Samples.” Public Opinion Quarterly 75 (4): 709–47.
Yoshioka, Alan. 1998. “Use of Randomisation in the Medical Research Council’s Clinical Trial of Streptomycin in Pulmonary Tuberculosis in the 1940s.” BMJ 317 (7167): 1220–23.
Zhang, Susan, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, et al. 2022. “OPT: Open Pre-Trained Transformer Language Models.” arXiv.
Zhu, Hao. 2020. kableExtra: Construct Complex Table with “kable” and Pipe Syntax.
Zimmer, Michael. 2018. “Addressing Conceptual Gaps in Big Data Research Ethics: An Application of Contextual Integrity.” Social Media + Society 4 (2): 1–11.
Zinsser, William. 1976. On Writing Well. New York: HarperCollins.
Zook, Matthew, Solon Barocas, danah boyd, Kate Crawford, Emily Keller, Seeta Peña Gangadharan, Alyssa Goodman, et al. 2017. “Ten Simple Rules for Responsible Big Data Research.” PLoS Computational Biology 13 (3): e1005399.