References
Abadie, Alberto, Susan Athey, Guido Imbens, and Jeffrey Wooldridge.
2017. “When Should You Adjust Standard Errors for
Clustering?” Working Paper 24003. Working Paper Series. National
Bureau of Economic Research. https://doi.org/10.3386/w24003.
Abelson, Harold, and Gerald Jay Sussman. 1996. Structure and
Interpretation of Computer Programs. Massachusetts: The MIT Press.
Abeysooriya, Mandhri, Megan Soria, Mary Sravya Kasu, and Mark Ziemann.
2021. “Gene Name Errors: Lessons Not Learned.” PLOS
Computational Biology 17 (7): 1–13. https://doi.org/10.1371/journal.pcbi.1008984.
Acemoglu, Daron, Simon Johnson, and James Robinson. 2001. “The
Colonial Origins of Comparative Development: An Empirical
Investigation.” American Economic Review 91
(5): 1369–1401. https://doi.org/10.1257/aer.102.6.3077.
Achen, Christopher. 1978. “Measuring Representation.”
American Journal of Political Science 22 (3): 475–510. https://doi.org/10.2307/2110458.
Akerlof, George. 1970. “The Market for ‘Lemons’:
Quality Uncertainty and the Market Mechanism.” The Quarterly
Journal of Economics. https://doi.org/10.2307/1879431.
Alexander, Monica. 2019a. “Reproducibility in Demographic
Research.” https://www.monicaalexander.com/posts/2019-10-20-reproducibility/.
———. 2019b. “The Concentration and Uniqueness of Baby Names in
Australia and the US,” January. https://www.monicaalexander.com/posts/2019-20-01-babynames/.
———. 2019c. “Analyzing Name Changes After Marriage Using a
Non-Representative Survey,” August. https://www.monicaalexander.com/posts/2019-08-07-mrp/.
———. 2021. “Overcoming Barriers to Sharing Code.”
YouTube, February. https://youtu.be/yvM2C6aZ94k.
Alexander, Monica, and Leontine Alkema. 2021. “A Bayesian Cohort
Component Projection Model to Estimate Adult Populations at the
Subnational Level in Data-Sparse Settings.” https://arxiv.org/abs/2102.06121.
Alexander, Monica, Mathew Kiang, and Magali Barbieri. 2018.
“Trends in Black and White Opioid Mortality in the United States,
1979–2015.” Epidemiology 29 (5): 707–15. https://doi.org/10.1097/EDE.0000000000000858.
Alexander, Rohan, and Monica Alexander. 2021. “The Increased
Effect of Elections and Changing Prime Ministers on Topics Discussed in
the Australian Federal Parliament Between 1901 and 2018.” https://doi.org/10.48550/arXiv.2111.09299.
Alexander, Rohan, and Paul Hodgetts. 2021.
AustralianPoliticians: Provides Datasets About Australian
Politicians. https://CRAN.R-project.org/package=AustralianPoliticians.
Alexander, Rohan, and A Mahfouz. 2021. heapsofpapers: Easily Download Heaps of PDF and CSV
Files. https://CRAN.R-project.org/package=heapsofpapers.
Alexander, Rohan, and Zachary Ward. 2018. “Age at Arrival and
Assimilation During the Age of Mass Migration.” The Journal
of Economic History 78 (3): 904–37. https://doi.org/10.1017/S0022050718000335.
Allaire, JJ, Rich Iannone, Alison Presmanes Hill, and Yihui Xie. 2021.
distill: “R Markdown” Format for
Scientific and Technical Writing. https://rstudio.github.io/distill/.
Allen, Jeff. 2021. plumberDeploy: Plumber
Deployment. https://CRAN.R-project.org/package=plumberDeploy.
Alsan, Marcella, and Amy Finkelstein. 2021. “Beyond Causality:
Additional Benefits of Randomized Controlled Trials for Improving Health
Care Delivery.” The Milbank Quarterly 99 (4): 864–81. https://doi.org/10.1111/1468-0009.12521.
Alsan, Marcella, and Marianne Wanamaker. 2018. “Tuskegee and the
Health of Black Men.” The Quarterly Journal of Economics
133 (1): 407–55. https://doi.org/10.1093/qje/qjx029.
Amaka, Ofunne, and Amber Thomas. 2021. “The Naked Truth: How the
Names of 6,816 Complexion Products Can Reveal Bias in Beauty.”
The Pudding, March. https://pudding.cool/2021/03/foundation-names/.
American Medical Association and New York Academy of Medicine. 1848.
Code of Medical Ethics. Academy of Medicine. https://hdl.handle.net/2027/chi.57108026.
Andersen, Robert, and David Armstrong II. 2021. Presenting
Statistical Results Effectively. London: Sage.
Anderson, Margo, and Stephen Fienberg. 1999. Who counts?: The politics of census-taking in
contemporary America. Russell Sage Foundation. http://www.jstor.org/stable/10.7758/9781610440059.
Andrews, David, and Agnes Herzberg. 2012. Data: A Collection of
Problems from Many Fields for the Student and Research Worker. New
York: Springer Science & Business Media.
Angelucci, Charles, and Julia Cagé. 2019. “Newspapers in Times of
Low Advertising Revenues.” American Economic Journal:
Microeconomics 11 (3): 319–64. https://doi.org/10.1257/mic.20170306.
Angrist, Joshua, and Jörn-Steffen Pischke. 2010. “The Credibility
Revolution in Empirical Economics: How Better Research Design Is Taking
the Con Out of Econometrics.” Journal of Economic
Perspectives 24 (2): 3–30. https://doi.org/10.1257/jep.24.2.3.
Annas, George. 2003. “HIPAA Regulations: A New Era of
Medical-Record Privacy?” New England Journal of Medicine
348: 1486–90. https://doi.org/10.1056/NEJMlim035027.
Aprameya, Lavanya. 2020. “Improving Duolingo, One Experiment at a
Time.” Duolingo Blog, January. https://blog.duolingo.com/improving-duolingo-one-experiment-at-a-time/.
Arel-Bundock, Vincent. 2021a. modelsummary:
Summary Tables and Plots for Statistical Models and Data: Beautiful,
Customizable, and Publication-Ready. https://CRAN.R-project.org/package=modelsummary.
———. 2021b. WDI: World Development Indicators
and Other World Bank Data. https://CRAN.R-project.org/package=WDI.
Arel-Bundock, Vincent, Ryan Briggs, Hristos Doucouliagos, Marco Mendoza
Aviña, and T. D. Stanley. 2022. “Quantitative Political Science
Research Is Greatly Underpowered.” https://osf.io/bzj9y/.
Armstrong, Zan. 2022. “Stop Aggregating Away the Signal in Your
Data.” The Overflow, March. https://stackoverflow.blog/2022/03/03/stop-aggregating-away-the-signal-in-your-data/.
Arnold, Jeffrey. 2021. ggthemes: Extra Themes,
Scales and Geoms for “ggplot2”. https://CRAN.R-project.org/package=ggthemes.
Asquith, Brian, Brad Hershbein, Tracy Kugler, Shane Reed, Steven
Ruggles, Jonathan Schroeder, Steve Yesiltepe, and David Van Riper. 2022.
“Assessing the Impact of Differential
Privacy on Measures of Population
and Racial Residential
Segregation.” Harvard Data Science Review,
no. Special Issue 2. https://doi.org/10.1162/99608f92.5cd8024e.
Athey, Susan, and Guido Imbens. 2017a. “The Econometrics of
Randomized Experiments.” In Handbook of Field
Experiments, 73–140. Elsevier. https://doi.org/10.1016/bs.hefe.2016.10.003.
———. 2017b. “The State of Applied Econometrics: Causality and
Policy Evaluation.” Journal of Economic Perspectives 31
(2): 3–32. https://doi.org/10.1257/jep.31.2.3.
Athey, Susan, Guido Imbens, Jonas Metzger, and Evan Munro. 2021.
“Using Wasserstein Generative Adversarial Networks for the Design
of Monte Carlo Simulations.” Journal of Econometrics. https://doi.org/10.1016/j.jeconom.2020.09.013.
Au, Randy. 2020. “Data Cleaning IS Analysis, Not Grunt
Work,” September. https://counting.substack.com/p/data-cleaning-is-analysis-not-grunt.
———. 2022. “Celebrating Everyone Counting Things,”
February. https://counting.substack.com/p/celebrating-everyone-counting-things.
Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R. https://CRAN.R-project.org/package=magrittr.
Backus, John. 1981. “The History of FORTRAN
I, II, and III.” In History of Programming
Languages, edited by Richard Wexelblat, 25–74. Academic Press.
Bailey, Rosemary. 2008. Design of Comparative Experiments.
Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511611483.
Baker, Reg, Michael Brick, Nancy Bates, Mike Battaglia, Mick Couper,
Jill Dever, Krista Gile, and Roger Tourangeau. 2013. “Summary Report of the AAPOR Task Force on Non-Probability
Sampling.” Journal of Survey Statistics and
Methodology 1 (2): 90–143. https://doi.org/10.1093/jssam/smt008.
Bandy, Jack, and Nicholas Vincent. 2021. “Addressing
‘Documentation Debt’ in Machine Learning Research: A
Retrospective Datasheet for BookCorpus.” arXiv. https://doi.org/10.48550/ARXIV.2105.05241.
Banerjee, Abhijit, and Esther Duflo. 2011. Poor Economics: A Radical
Rethinking of the Way to Fight Global Poverty. New York:
PublicAffairs.
Banerjee, Abhijit, Esther Duflo, Rachel Glennerster, and Cynthia Kinnan.
2015. “The Miracle of Microfinance? Evidence from a Randomized
Evaluation.” American Economic Journal: Applied
Economics 7 (1): 22–53. https://doi.org/10.1257/app.20130533.
Barba, Lorena. 2018. “Terminologies for Reproducible
Research.” https://arxiv.org/abs/1802.03311.
Barrett, Malcolm. 2021a. Data Science as an Atomic Habit. https://malco.io/2021/01/04/data-science-as-an-atomic-habit/.
———. 2021b. ggdag: Analyze and Create Elegant
Directed Acyclic Graphs. https://CRAN.R-project.org/package=ggdag.
Barron, Alexander, Jenny Huang, Rebecca Spang, and Simon DeDeo. 2018.
“Individuals, Institutions, and Innovation in the Debates of the
French Revolution.” Proceedings of the National Academy of
Sciences 115 (18): 4607–12. https://doi.org/10.1073/pnas.1717729115.
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015.
“Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical
Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.
Baumer, Benjamin, Daniel Kaplan, and Nicholas Horton. 2021.
Modern Data Science With R. 2nd ed. Chapman;
Hall/CRC. https://mdsr-book.github.io/mdsr2e/.
Baumgartner, Jason, Savvas Zannettou, Brian Keegan, Megan Squire, and
Jeremy Blackburn. 2020. “The Pushshift Reddit Dataset.”
arXiv. https://doi.org/10.48550/arxiv.2001.08435.
Baumgartner, Peter. 2021. “Ways I Use Testing
as a Data Scientist,” December. https://www.peterbaumgartner.com/blog/testing-for-data-science/.
Beaumont, Jean-Francois. 2020. “Are Probability Surveys Bound to
Disappear for the Production of Official Statistics?” Survey
Methodology 46 (1): 1–29.
Beauregard, Katrine, and Jill Sheppard. 2021. “Antiwomen but
Proquota: Disaggregating Sexism and Support for Gender Quota
Policies.” Political Psychology 42 (2): 219–37. https://doi.org/10.1111/pops.12696.
Becker, Richard, Allan Wilks, Ray Brownrigg, Thomas Minka, and Alex
Deckmyn. 2021. maps: Draw Geographical
Maps. https://CRAN.R-project.org/package=maps.
Bender, Emily, Timnit Gebru, Angelina McMillan-Major, and Shmargaret
Shmitchell. 2021. “On the Dangers of Stochastic Parrots.”
In Proceedings of the 2021 ACM Conference on Fairness,
Accountability, and Transparency. ACM. https://doi.org/10.1145/3442188.3445922.
Bengtsson, Henrik. 2021. “A Unifying Framework for Parallel and
Distributed Processing in r Using Futures.” The R
Journal 13 (2): 208–27. https://doi.org/10.32614/RJ-2021-048.
Bensinger, Greg. 2020. “Google Redraws the Borders on Maps
Depending on Who’s Looking.” Washington Post, February.
https://www.washingtonpost.com/technology/2020/02/14/google-maps-political-borders/.
Berkson, Joseph. 1946. “Limitations of the Application of Fourfold
Table Analysis to Hospital Data.” Biometrics Bulletin 2
(3): 47–53. https://doi.org/10.2307/3002000.
Berners-Lee, Timothy. 1989. “Information Management: A
Proposal.” https://www.w3.org/History/1989/proposal.html.
Berry, Donald. 1989. “Comment: Ethics and ECMO.”
Statistical Science 4 (4): 306–10. https://www.jstor.org/stable/2245830.
Bertrand, Marianne, and Sendhil Mullainathan. 2004. “Are Emily and
Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor
Market Discrimination.” American Economic Review 94 (4):
991–1013. https://doi.org/10.1257/0002828042002561.
Bethlehem, R. A. I., J. Seidlitz, S. R. White, J. W. Vogel, K. M.
Anderson, C. Adamson, S. Adler, et al. 2022. “Brain Charts for the
Human Lifespan.” Nature, April. https://doi.org/10.1038/s41586-022-04554-y.
Bickel, Peter, Eugene Hammel, and William O’Connell. 1975. “Sex
Bias in Graduate Admissions: Data from Berkeley: Measuring Bias Is
Harder Than Is Usually Assumed, and the Evidence Is Sometimes Contrary
to Expectation.” Science 187 (4175): 398–404. https://doi.org/10.1126/science.187.4175.398.
Biderman, Stella, Kieran Bicheno, and Leo Gao. 2022. “Datasheet
for the Pile.” https://arxiv.org/abs/2201.07311.
Birkmeyer, John, Jonathan Finks, Amanda O’Reilly, Mary Oerline, Arthur
Carlin, Andre Nunn, Justin Dimick, Mousumi Banerjee, and Nancy
Birkmeyer. 2013. “Surgical Skill and Complication Rates After
Bariatric Surgery.” New England Journal of Medicine 369
(15): 1434–42. https://doi.org/10.1056/nejmsa1300625.
Blair, Ed, Seymour Sudman, Norman M Bradburn, and Carol Stocking. 1977.
“How to Ask Questions about Drinking and Sex: Response Effects in
Measuring Consumer Behavior.” Journal of Marketing
Research 14 (3): 316–21. https://doi.org/10.2307/3150769.
Blair, Graeme, Jasper Cooper, Alexander Coppock, and Macartan Humphreys.
2019. “Declaring and Diagnosing Research Designs.”
American Political Science Review 113: 838–59. https://doi.org/10.1017/S0003055419000194.
Blair, Graeme, Jasper Cooper, Alexander Coppock, Macartan Humphreys, and
Luke Sonnet. 2021. estimatr: Fast Estimators
for Design-Based Inference. https://CRAN.R-project.org/package=estimatr.
Blair, James. 2019. Democratizing R with
Plumber APIs. https://www.rstudio.com/resources/rstudioconf-2019/democratizing-r-with-plumber-apis/.
Bland, Martin, and Douglas Altman. 1986. “Statistical Methods for
Assessing Agreement Between Two Methods of Clinical Measurement.”
The Lancet 327 (8476): 307–10. https://doi.org/10.1016/S0140-6736(86)90837-8.
Blei, David. 2012. “Probabilistic Topic Models.”
Communications of the ACM 55 (4): 77–84. https://doi.org/10.1145/2133806.2133826.
Blei, David, and John Lafferty. 2009. “Topic Models.” In
Text Mining, edited by Ashok Srivastava and Mehran Sahami,
101–24. Chapman & Hall/CRC. https://doi.org/10.1201/9781420059458.
Blei, David, Andrew Ng, and Michael Jordan. 2003. “Latent
Dirichlet Allocation.” Journal of Machine Learning
Research 3 (Jan): 993–1022.
Bloom, Howard, Andrew Bell, and Kayla Reiman. 2020. “Using Data
from Randomized Trials to Assess the Likely Generalizability of
Educational Treatment-Effect Estimates from Regression Discontinuity
Designs.” Journal of Research on Educational
Effectiveness 13 (3): 488–517. https://doi.org/10.1080/19345747.2019.1634169.
Boland, Philip. 1984. “A Biographical Glimpse of William Sealy
Gosset.” The American Statistician 38 (3): 179–83. https://doi.org/10.2307/2683648.
Bolton, Ruth, and Randall Chapman. 1986. “Searching for Positive
Returns at the Track.” Management Science 32 (August):
1040–60. https://doi.org/10.1287/mnsc.32.8.1040.
Borghi, John, and Ana Van Gulick. 2022. “Promoting Open Science
Through Research Data Management.” Harvard Data Science
Review 4 (3). https://doi.org/10.1162/99608f92.9497f68e.
Borkin, Michelle, Zoya Bylinskii, Nam Wook Kim, Constance May
Bainbridge, Chelsea Yeh, Daniel Borkin, Hanspeter Pfister, and Aude
Oliva. 2015. “Beyond Memorability: Visualization Recognition and
Recall.” IEEE Transactions on Visualization and Computer
Graphics 22 (1): 519–28. https://doi.org/10.1109/TVCG.2015.2467732.
Bouguen, Adrien, Yue Huang, Michael Kremer, and Edward Miguel. 2019.
“Using Randomized Controlled Trials to Estimate Long-Run Impacts
in Development Economics.” Annual Review of Economics 11
(1): 523–61. https://doi.org/10.1146/annurev-economics-080218-030333.
Bouie, Jamelle. 2022. “We Still Can’t See American Slavery for
What It Was.” New York Times, January. https://www.nytimes.com/2022/01/28/opinion/slavery-voyages-data-sets.html.
Bowen, Claire McKay. 2022. Protecting Your
Privacy in a Data-Driven World. Chapman; Hall/CRC.
Bowers, Jake, and Maarten Voors. 2016. “How to Improve Your
Relationship with Your Future Self.” Revista de Ciencia
Polı́tica 36 (3): 829–48. https://doi.org/10.4067/S0718-090X2016000300011.
Bowley, Arthur Lyon. 1901. Elements of Statistics. London: P.
S. King.
———. 1913. “Working-Class Households in Reading.”
Journal of the Royal Statistical Society 76 (7): 672–701. https://doi.org/10.1111/j.2397-2335.1913.tb03071.x.
Boykis, Vicki. 2019. “A Deep Dive on Python Type Hints,”
July. https://vickiboykis.com/2019/07/08/a-deep-dive-on-python-type-hints/.
———. 2022. “Duo, the Push, and the Bandits.” Normcore
Tech, May. https://vicki.substack.com/p/duo-the-push-and-the-bandits.
Bradley, Valerie, Shiro Kuriwaki, Michael Isakov, Dino Sejdinovic,
Xiao-Li Meng, and Seth Flaxman. 2021. “Unrepresentative Big
Surveys Significantly Overestimated US Vaccine
Uptake.” Nature 600 (7890): 695–700. https://doi.org/10.1038/s41586-021-04198-4.
Braginsky, Mika. 2020. wordbankr: Accessing the
Wordbank Database. https://CRAN.R-project.org/package=wordbankr.
Brandt, Allan. 1978. “Racism and Research: The Case of the
Tuskegee Syphilis Study.” Hastings Center Report, 21–29.
https://dash.harvard.edu/bitstream/handle/1/3372911/Brandt\%5FRacism.pdf?sequence=1\&isAllowed=y.
Brewer, Ken. 2013. “Three Controversies in the History of Survey
Sampling.” Survey Methodology 39 (2): 249–63.
Briggs, Ryan. 2021. “Why Does Aid Not Target the Poorest?”
International Studies Quarterly 65 (3): 739–52. https://doi.org/10.1093/isq/sqab035.
Brokowski, Carolyn, and Mazhar Adli. 2019. “CRISPR Ethics: Moral
Considerations for Applications of a Powerful Tool.” Journal
of Molecular Biology 431 (1): 88–101. https://doi.org/10.1016/j.jmb.2018.05.044.
Bronner, Lenny, Emily Liu, and Jeremy Bowers. 2022. “What the
Washington Post Elections Engineering Team Had to Learn about Election
Data.” Washington Post, April. https://washpost.engineering/what-the-washington-post-elections-engineering-team-had-to-learn-about-election-data-a41603daf9ca.
Brontë, Charlotte. 1847. Jane Eyre. https://www.gutenberg.org/files/1260/1260-h/1260-h.htm.
———. 1857. The Professor. https://www.gutenberg.org/files/1028/1028-h/1028-h.htm.
Brook, Robert, John Ware, William Rogers, Emmett Keeler, Allyson Ross
Davies, Cathy Sherbourne, George Goldberg, Kathleen Lohr, Patricia Camp,
and Joseph Newhouse. 1984. “The Effect of Coinsurance on the
Health of Adults: Results from the RAND Health Insurance
Experiment.” https://www.rand.org/pubs/reports/R3055.html.
Brown, Zack. 2018. “A Git Origin Story.” Linux
Journal, July. https://www.linuxjournal.com/content/git-origin-story.
Bryan, Jenny. 2015. “Naming Things.” Reproducible
Science Workshop, May. https://speakerdeck.com/jennybc/how-to-name-files.
———. 2018a. “Excuse Me, Do You Have a Moment to Talk about Version
Control?” The American Statistician 72 (1): 20–27. https://doi.org/10.1080/00031305.2017.1399928.
———. 2018b. “Code Smells and Feels.” YouTube,
July. https://youtu.be/7oyiPBjLAWY.
———. 2020. Happy Git and GitHub for the
useR. https://happygitwithr.com.
Bryan, Jenny, and Jim Hester. 2020. What They
Forgot to Teach You About R. https://rstats.wtf/index.html.
Bryan, Jenny, Jim Hester, David Robinson, and Hadley Wickham. 2019.
reprex: Prepare Reproducible Example Code via
the Clipboard. https://CRAN.R-project.org/package=reprex.
Bryan, Jenny, and Hadley Wickham. 2021. Gh: ’GitHub’ ’API’. https://CRAN.R-project.org/package=gh.
Buckheit, Jonathan, and David Donoho. 1995. “Wavelab and
Reproducible Research.” In Wavelets and Statistics,
55–81. Springer. https://doi.org/10.1007/978-1-4612-2544-7\_5.
Bueno de Mesquita, Ethan, and Anthony Fowler. 2021. Thinking Clearly
with Data: A Guide to Quantitative Reasoning and Analysis. New
Jersey: Princeton University Press.
Buhr, Ray. 2017. Using R as a Production
Machine Learning Language (Part I). https://raybuhr.github.io/blog/posts/making-predictions-over-http/.
Buja, Andreas, Dianne Cook, and Deborah F Swayne. 1996.
“Interactive High-Dimensional Data Visualization.”
Journal of Computational and Graphical Statistics 5 (1): 78–99.
https://doi.org/10.2307/1390754.
Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades:
Intersectional Accuracy Disparities in Commercial Gender
Classification.” In Conference on Fairness, Accountability
and Transparency, 77–91. PMLR.
Burton, Jason, Nicole Cruz, and Ulrike Hahn. 2021. “Reconsidering
Evidence of Moral Contagion in Online Social Networks.”
Nature Human Behaviour 5 (12): 1629–35. https://doi.org/10.1038/s41562-021-01133-5.
Bush, Vannevar. 1945. “As We May Think.” The Atlantic
Monthly, July. https://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/.
Byrd, James Brian, Anna Greene, Deepashree Venkatesh Prasad, Xiaoqian
Jiang, and Casey Greene. 2020. “Responsible, Practical Genomic
Data Sharing That Accelerates Research.” Nature Reviews
Genetics 21 (10): 615–29. https://doi.org/10.1038/s41576-020-0257-5.
Cahill, Niamh, Michelle Weinberger, and Leontine Alkema. 2020.
“What Increase in Modern Contraceptive Use Is Needed in Fp2020
Countries to Reach 75% Demand Satisfied by 2030? An Assessment Using the
Accelerated Transition Method and Family Planning Estimation
Model.” Gates Open Research 4. https://doi.org/10.12688/gatesopenres.13125.1.
Calonico, Sebastian, Matias Cattaneo, Max Farrell, and Rocio Titiunik.
2021. rdrobust: Robust Data-Driven Statistical
Inference in Regression-Discontinuity Designs. https://CRAN.R-project.org/package=rdrobust.
Cambon, Jesse, and Christopher Belanger. 2021. “tidygeocoder: Geocoding Made Easy.” Zenodo.
https://doi.org/10.5281/zenodo.3981510.
Cardoso, Tom. 2020. “Bias behind bars: A
Globe investigation finds a prison system stacked against Black and
Indigenous inmates.” The Globe and Mail, October.
https://www.theglobeandmail.com/canada/article-investigation-racial-bias-in-canadian-prison-risk-assessments/.
Carle, Eric. 1969. The Very Hungry Caterpillar. World
Publishing Company.
Carleton, Chris. 2021. wccarleton/conflict-europe: Acce (version
v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.4550688.
Carleton, Chris, Dave Campbell, and Mark Collard. 2021. “A
Reassessment of the Impact of Temperature Change on European Conflict
During the Second Millennium CE Using a Bespoke Bayesian Time-Series
Model.” Climatic Change 165 (1): 1–16. https://doi.org/10.1007/s10584-021-03022-2.
Caro, Robert. 2019. Working. 1st ed. New York: Knopf.
Carroll, Lewis. 1865. Alice’s Adventures in Wonderland.
Macmillan. https://www.gutenberg.org/files/11/11-h/11-h.htm.
———. 1871. Through the Looking-Glass. Macmillan. https://www.gutenberg.org/files/12/12-h/12-h.htm.
Chamberlain, Scott, Hadley Wickham, and Winston Chang. 2021.
Analogsea: Interface to “Digital Ocean”. https://github.com/sckott/analogsea.
Chambliss, Daniel. 1989. “The Mundanity of Excellence: An
Ethnographic Report on Stratification and Olympic Swimmers.”
Sociological Theory 7 (1): 70–86. https://doi.org/10.2307/202063.
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke,
Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara
Borges. 2021. shiny: Web Application Framework
for R. https://CRAN.R-project.org/package=shiny.
Chase, William. 2020. “The Glamour of Graphics.”
RStudio Conference, January. https://www.rstudio.com/resources/rstudioconf-2020/the-glamour-of-graphics/.
Chawla, Dalmeet Singh. 2020. “Critiqued Coronavirus Simulation
Gets Thumbs up from Code-Checking Efforts.” Nature 582:
323–24. https://doi.org/10.1038/d41586-020-01685-y.
Chellel, Kit. 2018. “The Gambler Who Cracked the Horse-Racing
Code.” Bloomberg Businessweek, May. https://www.bloomberg.com/news/features/2018-05-03/the-gambler-who-cracked-the-horse-racing-code.
Chen, Heng, Marie-Hélène Felt, and Christopher Henry. 2018. “2017
Methods-of-Payment Survey: Sample Calibration and Variance
Estimation.” Bank of Canada. https://doi.org/10.34989/tr-114.
Chen, Wei, Xilu Chen, Chang-Tai Hsieh, and Zheng Song. 2019. “A
Forensic Examination of China’s National Accounts.” Brookings
Papers on Economic Activity, 77–127. https://www.jstor.org/stable/26798817.
Cheng, Joe, Bhaskar Karambelkar, and Yihui Xie. 2021. leaflet: Create Interactive Web Maps with the JavaScript
“Leaflet” Library. https://CRAN.R-project.org/package=leaflet.
Cheriet, Mohamed, Nawwaf Kharma, Cheng-Lin Liu, and Ching Suen. 2007.
Character Recognition Systems: A Guide for Students and
Practitioner. Wiley.
Chouldechova, Alexandra, Diana Benavides-Prado, Oleksandr Fialko, and
Rhema Vaithianathan. 2018. “A Case Study of Algorithm-Assisted
Decision Making in Child Maltreatment Hotline Screening
Decisions.” In Proceedings of the 1st Conference on Fairness,
Accountability and Transparency, edited by Sorelle Friedler and
Christo Wilson, 81:134–48. Proceedings of Machine Learning Research.
PMLR. https://proceedings.mlr.press/v81/chouldechova18a.html.
Chrétien, Jean. 2007. My Years as Prime Minister. 1st ed.
Toronto: Knopf Canada.
Christensen, Garret, Allan Dafoe, Edward Miguel, Don Moore, and Andrew
Rose. 2019. “A Study of the Impact of Data Sharing on Article
Citations Using Journal Policies as a Natural Experiment.”
PLoS One 14 (12): e0225883. https://doi.org/10.1371/journal.pone.0225883.
Christensen, Garret, Jeremy Freese, and Edward Miguel. 2019.
Transparent and Reproducible Social Science Research.
California: University of California Press.
Christian, Brian. 2012. “The A/B Test: Inside
the Technology That’s Changing the Rules of Business.”
Wired, April. https://www.wired.com/2012/04/ff-abtesting/.
Churchill, Winston. 1956. A History of the English-Speaking
Peoples. Cassell.
Cirone, Alexandra, and Arthur Spirling. 2021. “Turning History
into Data: Data Collection, Measurement, and Inference in HPE.”
Journal of Historical Political Economy 1 (1): 127–54. https://doi.org/10.1561/115.00000005.
City of Toronto. 2021. 2021 Street Needs Assessment. https://www.toronto.ca/city-government/data-research-maps/research-reports/housing-and-homelessness-research-and-reports/.
Clarke, Erik, and Scott Sherrill-Mix. 2017. ggbeeswarm: Categorical Scatter (Violin Point)
Plots. https://CRAN.R-project.org/package=ggbeeswarm.
Cleveland, William. 1994. The Elements of Graphing Data. 2nd
ed. Hobart Press.
Cohen, Glenn, and Michelle Mello. 2018. “HIPAA and
Protecting Health Information in the 21st Century.”
JAMA 320 (3): 231. https://doi.org/10.1001/jama.2018.5630.
Cohn, Alain. 2019. “Data and code for: Civic
Honesty Around the Globe.” Harvard Dataverse. https://doi.org/10.7910/dvn/ykbodn.
Cohn, Alain, Michel André Maréchal, David Tannenbaum, and Christian
Lukas Zünd. 2019a. “Civic Honesty Around the Globe.”
Science 365 (6448): 70–73. https://doi.org/10.1126/science.aau8712.
———. 2019b. “Supplementary Materials for: Civic Honesty Around the
Globe.” Science 365 (6448): 70–73.
Cohn, Nate. 2016. “We Gave Four Good Pollsters the Same Raw Data.
They Had Four Different Results.” New York Times,
September. https://www.nytimes.com/interactive/2016/09/20/upshot/the-error-the-polling-world-rarely-talks-about.html.
Collins, Annie, and Rohan Alexander. 2022. “Reproducibility of
COVID-19 Pre-Prints.” Scientometrics. https://doi.org/10.1007/s11192-022-04418-2.
Colombo, Tommaso, Holger Fröning, Pedro Javier Garcı̀a, and Wainer
Vandelli. 2016. “Optimizing the Data-Collection Time of a
Large-Scale Data-Acquisition System Through a Simulation
Framework.” The Journal of Supercomputing 72 (12):
4546–72. https://doi.org/10.1007/s11227-016-1764-1.
Cook, Dianne, Nancy Reid, and Emi Tanaka. 2021. “The Foundation Is
Available for Thinking about Data Visualization Inferentially.”
Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.8453435d.
Cooley, David. 2020. mapdeck: Interactive Maps
Using “Mapbox GL JS” and
“Deck.gl”. https://CRAN.R-project.org/package=mapdeck.
Council of European Union. 2016. “General Data Protection
Regulation 2016/679.” https://eur-lex.europa.eu/eli/reg/2016/679/oj.
Cox, David. 2018. “In Gentle Praise of Significance Tests.”
YouTube, October. https://youtu.be/txLj\%5FP9UlCQ.
Cox, David, and Nancy Reid. 1987. “Parameter Orthogonality and
Approximate Conditional Inference.” Journal of the Royal
Statistical Society: Series B (Methodological) 49 (1): 1–18. https://doi.org/10.1111/j.2517-6161.1987.tb01422.x.
Cox, Murray. 2021. “Inside Airbnb—Toronto
Data.” http://insideairbnb.com/get-the-data.html.
Craiu, Radu. 2019. “The Hiring Gambit: In Search of the Twofer
Data Scientist.” Harvard Data Science Review 1 (1). https://doi.org/10.1162/99608f92.440445cb.
Cramer, Jan Salomon. 2003. “The Origins of Logistic
Regression.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.360300.
Crawford, Kate. 2021. Atlas of AI.
New Haven: Yale University Press.
Crosby, Alfred. 1997. The Measure of Reality: Quantification in
Western Europe, 1250-1600. Cambridge: Cambridge University Press.
Csárdi, Gábor. 2020. gitcreds: Query
“git” Credentials from “R”. https://CRAN.R-project.org/package=gitcreds.
Cummins, Neil. 2022. “The Hidden Wealth of English Dynasties,
1892–2016.” The Economic History Review 75 (3): 667–702.
https://doi.org/10.1111/ehr.13120.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. New
Haven: Yale Press.
D’Ignazio, Catherine, and Lauren F Klein. 2020. Data Feminism.
Massachusetts: The MIT Press.
Dagan, Noa, Noam Barda, Eldad Kepten, Oren Miron, Shay Perchik, Mark
Katz, Miguel Hernán, Marc Lipsitch, Ben Reis, and Ran Balicer. 2021.
“BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination
Setting.” New England Journal of Medicine 384 (15):
1412–23. https://doi.org/10.1056/NEJMoa2101765.
Darling, William. 2011. “A Theoretical and Practical
Implementation Tutorial on Topic Modeling and Gibbs Sampling.” In
Proceedings of the 49th Annual Meeting of the Association for
Computational Linguistics: Human Language Technologies, 642–47.
Davis, Darren. 1997. “Nonrandom Measurement Error and Race of
Interviewer Effects Among African Americans.” The Public
Opinion Quarterly 61 (1): 183–207. https://doi.org/10.1086/297792.
De Jonge, Edwin, and Mark Van Der Loo. 2013. An Introduction to Data
Cleaning with r. Statistics Netherlands Heerlen. https://cran.r-project.org/doc/contrib/de\%5FJonge+van\%5Fder\%5FLoo-Introduction\%5Fto\%5Fdata\%5Fcleaning\%5Fwith\%5FR.pdf.
Dean, Natalie. 2022. “Tracking COVID-19 Infections:
Time for Change.” Nature 602 (7896): 185. https://doi.org/10.1038/d41586-022-00336-8.
Deaton, Angus. 2010. “Instruments, Randomization, and Learning
about Development.” Journal of Economic Literature 48
(2): 424–55. https://doi.org/10.1257/jel.48.2.424.
DeWitt, Helen. 2000. The Last Samurai. 1st ed. United States:
Talk Mirimax Books.
Dillman, Don, Jolene Smyth, and Leah Christian. 2014. Internet,
Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method.
4th ed. Wiley.
Dolatsara, Hamidreza Ahady, Ying-Ju Chen, Robert Leonard, Fadel Megahed,
and Allison Jones-Farmer. 2021. “Explaining Predictive Model
Performance: An Experimental Study of Data Preparation and Model
Choice.” Big Data, October. https://doi.org/10.1089/big.2021.0067.
Doll, Richard, and Bradford Hill. 1950. “Smoking and Carcinoma of
the Lung.” British Medical Journal 2 (4682): 739–48. https://doi.org/10.1136/bmj.2.4682.739.
Druckman, James, and Donald Green. 2021. “A New Era of
Experimental Political Science.” In Advances in Experimental
Political Science, 1–16. Cambridge University Press. https://doi.org/10.1017/9781108777919.002.
Duflo, Esther. 2020. “Field Experiments and the Practice of
Policy.” American Economic Review 110 (7): 1952–73. https://doi.org/10.1257/aer.110.7.1952.
Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006.
“Calibrating Noise to Sensitivity in Private Data
Analysis.” In Theory of Cryptography Conference, 265–84.
Springer.
Dwork, Cynthia, and Aaron Roth. 2013. “The Algorithmic Foundations
of Differential Privacy.” Foundations and Trends
in Theoretical Computer Science 9 (3-4): 211–407. https://doi.org/10.1561/0400000042.
Edgeworth, Francis Ysidro. 1885. “Methods of Statistics.”
Journal of the Statistical Society of London, 181–217. https://www.jstor.org/stable/25163974.
Efron, Bradley, and Carl Morris. 1977. “Stein’s Paradox in
Statistics.” Scientific American - SCI AMER 236 (May):
119–27. https://doi.org/10.1038/scientificamerican0577-119.
Eghbal, Nadia. 2020. Working in Public: The Making and Maintenance
of Open Source Software. California: Stripe Press.
Eisenstein, Michael. 2022. “Need Web Data? Here’s How to Harvest
Them.” Nature 607: 200–201. https://doi.org/10.1038/d41586-022-01830-9.
Elliott, Michael, Brady West, Xinyu Zhang, and Stephanie Coffey. 2022.
“The Anchoring Method: Estimation of Interviewer Effects in the
Absence of Interpenetrated Sample Assignment.” Survey
Methodology 48 (1): 25–48. http://www.statcan.gc.ca/pub/12-001-x/2022001/article/00005-eng.htm.
Elson, Malte. n.d. “Question Wording and Item Formulation.”
https://doi.org/10.31234/osf.io/e4ktc.
Enns, Peter, and Jake Rothschild. 2022. “Do You Know Where Your
Survey Data Come From?” May. https://medium.com/3streams/surveys-3ec95995dde2.
Farrugia, Patricia, Bradley Petrisor, Forough Farrokhyar, and Mohit
Bhandari. 2010. “Research Questions, Hypotheses and
Objectives.” Canadian Journal of Surgery 53 (4): 278.
Finkelstein, Amy, Sarah Taubman, Bill Wright, Mira Bernstein, Jonathan
Gruber, Joseph Newhouse, Heidi Allen, Katherine Baicker, and Oregon
Health Study Group. 2012. “The Oregon Health Insurance Experiment:
Evidence from the First Year.” The Quarterly Journal of
Economics 127 (3): 1057–1106. https://doi.org/10.1093/qje/qjs020.
Firke, Sam. 2020. janitor: Simple Tools for
Examining and Cleaning Dirty Data. https://CRAN.R-project.org/package=janitor.
Fisher, Ronald. 1926. “The Arrangement of
Field Experiments,” 503–15. https://doi.org/10.23637/rothamsted.8v61q.
———. 1928. Statistical Methods for Research Workers. 2nd ed.
London: Oliver; Boyd.
———. 1949. The Design of Experiments. 5th ed. London: Oliver;
Boyd.
Fiske, Susan, and Shiro Kuriwaki. 2021. “Words to the Wise on
Writing Scientific Papers,” November. https://doi.org/10.31234/osf.io/n32qw.
Fitts, Alexis Sobel. 2014. “The King of Content: How Upworthy Aims
to Alter the Web, and Could End up Altering the World.”
Columbia Journalism Review 53: 34–38. https://archives.cjr.org/feature/the\%5Fking\%5Fof\%5Fcontent.php.
Flake, Jessica, and Eiko Fried. 2020. “Measurement Schmeasurement:
Questionable Measurement Practices and How to Avoid Them.”
Advances in Methods and Practices in Psychological Science 3
(4): 456–65. https://doi.org/10.1177/2515245920952393.
Flynn, Michael. 2021. troopdata: Tools for
Analyzing Cross-National Military Deployment and Basing
Data. https://CRAN.R-project.org/package=troopdata.
Forster, Edward Morgan. 1927. Aspects of the Novel. London:
Edward Arnold.
Foster, Gordon. 1968. “Computers, Statistics and Planning: Systems
or Chaos?” Geary Lecture. https://www.esri.ie/system/files/media/file-uploads/2016-03/GLS2.pdf.
Fourcade, Marion, and Kieran Healy. 2017. “Seeing Like a
Market.” Socio-Economic Review 15 (1): 9–29. https://doi.org/10.1093/ser/mww033.
Fowler, Martin, and Kent Beck. 2018. Refactoring: Improving the
Design of Existing Code. 2nd ed. New York: Addison-Wesley
Professional.
Fox, John, and Robert Andersen. 2006. “Effect Displays for
Multinomial and Proportional-Odds Logit Models.” Sociological
Methodology 36 (1): 225–55.
Franconeri, Steven, Lace Padilla, Priti Shah, Jeffrey Zacks, and Jessica
Hullman. 2021. “The Science of Visual Data Communication: What
Works.” Psychological Science in the Public Interest 22
(3): 110–61. https://doi.org/10.1177/15291006211051956.
Frandell, Ashlee, Mary Feeney, Timothy Johnson, Eric Welch, Lesley
Michalegko, and Heyjie Jung. 2021. “The Effects of Electronic
Alert Letters for Internet Surveys of Academic Scientists.”
Scientometrics 126 (8): 7167–81. https://doi.org/10.1007/s11192-021-04029-3.
Franklin, Laura. 2005. “Exploratory Experiments.”
Philosophy of Science 72 (5): 888–99. https://doi.org/10.1086/508117.
Fried, Eiko, Jessica Flake, and Donald Robinaugh. 2022.
“Revisiting the Theoretical and Methodological Foundations of
Depression Measurement.” Nature Reviews Psychology,
April. https://doi.org/10.1038/s44159-022-00050-2.
Friedman, Jerome, Robert Tibshirani, and Trevor Hastie. 2009. The
Elements of Statistical Learning. 2nd ed. Springer. https://hastie.su.domains/ElemStatLearn/.
Friendly, Michael, and Howard Wainer. 2021. A History of Data
Visaulization and Graphic Communication. 1st ed. Massachusetts:
Harvard University Press.
Fry, Hannah. 2020. “Big Tech Is Testing You.” The New
Yorker, February, 61–65. https://www.newyorker.com/magazine/2020/03/02/big-tech-is-testing-you.
Fuller, Mark, and James Mosher. 1987. “Raptor Survey
Techniques.” In Raptor Management Techniques Manual,
edited by Beth Pendleton, Brian Millsap, Keith Cline, and David Bird,
37–65. National Wildlife Federation. https://www.sandiegocounty.gov/content/dam/sdc/pds/ceqa/JVR/AdminRecord/IncorporatedByReference/Appendices/Appendix-D---Biological-Resources-Report/Fuller\%20and\%20Mosher\%201987.pdf.
Funkhouser, Gray. 1937. “Historical Development of the Graphical
Representation of Statistical Data.” Osiris 3: 269–404.
https://doi.org/10.1086/368480.
Gagolewski, Marek. 2020. R Package Stringi: Character String
Processing Facilities. http://www.gagolewski.com/software/stringi/.
Garfinkel, Irwin, Lee Rainwater, and Timothy Smeeding. 2006. “A
Re-Examination of Welfare States and Inequality in Rich Nations: How
in-Kind Transfers and Indirect Taxes Change the Story.”
Journal of Policy Analysis and Management: The Journal of the
Association for Public Policy Analysis and Management 25 (4):
897–919.
Garnier, Simon, Noam Ross, Robert Rudis, Antônio Camargo, Marco Sciaini,
and Cédric Scherer. 2021. viridis -
Colorblind-Friendly Color Maps for R. https://doi.org/10.5281/zenodo.4679424.
Gavras, Konstantin, Jan Karem Höhne, Annelies Blom, and Harald Schoen.
2022. “Innovating the Collection of Open-Ended Answers: The
Linguistic and Content Characteristics of Written and Oral Answers to
Political Attitude Questions.” Journal of the Royal
Statistical Society: Series A (Statistics in Society) 185 (3):
872–90. https://doi.org/10.1111/rssa.12807.
Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman
Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021.
“Datasheets for Datasets.” Communications of the
ACM 64 (12): 86–92. https://doi.org/10.1145/3458723.
Gelfand, Sharla. 2019. Crying Sephora. https://sharla.party/post/crying-sephora/.
———. 2020. opendatatoronto: Access the City of
Toronto Open Data Portal. https://CRAN.R-project.org/package=opendatatoronto.
———. 2021. “Make a ReprEx... Please.” YouTube,
February. https://youtu.be/G5Nm-GpmrLw.
Gelman, Andrew. 2016. “What Has Happened down Here Is the Winds
Have Changed,” September. https://statmodeling.stat.columbia.edu/2016/09/21/what-has-happened-down-here-is-the-winds-have-changed/.
———. 2019. “Another Regression Discontinuity Disaster and What Can
We Learn from It,” June. https://statmodeling.stat.columbia.edu/2019/06/25/another-regression-discontinuity-disaster-and-what-can-we-learn-from-it/.
———. 2020. “Statistical Models of Election Outcomes.”
YouTube, August. https://youtu.be/7gjDnrbLQ4k.
Gelman, Andrew, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and
Donald Rubin. 2014. Bayesian Data Analysis. 3rd ed. Chapman;
Hall/CRC.
Gelman, Andrew, Sharad Goel, Douglas Rivers, and David Rothschild. 2016.
“The Mythical Swing Voter.” Quarterly Journal of
Political Science 11 (1): 103–30. https://doi.org/10.1561/100.00015031.
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using
Regression and Multilevel/Hierarchical Models. Cambridge University
Press.
Gelman, Andrew, and Eric Loken. 2013. “The Garden of Forking
Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No
‘Fishing Expedition’ or ‘p-Hacking’ and the
Research Hypothesis Was Posited Ahead of Time.” Department of
Statistics, Columbia University. http://www.stat.columbia.edu/~gelman/research/unpublished/p\%5Fhacking.pdf.
Gelman, Andrew, Greggor Mattson, and Daniel Simpson. 2018. “Gaydar
and the Fallacy of Decontextualized Measurement.”
Sociological Science 5 (12): 270–80. https://doi.org/10.15195/v5.a12.
Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s
Practice What We Preach: Turning Tables into Graphs.” The
American Statistician 56 (2): 121–30. https://doi.org/10.1198/000313002317572790.
Gelman, Andrew, and Aki Vehtari. 2020. “What Are the Most
Important Statistical Ideas of the Past 50 Years?” arXiv. https://doi.org/10.48550/ARXIV.2012.00174.
Gentemann, Chelle Leigh, Chris Holdgraf, Ryan Abernathey, Daniel
Crichton, James Colliander, Edward Joseph Kearns, Yuvi Panda, and
Richard Signell. 2021. “Science Storms the Cloud.”
AGU Advances 2 (2). https://doi.org/10.1029/2020av000354.
Gerber, Alan, and Donald Green. 2012. Field Experiments: Design,
Analysis, and Interpretation. W W Norton.
Gertler, Paul, Sebastian Martinez, Patrick Premand, Laura Rawlings, and
Christel Vermeersch. 2016. Impact Evaluation in Practice. 2nd
ed. The World Bank. https://doi.org/10.1596/978-1-4648-0779-4.
Geuenich, Michael, Jinyu Hou, Sunyun Lee, Shanza Ayub, Hartland Jackson,
and Kieran Campbell. 2021a. “Automated Assignment of Cell Identity
from Single-Cell Multiplexed Imaging and Proteomic Data.”
Cell Systems 12 (12): 1173–86. https://doi.org/10.1016/j.cels.2021.08.012.
———. 2021b. “Automated Assignment of Cell Identity from
Single-Cell Multiplexed Imaging and Proteomic Data.” https://doi.org/10.5281/ZENODO.5156049.
Ghitza, Yair, and Andrew Gelman. 2020. “Voter Registration
Databases and MRP: Toward the Use of Large-Scale Databases in Public
Opinion Research.” Political Analysis 28 (4): 507–31. https://doi.org/10.1017/pan.2020.3.
Godfrey, Ernest. 1918. “History and Development of Statistics in
Canada.” In The History of Statistics–Their Development and
Progress in Many Countries. New York: Macmillan, edited by John
Koren, 179–98. Macmillan Company of New York.
Goodman, Leo. 1961. “Snowball Sampling.” The Annals of
Mathematical Statistics 32 (1): 148–70. https://doi.org/10.1214/aoms/1177705148.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2020.
“rstanarm: Bayesian applied
regression modeling via Stan.” https://mc-stan.org/rstanarm.
Google. 2022. “What to Look for in a Code Review.” Google
Engineering Practices Documentation. https://google.github.io/eng-practices/review/reviewer/looking-for.html.
Gordon, Brett, Robert Moakler, and Florian Zettelmeyer. 2022.
“Close Enough? A Large-Scale Exploration of Non-Experimental
Approaches to Advertising Measurement.” arXiv. https://doi.org/10.48550/ARXIV.2201.07055.
Gordon, Brett, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky.
2019. “A Comparison of Approaches to Advertising Measurement:
Evidence from Big Field Experiments at Facebook.” Marketing
Science 38 (2): 193–225. https://doi.org/10.1287/mksc.2018.1135.
Graham, Paul. 2020. How to Write Usefully. http://paulgraham.com/useful.html.
Green, Donald, Terence Leong, Holger Kern, Alan Gerber, and Christopher
Larimer. 2009. “Testing the Accuracy of Regression Discontinuity
Analysis Using Experimental Benchmarks.” Political
Analysis 17 (4): 400–417. https://doi.org/https://doi.org/10.1093/pan/mpp018.
Green, Eric. 2020. “Nivi Research: Mister P
helps us understand vaccine hesitancy.” https://doi.org/2020-12-08.
Greenberg, Bernard, Abdel-Latif Abul-Ela, Walt Simmons, and Daniel
Horvitz. 1969. “The Unrelated Question Randomized Response Model:
Theoretical Framework.” Journal of the American Statistical
Association 64 (326): 520–39. https://doi.org/10.1080/01621459.1969.10500991.
Greenland, Sander, Stephen Senn, Kenneth Rothman, John Carlin, Charles
Poole, Steven Goodman, and Douglas Altman. 2016. “Statistical tests, P values, confidence intervals, and
power: a guide to misinterpretations.” European
Journal of Epidemiology 31 (4): 337–50. https://doi.org/10.1007/s10654-016-0149-3.
Griffiths, Thomas, and Mark Steyvers. 2004. “Finding Scientific
Topics.” PNAS 101: 5228–35. https://doi.org/10.1073/pnas.0307752101.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times
Made Easy with lubridate.”
Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.
Groves, Robert. 2011. “Three Eras of Survey Research.”
Public Opinion Quarterly 75 (5): 861–71. https://doi.org/10.1093/poq/nfr057.
Groves, Robert, and Lars Lyberg. 2010. “Total
Survey Error: Past, Present, and Future.” Public
Opinion Quarterly 74 (5): 849–79. https://doi.org/10.1093/poq/nfq065.
Grün, Bettina, and Kurt Hornik. 2011. “topicmodels: An R Package for Fitting
Topic Models.” Journal of Statistical Software 40 (13):
1–30. https://doi.org/10.18637/jss.v040.i13.
Gustafsson, Karl, and Linus Hagström. 2017. “What Is the Point?
Teaching Graduate Students How to Construct Political Science Research
Puzzles.” European Political Science 17 (4): 634–48. https://doi.org/10.1057/s41304-017-0130-y.
Gutman, Robert. 1958. “Birth and Death Registration in
Massachusetts: II. The Inauguration of a Modern System,
1800-1849.” The Milbank Memorial Fund Quarterly 36 (4):
373–402.
Hackett, Robert. 2016. “Researchers Caused an Uproar by Publishing
Data from 70,000 Okcupid Users.” Fortune, May. https://fortune.com/2016/05/18/okcupid-data-research/.
Halberstam, David. 1972. The Best and the
Brightest. 1st ed. New York: Random House.
Hamming, Richard. 1996. The Art of Doing
Science and Engineering. Stripe Press.
Hand, David. 2018. “Statistical Challenges of Administrative and
Transaction Data.” Journal of the Royal Statistical Society:
Series A (Statistics in Society) 181 (3): 555–605. https://doi.org/10.1111/rssa.12315.
Handcock, Mark, and Krista Gile. 2011. “Comment: On the Concept of
Snowball Sampling.” Sociological Methodology 41 (1):
367–71.
Hangartner, Dominik, Daniel Kopp, and Michael Siegenthaler. 2021.
“Monitoring Hiring Discrimination Through Online Recruitment
Platforms.” Nature 589 (7843): 572–76. https://doi.org/10.1038/s41586-020-03136-0.
Hanretty, Chris. 2020. “An Introduction to Multilevel Regression
and Post-Stratification for Estimating Constituency Opinion.”
Political Studies Review 18 (4): 630–45. https://doi.org/10.1177/1478929919864773.
Hao, Karen. 2019. “This is how AI bias really
happens—and why it’s so hard to fix.” MIT Technology
Review, February. https://www.technologyreview.com/2019/02/04/137602/this-is-how-ai-bias-really-happensand-why-its-so-hard-to-fix/.
Hart, Edmund, Pauline Barmby, David LeBauer, François Michonneau, Sarah
Mount, Patrick Mulrooney, Timothée Poisot, Kara Woo, Naupaka Zimmerman,
and Jeffrey Hollister. 2016. “Ten Simple Rules for Digital Data
Storage.” PLOS Computational Biology 12
(10): e1005097. https://doi.org/10.1371/journal.pcbi.1005097.
Hartocollis, Anemona. 2022. “U.S. News Ranked
Columbia No. 2, but a Math Professor Has His Doubts.”
The New York Times, March. https://www.nytimes.com/2022/03/17/us/columbia-university-rank.html.
Hassan, Mai. 2022. “New Insights on Africa’s Autocratic
Past.” African Affairs 121 (483): 321–33. https://doi.org/10.1093/afraf/adac002.
Hastie, Trevor, and Robert Tibshirani. 1990. Generalized Additive
Models. Chapman; Hall/CRC.
Hawes, Michael. 2020. “Implementing Differential
Privacy: Seven Lessons From the
2020 United States
Census.” Harvard Data Science Review 2 (2).
https://doi.org/10.1162/99608f92.353c6f99.
Hayot, Eric. 2014. The Elements of Academic Style. Columbia
University Press.
Healy, Kieran. 2018. Data Visualization. New Jersey: Princeton
University Press.
———. 2020. “The Kitchen Counter Observatory,” May. https://kieranhealy.org/blog/archives/2020/05/21/the-kitchen-counter-observatory/.
———. 2022. “Unhappy in Its Own Way,” July. https://kieranhealy.org/blog/archives/2022/07/22/unhappy-in-its-own-way/.
Heckathorn, Douglas. 1997. “Respondent-Driven Sampling: A New
Approach to the Study of Hidden Populations.” Social
Problems 44 (2): 174–99. https://doi.org/10.2307/3096941.
Heil, Benjamin, Michael Hoffman, Florian Markowetz, Su-In Lee, Casey
Greene, and Stephanie Hicks. 2021. “Reproducibility Standards for
Machine Learning in the Life Sciences.” Nature Methods
18 (10): 1132–35. https://doi.org/10.1038/s41592-021-01256-7.
Heller, Jean. 2022. “AP Exposes the Tuskegee Syphilis Study: The
50th Anniversary.” AP, July. https://apnews.com/article/tuskegee-study-ap-story-investigation-syphilis-53403657e77d76f52df6c2e2892788c9.
Henry, Lionel, and Hadley Wickham. 2020. purrr:
Functional Programming Tools. https://CRAN.R-project.org/package=purrr.
Hermans, Felienne. 2017. “Peter Hilton on Naming.” IEEE
Software 34 (3): 117–20. https://doi.org/10.1109/MS.2017.81.
———. 2021. The Programmer’s Brain: What Every Programmer Needs to
Know about Cognition. 1st ed. Simon; Schuster. https://www.manning.com/books/the-programmers-brain.
Hernan, Miguel, and James Robins. 2020. What If. 1st ed. Boca
Raton: Chapman & Hall/CRC. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.
Herndon, Thomas, Michael Ash, and Robert Pollin. 2014. “Does High
Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart
and Rogoff.” Cambridge Journal of Economics 38 (2):
257–79. https://doi.org/10.1093/cje/bet075.
Hester, Jim, Florent Angly, Russ Hyde, Michael Chirico, Kun Ren, and
Alexander Rosenstock. 2022. Lintr: A ’Linter’ for r Code. https://CRAN.R-project.org/package=lintr.
Hester, Jim, Hadley Wickham, and Gábor Csárdi. 2021. Fs:
Cross-Platform File System Operations Based on ’Libuv’. https://CRAN.R-project.org/package=fs.
Hill, Austin Bradford. 1965. “The Environment and Disease:
Association or Causation?” Proceedings of the Royal Society
of Medicine. Sage Publications.
Hillel, Wayne. 2017. How Do We Trust Our Science Code? https://www.hillelwayne.com/how-do-we-trust-science-code/.
Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2011.
“MatchIt: Nonparametric Preprocessing for Parametric
Causal Inference.” Journal of Statistical Software 42
(8): 1–28. https://doi.org/10.18637/jss.v042.i08.
Hodgetts, Paul. 2022. “The Negative Space of Data,” March.
https://hodgettsp.netlify.app/post/data-negativespace/.
Hofmeister, Johannes, Janet Siegmund, and Daniel Holt. 2017.
“Shorter Identifier Names Take Longer to Comprehend.” In
2017 IEEE 24th International Conference on Software Analysis,
Evolution and Reengineering (SANER), 217–27. https://doi.org/10.1109/saner.2017.7884623.
Holland, Paul. 1986. “Statistics and Causal Inference.”
Journal of the American Statistical Association 81 (396):
945–60. https://doi.org/10.2307/2289064.
Horst, Allison Marie, Alison Presmanes Hill, and Kristen Gorman. 2020.
palmerpenguins: Palmer Archipelago (Antarctica)
penguin data. https://allisonhorst.github.io/palmerpenguins/.
Hotz, Joseph, Christopher Bollinger, Tatiana Komarova, Charles Manski,
Robert Moffitt, Denis Nekipelov, Aaron Sojourner, and Bruce Spencer.
2022. “Balancing Data Privacy and Usability in the Federal
Statistical System.” Proceedings of the National Academy of
Sciences 119 (31): 1–10. https://doi.org/10.1073/pnas.2104906119.
Howes, Adam. 2022. “Representing Uncertainty Using Significant
Figures,” April. https://athowes.github.io/posts/2022-04-24-representing-uncertainty-using-significant-figures/.
Hug, Lucia, Monica Alexander, Danzhen You, Leontine Alkema, and UN
Inter-agency Group for Child. 2019. “National, Regional, and
Global Levels and Trends in Neonatal Mortality Between 1990 and 2017,
with Scenario-Based Projections to 2030: A Systematic Analysis.”
Lancet Global Health 7 (6): e710–20. https://doi.org/10.1016/S2214-109X(19)30163-9.
Hughes, Nicola, and Jill Rutter. 2016. “Ministers Reflect:
Interview with Oliver Letwin,” December. https://www.instituteforgovernment.org.uk/ministers-reflect/person/oliver-letwin/.
Hulley, Stephen, Steven Cummings, Warren Browner, Deborah Grady, and
Thomas Newman. 2007. Designing Clinical Research. 3rd ed.
Lippincott Williams & Wilkins.
Hullman, Jessica, and Andrew Gelman. 2021. “Designing for
Interactive Exploratory Data Analysis Requires Theories of Graphical
Inference.” Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.3ab8a587.
Huntington-Klein, Nick. 2021. The Effect: An Introduction to
Research Design and Causality. 1st ed. Chapman & Hall. https://theeffectbook.net.
———. 2022. “Library of Statistical Techniques.” https://lost-stats.github.io.
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni,
Jeffrey Bloem, Pralhad Burli, Naibin Chen, et al. 2021. “The
Influence of Hidden Researcher Decisions in Applied
Microeconomics.” Economic Inquiry 59: 944–60. https://doi.org/10.1111/ecin.12992.
Huyen, Chip. 2020. “Machine Learning Is Going Real-Time,”
December. https://huyenchip.com/2020/12/27/real-time-machine-learning.html.
Hvitfeldt, Emil, and Julia Silge. 2021. Supervised Machine Learning for Text Analysis in
R. 1st ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781003093459.
Hyndman, Rob, Timothy Hyndman, Charles Gray, Sayani Gupta, and Jacquie
Tran. 2022. cricketdata: International Cricket
Data. https://CRAN.R-project.org/package=cricketdata.
Iannone, Richard. 2020. DiagrammeR: Graph/Network
Visualization. https://CRAN.R-project.org/package=DiagrammeR.
Iannone, Richard, Joe Cheng, and Barret Schloerke. 2020. gt: Easily Create Presentation-Ready Display
Tables. https://CRAN.R-project.org/package=gt.
Iannone, Richard, and Mauricio Vargas. 2022. pointblank: Data Validation and Organization of Metadata
for Local and Remote Tables. https://CRAN.R-project.org/package=pointblank.
Igelström, Erik. 2020. “Causal graphs in R
with DiagrammeR.” https://www.erikigelstrom.com/articles/causal-graphs-in-r-with-diagrammer/.
International Organization Of Legal Metrology. 2007. International
Vocabulary of Metrology – Basic and General Concepts and Associated
Terms. 3rd ed. https://www.oiml.org/en/files/pdf\%5Fv/v002-200-e07.pdf.
Ioannidis, John. 2005. “Why Most Published Research Findings Are
False.” PLoS Medicine 2 (8): e124. https://doi.org/10.1371/journal.pmed.0020124.
Irving, Damien, Kate Hertweck, Luke Johnston, Joel Ostblom, Charlotte
Wickham, and Greg Wilson. 2021. Research Software Engineering with
Python. Chapman; Hall/CRC.
Isaacson, Walter. 2011. Steve Jobs. 1st ed. Simon &
Schuster.
Ishiguro, Kazuo. 1989. The Remains of the Day. 1st ed. Faber;
Faber.
Izrailev, Sergei. 2014. tictoc: Functions for
Timing R Scripts. https://CRAN.R-project.org/package=tictoc.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani.
2021. An Introduction to Statistical Learning
with Applications in R. 2nd ed. Springer. https://www.statlearning.com.
Johnson, Alicia, Miles Ott, and Mine Dogucu. 2022. Bayes Rules! An Introduction to Bayesian Modeling with
R. 1st ed. Chapman; Hall/CRC. https://www.bayesrulesbook.com.
Johnson, Kaneesha. 2021. “Two Regimes of Prison Data
Collection.” Harvard Data Science Review 3 (3). https://doi.org/10.1162/99608f92.72825001.
Jones, Arnold. 1953. “Census Records of the Later Roman
Empire.” The Journal of Roman Studies 43: 49–64. https://doi.org/10.2307/297781.
Jordan, Michael. 2019. “Artificial Intelligence–the Revolution
Hasn’t Happened Yet.” Harvard Data Science Review 1 (1).
https://doi.org/10.1162/99608f92.f06c6e61.
Joyner, Michael. 1991. “Modeling: Optimal Marathon Performance on
the Basis of Physiological Factors.” Journal of Applied
Physiology 70 (2): 683–87. https://doi.org/10.1152/jappl.1991.70.2.683.
Kahan, Brennan, Fan Li, Andrew Copas, and Michael Harhay. 2022.
“Estimands in Cluster-Randomized Trials: Choosing Analyses That
Answer the Right Question.” International Journal of
Epidemiology, July. https://doi.org/10.1093/ije/dyac131.
Kahle, David, and Hadley Wickham. 2013. “ggmap: Spatial Visualization with ggplot2.”
The R Journal 5 (1): 144–61. http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf.
Kahneman, Daniel, Olivier Sibony, and Cass Sunstein. 2021. Noise: A
Flaw in Human Judgment. William Collins.
Karsten, Karl. 1923. Charts and Graphs. New York:
Prentice-Hall.
Kastellec, Jonathan, and Eduardo Leoni. 2007. “Using Graphs
Instead of Tables in Political Science.” Perspectives on
Politics 5 (4): 755–71. https://doi.org/10.1017/s1537592707072209.
Kasy, Maximilian, and Alexander Teytelboym. 2022. “Matching with
Semi-Bandits.” Econometrics Journal. https://maxkasy.github.io/home/files/papers/adaptive\%5Fcombinatorial.pdf.
Kay, Matthew. 2020. tidybayes: Tidy Data
and Geoms for Bayesian Models. https://doi.org/10.5281/zenodo.1308151.
Kearney, Michael W. 2019. “rtweet: Collecting
and analyzing Twitter data.” Journal of Open Source
Software 4 (42): 1829. https://doi.org/10.21105/joss.01829.
Kennedy, Lauren, and Jonah Gabry. 2020. “MRP
with rstanarm,” July. https://mc-stan.org/rstanarm/articles/mrp.html.
Kennedy, Lauren, and Andrew Gelman. 2020. “Know Your Population
and Know Your Model: Using Model-Based Regression and Poststratification
to Generalize Findings Beyond the Observed Sample.” https://arxiv.org/abs/1906.11323.
Kennedy, Lauren, Katharine Khanna, Daniel Simpson, and Andrew Gelman.
2020. “Using Sex and Gender in Survey Adjustment.” https://arxiv.org/abs/2009.14401.
Kenny, Christopher, Shiro Kuriwaki, Cory McCartan, Evan Rosenman, Tyler
Simko, and Kosuke Imai. 2021. “The Impact of
the U.S. Census Disclosure Avoidance System on Redistricting and Voting
Rights Analysis.” https://arxiv.org/abs/2105.14197.
Keyes, Os. 2019. “Counting the Countless.” Real
Life. https://reallifemag.com/counting-the-countless/.
Kharecha, Pushker, and James Hansen. 2013. “Prevented Mortality
and Greenhouse Gas Emissions from Historical and Projected Nuclear
Power.” Environmental Science & Technology 47 (9):
4889–95. https://doi.org/10.1021/es3051197.
Kiang, Mathew, Alexander Tsai, Monica Alexander, David Rehkopf, and
Sanjay Basu. 2021. “Racial/Ethnic Disparities in Opioid-Related
Mortality in the USA, 1999–2019: The Extreme Case of Washington
DC.” Journal of Urban Health 98 (5): 589–95. https://doi.org/10.1007/s11524-021-00573-8.
Kimmerer, Robin Wall. 2013. Braiding Sweetgrass. 1st ed.
Milkweed Editions.
King, Gary. 2006. “Publication, Publication.” PS:
Political Science & Politics 39 (1): 119–25. https://doi.org/10.1017/S1049096506060252.
King, Gary, and Richard Nielsen. 2019. “Why Propensity Scores
Should Not Be Used for Matching.” Political Analysis 27
(4): 435–54. https://doi.org/10.1017/pan.2019.11.
King, Stephen. 2000. On Writing: A Memoir of the Craft. 1st ed.
Scribner.
Kirkegaard, Emil, and Julius Bjerrekær. 2016. “The OKCupid
Dataset: A Very Large Public Dataset of Dating Site Users.”
Open Differential Psychology, 1–10. https://doi.org/10.26775/ODP.2016.11.03.
Kleiber, Christian, and Achim Zeileis. 2008. Applied Econometrics
with R. New York: Springer-Verlag. https://CRAN.R-project.org/package=AER.
Knuth, Donald. 1984. “Literate Programming.” The
Computer Journal 27 (2): 97–111. https://doi.org/10.1093/comjnl/27.2.97.
———. 1998. Art of Computer Programming, Volume 2: Seminumerical
Algorithms. 2nd ed.
Koenecke, Allison, and Hal Varian. 2020. “Synthetic Data
Generation for Economists.” https://arxiv.org/abs/2011.01374.
Koenker, Roger, and Achim Zeileis. 2009. “On Reproducible
Econometric Research.” Journal of Applied Econometrics
24 (5): 833–47. https://doi.org/10.1002/jae.1083.
Kohavi, Ron, Alex Deng, Brian Frasca, Roger Longbotham, Toby Walker, and
Ya Xu. 2012. “Trustworthy Online Controlled Experiments.”
In Proceedings of the 18th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining -
KDD 12, 1st ed. ACM Press.
https://doi.org/10.1145/2339530.2339653.
Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical
Guide to A/B Testing. Cambridge University Press.
Koitsalu, Marie, Martin Eklund, Jan Adolfsson, Henrik Grönberg, and
Yvonne Brandberg. 2018. “Effects of Pre-Notification, Invitation
Length, Questionnaire Length and Reminder on Participation Rate: A
Quasi-Randomised Controlled Trial.” BMC Medical Research
Methodology 18 (3): 1–5. https://doi.org/10.1186/s12874-017-0467-5.
Kross, Sean. 2021. postcards: Create Beautiful,
Simple Personal Websites. https://CRAN.R-project.org/package=postcards.
Kuhn, Max. 2021. poissonreg: Model Wrappers for
Poisson Regression. https://CRAN.R-project.org/package=poissonreg.
Kuhn, Max, and Davis Vaughan. 2022. parsnip: A
Common API to Modeling and Analysis Functions. https://CRAN.R-project.org/package=parsnip.
Kuhn, Max, and Hadley Wickham. 2020. Tidymodels: A Collection of
Packages for Modeling and Machine Learning Using Tidyverse
Principles. https://www.tidymodels.org.
Kuriwaki, Shiro, Will Beasley, and Thomas Leeper. 2022. dataverse: R Client for Dataverse 4+
Repositories.
Kuznets, Simon, Lillian Epstein, and Elizabeth Jenks. 1941. National Income and Its Composition,
1919-1938. National Bureau of Economic Research.
Lamott, Anne. 1994. Bird by Bird: Some Instructions on Writing and
Life. Anchor Books.
Landau, William Michael. 2021. “The targets R
Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for
Reproducibility and High-Performance Computing.”
Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959.
Lane, Nick. 2015. “The Unseen World: Reflections on Leeuwenhoek
(1677) ‘Concerning Little Animals’.”
Philosophical Transactions of the Royal Society B: Biological
Sciences 370 (1666): 20140344. https://doi.org/10.1098/rstb.2014.0344.
Laouenan, Morgane, Palaash Bhargava, Jean-Benoı̂t Eyméoud, Olivier
Gergaud, Guillaume Plique, and Etienne Wasmer. 2022. “A Cross-Verified Database of Notable People,
3500BC–2018AD.” Scientific Data 9 (1): 290. https://doi.org/10.1038/s41597-022-01369-4.
Larmarange, Joseph. 2021. Labelled: Manipulating Labelled Data.
https://CRAN.R-project.org/package=labelled.
Latour, Bruno. 1996. “On Actor-Network Theory: A Few
Clarifications.” Soziale Welt 47 (4): 369–81. http://www.jstor.org/stable/40878163.
Lauderdale, Benjamin, Delia Bailey, Jack Blumenau, and Douglas Rivers.
2020. “Model-Based Pre-Election Polling for National and
Sub-National Outcomes in the US and UK.” International
Journal of Forecasting 36 (2): 399–413. https://doi.org/10.1016/j.ijforecast.2019.05.012.
Lazear, Edward. 2000. “Economic Imperialism.” The
Quarterly Journal of Economics 115 (1): 99–146. https://doi.org/10.1162/003355300554683.
Leek, Jeff, Blakeley McShane, Andrew Gelman, David Colquhoun, Michèle
Nuijten, and Steven Goodman. 2017. “Five Ways to Fix
Statistics.” Nature 551 (7682): 557–59. https://doi.org/10.1038/d41586-017-07522-z.
Leek, Jeff, and Roger Peng. 2020. “Advanced Data Science
2020.” http://jtleek.com/ads2020/index.html.
Leonelli, Sabina. 2020. “Learning from Data Journeys.” In
Data Journeys in the Sciences, 1–24. Springer International
Publishing. https://doi.org/10.1007/978-3-030-37177-7\_1.
Leos-Barajas, Vianey, Theoni Photopoulou, Roland Langrock, Toby
Patterson, Yuuki Watanabe, Megan Murgatroyd, and Yannis Papastamatiou.
2016. “Analysis of Animal Accelerometer Data Using Hidden Markov
Models.” Methods in Ecology and Evolution 8 (2): 161–73.
https://doi.org/10.1111/2041-210x.12657.
Levay, Kevin, Jeremy Freese, and James Druckman. 2016. “The
Demographic and Political Composition of Mechanical Turk
Samples.” SAGE Open 6 (1): 1–17. https://doi.org/10.1177/2158244016636433.
Lichand, Guilherme, and Sharon Wolf. 2022. “Measuring Child Labor:
Whom Should Be Asked, and Why It Matters,” March. https://doi.org/10.21203/rs.3.rs-1474562/v1.
Lima, Renato de, Oliver Phillips, Alvaro Duque, Sebastian Tello, Stuart
Davies, Alexandre Adalardo de Oliveira, Sandra Muller, et al. 2022.
“Making Forest Data Fair and Open.” Nature Ecology
& Evolution 6 (6): 656–58. https://doi.org/10.1038/s41559-022-01738-7.
Lin, Herbert. 2014. “A Proposal to Reduce Government
Overclassification of Information Related to National Security.”
Journal of National Security Law and Policy 7: 443–63.
Lin, Sarah, Ibraheem Ali, and Greg Wilson. 2021. “Ten Quick Tips
for Making Things Findable.” PLOS Computational Biology
16 (12): 1–10. https://doi.org/10.1371/journal.pcbi.1008469.
Lips, Hilary. 2020. Sex and Gender: An Introduction. 7th ed.
Illinois: Waveland Press.
Little, Roderick, and Roger Lewis. 2021. “Estimands, Estimators,
and Estimates.” JAMA 326 (10): 967. https://doi.org/10.1001/jama.2021.2886.
Locke, Steph, and Lucy D’Agostino McGowan. 2018. datasauRus: Datasets from the Datasaurus
Dozen. https://CRAN.R-project.org/package=datasauRus.
Lockheed Martin. 2005. “Joint Strike Fighter Air Vehicle C++
Coding Standards For The System Development And Demonstration
Program.” Document Number 2rdu00001 Rev C,
December. https://www.stroustrup.com/JSF-AV-rules.pdf.
Lohr, Sharon. 2022. Sampling: Design and Analysis. 3rd ed.
Chapman; Hall/CRC.
Loo, Mark PJ van der, and Edwin de Jonge. 2021. “Data Validation
Infrastructure for r.” Journal of Statistical Software
97: 1–33. https://doi.org/10.18637/jss.v097.i10.
Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019. Geocomputation with R. 1st ed. Chapman;
Hall/CRC. https://geocompr.robinlovelace.net.
Lucas, Jack, Reed Merrill, Kelly Blidook, Sandra Breux, Laura Conrad,
Gabriel Eidelman, Royce Koop, et al. 2020. “Canadian
Municipal Elections Database.” Scholars Portal Dataverse.
https://doi.org/10.5683/sp2/4mzjpq.
Lucas, Robert. 1978. “Asset Prices in an Exchange Economy.”
Econometrica 46 (6): 1429–45. https://doi.org/10.2307/1913837.
Luebke, David Martin, and Sybil Milton. 1994. “Locating the
Victim: An Overview of Census-Taking, Tabulation Technology, and
Persecution in Nazi Germany.” IEEE Annals of the History of
Computing 16 (3): 25–39. https://doi.org/10.1109/MAHC.1994.298418.
Lumley, Thomas. 2020. “Survey: Analysis of Complex Survey
Samples.” https://cran.r-project.org/web/packages/survey/index.html.
Lundberg, Ian, Rebecca Johnson, and Brandon Stewart. 2021. “What
Is Your Estimand? Defining the Target Quantity Connects Statistical
Evidence to Theory.” American Sociological Review 86
(3): 532–65. https://doi.org/10.1177/00031224211004187.
Luscombe, Alex, Kevin Dick, and Kevin Walby. 2021. “Algorithmic
Thinking in the Public Interest: Navigating Technical, Legal, and
Ethical Hurdles to Web Scraping in the Social Sciences.”
Quality & Quantity, 1–22. https://doi.org/10.1007/s11135-021-01164-0.
Luscombe, Alex, and Alexander McClelland. 2020. “Policing the
Pandemic: Tracking the Policing of COVID-19 Across Canada.”
Lyman, Frank. 1981. “The Responsive Classroom Discussion: The
Inclusion of All Students.” Mainstreaming Digest 109:
109–13.
Macaulay, Thomas Babington. 1848. The History of England from the
Accession of James the Second. https://www.gutenberg.org/files/1468/1468-h/1468-h.htm.
MacDorman, Marian, and Eugene Declercq. 2018. “The Failure of
United States Maternal Mortality Reporting and Its Impact on Women’s
Lives.” Birth (Berkeley, Calif.) 45 (2): 105.
Maier, Maximilian, František Bartoš, T. D. Stanley, David Shanks, Adam
Harris, and Eric-Jan Wagenmakers. 2022. “No Evidence for Nudging
After Adjusting for Publication Bias.” Proceedings of the
National Academy of Sciences 119 (31): e2200300119. https://doi.org/10.1073/pnas.2200300119.
Martin, Charles, and Ben Popper. 2021. “Don’t Push That Button:
Exploring the Software That Flies SpaceX Rockets and Starships.”
The Overflow, December. https://stackoverflow.blog/2021/12/27/dont-push-that-button-exploring-the-software-that-flies-spacex-starships/.
Martinez, Luis. 2021. “How Much Should We Trust the Dictator’s GDP
Growth Estimates?” https://bfi.uchicago.edu/wp-content/uploads/2021/07/BFI\%5FWP\%5F2021-78.pdf.
Matias, Nathan, Kevin Munger, Marianne Aubin Le Quere, and Charles
Ebersole. 2021. “The Upworthy Research
Archive, a time series of 32,487 experiments in U.S.
media.” Scientific Data 8 (1): 1–8. https://doi.org/10.1038/s41597-021-00934-7.
Mattson, Greggor. 2017. “Artificial Intelligence Discovers
Gayface. Sigh.” https://greggormattson.com/2017/09/09/artificial-intelligence-discovers-gayface/amp/.
McClelland, Alexander. 2019. “‘Lock This Whore up’:
Legal Violence and Flows of Information Precipitating Personal Violence
Against People Criminalised for HIV-Related Crimes in Canada.”
European Journal of Risk Regulation 10 (1): 132–47.
McElreath, Richard. 2020. Statistical
Rethinking: A Bayesian Course with Examples in R and Stan.
2nd ed. Chapman; Hall/CRC.
McPhee, John. 2017. Draft No. 4. 1st ed. Farrar, Straus;
Giroux.
McQuire, Scott. 2019. “One Map to Rule Them All? Google Maps as
Digital Technical Object.” Communication and the Public
4 (2): 150–65. https://doi.org/10.1177/2057047319850192.
Meng, Xiao-Li. 2018. “Statistical Paradises and Paradoxes in Big
Data (i): Law of Large Populations, Big Data Paradox, and the 2016 US
Presidential Election.” The Annals of Applied Statistics
12 (2): 685–726. https://doi.org/10.1214/18-AOAS1161SF.
———. 2021. “What Are the Values of Data, Data Science, or Data
Scientists?” Harvard Data Science Review 3 (1). https://doi.org/10.1162/99608f92.ee717cf7.
Merali, Zeeya. 2010. “Computational Science:... Error.”
Nature 467 (7317): 775–77. https://doi.org/10.1038/467775a.
Miceli, Milagros, Julian Posada, and Tianling Yang. 2022.
“Studying up Machine Learning Data.” Proceedings of the
ACM on Human-Computer Interaction 6 (January): 1–14.
https://doi.org/10.1145/3492853.
Michener, William. 2015. “Ten Simple Rules for Creating a Good
Data Management Plan.” PLoS Computational Biology 11
(10): e1004525. https://doi.org/10.1371/journal.pcbi.1004525.
Mindell, David. 2008. Digital Apollo: Human and
Machine in Spaceflight. New York: The MIT Press.
Mineault, Patrick, and The Good Research Code Handbook Community. 2021.
“The Good Research Code Handbook.” https://doi.org/10.5281/zenodo.5796873.
Minsky, Yaron. 2011. “OCaml for the
masses.” Communications of the ACM 54 (11):
53–58. https://doi.org/10.1145/2018396.2018413.
———. 2015. “Automated Trading and OCaml with Yaron Minsky.”
Hackers — Software Engineering Daily, November. https://softwareengineeringdaily.com/2015/11/09/automated-trading-and-ocaml-with-yaron-minsky/.
Mitchell, Alanna. 2022. “Get Ready for the New, Improved
Second.” The New York Times, April. https://www.nytimes.com/2022/04/25/science/time-second-measurement.html.
Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy
Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and
Timnit Gebru. 2019. “Model Cards for Model Reporting.”
Proceedings of the Conference on Fairness, Accountability, and
Transparency, January. https://doi.org/10.1145/3287560.3287596.
Miyakawa, Tsuyoshi. 2020. “No Raw Data, No Science: Another
Possible Source of the Reproducibility Crisis.” Molecular
Brain 13 (1): 1–6. https://doi.org/10.1186/s13041-020-0552-2.
Mok, Lillio, Samuel Way, Lucas Maystre, and Ashton Anderson. 2022.
“The Dynamics of Exploration on Spotify.” In
Proceedings of the International AAAI Conference on Web and Social
Media, 16:663–74.
Molanphy, Chris. 2012. “100 & Single: Three Rules to Define
the Term ‘One-Hit Wonder’ in 2012.” The Village
Voice, September. https://www.villagevoice.com/2012/09/10/100-single-three-rules-to-define-the-term-one-hit-wonder-in-2012/.
Morange, Michel. 2016. A History of Biology. New Jersey:
Princeton University Press.
Moyer, Brian, and Abe Dunn. 2020. “Measuring the
Gross Domestic Product
(GDP): The Ultimate Data
Science Project.” Harvard Data
Science Review 2 (1). https://doi.org/10.1162/99608f92.414caadb.
Müller, Kirill, and Lorenz Walthert. 2022. styler: Non-Invasive Pretty Printing of R
Code. https://CRAN.R-project.org/package=styler.
Müller, Kirill, and Hadley Wickham. 2021. tibble: Simple Data Frames. https://CRAN.R-project.org/package=tibble.
Murphy, Heather. 2017. “Why Stanford Researchers Tried to Create a
‘Gaydar’ Machine.” New York Times, October.
https://www.nytimes.com/2017/10/09/science/stanford-sexual-orientation-study.html.
Nelder, John, and Robert Wedderburn. 1972. “Generalized Linear
Models.” Journal of the Royal Statistical Society: Series A
(General) 135 (3): 370–84. https://doi.org/10.2307/2344614.
Neufeld, Michael. 2002. “Wernher von Braun, the SS, and
Concentration Camp Labor: Questions of Moral, Political, and Criminal
Responsibility.” German Studies Review 25 (1): 57–78. https://doi.org/10.2307/1433245.
Neuwirth, Erich. 2014. RColorBrewer: ColorBrewer
Palettes. https://CRAN.R-project.org/package=RColorBrewer.
Newman, Daniel. 2014. “Missing Data: Five Practical
Guidelines.” Organizational Research Methods 17 (4):
372–411. https://doi.org/10.1177/1094428114548590.
Neyman, Jerzy. 1934. “On the Two Different Aspects of the
Representative Method: The Method of Stratified Sampling and the Method
of Purposive Selection.” Journal of the Royal Statistical
Society 97 (4): 558–625. https://doi.org/10.2307/2342192.
Nobles, Melissa. 2002. “Racial Categorization and
Censuses.” In Census and Identity: The Politics of Race,
Ethnicity, and Language in National Censuses, edited by David
Kertzer and Dominique Arel, 43–70. New York, NY: Cambridge University
Press.
Northcutt, Curtis, Anish Athalye, and Jonas Mueller. 2021.
“Pervasive Label Errors in Test Sets Destabilize Machine Learning
Benchmarks.” https://doi.org/10.48550/ARXIV.2103.14749.
Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil
Mullainathan. 2019. “Dissecting Racial Bias in an Algorithm Used
to Manage the Health of Populations.” Science 366
(6464): 447–53. https://doi.org/10.1126/science.aax234.
Oberski, Daniel L., and Frauke Kreuter. 2020. “Differential
Privacy and Social Science: An
Urgent Puzzle.” Harvard Data
Science Review 2 (1).
OECD. 2014. “The Essential Macroeconomic Aggregates.” In
Understanding National Accounts, 13–46. OECD. https://doi.org/10.1787/9789264214637-2-en.
———. 2022. Quarterly GDP. https://data.oecd.org/gdp/quarterly-gdp.htm.
Ooms, Jeroen. 2014. “The jsonlite Package: A
Practical and Consistent Mapping Between JSON Data and R
Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805.
———. 2019a. pdftools: Text Extraction,
Rendering and Converting of PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2019b. tesseract: Open Source OCR
Engine. https://CRAN.R-project.org/package=tesseract.
———. 2021. openssl: Toolkit for Encryption,
Signatures and Certificates Based on OpenSSL. https://CRAN.R-project.org/package=openssl.
Oostrom, Tamar. 2022. “Funding of Clinical Trials and Reported
Drug Efficacy.” https://drive.google.com/file/d/1EQLCH0ns99IxYBkxPNbagcZtGgE9a8MQ/view.
Orwell, George. 1946. Politics and the English Language. https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-and-the-english-language/.
Patki, Neha, Roy Wedge, and Kalyan Veeramachaneni. 2016. “The
Synthetic Data Vault.” In 2016 IEEE International Conference
on Data Science and Advanced Analytics (DSAA), 399–410. https://doi.org/10.1109/DSAA.2016.49.
Paullada, Amandalynne, Inioluwa Deborah Raji, Emily Bender, Emily
Denton, and Alex Hanna. 2021. “Data and Its (Dis)contents: A
Survey of Dataset Development and Use in Machine Learning
Research.” Patterns 2 (11): 100336. https://doi.org/10.1016/j.patter.2021.100336.
Pavlik, Kaylin. 2019. “Understanding + Classifying Genres Using
Spotify Audio Features.” https://www.kaylinpavlik.com/classifying-songs-genres/.
Pedersen, Thomas Lin. 2020. patchwork: The
Composer of Plots. https://CRAN.R-project.org/package=patchwork.
Perepolkin, Dmytro. 2019. Polite: Be Nice on the Web. https://CRAN.R-project.org/package=polite.
Perkel, Jeffrey. 2021. “Ten Computer Codes That Transformed
Science.” Nature 589 (7842): 344–48. https://doi.org/10.1038/d41586-021-00075-2.
Pfeffer, Juergen, Angelina Mooseder, Luca Hammer, Oliver Stritzel, and
David Garcia. 2022. “This Sample Seems to Be Good Enough!
Assessing Coverage and Temporal Reliability of Twitter’s Academic
API.” arXiv. https://doi.org/10.48550/ARXIV.2204.02290.
Phillips, Alban. 1958. “The Relation Between Unemployment and the
Rate of Change of Money Wage Rates in the United Kingdom,
1861-1957.” Economica 25 (100): 283–99. https://doi.org/10.1111/j.1468-0335.1958.tb00003.x.
Piller, Charles. 2022. “Blots on a Field?” Science
377 (6604): 358–63. https://doi.org/10.1126/science.ade0209.
Pineau, Joelle, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent
Larivière, Alina Beygelzimer, Florence d’Alché-Buc, Emily Fox, and Hugo
Larochelle. 2021. “Improving Reproducibility in Machine Learning
Research (a Report from the NeurIPS 2019 Reproducibility
Program).” Journal of Machine Learning Research 22
(164): 1–20. http://jmlr.org/papers/v22/20-303.html.
Pitman, Jim. 1993. Probability. 1st ed. New York: Springer. https://doi.org/10.1007/978-1-4612-4374-8.
Plant, Anne, and Robert Hanisch. 2020. “Reproducibility in
Science: A Metrology Perspective.” Harvard Data Science
Review 2 (4). https://doi.org/10.1162/99608f92.eb6ddee4.
Presmanes Hill, Alison. 2021a. M-F-E-O:
postcards + distill. https://alison.rbind.io/post/2020-12-22-postcards-distill/.
———. 2021b. Up & Running with Blogdown in 2021. https://alison.rbind.io/post/new-year-new-blogdown/.
Prévost, Jean-Guy, and Jean-Pierre Beaud. 2015. Statistics, Public
Debate and the State, 1800–1945: A Social, Political and Intellectual
History of Numbers. Routledge.
R Core Team. 2022. R: A Language and Environment for Statistical
Computing. Vienna, Austria: R Foundation for Statistical Computing.
https://www.R-project.org/.
R Special Interest Group on Databases (R-SIG-DB), Hadley Wickham, and
Kirill Müller. 2022. DBI: R Database Interface. https://CRAN.R-project.org/package=DBI.
Register, Yim. 2020a. “Introduction to Sampling and
Randomization,” November. https://youtu.be/U272FFxG8LE.
———. 2020b. “Data Science Ethics in 6 Minutes.”
YouTube, December. https://youtu.be/mA4gypAiRYU.
Reid, Nancy. 2003. “Asymptotics and the Theory of
Inference.” The Annals of Statistics 31 (6): 1695–1731.
https://doi.org/10.1214/aos/1074290325.
Richardson, Neal, Ian Cook, Nic Crane, Jonathan Keane, Romain François,
Jeroen Ooms, and Apache Arrow. 2022. arrow:
Integration to “Apache” “Arrow”.
https://CRAN.R-project.org/package=arrow.
Riederer, Emily. 2020. “Column Names as Contracts,”
September. https://emilyriederer.netlify.app/post/column-name-contracts/.
Riffe, Tim, Enrique Acosta, Enrique José Acosta, Diego Manuel Aburto,
Anna Alburez-Gutierrez, Ainhoa Altová, Ugofilippo Alustiza, et al. 2021.
“Data Resource Profile: COVerAGE-DB: A
Global Demographic Database of COVID-19 Cases and
Deaths.” International Journal of Epidemiology 50 (2):
390–390f. https://doi.org/10.1093/ije/dyab027.
Rilke, Rainer Maria. 1929. Letters to a Young Poet.
Robinson, David. 2021. gutenbergr: Download and
Process Public Domain Works from Project Gutenberg. https://CRAN.R-project.org/package=gutenbergr.
Robinson, David, Alex Hayes, and Simon Couch. 2021. broom: Convert Statistical Objects into Tidy
Tibbles. https://CRAN.R-project.org/package=broom.
Robinson, Emily, and Jacqueline Nolis. 2020. Build a Career in Data
Science. Manning Publications. https://livebook.manning.com/book/build-a-career-in-data-science.
Rockoff, Hugh. 2019. “On the Controversies Behind the Origins of
the Federal Economic Statistics.” Journal of Economic
Perspectives 33 (1): 147–64. https://doi.org/10.1257/jep.33.1.147.
Rose, Angela, Rebecca Grais, Denis Coulombier, and Helga Ritter. 2006.
“A Comparison of Cluster and Systematic Sampling Methods for
Measuring Crude Mortality.” Bulletin of the World Health
Organization 84: 290–96. https://doi.org/10.2471/blt.05.029181.
Ross, Casey. 2022. “How a Decades-Old Database Became a Hugely
Profitable Dossier on the Health of 270 Million Americans.”
Stat, February. https://www.statnews.com/2022/02/01/ibm-watson-health-marketscan-data/.
Rudis, Bob. 2020. hrbrthemes: Additional
Themes, Theme Components and Utilities for
“ggplot2”. https://CRAN.R-project.org/package=hrbrthemes.
Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan
Schroeder. 2019. “Differential Privacy and Census Data:
Implications for Social and Economic Research.” AEA Papers
and Proceedings 109 (May): 403–8. https://doi.org/10.1257/pandp.20191107.
Ruggles, Steven, Sarah Flood, Sophia Foster, Ronald Goeken, Jose Pacas,
Megan Schouweiler, and Matthew Sobek. 2021. “IPUMS USA: Version
11.0.” Minneapolis, MN: IPUMS. https://doi.org/10.18128/d010.v11.0.
Ryan, Philip. 2015. “Keeping a Lab Notebook.”
YouTube, May. https://youtu.be/-MAIuaOL64I.
Sadowski, Caitlin, Emma Söderberg, Luke Church, Michal Sipko, and
Alberto Bacchelli. 2018b. “Modern Code Review: A Case Study at
Google.” In Proceedings of the 40th International Conference
on Software Engineering: Software Engineering in Practice, 181–90.
https://doi.org/10.1145/3183519.3183525.
———. 2018a. “Modern Code Review: A Case Study at Google.”
In Proceedings of the 40th International Conference on Software
Engineering: Software Engineering in Practice, 181–90. ICSE-SEIP
’18. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3183519.3183525.
Sakshaug, Joseph, Ting Yan, and Roger Tourangeau. 2010.
“Nonresponse Error, Measurement Error, and Mode of Data
Collection: Tradeoffs in a Multi-Mode Survey of Sensitive and
Non-Sensitive Items.” Public Opinion Quarterly 74 (5):
907–33. https://doi.org/10.1093/poq/nfq057.
Salganik, Matthew. 2018. Bit by Bit: Social Research in the Digital
Age. New Jersey: Princeton University Press.
Salganik, Matthew, Peter Sheridan Dodds, and Duncan Watts. 2006.
“Experimental Study of Inequality and Unpredictability in an
Artificial Cultural Market.” Science 311 (5762): 854–56.
https://doi.org/10.1126/science.1121066.
Salganik, Matthew, and Douglas Heckathorn. 2004. “Sampling and
Estimation in Hidden Populations Using Respondent-Driven
Sampling.” Sociological Methodology 34 (1): 193–240. https://doi.org/10.1111/j.0081-1750.2004.00152.x.
Sambasivan, Nithya, Shivani Kapania, Hannah Highfill, Diana Akrong,
Praveen Paritosh, and Lora Aroyo. 2021. “‘Everyone Wants to
Do the Model Work, Not the Data Work’: Data Cascades in
High-Stakes AI.” In Proceedings of the 2021
CHI Conference on Human Factors in Computing Systems.
ACM. https://doi.org/10.1145/3411764.3445518.
Samuel, Arthur. 1959. “Some Studies in Machine Learning Using the
Game of Checkers.” IBM Journal of Research and
Development 3 (3): 210–29. https://doi.org/10.1147/rd.33.0210.
Saulnier, Lucile, Siddharth Karamcheti, Hugo Laurençon, Léo Tronchon,
Thomas Wang, Victor Sanh, Amanpreet Singh, et al. 2022. “Putting
Ethical Principles at the Core of the Research Lifecycle.” https://huggingface.co/blog/ethical-charter-multimodal.
Schloerke, Barret, and Jeff Allen. 2021. plumber: An API Generator for R. https://CRAN.R-project.org/package=plumber.
Schmertmann, Carl. 2022. “UN API Test,” July. https://bonecave.schmert.net/un-api-example.html.
Scott, James. 1998. Seeing Like a State. Yale University Press.
Sekhon, Jasjeet, and Rocío Titiunik. 2017. “Understanding
Regression Discontinuity Designs as Observational Studies.”
Observational Studies 3 (2): 174–82. https://doi.org/10.1353/obs.2017.0005.
Si, Yajuan. 2020. “On the Use of Auxiliary Variables in Multilevel
Regression and Poststratification.” https://arxiv.org/abs/2011.00360.
Sides, John, Lynn Vavreck, and Christopher Warshaw. 2021. “The
Effect of Television Advertising in United States Elections.”
American Political Science Review, 1–17. https://doi.org/10.1017/s000305542100112x.
Silberzahn, Raphael, Eric Uhlmann, Daniel Martin, Pasquale Anselmi,
Frederik Aust, Eli Awtrey, Štěpán Bahnı́k, et al. 2018. “Many
Analysts, One Data Set: Making Transparent How Variations in Analytic
Choices Affect Results.” Advances in Methods and Practices in
Psychological Science 1 (3): 337–56. https://doi.org/10.1177/2515245917747646.
Silge, Julia. 2018. “Text Classification with Tidy Data
Principles,” December. https://juliasilge.com/blog/tidy-text-classification/.
Silge, Julia, Fanny Chow, Max Kuhn, and Hadley Wickham. 2022. rsample: General Resampling Infrastructure.
https://CRAN.R-project.org/package=rsample.
Silge, Julia, and David Robinson. 2016. “tidytext: Text Mining and Analysis Using Tidy Data
Principles in R.” The Journal of Open Source
Software 1 (3). https://doi.org/10.21105/joss.00037.
Silver, Nate. 2020. “We Fixed an Issue with How Our Primary
Forecast Was Calculating Candidates’ Demographic Strengths.”
FiveThirtyEight, February. https://fivethirtyeight.com/features/we-fixed-a-mistake-in-how-our-primary-forecast-was-calculating-candidates-demographic-strengths/.
Simon, Noah, Jerome Friedman, Trevor Hastie, and Rob Tibshirani. 2011.
“Regularization Paths for Cox’s Proportional Hazards Model via
Coordinate Descent.” Journal of Statistical Software 39
(5): 1–13. https://doi.org/10.18637/jss.v039.i05.
Simonsohn, Uri. 2013. “Just Post It: The Lesson from Two Cases of
Fabricated Data Detected by Statistics Alone.” Psychological
Science 24 (10): 1875–88. https://doi.org/10.1177/0956797613480366.
Simpson, Edward. 1951. “The Interpretation of Interaction in
Contingency Tables.” Journal of the Royal Statistical
Society: Series B (Methodological) 13 (2): 238–41.
Smith, Jessie, Saleema Amershi, Solon Barocas, Hanna Wallach, and
Jennifer Wortman Vaughan. 2022. “REAL ML: Recognizing, Exploring,
and Articulating Limitations of Machine Learning Research.”
2022 ACM Conference on Fairness, Accountability, and Transparency
(FAccT ’22). https://doi.org/10.1145/3531146.3533122.
Sobek, Matthew, and Steven Ruggles. 1999. “The IPUMS Project: An
Update.” Historical Methods: A Journal of Quantitative and
Interdisciplinary History 32 (3): 102–10. https://doi.org/10.1080/01615449909598930.
Somers, James. 2015. “Toolkits for the
Mind.” MIT Technology Review, April. https://www.technologyreview.com/2015/04/02/168469/toolkits-for-the-mind/.
———. 2017. “Torching the Modern-Day Library of Alexandria.”
The Atlantic, April. https://www.theatlantic.com/technology/archive/2017/04/the-tragedy-of-google-books/523320/.
Sprint, Gina, and Jason Conci. 2019. “Mining Github Classroom
Commit Behavior in Elective and Introductory Computer Science
Courses.” Journal of Computing Sciences in Colleges 35
(1): 76–84.
Staicu, Ana-Maria. 2017. “Interview with Nancy Reid.”
International Statistical Review 85 (3): 381–403. https://doi.org/10.1111/insr.12237.
Staniak, Mateusz, and Przemyslaw Biecek. 2019. “The landscape of R packages for automated exploratory
data analysis.” arXiv Preprint arXiv:1904.02101.
Statistics Canada. 2017. “Guide to the Census of Population,
2016.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2016/ref/98-304/98-304-x2016001-eng.pdf.
———. 2020. “Sex at Birth and Gender: Technical Report on Changes
for the 2021 Census.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2021/ref/98-20-0002/982000022020002-eng.pdf.
Steckel, Richard. 1991. “The Quality of Census Data for Historical
Inquiry: A Research Agenda.” Social Science History 15
(4): 579–99. https://doi.org/10.2307/1171470.
Stevens, Wallace. 1934. The Idea of Order at Key West. https://www.poetryfoundation.org/poems/43431/the-idea-of-order-at-key-west.
Steyvers, Mark, and Tom Griffiths. 2006. “Probabilistic Topic
Models.” In Latent Semantic Analysis: A Road to Meaning,
edited by T. Landauer, D McNamara, S. Dennis, and W. Kintsch.
Stigler, Stephen. 1986. The History of Statistics. Harvard
University Press.
Stock, James, and Francesco Trebbi. 2003. “Retrospectives: Who
Invented Instrumental Variable Regression?” Journal of
Economic Perspectives 17 (3): 177–94. https://doi.org/10.1257/089533003769204416.
Stolberg, Michael. 2006. “Inventing the Randomized Double-Blind
Trial: The Nuremberg Salt Test of 1835.” Journal of the Royal
Society of Medicine 99 (12): 642–43. https://doi.org/10.1258/jrsm.99.12.642.
Stolley, Paul. 1991. “When Genius Errs: R. A. Fisher and the Lung
Cancer Controversy.” American Journal of Epidemiology
133 (5): 416–25. https://doi.org/10.1093/oxfordjournals.aje.a115904.
Student. 1908. “The Probable Error of a Mean.”
Biometrika 6 (1): 1–25. https://doi.org/10.2307/2331554.
Sunstein, Cass, and Lucia Reisch. 2017. The Economics of Nudge.
Routledge.
Suriyakumar, Vinith, Nicolas Papernot, Anna Goldenberg, and Marzyeh
Ghassemi. 2021. “Chasing Your Long Tails.” In
Proceedings of the 2021 ACM Conference on Fairness,
Accountability, and Transparency. Acm. https://doi.org/10.1145/3442188.3445934.
Swain, Larry. 1985. “Basic Principles of Questionnaire
Design.” Survey Methodology 11 (2): 161–70.
Taddy, Matt. 2019. Business Data Science. McGraw Hill.
Tal, Eran. 2020. “Measurement in
Science.” In The Stanford Encyclopedia of
Philosophy, edited by Edward Zalta, Fall 2020. https://plato.stanford.edu/archives/fall2020/entries/measurement-science/;
Metaphysics Research Lab, Stanford University.
Tang, Jun, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, and
Xiaofeng Wang. 2017. “Privacy Loss in Apple’s Implementation of
Differential Privacy on MacOS 10.12.” arXiv. https://doi.org/10.48550/ARXIV.1709.02753.
The Economist. 2013. “Johnson: Those Six Little Rules: George
Orwell on Writing,” July. https://www.economist.com/prospero/2013/07/29/johnson-those-six-little-rules.
———. 2022a. “What Spotify Data Show about the Decline of
English,” January. https://www.economist.com/interactives/graphic-detail/2022/01/29/what-spotify-data-show-about-the-decline-of-english.
———. 2022b. “Will Emmanuel Macron Win a Second Term?”
April. https://www.economist.com/interactive/france-2022/forecast.
———. 2022c. “France’s Presidential Election: The Second Round in
Detail,” April. https://www.economist.com/interactive/france-2022/results-round-two.
The Prize in Economic Sciences. 2019. “Scientific Background:
Understanding Development and Poverty Alleviation.” The Committee
for the Prize in Economic Sciences in Memory of Alfred Nobel. https://www.nobelprize.org/uploads/2019/10/advanced-economicsciencesprize2019.pdf.
Thieme, Nick. 2018. “R Generation.” Significance
15 (4): 14–19. https://doi.org/10.1111/j.1740-9713.2018.01169.x.
Thistlethwaite, Donald, and Donald Campbell. 1960.
“Regression-Discontinuity Analysis: An Alternative to the Ex Post
Facto Experiment.” Journal of Educational Psychology 51
(6): 309. https://doi.org/10.1037/h0044319.
Thompson, Charlie, Josiah Parry, Donal Phipps, and Tom Wolff. 2020.
spotifyr: R Wrapper for the
“Spotify” Web API. http://github.com/charlie86/spotifyr.
Thornhill, John. 2021. “Lunch with the FT: Mathematician Hannah
Fry.” Financial Times, July. https://www.ft.com/content/a5e33e5a-99b9-4bbc-948f-8a527c7675c3.
Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data
Frames.” Journal of Open Source Software 2 (16): 355. https://doi.org/10.21105/joss.00355.
———. 2020. R Markdown for Scientists. https://rmd4sci.njtierney.com.
———. 2022. Quarto for Scientists. https://github.com/njtierney/qmd4sci.
Tierney, Nicholas, Di Cook, Miles McBain, and Colin Fay. 2021.
Naniar: Data Structures, Summaries, and Visualisations for Missing
Data. https://CRAN.R-project.org/package=naniar.
Tierney, Nicholas, and Karthik Ram. 2020. “A Realistic Guide to
Making Data Available Alongside Code to Improve Reproducibility.”
https://arxiv.org/abs/2002.11626.
Timbers, Tiffany. 2020. canlang: Canadian
Census language data. https://ttimbers.github.io/canlang/.
Timbers, Tiffany, Trevor Campbell, and Melissa Lee. 2022. Data
Science: A First Introduction. Chapman; Hall/CRC. https://datasciencebook.ca.
Tolley, Erin, and Mireille Paquet. 2021. “Gender, Municipal Party
Politics, and Montreal’s First Woman Mayor.” Canadian Journal
of Urban Research 30 (1): 40–52.
Tourangeau, Roger, Lance Rips, and Kenneth Rasinski. 2000. The
Psychology of Survey Response. 1st ed. Cambridge University Press.
https://doi.org/10.1017/CBO9780511819322.
Trisovic, Ana, Matthew Lau, Thomas Pasquier, and Mercè Crosas. 2022.
“A Large-Scale Study on Research Code Quality and
Execution.” Scientific Data 9 (1). https://doi.org/10.1038/s41597-022-01143-6.
Tukey, John. 1962. “The Future of Data Analysis.” The
Annals of Mathematical Statistics 33 (1): 1–67. https://doi.org/10.1214/aoms/1177704711.
UN IGME. 2021. “Levels and Trends in Child Mortality,
2021.” https://childmortality.org/wp-content/uploads/2021/12/UNICEF-2021-Child-Mortality-Report.pdf.
Urban, Steve, Rangarajan Sreenivasan, and Vineet Kannan. 2016.
“It’s All A/Bout Testing: The Netflix
Experimentation Platform.” Netflix Technology
Blog, April. https://netflixtechblog.com/its-all-a-bout-testing-the-netflix-experimentation-platform-4e1ca458c15.
Ushey, Kevin. 2022. renv: Project
Environments. https://CRAN.R-project.org/package=renv.
Van den Broeck, Jan, Solveig Argeseanu Cunningham, Roger Eeckels, and
Kobus Herbst. 2005. “Data Cleaning: Detecting, Diagnosing, and
Editing Data Abnormalities.” PLoS Medicine 2 (10): e267.
https://doi.org/10.1371/journal.pmed.0020267.
van der Loo, Mark. 2022. The Data Validation Cookbook. https://data-cleaning.github.io/validate/.
Vanderplas, Susan, Dianne Cook, and Heike Hofmann. 2020. “Testing
Statistical Charts: What Makes a Good Graph?” Annual Review
of Statistics and Its Application 7: 61–88. https://doi.org/10.1146/annurev-statistics-031219-041252.
Vanhoenacker, Mark. 2015. Skyfaring: A Journey with a Pilot.
Alfred A. Knopf.
Varin, Cristiano, Nancy Reid, and David Firth. 2011. “An Overview
of Composite Likelihood Methods.” Statistica Sinica,
5–42.
von Bergmann, Jens, Dmitry Shkolnik, and Aaron Jacobs. 2021. cancensus: R package to access, retrieve, and work with
Canadian Census data and geography. https://mountainmath.github.io/cancensus/.
Walby, Kevin, and Alex Luscombe. 2019. Freedom of Information and
Social Science Research Design. Routledge.
Walker, Kyle. 2022. Analyzing US Census Data. Chapman;
Hall/CRC. https://walker-data.com/census-r/index.html.
Walker, Kyle, and Matt Herman. 2022. tidycensus: Load US Census Boundary and Attribute Data as
“tidyverse” and “sf”-Ready Data
Frames. https://CRAN.R-project.org/package=tidycensus.
Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015.
“Forecasting Elections with Non-Representative Polls.”
International Journal of Forecasting 31 (3): 980–91. https://doi.org/10.1016/j.ijforecast.2014.06.001.
Wang, Yilun, and Michal Kosinski. 2018. “Deep Neural Networks Are
More Accurate Than Humans at Detecting Sexual Orientation from Facial
Images.” Journal of Personality and Social Psychology
114 (2): 246–57. https://doi.org/10.1037/pspa0000098.
Wardrop, Robert. 1995. “Simpson’s Paradox and the Hot Hand in
Basketball.” The American Statistician 49 (1): 24–28. https://doi.org/10.2307/2684806.
Ware, James. 1989. “Investigating Therapies of Potentially Great
Benefit: ECMO.” Statistical Science 4 (4): 298–306. https://doi.org/10.1214/ss/1177012384.
Waring, Elin, Michael Quinn, Amelia McNamara, Eduardo Arino de la Rubia,
Hao Zhu, and Shannon Ellis. 2022. Skimr: Compact and Flexible
Summaries of Data. https://CRAN.R-project.org/package=skimr.
Wasserman, Larry. 2005. All of Statistics. Springer.
Wei, LJ, and S Durham. 1978. “The Randomized Play-the-Winner Rule
in Medical Trials.” Journal of the American Statistical
Association 73 (364): 840–43. https://doi.org/10.2307/2286290.
Weinberg, Gerald. 1971. The Psychology of Computer Programming.
New York: Van Nostrand Reinhold Company.
Weissgerber, Tracey, Natasa Milic, Stacey Winham, and Vesna Garovic.
2015. “Beyond Bar and Line Graphs: Time for a New Data
Presentation Paradigm.” PLoS Biology 13 (4): e1002128.
https://doi.org/10.1371/journal.pbio.1002128.
Whitby, Andrew. 2020. The Sum of the
People. New York: Basic Books.
Whitelaw, James. 1805. An Essay on the Population of Dublin. Being
the Result of an Actual Survey Taken in 1798, with Great Care and
Precision, and Arranged in a Manner Entirely New. Graisberry;
Campbell.
Wicherts, Jelte, Marjan Bakker, and Dylan Molenaar. 2011.
“Willingness to Share Research Data Is Related to the Strength of
the Evidence and the Quality of Reporting of Statistical
Results.” PLoS ONE 6 (11):
e26828. https://doi.org/10.1371/journal.pone.0026828.
Wickham, Hadley. 2009. “Manipulating Data.” In ggplot2, 157–75. Springer New York. https://doi.org/10.1007/978-0-387-98141-3\_9.
———. 2010. “A Layered Grammar of Graphics.” Journal of
Computational and Graphical Statistics 19 (1): 3–28. https://doi.org/10.1198/jcgs.2009.07098.
———. 2011. “testthat: Get Started with
Testing.” The R Journal 3: 5–10. https://journal.r-project.org/archive/2011-1/RJournal\%5F2011-1\%5FWickham.pdf.
———. 2014. “Tidy Data.” Journal of Statistical
Software 59 (1): 1–23. https://doi.org/10.18637/jss.v059.i10.
———. 2016. ggplot2: Elegant Graphics for Data
Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2017. tidyverse: Easily Install and Load
the “Tidyverse”. https://CRAN.R-project.org/package=tidyverse.
———. 2019a. Advanced R. 2nd ed. Chapman; Hall/CRC.
https://adv-r.hadley.nz.
———. 2019b. babynames: US Baby Names
1880-2017. https://CRAN.R-project.org/package=babynames.
———. 2019c. httr: Tools for Working with URLs
and HTTP. https://CRAN.R-project.org/package=httr.
———. 2019d. rvest: Easily Harvest (Scrape) Web
Pages. https://CRAN.R-project.org/package=rvest.
———. 2019e. stringr: Simple, Consistent
Wrappers for Common String Operations. https://CRAN.R-project.org/package=stringr.
———. 2020a. forcats: Tools for Working with
Categorical Variables (Factors). https://CRAN.R-project.org/package=forcats.
———. 2020b. Tidyverse. https://www.tidyverse.org/.
———. 2021a. Mastering Shiny. 1st ed. O’Reilly Media. https://mastering-shiny.org.
———. 2021b. The Tidyverse Style Guide. https://style.tidyverse.org/index.html.
———. 2021c. tidyr: Tidy Messy Data.
https://CRAN.R-project.org/package=tidyr.
———. 2022. R Packages. 2nd ed. O’Reilly Media. https://r-pkgs.org.
Wickham, Hadley, Mara Averick, Jenny Bryan, Winston Chang, Lucy
D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019.
“Welcome to the tidyverse.”
Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, and Jenny Bryan. 2020. usethis: Automate Package and Project Setup.
https://CRAN.R-project.org/package=usethis.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022.
dplyr: A Grammar of Data
Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, Maximilian Girlich, and Edgar Ruiz. 2022. dbplyr: A “dplyr” Back End for
Databases. https://CRAN.R-project.org/package=dbplyr.
Wickham, Hadley, and Garrett Grolemund. 2022. R for Data
Science. 2nd ed. O’Reilly Media. https://r4ds.hadley.nz.
Wickham, Hadley, Jim Hester, and Jenny Bryan. 2021. readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.
Wickham, Hadley, Jim Hester, and Winston Chang. 2020. devtools: Tools to Make Developing R Packages
Easier. https://CRAN.R-project.org/package=devtools.
Wickham, Hadley, Jim Hester, and Jeroen Ooms. 2021. xml2: Parse XML. https://CRAN.R-project.org/package=xml2.
Wickham, Hadley, and Evan Miller. 2020. haven:
Import and Export “SPSS,” “Stata” and
“SAS” Files. https://CRAN.R-project.org/package=haven.
Wickham, Hadley, and Dana Seidel. 2020. scales:
Scale Functions for Visualization. https://CRAN.R-project.org/package=scales.
Wiessner, Polly. 2014. “Embers of Society: Firelight Talk Among
the Ju/’Hoansi Bushmen.” Proceedings of the National Academy
of Sciences 111 (39): 14027–35. https://doi.org/10.1073/pnas.1404212111.
Wilde, Oscar. 1891. The Picture of Dorian Gray. https://www.gutenberg.org/files/174/174-h/174-h.htm.
Wilke, Claus. 2019. Fundamentals of Data Visualization: A Primer on
Making Informative and Compelling Figures. O’Reilly Media.
Wilkinson, Leland. 2005. The Grammar of Graphics. 2nd ed.
Springer.
Wilkinson, Mark, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle
Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016.
“The FAIR Guiding Principles for Scientific Data Management and
Stewardship.” Scientific Data 3 (1): 1–9. https://doi.org/10.1038/sdata.2016.18.
Wilson, Greg. 2021. Building Software Together. CRC Books. https://buildtogether.tech.
Wilson, Greg, Jenny Bryan, Karen Cranston, Justin Kitzes, Lex
Nederbragt, and Tracy Teal. 2017. “Good Enough Practices in
Scientific Computing.” PLOS Computational Biology 13
(6): 1–20. https://doi.org/10.1371/journal.pcbi.1005510.
Wong, Julia Carrie. 2020. “One Year Inside Trump’s Monumental
Facebook Campaign.” The Guardian, January. https://www.theguardian.com/us-news/2020/jan/28/donald-trump-facebook-ad-campaign-2020-election.
World Health Organization. 2019. “Trends in Maternal Mortality
2000 to 2017: Estimates by WHO, UNICEF, UNFPA, World Bank Group and the
United Nations Population Division.” https://www.who.int/reproductivehealth/publications/maternal-mortality-2000-2017/en/.
Wright, Philip. 1928. The Tariff on Animal and Vegetable Oils.
New York: Macmillan Company.
Wu, Changbao, and Mary Thompson. 2020. Sampling Theory and
Practice. Springer.
Xie, Yihui. 2019. “TinyTeX: A lightweight,
cross-platform, and easy-to-maintain LaTeX distribution based on TeX
Live.” TUGboat, no. 1: 30–32. https://tug.org/TUGboat/Contents/contents40-1.html.
———. 2021. knitr: A General-Purpose Package for
Dynamic Report Generation in R. https://yihui.org/knitr/.
Xie, Yihui, Christophe Dervieux, and Alison Presmanes Hill. 2021.
blogdown: Create Blogs and Websites with R
Markdown. https://github.com/rstudio/blogdown.
Xie, Yihui, Amber Thomas, and Alison Presmanes Hill. 2021. blogdown: Creating Websites with R Markdown.
Xu, Ya. 2020. “Causal Inference Challenges in Industry: A
Perspective from Experiences at LinkedIn.” YouTube,
July. https://youtu.be/OoKsLAvyIYA.
Yeager, David, Jon Krosnick, LinChiat Chang, Harold Javitz, Matthew
Levendusky, Alberto Simpser, and Rui Wang. 2011. “Comparing the
Accuracy of RDD Telephone Surveys and Internet Surveys
Conducted with Probability and Non-Probability Samples.”
Public Opinion Quarterly 75 (4): 709–47. https://doi.org/10.1093/poq/nfr020.
Yoshioka, Alan. 1998. “Use of Randomisation in the Medical
Research Council’s Clinical Trial of Streptomycin in Pulmonary
Tuberculosis in the 1940s.” BMJ 317 (7167): 1220–23. https://doi.org/10.1136/bmj.317.7167.1220.
Zhang, Susan, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen,
Shuohui Chen, Christopher Dewan, et al. 2022. “OPT: Open
Pre-Trained Transformer Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2205.01068.
Zhu, Hao. 2020. kableExtra: Construct Complex
Table with “kable” and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.
Zimmer, Michael. 2018. “Addressing Conceptual Gaps in Big Data
Research Ethics: An Application of Contextual Integrity.”
Social Media + Society 4 (2): 1–11. https://doi.org/10.1177/2056305118768300.
Zinsser, William. 1976. On Writing Well. New York:
HarperCollins.
Zook, Matthew, Solon Barocas, danah boyd, Kate Crawford, Emily Keller,
Seeta Peña Gangadharan, Alyssa Goodman, et al. 2017. “Ten Simple
Rules for Responsible Big Data Research.” PLoS Computational
Biology 13 (3): e1005399. https://doi.org/10.1371/journal.pcbi.1005399.