References
Abeysooriya, Mandhri, Megan Soria, Mary Sravya Kasu, and Mark Ziemann.
2021. “Gene Name Errors: Lessons Not Learned.” PLOS
Computational Biology 17 (7): 1–13. https://doi.org/10.1371/journal.pcbi.1008984.
Alexander, Monica. 2019a. “Reproducibility in Demographic
Research.” https://www.monicaalexander.com/posts/2019-10-20-reproducibility/.
———. 2019b. “The Concentration and Uniqueness of Baby Names in
Australia and the US,” January. https://www.monicaalexander.com/posts/2019-20-01-babynames/.
———. 2019c. “Analyzing Name Changes After Marriage Using a
Non-Representative Survey,” August. https://www.monicaalexander.com/posts/2019-08-07-mrp/.
———. 2021. “Overcoming Barriers to Sharing Code.”
YouTube, February. https://youtu.be/yvM2C6aZ94k.
Alexander, Monica J, Mathew V Kiang, and Magali Barbieri. 2018.
“Trends in Black and White Opioid Mortality in the United States,
1979–2015.” Epidemiology (Cambridge, Mass.) 29 (5): 707.
Alexander, Monica, and Leontine Alkema. 2018. “Global Estimation
of Neonatal Mortality Using a Bayesian Hierarchical Splines Regression
Model.” Demographic Research 38: 335–72.
———. 2021. “A Bayesian Cohort Component Projection Model to
Estimate Adult Populations at the Subnational Level in Data-Sparse
Settings.” https://arxiv.org/abs/2102.06121.
Alexander, Rohan, and Monica Alexander. 2021. “The Increased
Effect of Elections and Changing Prime Ministers on Topics Discussed in
the Australian Federal Parliament Between 1901 and 2018.” https://arxiv.org/abs/2111.09299.
Alexander, Rohan, and Paul A. Hodgetts. 2021. AustralianPoliticians:
Provides Datasets about Australian Politicians. https://CRAN.R-project.org/package=AustralianPoliticians.
Alexander, Rohan, and Zachary Ward. 2018. “Age at Arrival and
Assimilation During the Age of Mass Migration.” The Journal
of Economic History 78 (3): 904–37.
Allaire, JJ, Rich Iannone, Alison Presmanes Hill, and Yihui Xie. 2021.
Distill: ’R Markdown’ Format for Scientific and Technical
Writing.
Allen, Jeff. 2021. plumberDeploy: Plumber Deployment. https://CRAN.R-project.org/package=plumberDeploy.
Alsan, Marcella, and Marianne Wanamaker. 2018. “Tuskegee and the
Health of Black Men.” The Quarterly Journal of Economics
133 (1): 407–55.
Amaka, Ofunne, and Amber Thomas. 2021. The Naked Truth: How the
Names of 6,816 Complexion Products Can Reveal Bias in Beauty. https://pudding.cool/2021/03/foundation-names/.
Andrews, David F, and Agnes M Herzberg. 2012. Data: A Collection of
Problems from Many Fields for the Student and Research Worker.
Springer Science & Business Media.
Angelucci, Charles, and Julia Cagé. 2019. “Newspapers in Times of
Low Advertising Revenues.” American Economic Journal:
Microeconomics 11 (3): 319–64.
Angrist, Joshua D, and Jörn-Steffen Pischke. 2010. “The
Credibility Revolution in Empirical Economics: How Better Research
Design Is Taking the Con Out of Econometrics.” Journal of
Economic Perspectives 24 (2): 3–30.
Annas, George J. 2003. “HIPAA Regulations: A New Era of
Medical-Record Privacy?” New England Journal of Medicine
348: 1486.
Arel-Bundock, Vincent. 2021a. Modelsummary: Summary Tables and Plots
for Statistical Models and Data: Beautiful, Customizable, and
Publication-Ready. https://CRAN.R-project.org/package=modelsummary.
———. 2021b. WDI: World Development Indicators and Other World Bank
Data. https://CRAN.R-project.org/package=WDI.
Arnold, Jeffrey B. 2021. Ggthemes: Extra Themes, Scales and Geoms
for ’Ggplot2’. https://CRAN.R-project.org/package=ggthemes.
Association, American Medical, and New York Academy of Medicine. 1848.
Code of Medical Ethics. Academy of Medicine.
Athey, Susan, and Guido W Imbens. 2017. “The State of Applied
Econometrics: Causality and Policy Evaluation.” Journal of
Economic Perspectives 31 (2): 3–32.
Athey, Susan, Guido W Imbens, Jonas Metzger, and Evan Munro. 2021.
“Using Wasserstein Generative Adversarial Networks for the Design
of Monte Carlo Simulations.” Journal of Econometrics.
Au, Randy. 2020. “Data Cleaning IS Analysis, Not Grunt
Work.” Counting Stuff. https://counting.substack.com/p/data-cleaning-is-analysis-not-grunt.
Bache, Stefan Milton, and Hadley Wickham. 2022. Magrittr: A
Forward-Pipe Operator for r. https://CRAN.R-project.org/package=magrittr.
Baker, Reg, J. Michael Brick, Nancy A. Bates, Mike Battaglia, Mick P.
Couper, Jill A. Dever, Krista J. Gile, and Roger Tourangeau. 2013.
“Summary Report of the AAPOR Task Force on
Non-probability Sampling.” Journal of Survey
Statistics and Methodology 1 (2): 90–143. https://doi.org/10.1093/jssam/smt008.
Bandy, Jack, and Nicholas Vincent. 2021. “Addressing
"Documentation Debt" in Machine Learning Research: A Retrospective
Datasheet for BookCorpus.” https://arxiv.org/abs/2105.05241.
Barba, Lorena A. 2018. “Terminologies for Reproducible
Research.” https://arxiv.org/abs/1802.03311.
Barrett, Malcolm. 2021a. Data Science as an Atomic Habit. https://malco.io/2021/01/04/data-science-as-an-atomic-habit/.
———. 2021b. Ggdag: Analyze and Create Elegant Directed Acyclic
Graphs. https://CRAN.R-project.org/package=ggdag.
Barron, Alexander TJ, Jenny Huang, Rebecca L Spang, and Simon DeDeo.
2018. “Individuals, Institutions, and Innovation in the Debates of
the French Revolution.” Proceedings of the National Academy
of Sciences 115 (18): 4607–12.
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015.
“Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical
Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.
Baumer, Benjamin, Daniel Kaplan, and Nicholas Horton. 2021. Modern
Data Science with r. 2nd ed. CRC Press.
Beauregard, Katrine, and Jill Sheppard. 2021. “Antiwomen but
Proquota: Disaggregating Sexism and Support for Gender Quota
Policies.” Political Psychology 42 (2): 219–37.
Bensinger, Greg. 2020. Google Redraws the Borders on Maps Depending
on Who’s Looking. Washington Post.
Berkson, Joseph. 1946. “Limitations of the Application of Fourfold
Table Analysis to Hospital Data.” Biometrics Bulletin 2
(3): 47–53. http://www.jstor.org/stable/3002000.
Berners-Lee, Timothy J. 1989. “Information Management: A
Proposal.”
Berry, Donald A. 1989. “[Investigating Therapies of Potentially
Great Benefit: ECMO]: Comment: Ethics and ECMO.” Statistical
Science 4 (4): 306–10.
Bertrand, Marianne, and Sendhil Mullainathan. 2004. “Are Emily and
Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor
Market Discrimination.” American Economic Review 94 (4):
991–1013.
Bethlehem, R. A. I., J. Seidlitz, S. R. White, J. W. Vogel, K. M.
Anderson, C. Adamson, S. Adler, et al. 2022. “Brain Charts for the
Human Lifespan.” Nature, April. https://doi.org/10.1038/s41586-022-04554-y.
Bickel, Peter J, Eugene A Hammel, and J William O’Connell. 1975.
“Sex Bias in Graduate Admissions: Data from Berkeley: Measuring
Bias Is Harder Than Is Usually Assumed, and the Evidence Is Sometimes
Contrary to Expectation.” Science 187 (4175): 398–404.
Biderman, Stella, Kieran Bicheno, and Leo Gao. 2022. “Datasheet
for the Pile.” https://arxiv.org/abs/2201.07311.
Blair, Graeme, Jasper Cooper, Alexander Coppock, Macartan Humphreys, and
Luke Sonnet. 2021. Estimatr: Fast Estimators for Design-Based
Inference. https://CRAN.R-project.org/package=estimatr.
Blair, James. 2019. Democratizing r with Plumber APIs. https://www.rstudio.com/resources/rstudioconf-2019/democratizing-r-with-plumber-apis/.
Bland, J Martin, and DouglasG Altman. 1986. “Statistical Methods
for Assessing Agreement Between Two Methods of Clinical
Measurement.” The Lancet 327 (8476): 307–10.
Blei, David M. 2012. “Probabilistic Topic Models.”
Communications of the ACM 55 (4): 77–84.
Blei, David M, and John D Lafferty. 2009. “Topic Models.”
In Text Mining, 101–24. Chapman; Hall/CRC.
Blei, David M, Andrew Y Ng, and Michael I Jordan. 2003. “Latent
Dirichlet Allocation.” Journal of Machine Learning
Research 3 (Jan): 993–1022.
Bloom, Howard, Andrew Bell, and Kayla Reiman. 2020. “Using Data
from Randomized Trials to Assess the Likely Generalizability of
Educational Treatment-Effect Estimates from Regression Discontinuity
Designs.” Journal of Research on Educational
Effectiveness, 1–30. https://doi.org/10.1080/19345747.2019.1634169.
Boland, Philip J. 1984. “A Biographical Glimpse of William Sealy
Gosset.” The American Statistician 38 (3): 179–83.
Bolton, Ruth, and Randall Chapman. 1986. “Searching for Positive
Returns at the Track.” Management Science 32 (August):
1040–60. https://doi.org/10.1287/mnsc.32.8.1040.
Borkin, Michelle A, Zoya Bylinskii, Nam Wook Kim, Constance May
Bainbridge, Chelsea S Yeh, Daniel Borkin, Hanspeter Pfister, and Aude
Oliva. 2015. “Beyond Memorability: Visualization Recognition and
Recall.” IEEE Transactions on Visualization and Computer
Graphics 22 (1): 519–28.
Bouie, Jamelle. 2022. We Still Can’t See American Slavery for What
It Was.
Bowers, Jake. 2011. “Six Steps to a Better Relationship with Your
Future Self.” The Political Methodologist 18 (2): 2–8.
Bowley, Arthur Lyon. 1901. Elements of Statistics. P. S. King.
———. 1913. “Working-Class Households in Reading.”
Journal of the Royal Statistical Society 76 (7): 672–701.
Braginsky, Mika. 2020. Wordbankr: Accessing the Wordbank
Database. https://CRAN.R-project.org/package=wordbankr.
Brandt, Allan M. 1978. “Racism and Research: The Case of the
Tuskegee Syphilis Study.” Hastings Center Report, 21–29.
Briggs, Ryan C. 2021. “Why Does Aid Not Target the
Poorest?” International Studies Quarterly 65 (3):
739–52.
Brokowski, Carolyn, and Mazhar Adli. 2019. “CRISPR Ethics: Moral
Considerations for Applications of a Powerful Tool.” Journal
of Molecular Biology 431 (1): 88–101.
Bronte, Charlotte. 1847. Jane Eyre. https://www.gutenberg.org/files/1260/1260-h/1260-h.htm.
Brontë, Charlotte. 1857. The Professor.
Brook, Robert H, John E Ware, William H Rogers, Emmett B Keeler, Allyson
Ross Davies, Cathy D Sherbourne, George A Goldberg, Kathleen N Lohr,
Patricia Camp, and Joseph P Newhouse. 1984. “The Effect of
Coinsurance on the Health of Adults: Results from the RAND Health
Insurance Experiment.”
Bryan, Jennifer, Jim Hester, David Robinson, and Hadley Wickham. 2019.
Reprex: Prepare Reproducible Example Code via the Clipboard. https://CRAN.R-project.org/package=reprex.
Bryan, Jenny. 2018a. “Excuse Me, Do You Have a Moment to Talk
about Version Control?” The American Statistician 72
(1): 20–27. https://doi.org/10.1080/00031305.2017.1399928.
———. 2018b. “Code Smells and Feels.” YouTube,
July. https://youtu.be/7oyiPBjLAWY.
———. 2020. Happy Git and GitHub for the
useR. https://happygitwithr.com.
Bryan, Jenny, and Jim Hester. 2020. What They Forgot to Teach You
about r. https://rstats.wtf/index.html.
Buckheit, Jonathan B, and David L Donoho. 1995. “Wavelab and
Reproducible Research.” In Wavelets and Statistics,
55–81. Springer.
Bueno de Mesquita, Ethan, and Anthony Fowler. 2021. Thinking Clearly
with Data: A Guide to Quantitative Reasoning and Analysis.
Princeton University Press.
Buhr, Ray. 2017. Using r as a Production Machine Learning Language
(Part i). https://raybuhr.github.io/blog/posts/making-predictions-over-http/.
Buja, Andreas, Dianne Cook, and Deborah F Swayne. 1996.
“Interactive High-Dimensional Data Visualization.”
Journal of Computational and Graphical Statistics 5 (1): 78–99.
Bush, Vannevar. 1945. “As We May Think.” The Atlantic
Monthly 176 (1): 101–8.
Cahill, Niamh, Michelle Weinberger, and Leontine Alkema. 2020.
“What Increase in Modern Contraceptive Use Is Needed in Fp2020
Countries to Reach 75% Demand Satisfied by 2030? An Assessment Using the
Accelerated Transition Method and Family Planning Estimation
Model.” Gates Open Research 4.
Calonico, Sebastian, Matias D. Cattaneo, Max H. Farrell, and Rocio
Titiunik. 2021. Rdrobust: Robust Data-Driven Statistical Inference
in Regression-Discontinuity Designs. https://CRAN.R-project.org/package=rdrobust.
Cambon, Jesse, and Christopher Belanger. 2021. “Tidygeocoder:
Geocoding Made Easy.” Zenodo. https://doi.org/10.5281/zenodo.3981510.
Carle, Eric. 1969. The Very Hungry Caterpillar. World
Publishing Company.
Carleton, Chris. 2021. Wccarleton/Conflict-Europe: Acce
(version v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.4550688.
Carleton, W Christopher, Dave Campbell, and Mark Collard. 2021. “A
Reassessment of the Impact of Temperature Change on European Conflict
During the Second Millennium CE Using a Bespoke Bayesian Time-Series
Model.” Climatic Change 165 (1): 1–16.
Caro, Robert. 2019. Working. 1st ed. Knopf.
Carroll, Lewis. 1865. Alice’s Adventures in Wonderland.
Macmillan.
———. 1871. Through the Looking-Glass. Macmillan.
Chamberlain, Scott, Hadley Wickham, and Winston Chang. 2021.
Analogsea: Interface to ’Digital Ocean’.
Chambliss, Daniel F. 1989. “The Mundanity of Excellence: An
Ethnographic Report on Stratification and Olympic Swimmers.”
Sociological Theory 7 (1): 70–86.
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke,
Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara
Borges. 2021. Shiny: Web Application Framework for r. https://CRAN.R-project.org/package=shiny.
Chellel, Kit. 2018. “The Gambler Who Cracked the Horse-Racing
Code.” Bloomberg Businessweek (May 2018). Featured in
Bloomberg Businessweek, May 14.
Chen, Wei, Xilu Chen, Chang-Tai Hsieh, and Zheng Song. 2019. “A
Forensic Examination of China’s National Accounts.” National
Bureau of Economic Research.
Cheng, Joe, Bhaskar Karambelkar, and Yihui Xie. 2021. Leaflet:
Create Interactive Web Maps with the JavaScript ’Leaflet’ Library.
https://CRAN.R-project.org/package=leaflet.
Chouldechova, Alexandra, Diana Benavides-Prado, Oleksandr Fialko, and
Rhema Vaithianathan. 2018. “A Case Study of Algorithm-Assisted
Decision Making in Child Maltreatment Hotline Screening
Decisions.” In Proceedings of the 1st Conference on Fairness,
Accountability and Transparency, edited by Sorelle A. Friedler and
Christo Wilson, 81:134–48. Proceedings of Machine Learning Research.
PMLR. https://proceedings.mlr.press/v81/chouldechova18a.html.
Chrétien, Jean. 2007. My Years as Prime Minister. Knopf Canada.
Christensen, Garret, Allan Dafoe, Edward Miguel, Don A Moore, and Andrew
K Rose. 2019. “A Study of the Impact of Data Sharing on Article
Citations Using Journal Policies as a Natural Experiment.”
PLoS One 14 (12): e0225883.
Christensen, Garret, Jeremy Freese, and Edward Miguel. 2019.
Transparent and Reproducible Social Science Research.
University of California Press.
Churchill, Winston. 1956. A History of the English-Speaking
Peoples.
Cirone, Alexandra, and Arthur Spirling. 2021. “Turning History
into Data: Data Collection, Measurement, and Inference in HPE.”
Journal of Historical Political Economy 1 (1): 127–54. https://doi.org/10.1561/115.00000005.
City of Toronto. 2021. 2021 Street Needs Assessment. https://www.toronto.ca/city-government/data-research-maps/research-reports/housing-and-homelessness-research-and-reports/.
Cleveland, William. 1994. The Elements of Graphing Data. 2nd
ed. Hobart Press.
Cohen, I. Glenn, and Michelle M. Mello. 2018. “HIPAA
and Protecting Health Information in the 21st Century.”
JAMA 320 (3): 231. https://doi.org/10.1001/jama.2018.5630.
Cohn, Nate. 2016. We Gave Four Good Pollsters the Same Raw Data.
They Had Four Different Results.
Cook, Dianne, Nancy Reid, and Emi Tanaka. 2021. “The Foundation Is
Available for Thinking about Data Visualization Inferentially.”
Harvard Data Science Review, July. https://doi.org/10.1162/99608f92.8453435d.
Cooley, David. 2020. Mapdeck: Interactive Maps Using ’Mapbox GL JS’
and ’Deck.gl’. https://CRAN.R-project.org/package=mapdeck.
Council of European Union. 2016. “General Data Protection
Regulation 2016/679.”
Cox, David. 2018. “In Gentle Praise of Significance Tests.”
YouTube, October. https://youtu.be/txLj_P9UlCQ.
Cox, David Roxbee, and Nancy Reid. 1987. “Parameter Orthogonality
and Approximate Conditional Inference.” Journal of the Royal
Statistical Society: Series B (Methodological) 49 (1): 1–18.
Cox, Murray. 2021. “Inside Airbnb - Toronto
Data.” http://insideairbnb.com/get-the-data.html.
Craiu, Radu V. 2019. “The Hiring Gambit: In Search of the Twofer
Data Scientist.” Harvard Data Science Review 1 (1).
Cramer, Jan Salomon. 2002. “The Origins of Logistic
Regression.”
Crawford, Kate. 2021. Atlas of AI.
Yale University Press.
Csárdi, Gábor. 2020. Gitcreds: Query ’Git’ Credentials from
’r’. https://CRAN.R-project.org/package=gitcreds.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. Yale
Press.
D’Ignazio, Catherine, and Lauren F Klein. 2020. Data Feminism.
Mit Press.
Dagan, Noa, Noam Barda, Eldad Kepten, Oren Miron, Shay Perchik, Mark A
Katz, Miguel A Hernán, Marc Lipsitch, Ben Reis, and Ran D Balicer. 2021.
“BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination
Setting.” New England Journal of Medicine.
Darling, William M. 2011. “A Theoretical and Practical
Implementation Tutorial on Topic Modeling and Gibbs Sampling.” In
Proceedings of the 49th Annual Meeting of the Association for
Computational Linguistics: Human Language Technologies, 642–47.
DeWitt, Helen. 2000. The Last Samurai. Talk Mirimax Books.
Doll, Richard, and A Bradford Hill. 1950. “Smoking and Carcinoma
of the Lung.” British Medical Journal 2 (4682): 739.
Edgeworth, Francis Ysidro. 1885. “Methods of Statistics.”
Journal of the Statistical Society of London, 181–217.
Eghbal, Nadia. 2020. Working in Public: The Making and Maintenance
of Open Source Software. Stripe Press.
Farrugia, Patricia, Bradley A Petrisor, Forough Farrokhyar, and Mohit
Bhandari. 2010. “Research Questions, Hypotheses and
Objectives.” Canadian Journal of Surgery 53 (4): 278.
Finkelstein, Amy, Sarah Taubman, Bill Wright, Mira Bernstein, Jonathan
Gruber, Joseph P Newhouse, Heidi Allen, Katherine Baicker, and Oregon
Health Study Group. 2012. “The Oregon Health Insurance Experiment:
Evidence from the First Year.” The Quarterly Journal of
Economics 127 (3): 1057–1106.
Firke, Sam. 2020. Janitor: Simple Tools for Examining and Cleaning
Dirty Data. https://CRAN.R-project.org/package=janitor.
Fisher, Ronald. 1935. The Design of Experiments. Oliver; Boyd.
Fisher, Ronald Aylmer. 1926. “The Arrangement
of Field Experiments,” 503–15. https://doi.org/10.23637/ROTHAMSTED.8V61Q.
Fiske, Susan T, and Shiro Kuriwaki. 2021. “Words to the Wise on
Writing Scientific Papers.”
Fitts, Alexis Sobel. 2014. “The King of Content: How Upworthy Aims
to Alter the Web, and Could End up Altering the World.”
Columbia Journalism Review. https://archives.cjr.org/feature/the_king_of_content.php.
Flynn, Michael. 2021. Troopdata: Tools for Analyzing Cross-National
Military Deployment and Basing Data. https://CRAN.R-project.org/package=troopdata.
Forster, E M. 1927. Aspects of the Novel. Edward Arnold.
Foster, Gordon. 1968. “Computers, Statistics and Planning: Systems
or Chaos?” Geary Lecture. https://www.esri.ie/system/files/media/file-uploads/2016-03/GLS2.pdf.
Fourcade, Marion, and Kieran Healy. 2017. “Seeing Like a
Market.” Socio-Economic Review 15 (1): 9–29.
Fox, John, and Robert Andersen. 2006. “Effect Displays for
Multinomial and Proportional-Odds Logit Models.” Sociological
Methodology 36 (1): 225–55.
Franconeri, Steven L, Lace M Padilla, Priti Shah, Jeffrey M Zacks, and
Jessica Hullman. 2021. “The Science of Visual Data Communication:
What Works.” Psychological Science in the Public
Interest 22 (3): 110–61.
Franklin, Laura R. 2005. “Exploratory Experiments.”
Philosophy of Science 72 (5): 888–99.
Friedman, Jerome H., Robert Tibshirani, and Trevor Hastie. 2009. The
Elements of Statistical Learning. Springer.
Friendly, Michael, and Howard Wainer. 2021. A History of Data
Visaulization and Graphic Communication. 1st ed. Harvard University
Press.
Fry, Hannah. 2020. “Big Tech Is Testing You.” The New
Yorker, 61–65.
Funkhouser, H Gray. 1937. “Historical Development of the Graphical
Representation of Statistical Data.” Osiris 3: 269–404.
Gagolewski, Marek. 2020. R Package Stringi: Character String
Processing Facilities. http://www.gagolewski.com/software/stringi/.
Garnier, Simon, Ross, Noam, Rudis, Robert, Camargo, et al. 2021.
viridis - Colorblind-Friendly Color Maps
for r. https://doi.org/10.5281/zenodo.4679424.
Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman
Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021.
“Datasheets for Datasets.” Communications of the
ACM 64 (12): 86–92.
Gelfand, Sharla. 2019. Crying @ Sephora. https://sharla.party/post/crying-sephora/.
———. 2020. Opendatatoronto: Access the City of Toronto Open Data
Portal. https://CRAN.R-project.org/package=opendatatoronto.
———. 2021. “Make a ReprEx... Please.” YouTube,
February. https://youtu.be/G5Nm-GpmrLw.
Gelman, Andrew. 2016. “What Has Happened down Here Is the Winds
Have Changed.” https://statmodeling.stat.columbia.edu/2016/09/21/what-has-happened-down-here-is-the-winds-have-changed/.
———. 2019. Another Regression Discontinuity Disaster and What Can We
Learn from It. https://statmodeling.stat.columbia.edu/2019/06/25/another-regression-discontinuity-disaster-and-what-can-we-learn-from-it/.
———. 2020. “Statistical Models of Election Outcomes.”
YouTube, August. https://youtu.be/7gjDnrbLQ4k.
———. 2021. “Wrong Again! 30+ Years of Statistical
Mistakes.” YouTube, October. https://youtu.be/mB9Q26uptao.
Gelman, Andrew, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and
Donald Rubin. 2014. Bayesian Data Analysis. 3rd ed. CRC Press.
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using
Regression and Multilevel/Hierarchical Models.
Gelman, Andrew, and Eric Loken. 2013. “The Garden of Forking
Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No
‘Fishing Expedition’ or ‘p-Hacking’ and the
Research Hypothesis Was Posited Ahead of Time.” Department of
Statistics, Columbia University 348.
Gelman, Andrew, Greggor Mattson, and Daniel Simpson. 2018. “Gaydar
and the Fallacy of Decontextualized Measurement.”
Sociological Science 5 (12): 270–80. https://doi.org/10.15195/v5.a12.
Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s
Practice What We Preach: Turning Tables into Graphs.” The
American Statistician 56 (2): 121–30.
Gelman, Andrew, and Aki Vehtari. 2020. “What Are the Most
Important Statistical Ideas of the Past 50 Years?” arXiv
Preprint arXiv:2012.00174.
Gentemann, C. L., C. Holdgraf, R. Abernathey, D. Crichton, J.
Colliander, E. J. Kearns, Y. Panda, and R. P. Signell. 2021.
“Science Storms the Cloud.” AGU
Advances 2 (2). https://doi.org/10.1029/2020av000354.
Gerber, Alan, and Donald Green. 2012. Field Experiments: Design,
Analysis, and Interpretation. W W Norton.
Gertler, Paul J, Sebastian Martinez, Patrick Premand, Laura B Rawlings,
and Christel MJ Vermeersch. 2016. Impact Evaluation in
Practice. The World Bank.
Geuenich, Michael J, Jinyu Hou, Sunyun Lee, Shanza Ayub, Hartland W
Jackson, and Kieran R Campbell. 2021. “Automated Assignment of
Cell Identity from Single-Cell Multiplexed Imaging and Proteomic
Data.” Cell Systems 12 (12): 1173–86.
Ghitza, Yair, and Andrew Gelman. 2020. “Voter Registration
Databases and MRP: Toward the Use of Large-Scale Databases in Public
Opinion Research.” Political Analysis 28 (4): 507–31.
Goodman, Leo A. 1961. “Snowball Sampling.” The Annals
of Mathematical Statistics, 148–70.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2020.
“Rstanarm: Bayesian Applied Regression Modeling via
Stan.” https://mc-stan.org/rstanarm.
Gould, Stephen Jay. 2013. “The Median Isn’t the Message.”
AMA Journal of Ethics 15 (1): 77–81.
Graham, Paul. 2020. How to Write Usefully. http://paulgraham.com/useful.html.
Green, Donald P, Terence Y Leong, Holger L Kern, Alan S Gerber, and
Christopher W Larimer. 2009. “Testing the Accuracy of Regression
Discontinuity Analysis Using Experimental Benchmarks.”
Political Analysis 17 (4): 400–417.
Green, Eric. 2020. “Nivi Research: Mister p Helps Us Understand
Vaccine Hesitancy.” https://research.nivi.io/posts/2020-12-08-mister-p-helps-us-understand-vaccine-hesitancy/.
Greenland, Sander, Stephen J Senn, Kenneth J Rothman, John B Carlin,
Charles Poole, Steven N Goodman, and Douglas G Altman. 2016.
“Statistical Tests, p Values, Confidence Intervals, and Power: A
Guide to Misinterpretations.” European Journal of
Epidemiology 31 (4): 337–50.
Griffiths, Thomas, and Mark Steyvers. 2004. “Finding Scientific
Topics.” PNAS 101: 5228–35.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times
Made Easy with lubridate.”
Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.
Grün, Bettina, and Kurt Hornik. 2011. “topicmodels: An R Package for Fitting
Topic Models.” Journal of Statistical Software 40 (13):
1–30. https://doi.org/10.18637/jss.v040.i13.
Halberstam, David. 1972. The Best and the
Brightest. Random House.
Hamming, Richard W. 1996. The Art of Doing
Science and Engineering. Stripe Press.
Handcock, Mark S, and Krista J Gile. 2011. “Comment: On the
Concept of Snowball Sampling.” Sociological Methodology
41 (1): 367–71.
Hangartner, Dominik, Daniel Kopp, and Michael Siegenthaler. 2021.
“Monitoring Hiring Discrimination Through Online Recruitment
Platforms.” Nature 589 (7843): 572–76.
Hanretty, Chris. 2020. “An Introduction to Multilevel Regression
and Post-Stratification for Estimating Constituency Opinion.”
Political Studies Review 18 (4): 630–45.
Hao, Karen. 2019. “This is how AI bias really
happens—and why it’s so hard to fix.” MIT Technology
Review.
Hart, Edmund M, Pauline Barmby, David LeBauer, François Michonneau,
Sarah Mount, Patrick Mulrooney, Timothée Poisot, Kara H Woo, Naupaka B
Zimmerman, and Jeffrey W Hollister. 2016. “Ten Simple Rules for
Digital Data Storage.” Public Library of Science San Francisco,
CA USA.
Hastie, Trevor J, and Robert J Tibshirani. 1990. Generalized
Additive Models. Vol. 43. CRC press.
Hayot, Eric. 2014. The Elements of Academic Style. Columbia
University Press.
Healy, Kieran. 2018. Data Visualization. Princeton University
Press.
———. 2020. The Kitchen Counter Observatory. https://kieranhealy.org/blog/archives/2020/05/21/the-kitchen-counter-observatory/.
Heckathorn, Douglas D. 1997. “Respondent-Driven Sampling: A New
Approach to the Study of Hidden Populations.” Social
Problems 44 (2): 174–99.
Heil, Benjamin J, Michael M Hoffman, Florian Markowetz, Su-In Lee, Casey
S Greene, and Stephanie C Hicks. 2021. “Reproducibility Standards
for Machine Learning in the Life Sciences.” Nature
Methods 18 (10): 1132–35.
Henry, Lionel, and Hadley Wickham. 2020. Purrr: Functional
Programming Tools. https://CRAN.R-project.org/package=purrr.
Hernan, Miguel A, and James M Robins. 2020. What If. CRC Press.
Herndon, Thomas, Michael Ash, and Robert Pollin. 2014. “Does High
Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart
and Rogoff.” Cambridge Journal of Economics 38 (2):
257–79.
Hillel, Wayne. 2017. How Do We Trust Our Science Code? https://www.hillelwayne.com/how-do-we-trust-science-code/.
Ho, Daniel E., Kosuke Imai, Gary King, and Elizabeth A. Stuart. 2011.
“MatchIt: Nonparametric Preprocessing for Parametric
Causal Inference.” Journal of Statistical Software 42
(8): 1–28. https://doi.org/10.18637/jss.v042.i08.
Hofmeister, Johannes, Janet Siegmund, and Daniel V. Holt. 2017.
“Shorter Identifier Names Take Longer to Comprehend.” In
2017 IEEE 24th International Conference on Software Analysis,
Evolution and Reengineering (SANER), 217–27. https://doi.org/10.1109/SANER.2017.7884623.
Holland, Paul W. 1986. “Statistics and Causal Inference.”
Journal of the American Statistical Association 81 (396):
945–60.
Horst, Allison Marie, Alison Presmanes Hill, and Kristen B Gorman. 2020.
Palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data.
https://allisonhorst.github.io/palmerpenguins/.
Hug, Lucia, Monica Alexander, Danzhen You, Leontine Alkema, and UN
Inter-agency Group for Child. 2019. “National, Regional, and
Global Levels and Trends in Neonatal Mortality Between 1990 and 2017,
with Scenario-Based Projections to 2030: A Systematic Analysis.”
The Lancet Global Health 7 (6): e710–20.
Hughes, Nicola, and Jill Rutter. 2016. Oliver Letwin. https://www.instituteforgovernment.org.uk/ministers-reflect/person/oliver-letwin/.
Hulley, Stephen B. 2007. Designing Clinical Research.
Lippincott Williams & Wilkins.
Hullman, Jessica, and Andrew Gelman. 2021. “Designing for
Interactive Exploratory Data Analysis Requires Theories of Graphical
Inference.” Harvard Data Science Review, July. https://doi.org/10.1162/99608f92.3ab8a587.
Huntington-Klein, Nick. 2021. The Effect: An Introduction to
Research Design and Causality. Chapman & Hall.
———. 2022. “Library of Statistical Techniques.” https://lost-stats.github.io.
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni,
Jeffrey R Bloem, Pralhad Burli, Naibin Chen, et al. 2021. “The
Influence of Hidden Researcher Decisions in Applied
Microeconomics.” Economic Inquiry.
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni,
Jeffrey Bloem, Pralhad H Burli, Naibin Chen, et al. 2020. “The
Influence of Hidden Researcher Decisions in Applied
Microeconomics.”
Huyen, Chip. 2020. Machine Learning Is Going Real-Time. https://huyenchip.com/2020/12/27/real-time-machine-learning.html.
———. 2022. Real-Time Machine Learning: Challenges and
Solutions. https://huyenchip.com/2022/01/02/real-time-machine-learning-challenges-and-solutions.html.
Hvitfeldt, Emil, and Julia Silge. 2021. Supervised Machine Learning
for Text Analysis in r. Chapman; Hall/CRC.
Iannone, Richard. 2020. DiagrammeR: Graph/Network
Visualization. https://CRAN.R-project.org/package=DiagrammeR.
Iannone, Richard, Joe Cheng, and Barret Schloerke. 2020. Gt: Easily
Create Presentation-Ready Display Tables. https://CRAN.R-project.org/package=gt.
Iannone, Richard, and Mauricio Vargas. 2022. Pointblank: Data
Validation and Organization of Metadata for Local and Remote
Tables. https://CRAN.R-project.org/package=pointblank.
Igelström, Erik. 2020. “Causal Graphs in r with
DiagrammeR.” https://www.erikigelstrom.com/articles/causal-graphs-in-r-with-diagrammer/.
Ioannidis, John PA. 2005. “Why Most Published Research Findings
Are False.” PLoS Medicine 2 (8): e124.
Isaacson, Walter. 2011. Steve Jobs. Simon & Schuster.
Ishiguro, Kazuo. 1989. The Remains of the Day. Faber; Faber.
Izrailev, Sergei. 2014. Tictoc: Functions for Timing r Scripts.
https://CRAN.R-project.org/package=tictoc.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani.
2017. An Introduction to Statistical Learning with Applications in
r.
Johnson, Alicia A., Miles Ott, and Mine Dogucu. 2022. Bayes Rules!
An Introduction to Bayesian Modeling with r. CRC Press.
Johnson, Kaneesha R. 2021. “Two Regimes of Prison Data
Collection.” Harvard Data Science Review, July. https://doi.org/10.1162/99608f92.72825001.
Jones, Arnold HM. 1953. “Census Records of the Later Roman
Empire.” The Journal of Roman Studies 43 (1-2): 49–64.
Jordan, Michael I. 2019. “Artificial Intelligence—the Revolution
Hasn’t Happened Yet.” Harvard Data Science Review 1 (1).
https://doi.org/10.1162/99608f92.f06c6e61.
Joyner, MICHAEL J. 1991. “Modeling: Optimal Marathon Performance
on the Basis of Physiological Factors.” Journal of Applied
Physiology 70 (2): 683–87.
Kahle, David, and Hadley Wickham. 2013. “Ggmap: Spatial
Visualization with Ggplot2.” The R Journal 5 (1):
144–61. http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf.
Kahneman, Daniel, Olivier Sibony, and Cass Sunstein. 2021. Noise: A
Flaw in Human Judgment. William Collins.
Kay, Matthew. 2020. tidybayes: Tidy Data
and Geoms for Bayesian Models. https://doi.org/10.5281/zenodo.1308151.
Kearney, Michael W. 2019. “Rtweet: Collecting and Analyzing
Twitter Data.” Journal of Open Source Software 4 (42):
1829. https://doi.org/10.21105/joss.01829.
Kennedy, Lauren, and Jonah Gabry. 2020. “MRP with
Rstanarm.” https://mc-stan.org/rstanarm/articles/mrp.html.
Kennedy, Lauren, and Andrew Gelman. 2020. “Know Your Population
and Know Your Model: Using Model-Based Regression and Poststratification
to Generalize Findings Beyond the Observed Sample.” https://arxiv.org/abs/1906.11323.
Kennedy, Lauren, Katharine Khanna, Daniel Simpson, and Andrew Gelman.
2020. “Using Sex and Gender in Survey Adjustment.” https://arxiv.org/abs/2009.14401.
Kenny, Christopher T., Shiro Kuriwaki, Cory McCartan, Evan Rosenman,
Tyler Simko, and Kosuke Imai. 2021. “The Impact of the u.s. Census
Disclosure Avoidance System on Redistricting and Voting Rights
Analysis.” https://arxiv.org/abs/2105.14197.
Keyes, Os. 2019. “Counting the Countless.” Real
Life. https://reallifemag.com/counting-the-countless/.
Kharecha, Pushker A, and James E Hansen. 2013. “Prevented
Mortality and Greenhouse Gas Emissions from Historical and Projected
Nuclear Power.” Environmental Science & Technology
47 (9): 4889–95.
Kiang, Mathew V, Alexander C Tsai, Monica J Alexander, David H Rehkopf,
and Sanjay Basu. 2021. “Racial/Ethnic Disparities in
Opioid-Related Mortality in the USA, 1999–2019: The Extreme Case of
Washington DC.” Journal of Urban Health 98 (5): 589–95.
Kimmerer, Robin Wall. 2012. Braiding Sweetgrass. Milkweed
Editions.
King, Gary. 2006. “Publication, Publication.” PS:
Political Science & Politics 39 (1): 119–25.
King, Gary, and Richard Nielsen. 2019. “Why Propensity Scores
Should Not Be Used for Matching.” Political Analysis 27
(4): 435–54.
King, Stephen. 2000. On Writing: A Memoir of the Craft.
Scribner.
Kleiber, Christian, and Achim Zeileis. 2008. Applied Econometrics
with R. New York: Springer-Verlag. https://CRAN.R-project.org/package=AER.
Knuth, D. E. 1984. “Literate Programming.” The Computer
Journal 27 (2): 97–111. https://doi.org/10.1093/comjnl/27.2.97.
Knuth, Donald E. 1998. Art of Computer Programming, Volume 2:
Seminumerical Algorithms. 2nd ed.
Koenecke, Allison, and Hal Varian. 2020. “Synthetic Data
Generation for Economists.” https://arxiv.org/abs/2011.01374.
Koenker, Roger, and Achim Zeileis. 2009. “On Reproducible
Econometric Research.” Journal of Applied Econometrics
24 (5): 833–47.
Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online
Controlled Experiments: A Practical Guide to a/b Testing. Cambridge
University Press.
Kröger, Jacob Leon, Milagros Miceli, and Florian Müller. 2021.
“How Data Can Be Used Against People: A Classification of Personal
Data Misuses.” Available at SSRN 3887097.
Kross, Sean. 2021. Postcards: Create Beautiful, Simple Personal
Websites. https://CRAN.R-project.org/package=postcards.
Kuhn, Max. 2021. Poissonreg: Model Wrappers for Poisson
Regression. https://CRAN.R-project.org/package=poissonreg.
Kuhn, Max, and Hadley Wickham. 2020. Tidymodels: A Collection of
Packages for Modeling and Machine Learning Using Tidyverse
Principles. https://www.tidymodels.org.
Kuriwaki, Shiro, Will Beasley, and Thomas J. Leeper. 2022.
Dataverse: R Client for Dataverse 4+ Repositories.
Kuznets, Simon. 1941. National Income and Its
Composition, 1919-1938. National Bureau of Economic
Research.
Lamott, Anne. 1994. Bird by Bird: Some Instructions on Writing and
Life. Anchor Books.
Larmarange, Joseph. 2021. Labelled: Manipulating Labelled Data.
https://CRAN.R-project.org/package=labelled.
Latour, Bruno. 1996. “On Actor-Network Theory: A Few
Clarifications.” Soziale Welt 47 (4): 369–81. http://www.jstor.org/stable/40878163.
Lauderdale, Benjamin E, Delia Bailey, Jack Blumenau, and Douglas Rivers.
2020. “Model-Based Pre-Election Polling for National and
Sub-National Outcomes in the US and UK.” International
Journal of Forecasting 36 (2): 399–413.
Lazear, Edward P. 2000. “Economic Imperialism.” The
Quarterly Journal of Economics 115 (1): 99–146.
Lee, Benjamin D. 2018. “Ten Simple Rules for Documenting
Scientific Software.” Public Library of Science San Francisco, CA
USA.
Leek, Jeff, Blakeley B. McShane, Andrew Gelman, David Colquhoun, Michèle
B. Nuijten, and Steven N. Goodman. 2017. “Five Ways to Fix
Statistics.” Nature 551 (7682): 557–59. https://doi.org/10.1038/d41586-017-07522-z.
Leek, Jeff, and Roger D. Peng. 2020. “Advanced Data Science
2020.” http://jtleek.com/ads2020/index.html.
Lin, Sarah, Ibraheem Ali, and Greg Wilson. 2020. “Ten Quick Tips
for Making Things Findable.” PLoS Computational Biology
16 (12): e1008469.
Little, Roderick J., and Roger J. Lewis. 2021. “Estimands,
Estimators, and Estimates.” JAMA 326 (10):
967. https://doi.org/10.1001/jama.2021.2886.
Locke, Steph, and Lucy D’Agostino McGowan. 2018. datasauRus:
Datasets from the Datasaurus Dozen. https://CRAN.R-project.org/package=datasauRus.
Lohr, Sharon L. 2019. Sampling: Design and Analysis. CRC Press.
Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019.
Geocomputation with r. CRC Press.
Lucas Jr, Robert E. 1978. “Asset Prices in an Exchange
Economy.” Econometrica: Journal of the Econometric
Society, 1429–45.
Luebke, David Martin, and Sybil Milton. 1994. “Locating the
Victim: An Overview of Census-Taking, Tabulation Technology, and
Persecution in Nazi Germany.” IEEE Annals of the History of
Computing 16 (3): 25.
Lundberg, Ian, Rebecca Johnson, and Brandon M. Stewart. 2021.
“What Is Your Estimand? Defining the Target Quantity Connects
Statistical Evidence to Theory.” American Sociological
Review 86 (3): 532–65. https://doi.org/10.1177/00031224211004187.
Luscombe, Alex, Kevin Dick, and Kevin Walby. 2021. “Algorithmic
Thinking in the Public Interest: Navigating Technical, Legal, and
Ethical Hurdles to Web Scraping in the Social Sciences.”
Quality & Quantity, 1–22.
Luscombe, Alex, and Alexander McClelland. 2020. “Policing the
Pandemic: Tracking the Policing of COVID-19 Across Canada.”
Macaulay, Thomas Babington. 1848. The History of England from the
Accession of James the Second.
MacDorman, Marian F, and Eugene Declercq. 2018. “The Failure of
United States Maternal Mortality Reporting and Its Impact on Women’s
Lives.” Birth (Berkeley, Calif.) 45 (2): 105.
Martinez, Luis R. 2019. “How Much Should We Trust the Dictator’s
GDP Growth Estimates?” Available at SSRN 3093296.
Matias, J. Nathan, Kevin Munger, Marianne Aubin Le Quere, and Charles
Ebersole. 2019. “The Upworthy Research Archive.” https://upworthy.natematias.com.
Mattson, Greggor. 2017. “Artificial Intelligence Discovers
Gayface. Sigh.” https://greggormattson.com/2017/09/09/artificial-intelligence-discovers-gayface/amp/.
McClelland, Alexander. 2019. “"Lock This Whore up": Legal Violence
and Flows of Information Precipitating Personal Violence Against People
Criminalised for HIV-Related Crimes in Canada.” European
Journal of Risk Regulation 10 (1): 132–47.
McElreath, Richard. 2020. Statistical
Rethinking: A Bayesian Course with Examples in R and Stan.
CRC Press.
McPhee, John. 2017. Draft No. 4. Farrar, Straus; Giroux.
McQuire, Scott. 2019. “One Map to Rule Them All? Google Maps as
Digital Technical Object.” Communication and the Public
4 (2): 150–65.
Meng, Xiao-Li. 2018. “Statistical Paradises and Paradoxes in Big
Data (i): Law of Large Populations, Big Data Paradox, and the 2016 US
Presidential Election.” The Annals of Applied Statistics
12 (2): 685–726.
———. 2021. “What Are the Values of Data, Data Science, or Data
Scientists?” Harvard Data Science Review, January. https://doi.org/10.1162/99608f92.ee717cf7.
Merali, Zeeya. 2010. “Computational Science:... Error.”
Nature 467 (7317): 775–77.
Michael, Geuenich, Hou Jinyu, Lee Sunyun, Ayub Shanza, Jackson Hartland,
and Campbell Kieran. 2021. “Automated
assignment of cell identity from single- cell multiplexed imaging and
proteomic data.” Zenodo. https://doi.org/10.5281/zenodo.5156049.
Michener, William K. 2015. “Ten Simple Rules for Creating a Good
Data Management Plan.” PLoS Computational Biology 11
(10): e1004525. https://doi.org/https://doi.org/10.1371/journal.pcbi.1004525.
Mineault, Patrick, and The Good Research Code Handbook Community. 2021.
“The Good Research Code Handbook.” https://doi.org/10.5281/ZENODO.5796873.
Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy
Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and
Timnit Gebru. 2019. “Model Cards for Model Reporting.”
Proceedings of the Conference on Fairness, Accountability, and
Transparency, January. https://doi.org/10.1145/3287560.3287596.
Miyakawa, Tsuyoshi. 2020. “No Raw Data, No Science: Another
Possible Source of the Reproducibility Crisis.” Molecular
Brain. Springer.
Müller, Kirill, and Hadley Wickham. 2021. Tibble: Simple Data
Frames. https://CRAN.R-project.org/package=tibble.
Murphy, Heather. 2017. Why Stanford Researchers Tried to Create a
’Gaydar’ Machine.
Nelder, John Ashworth, and Robert WM Wedderburn. 1972.
“Generalized Linear Models.” Journal of the Royal
Statistical Society: Series A (General) 135 (3): 370–84.
Neufeld, Michael J. 2002. “Wernher von Braun, the SS, and
Concentration Camp Labor: Questions of Moral, Political, and Criminal
Responsibility.” German Studies Review 25 (1): 57–78.
Neuwirth, Erich. 2014. RColorBrewer: ColorBrewer Palettes. https://CRAN.R-project.org/package=RColorBrewer.
Neyman, Jerzy. 1934. “On the Two Different Aspects of the
Representative Method: The Method of Stratified Sampling and the Method
of Purposive Selection.” Journal of the Royal Statistical
Society 97 (4): 558–625.
Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil
Mullainathan. 2019. “Dissecting Racial Bias in an Algorithm Used
to Manage the Health of Populations.” Science 366
(6464): 447–53.
OECD. 2014. “The Essential Macroeconomic Aggregates.” In
Understanding National Accounts, 13–46. OECD. https://doi.org/10.1787/9789264214637-2-en.
———. 2022. Quarterly GDP. https://data.oecd.org/gdp/quarterly-gdp.htm.
Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and
Consistent Mapping Between JSON Data and r Objects.”
arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805.
———. 2018a. Pdftools: Text Extraction, Rendering and Converting of
PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2018b. Tesseract: Open Source OCR Engine. https://CRAN.R-project.org/package=tesseract.
———. 2019a. Pdftools: Text Extraction, Rendering and Converting of
PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2019b. Pdftools: Text Extraction, Rendering and Converting of
PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2019c. Tesseract: Open Source OCR Engine. https://CRAN.R-project.org/package=tesseract.
———. 2021. Openssl: Toolkit for Encryption, Signatures and
Certificates Based on OpenSSL. https://CRAN.R-project.org/package=openssl.
Oostrom, Tamar. 2021. “Funding of Clinical Trials and Reported
Drug Efficacy.” https://drive.google.com/file/d/1EQLCH0ns99IxYBkxPNbagcZtGgE9a8MQ/view.
Orwell, George. 1946. Politics and the English Language. https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-and-the-english-language/.
Patki, Neha, Roy Wedge, and Kalyan Veeramachaneni. 2016. “The
Synthetic Data Vault.” In 2016 IEEE International Conference
on Data Science and Advanced Analytics (DSAA), 399–410. IEEE.
Paullada, Amandalynne, Inioluwa Deborah Raji, Emily M. Bender, Emily
Denton, and Alex Hanna. 2021. “Data and Its (Dis)contents: A
Survey of Dataset Development and Use in Machine Learning
Research.” Patterns 2 (11): 100336. https://doi.org/10.1016/j.patter.2021.100336.
Pavlik, Kaylin. 2019. “Understanding + Classifying Genres Using
Spotify Audio Features.” https://www.kaylinpavlik.com/classifying-songs-genres/.
Pedersen, Thomas Lin. 2020. Patchwork: The Composer of Plots.
https://CRAN.R-project.org/package=patchwork.
Phillips, Alban W. 1958. “The Relation Between Unemployment and
the Rate of Change of Money Wage Rates in the United Kingdom,
1861-1957.” Economica 25 (100): 283–99.
Pineau, Joelle, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent
Larivière, Alina Beygelzimer, Florence d’Alché-Buc, Emily Fox, and Hugo
Larochelle. 2021. “Improving Reproducibility in Machine Learning
Research: A Report from the NeurIPS 2019 Reproducibility
Program.” Journal of Machine Learning Research 22.
Pitman, Jim. 1993. Probability.
Presmanes Hill, Alison. 2021a. M-F-E-O:
postcards + distill. https://alison.rbind.io/post/2020-12-22-postcards-distill/.
———. 2021b. Up & Running with Blogdown in 2021.
R Core Team. 2021. R: A Language and Environment for Statistical
Computing. Vienna, Austria: R Foundation for Statistical Computing.
https://www.R-project.org/.
Register, Yim. 2020. “Data Science Ethics in 6 Minutes.” https://youtu.be/mA4gypAiRYU.
Reid, Nancy. 2003. “Asymptotics and the Theory of
Inference.” The Annals of Statistics 31 (6): 1695–2095.
Richardson, Neal, Ian Cook, Nic Crane, Jonathan Keane, Romain François,
Jeroen Ooms, and Apache Arrow. 2022. Arrow: Integration to ’Apache’
’Arrow’. https://CRAN.R-project.org/package=arrow.
Riederer, Emily. 2020. “Column Names as Contracts.” https://emilyriederer.netlify.app/post/column-name-contracts/.
———. 2022. Convo: Enables Conversations and Contracts Through
Controlled Vocabulary Naming Conventions. https://github.com/emilyriederer/convo.
Rilke, Rainer Maria. 1929. Letters to a Young Poet.
Robinson, David. 2021. Gutenbergr: Download and Process Public
Domain Works from Project Gutenberg. https://CRAN.R-project.org/package=gutenbergr.
Robinson, David, Alex Hayes, and Simon Couch. 2021. Broom: Convert
Statistical Objects into Tidy Tibbles. https://CRAN.R-project.org/package=broom.
Robinson, Emily, and Jacqueline Nolis. 2020. Build a Career in Data
Science. https://livebook.manning.com/book/build-a-career-in-data-science?origin=product-look-inside.
Rockoff, Hugh. 2019. “On the Controversies Behind the Origins of
the Federal Economic Statistics.” Journal of Economic
Perspectives 33 (1): 147–64.
Ross, Casey. 2022. “How a Decades-Old Database Became a Hugely
Profitable Dossier on the Health of 270 Million Americans.”
Stat, February. https://www.statnews.com/2022/02/01/ibm-watson-health-marketscan-data/.
Rudis, Bob. 2020. Hrbrthemes: Additional Themes, Theme Components
and Utilities for ’Ggplot2’. https://CRAN.R-project.org/package=hrbrthemes.
Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan
Schroeder. 2019. “Differential Privacy and Census Data:
Implications for Social and Economic Research.” In AEA Papers
and Proceedings, 109:403–8.
Ruggles, Steven, Sarah Flood, Sophia Foster, Ronald Goeken, Jose Pacas,
Megan Schouweiler, and Matthew Sobek. 2021. “IPUMS USA: Version
11.0.” Minneapolis, MN: IPUMS. https://doi.org/10.18128/D010.V11.0.
Salganik, Matthew. 2018. Bit by Bit: Social Research in the Digital
Age. Princeton University Press.
Salganik, Matthew J, Peter Sheridan Dodds, and Duncan J Watts. 2006.
“Experimental Study of Inequality and Unpredictability in an
Artificial Cultural Market.” Science 311 (5762): 854–56.
Salganik, Matthew J, and Douglas D Heckathorn. 2004. “Sampling and
Estimation in Hidden Populations Using Respondent-Driven
Sampling.” Sociological Methodology 34 (1): 193–240.
Samuel, Arthur L. 1959. “Some Studies in Machine Learning Using
the Game of Checkers.” IBM Journal of Research and
Development 3 (3): 210–29.
Schloerke, Barret, and Jeff Allen. 2021. Plumber: An API Generator
for r. https://CRAN.R-project.org/package=plumber.
Sekhon, Jasjeet S, and Rocio Titiunik. 2017. “Understanding
Regression Discontinuity Designs as Observational Studies.”
Observational Studies 3 (2): 174–82.
Si, Yajuan. 2020. “On the Use of Auxiliary Variables in Multilevel
Regression and Poststratification.” https://arxiv.org/abs/2011.00360.
Sides, John, Lynn Vavreck, and Christopher Warshaw. 2021. “The
Effect of Television Advertising in United States Elections.”
American Political Science Review, 1–17. https://doi.org/10.1017/S000305542100112X.
Silberzahn, Raphael, Eric L Uhlmann, Daniel P Martin, Pasquale Anselmi,
Frederik Aust, Eli Awtrey, Štěpán Bahnı́k, et al. 2018. “Many
Analysts, One Data Set: Making Transparent How Variations in Analytic
Choices Affect Results.” Advances in Methods and Practices in
Psychological Science 1 (3): 337–56.
Silge, Julia. 2018. Text Classification with Tidy Data
Principles. https://juliasilge.com/blog/tidy-text-classification/.
Silge, Julia, and David Robinson. 2016. “Tidytext: Text Mining and
Analysis Using Tidy Data Principles in r.” JOSS 1 (3).
https://doi.org/10.21105/joss.00037.
Silver, Nate. 2020. We Fixed an Issue with How Our Primary Forecast
Was Calculating Candidates’ Demographic Strengths. https://fivethirtyeight.com/features/we-fixed-a-mistake-in-how-our-primary-forecast-was-calculating-candidates-demographic-strengths/.
Simonsohn, Uri. 2013. “Just Post It: The Lesson from Two Cases of
Fabricated Data Detected by Statistics Alone.” Psychological
Science 24 (10): 1875–88.
Simpson, Edward H. 1951. “The Interpretation of Interaction in
Contingency Tables.” Journal of the Royal Statistical
Society: Series B (Methodological) 13 (2): 238–41.
Somers, James. 2017. “Torching the Modern-Day Library of
Alexandria.” The Atlantic 20.
Sprint, Gina, and Jason Conci. 2019. “Mining Github Classroom
Commit Behavior in Elective and Introductory Computer Science
Courses.” The Journal of Computing Sciences in Colleges
35 (1).
Staicu, Ana-Maria. 2017. “Interview with Nancy Reid.”
International Statistical Review 85 (3): 381–403. https://doi.org/10.1111/insr.12237.
Staniak, Mateusz, and Przemyslaw Biecek. 2019. “The Landscape of r
Packages for Automated Exploratory Data Analysis.” arXiv
Preprint arXiv:1904.02101.
Statistics Canada. 2017. “Guide to the Census of Population,
2016.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2016/ref/98-304/98-304-x2016001-eng.pdf.
———. 2020. “Sex at Birth and Gender: Technical Report on Changes
for the 2021 Census.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2021/ref/98-20-0002/982000022020002-eng.pdf.
Stevens, Wallace. 1934. The Idea of Order at Key West. https://www.poetryfoundation.org/poems/43431/the-idea-of-order-at-key-west.
Steyvers, Mark, and Tom Griffiths. 2006. “Probabilistic Topic
Models.” In Latent Semantic Analysis: A Road to Meaning,
edited by T. Landauer, D McNamara, S. Dennis, and W. Kintsch.
Stigler, Stephen. 1986. The History of Statistics. Harvard
University Press.
Stock, James H, and Francesco Trebbi. 2003. “Retrospectives: Who
Invented Instrumental Variable Regression?” Journal of
Economic Perspectives 17 (3): 177–94.
Stolberg, Michael. 2006. “Inventing the Randomized Double-Blind
Trial: The Nuremberg Salt Test of 1835.” Journal of the Royal
Society of Medicine 99 (12): 642–43.
Student. 1908. “The Probable Error of a Mean.”
Biometrika, 1–25.
Sunstein, Cass R, and Lucia A Reisch. 2017. The Economics of
Nudge. Routledge.
Suriyakumar, Vinith M., Nicolas Papernot, Anna Goldenberg, and Marzyeh
Ghassemi. 2021. “Chasing Your Long Tails.” In
Proceedings of the 2021 ACM Conference on Fairness,
Accountability, and Transparency. ACM. https://doi.org/10.1145/3442188.3445934.
Taddy, Matt. 2019. Business Data Science. McGraw Hill.
Teece, David J. 2018. “Tesla and the Reshaping of the Auto
Industry.” Management and Organization Review 14 (3):
501–12.
The Economist. 2013. Johnson: Those Six Little Rules: George Orwell
on Writing. https://www.economist.com/prospero/2013/07/29/johnson-those-six-little-rules.
———. 2022. What Spotify Data Show about the Decline of English.
https://www.economist.com/interactives/graphic-detail/2022/01/29/what-spotify-data-show-about-the-decline-of-english.
Thieme, Nick. 2018. “R Generation.” Significance
15 (4): 14–19. https://doi.org/10.1111/j.1740-9713.2018.01169.x.
Thistlethwaite, Donald L, and Donald T Campbell. 1960.
“Regression-Discontinuity Analysis: An Alternative to the Ex Post
Facto Experiment.” Journal of Educational Psychology 51
(6): 309.
Thompson, Charlie, Josiah Parry, Donal Phipps, and Tom Wolff. 2020.
Spotifyr: R Wrapper for the ’Spotify’ Web API. http://github.com/charlie86/spotifyr.
Thornhill, John. 2021. “Lunch with the FT: Mathematician Hannah
Fry.” Financial Times.
Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data
Frames.” JOSS 2 (16): 355. https://doi.org/10.21105/joss.00355.
———. 2022. Quarto for Scientists. https://qmd4sci.njtierney.com.
Tierney, Nicholas J, and Karthik Ram. 2020. “A Realistic Guide to
Making Data Available Alongside Code to Improve Reproducibility.”
https://arxiv.org/abs/2002.11626.
Timbers, Tiffany. 2020. Canlang: Canadian Census Language Data.
https://ttimbers.github.io/canlang/.
Timbers, Tiffany-Anne, Trevor Campbell, and Melissa Lee. 2022. Data
Science: A First Introduction. CRC Press.
Tolley, Erin, and Mireille Paquet. 2021. “Gender, Municipal Party
Politics, and Montreal’s First Woman Mayor.” Canadian Journal
of Urban Research 30 (1): 40–52.
Trisovic, Ana, Matthew K. Lau, Thomas Pasquier, and Mercè Crosas. 2022.
“A Large-Scale Study on Research Code Quality and
Execution.” Scientific Data 9 (1). https://doi.org/10.1038/s41597-022-01143-6.
Tukey, John W. 1962. “The Future of Data Analysis.” The
Annals of Mathematical Statistics 33 (1): 1–67.
UN IGME. 2021. “Levels and Trends in Child Mortality,
2021.” https://childmortality.org/wp-content/uploads/2021/12/UNICEF-2021-Child-Mortality-Report.pdf.
Van den Broeck, Jan, Solveig Argeseanu Cunningham, Roger Eeckels, and
Kobus Herbst. 2005. “Data Cleaning: Detecting, Diagnosing, and
Editing Data Abnormalities.” PLoS Medicine 2 (10): e267.
Vanderplas, Susan, Dianne Cook, and Heike Hofmann. 2020. “Testing
Statistical Charts: What Makes a Good Graph?” Annual Review
of Statistics and Its Application 7: 61–88.
Varin, Cristiano, Nancy Reid, and David Firth. 2011. “An Overview
of Composite Likelihood Methods.” Statistica Sinica,
5–42.
von Bergmann, Jens, Dmitry Shkolnik, and Aaron Jacobs. 2021.
Cancensus: R Package to Access, Retrieve, and Work with Canadian
Census Data and Geography. https://mountainmath.github.io/cancensus/.
Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015.
“Forecasting Elections with Non-Representative Polls.”
International Journal of Forecasting 31 (3): 980–91.
Wang, Yilun, and Michal Kosinski. 2018. “Deep Neural Networks Are
More Accurate Than Humans at Detecting Sexual Orientation from Facial
Images.” Journal of Personality and Social Psychology
114 (2): 246.
Wardrop, Robert L. 1995. “Simpson’s Paradox and the Hot Hand in
Basketball.” The American Statistician 49 (1): 24–28.
Ware, James. 1989. “Investigating Therapies of Potentially Great
Benefit: ECMO.” Statistical Science, no. 4: 298–306.
Ware, James H. 1989. “Investigating Therapies of Potentially Great
Benefit: ECMO.” Statistical Science 4 (4): 298–306.
Wasserman, Larry. 2005. All of Statistics. Springer.
Wei, LJ, and S Durham. 1978. “The Randomized Play-the-Winner Rule
in Medical Trials.” Journal of the American Statistical
Association 73 (364): 840–43.
Weissgerber, Tracey L, Natasa M Milic, Stacey J Winham, and Vesna D
Garovic. 2015. “Beyond Bar and Line Graphs: Time for a New Data
Presentation Paradigm.” PLoS Biology 13 (4): e1002128.
Whitby, Andrew. 2020. The Sum of the
People. Basic Books.
Whitelaw, James. 1905. An Essay on the Population of Dublin. Being
the Result of an Actual Survey Taken in 1798, with Great Care and
Precision, and Arranged in a Manner Entirely New. Graisberry;
Campbell.
WHO. 2019. “Trends in Maternal Mortality 2000 to 2017: Estimates
by WHO, UNICEF, UNFPA, World Bank Group and the United Nations
Population Division.” https://www.who.int/reproductivehealth/publications/maternal-mortality-2000-2017/en/.
Wickham, Hadley. 2009. “Manipulating Data.” In
Ggplot2, 157–75. Springer New York. https://doi.org/10.1007/978-0-387-98141-3_9.
———. 2010. “A Layered Grammar of Graphics.” Journal of
Computational and Graphical Statistics 19 (1): 3–28.
———. 2011. “Testthat: Get Started with Testing.” The R
Journal 3: 5–10. https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.
———. 2014. “Tidy Data.” Journal of Statistical
Software 59 (1): 1–23.
———. 2016. Ggplot2: Elegant Graphics for Data Analysis.
Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2017. Tidyverse: Easily Install and Load the ’Tidyverse’.
https://CRAN.R-project.org/package=tidyverse.
———. 2019a. Advanced r. CRC Press.
———. 2019b. Babynames: US Baby Names 1880-2017. https://CRAN.R-project.org/package=babynames.
———. 2019c. Httr: Tools for Working with URLs and HTTP. https://CRAN.R-project.org/package=httr.
———. 2019d. Rvest: Easily Harvest (Scrape) Web Pages. https://CRAN.R-project.org/package=rvest.
———. 2019e. Stringr: Simple, Consistent Wrappers for Common String
Operations. https://CRAN.R-project.org/package=stringr.
———. 2020a. Forcats: Tools for Working with Categorical Variables
(Factors). https://CRAN.R-project.org/package=forcats.
———. 2020b. Tidyverse. https://www.tidyverse.org/.
———. 2021a. Mastering Shiny.
———. 2021b. The Tidyverse Style Guide. https://style.tidyverse.org/index.html.
———. 2021c. Tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy
D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019a.
“Welcome to the tidyverse.”
Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
———, et al. 2019b. “Welcome to the tidyverse.” Journal of Open Source
Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, and Jennifer Bryan. 2020. Usethis: Automate Package
and Project Setup. https://CRAN.R-project.org/package=usethis.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022.
Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, and Garrett Grolemund. 2017. R for Data
Science. https://r4ds.had.co.nz/.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2021. Readr: Read
Rectangular Text Data. https://CRAN.R-project.org/package=readr.
Wickham, Hadley, Jim Hester, and Winston Chang. 2020. Devtools:
Tools to Make Developing r Packages Easier. https://CRAN.R-project.org/package=devtools.
Wickham, Hadley, Jim Hester, and Jeroen Ooms. 2021. Xml2: Parse
XML. https://CRAN.R-project.org/package=xml2.
Wickham, Hadley, and Evan Miller. 2020. Haven: Import and Export
’SPSS’, ’Stata’ and ’SAS’ Files. https://CRAN.R-project.org/package=haven.
Wickham, Hadley, and Dana Seidel. 2020. Scales: Scale Functions for
Visualization. https://CRAN.R-project.org/package=scales.
Wiessner, Polly W. 2014. “Embers of Society: Firelight Talk Among
the Ju/’Hoansi Bushmen.” Proceedings of the National Academy
of Sciences 111 (39): 14027–35.
Wilde, Oscar. 1891. The Picture of Dorian Gray.
Wilkinson, Leland. 2005. The Grammar of Graphics. 2nd ed.
Springer.
Wilkinson, Mark D, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle
Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016.
“The FAIR Guiding Principles for Scientific Data Management and
Stewardship.” Scientific Data 3 (1): 1–9.
Wilson, Greg. 2021. Building Software Together. CRC Books.
Wilson, Greg, Jennifer Bryan, Karen Cranston, Justin Kitzes, Lex
Nederbragt, and Tracy K. Teal. 2017. “Good Enough Practices in
Scientific Computing.” PLOS Computational Biology 13
(6): 1–20. https://doi.org/10.1371/journal.pcbi.1005510.
Wong, Julia Carrie. 2020. One Year Inside Trump’s Monumental
Facebook Campaign.
Wright, Philip G. 1928. The Tariff on Animal and Vegetable
Oils. Macmillan Company.
Wu, Changbao, and Mary E Thompson. 2020. Sampling Theory and
Practice. Springer.
Xie, Yihui. 2019. “TinyTeX: A Lightweight, Cross-Platform, and
Easy-to-Maintain LaTeX Distribution Based on TeX Live.”
TUGboat, no. 1: 30–32. https://tug.org/TUGboat/Contents/contents40-1.html.
———. 2021. Knitr: A General-Purpose Package for Dynamic Report
Generation in r. https://yihui.org/knitr/.
Xie, Yihui, Christophe Dervieux, and Alison Presmanes Hill. 2021.
Blogdown: Create Blogs and Websites with r Markdown. https://github.com/rstudio/blogdown.
Xie, Yihui, Amber Thomas, and Alison Presmanes Hill. 2021. Blogdown:
Creating Websites with r Markdown.
Zhu, Hao. 2020. kableExtra: Construct Complex Table with ’Kable’ and
Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.
Zinsser, William. 1976. On Writing Well.
Zook, Matthew, Solon Barocas, Danah Boyd, Kate Crawford, Emily Keller,
Seeta Peña Gangadharan, Alyssa Goodman, et al. 2017. “Ten Simple
Rules for Responsible Big Data Research.” PLoS Computational
Biology. Public Library of Science San Francisco, CA USA.