At about this point, Paper Four (Appendix @ref(paper-four)) would be appropriate.
Alexander, Rohan, and Paul A. Hodgetts. 2021.
AustralianPoliticians: Provides Datasets about Australian Politicians.
https://CRAN.R-project.org/package=AustralianPoliticians.
Annas, George J. 2003. “HIPAA Regulations: A New Era of Medical-Record Privacy?” New England Journal of Medicine 348: 1486.
Athey, Susan, Guido W Imbens, Jonas Metzger, and Evan Munro. 2021. “Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations.” Journal of Econometrics.
Bandy, Jack, and Nicholas Vincent. 2021.
“Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus.” https://arxiv.org/abs/2105.05241.
Berners-Lee, Timothy J. 1989. “Information Management: A Proposal.”
Biderman, Stella, Kieran Bicheno, and Leo Gao. 2022.
“Datasheet for the Pile.” https://arxiv.org/abs/2201.07311.
Bush, Vannevar. 1945. “As We May Think.” The Atlantic Monthly 176 (1): 101–8.
Carleton, Chris. 2021.
Wccarleton/Conflict-Europe: Acce (version v1.0.0). Zenodo.
https://doi.org/10.5281/zenodo.4550688.
Carleton, W Christopher, Dave Campbell, and Mark Collard. 2021. “A Reassessment of the Impact of Temperature Change on European Conflict During the Second Millennium CE Using a Bespoke Bayesian Time-Series Model.” Climatic Change 165 (1): 1–16.
Christensen, Garret, Allan Dafoe, Edward Miguel, Don A Moore, and Andrew K Rose. 2019. “A Study of the Impact of Data Sharing on Article Citations Using Journal Policies as a Natural Experiment.” PLoS One 14 (12): e0225883.
Christensen, Garret, Jeremy Freese, and Edward Miguel. 2019. Transparent and Reproducible Social Science Research. University of California Press.
Cohen, I. Glenn, and Michelle M. Mello. 2018.
“HIPAA and Protecting Health Information in the 21st Century.” JAMA 320 (3): 231.
https://doi.org/10.1001/jama.2018.5630.
Council of European Union. 2016. “General Data Protection Regulation 2016/679.”
Flynn, Michael. 2021.
Troopdata: Tools for Analyzing Cross-National Military Deployment and Basing Data.
https://CRAN.R-project.org/package=troopdata.
Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. “Datasheets for Datasets.” Communications of the ACM 64 (12): 86–92.
Geuenich, Michael J, Jinyu Hou, Sunyun Lee, Shanza Ayub, Hartland W Jackson, and Kieran R Campbell. 2021. “Automated Assignment of Cell Identity from Single-Cell Multiplexed Imaging and Proteomic Data.” Cell Systems 12 (12): 1173–86.
Hart, Edmund M, Pauline Barmby, David LeBauer, François Michonneau, Sarah Mount, Patrick Mulrooney, Timothée Poisot, Kara H Woo, Naupaka B Zimmerman, and Jeffrey W Hollister. 2016. “Ten Simple Rules for Digital Data Storage.” Public Library of Science San Francisco, CA USA.
Kenny, Christopher T., Shiro Kuriwaki, Cory McCartan, Evan Rosenman, Tyler Simko, and Kosuke Imai. 2021.
“The Impact of the u.s. Census Disclosure Avoidance System on Redistricting and Voting Rights Analysis.” https://arxiv.org/abs/2105.14197.
Knuth, Donald E. 1998. Art of Computer Programming, Volume 2: Seminumerical Algorithms. 2nd ed.
Koenecke, Allison, and Hal Varian. 2020.
“Synthetic Data Generation for Economists.” https://arxiv.org/abs/2011.01374.
Kröger, Jacob Leon, Milagros Miceli, and Florian Müller. 2021. “How Data Can Be Used Against People: A Classification of Personal Data Misuses.” Available at SSRN 3887097.
Kuriwaki, Shiro, Will Beasley, and Thomas J. Leeper. 2022. Dataverse: R Client for Dataverse 4+ Repositories.
Michael, Geuenich, Hou Jinyu, Lee Sunyun, Ayub Shanza, Jackson Hartland, and Campbell Kieran. 2021.
“Automated assignment of cell identity from single- cell multiplexed imaging and proteomic data.” Zenodo.
https://doi.org/10.5281/zenodo.5156049.
Michener, William K. 2015.
“Ten Simple Rules for Creating a Good Data Management Plan.” PLoS Computational Biology 11 (10): e1004525. https://doi.org/
https://doi.org/10.1371/journal.pcbi.1004525.
Ooms, Jeroen. 2021.
Openssl: Toolkit for Encryption, Signatures and Certificates Based on OpenSSL.
https://CRAN.R-project.org/package=openssl.
Patki, Neha, Roy Wedge, and Kalyan Veeramachaneni. 2016. “The Synthetic Data Vault.” In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 399–410. IEEE.
Paullada, Amandalynne, Inioluwa Deborah Raji, Emily M. Bender, Emily Denton, and Alex Hanna. 2021.
“Data and Its (Dis)contents: A Survey of Dataset Development and Use in Machine Learning Research.” Patterns 2 (11): 100336.
https://doi.org/10.1016/j.patter.2021.100336.
Ross, Casey. 2022.
“How a Decades-Old Database Became a Hugely Profitable Dossier on the Health of 270 Million Americans.” Stat, February.
https://www.statnews.com/2022/02/01/ibm-watson-health-marketscan-data/.
Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan Schroeder. 2019. “Differential Privacy and Census Data: Implications for Social and Economic Research.” In AEA Papers and Proceedings, 109:403–8.
Simonsohn, Uri. 2013. “Just Post It: The Lesson from Two Cases of Fabricated Data Detected by Statistics Alone.” Psychological Science 24 (10): 1875–88.
Suriyakumar, Vinith M., Nicolas Papernot, Anna Goldenberg, and Marzyeh Ghassemi. 2021.
“Chasing Your Long Tails.” In
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency.
ACM.
https://doi.org/10.1145/3442188.3445934.
Tierney, Nicholas J, and Karthik Ram. 2020.
“A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility.” https://arxiv.org/abs/2002.11626.
Wilkinson, Mark D, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9.
Zook, Matthew, Solon Barocas, Danah Boyd, Kate Crawford, Emily Keller, Seeta Peña Gangadharan, Alyssa Goodman, et al. 2017. “Ten Simple Rules for Responsible Big Data Research.” PLoS Computational Biology. Public Library of Science San Francisco, CA USA.