Investigation Plan: Deepfake Submissions on Federal Public Comment Servers

Investigation Plan Summary

American Society versus the U.S. Government.
Issue is bot submissions to federal comment websites.

Federal comment periods are an important way that federal agencies include public input into policy decisions, but being online, they are vulnerable to attacks at Internet scale. For example, in 2017, more than 21 million (96 percent of the 22 million) public comments submitted regarding the Federal Communication Commission’s proposal to repeal net neutrality were discernible as being generated using search-and-replace techniques [1]. Worse, publicly available artificial intelligence methods can now generate “deepfake text” that allows computers to closely mimic original human speech. How vulnerable are federal comment processes to automated deepfake submissions that seem indistinguishable from human submissions and what can be done about them?

Studies to Investigate:

A study might generate and submit topical deepfake comments to a federal public comment website to demonstrate that the submissions would be received in volume and then be withdrawn from the comment process.
- This was completed by Max Weiss as a research study published at Weiss M. Deepfake Bot Submissions to Federal Public Comment Websites Cannot Be Distinguished from Human Submissions. Technology Science. 2019121801. December 18, 2019. https://techscience.org/a/2019121801
A study might generate topical deepfake comments for submission to a federal public comment website, but instead of submitting them, runs a test on Amazon Mechanical Turk to see how well humans can distinguish deepfake comments from other comments submitted.
- This was completed by Max Weiss as a research study published at Weiss M. Deepfake Bot Submissions to Federal Public Comment Websites Cannot Be Distinguished from Human Submissions. Technology Science. 2019121801. December 18, 2019. https://techscience.org/a/2019121801
(Related) Generate topical deepfake comments for submission to a federal comment server, as usual, using a training dataset of prior comments on the same topic. Then, analyze the deepfake and submitted comments in comparison to the original training data to see whether the deepfake comments can be inferred as deepfakes based on knowledge of the training data used.
(Related) Google reCAPTCHA can be added to federal public comment websites in order to help prevent bots from making massive submissions. The latest version uses the history of the computer’s browser to determine whether the submission is likely from a bot. Write some cloud simulation programs to see if a bot that establishes a browsing history can convince Google’s reCAPTCHA that it is a human.
(Related) The idea of outside verification for a comment submitter involves the federal public comments website sending a private code to an email address or phone number and requiring the sent code be entered with the submission. Survey the availability of email addresses and Internet phone numbers that could be used by a bot to automate submissions that required outside verification.
(Related) The idea of outside verification (see Study 5 above) would require members of the public who submit comments to provide personally identifying information –namely, an email address or a phone number. Perform a review of the legal requirements for federal public comment websites and for the privacy of personal information collected by federal agencies to see whether federal public comment servers can require personally identifying information, and if so, what additional requirements exist for federal agencies to collect personally identifying information at websites.
(Related) Many websites allow visitors to authenticate themselves to the website using their credentials at other established websites such as Google or Facebook. If the government were to do something similar for federal public comment websites, then the website could know that authenticated (known) people submitted those comments. Build a website that uses authentication from Google and Facebook and analyze whether the authentication can be provided without the host of the website actually learning or knowing the person’s Facebook or Google identity.
(Related) A variant of Study 7 above in which a website is built that allows people to register at the website, and after confirming their identity, use the website to authenticate themselves to other websites. This study would involve constructing the two websites (authenticator and a website that uses the authentication) and then performing a security analysis.
(Related) A variant of Study 8 above that conducts a legal analysis of barriers to and requirements for the federal government to host its own authentication server.
(Related) A variant of Study 8 above that examines the pros and cons of the federal government providing its own authentication server and includes a survey of the identity problems that could be solved by the federal government having its own authentication server.

The Public Comment Process

Bots, Turing Tests and Deepfake Text

Desired Outcome

Study 1. Deepfake Submissions

Study 2. Deepfake Comments Turing Test

Study 3. (Related) Detecting Deepfake Comments Using Its Training Data

Study 4. (Related) Bots Challenges reCAPTCHA

Study 5. (Related) Outside Verification Vulnerability to Bots

Study 6. (Related) Legal Requirements for Government Collection of Personal Information

Study 7. (Related) Google and Facebook Authentication Demonstration

Study 8. (Related) Federal Authentication Website Demonstration

Study 9. (Related) Legal Review for a Federal Government Authentication Server

Study 10. (Related) Analysis of Identity Problems with Using a Federal Authentication Server

Predicted Events

Appendix A

Key Conflict in this Study

Appendix B

Decision-Makers Projected Response to the Proposed Study

Appendix C

Project Response Timeline

Appendix D

Helper Model Opportunity for Change