Tech Science to Save the World Poster Fair

Tuesday, 5/5/2020. 10 AM – 12 PM

Zoom

Join us on Tuesday 5/5 to see presentations from students in Harvard's Tech Science to Save the World that answer:

  • Who had to wait in line to vote in-person on April 7 in the Wisconsin Primary using cellphone GPS data?
  • How can states decide to implement vote-by-mail policies to respond to COVID-19 given their existing laws?
  • Are Apple's and Google's bluetooth-based exposure notification for COVID-19 compliant with the EU's GDPR?
  • What factors affect how much people change their behavior after starting and ending stay-at-home orders from their state's governor?
  • How much privacy are people willing to sacrifice for better contact tracing technologies?
  • Can people tell fake images created by machine learning apart from images of real humans?
  • How well can you simulate demand for bike share given past data?
  • Does Google Translate give gender-biased responses?
  • How accessible are voter registration websites in every state?
  • How can bots disrupt federal public comment websites by using machine learning to create deepfake text that is indistinguishable from human-written speech?
  • And more!

Abstracts

Project 1. Analyzing the Impact of Coronavirus on Voter Wait Times during the Wisconsin Primary Election: A GPS-Based Approach

Sam Lurye

On April 6, 2020, the governor of Wisconsin issued an executive order in a last minute attempt to delay Wisconsin’s primary election. The order was intended to allow citizens in Wisconsin to exercise their right to vote while avoiding the inherent perils of voting in-person during a pandemic. Nonetheless, later that same day, the state Supreme Court judged that the election would go on as planned. On April 7, with the electoral process thrown into chaos and just a fraction of the normal polling places open, thousands of Wisconsinites were forced to brave hours-long lines in order to vote.

Prior work has shown that long lines decrease voter turnout in future elections, and at least one study used mobile GPS data to show that people who live in predominantly white areas tend to face shorter voting lines than those who do not. Given the distinct possibility of needing to hold more elections before the end of this pandemic, it is imperative to understand the ways in which the election process in different communities in Wisconsin was differentially impacted by the Coronavirus. This study analyzes mobile phone GPS data from a sample of 10% of Wisconsin’s population on the day of the primary election in order to determine voter wait times across the state.

Preliminary results corroborate reports of substantially increased wait times, with some larger cities facing median wait times of over 1 hour. Next steps will focus on breaking down wait times by different demographics, as well as examining which communities had the largest proportion of in-person to absentee voters.

Project 2. Vote-By-Mail Pandemic Decision Making

Diego Garcia and Kiera O’Brien

Voting by mail, while politically controversial before the coronavirus, is poised to be a major issue in the November 2020 elections. Five states currently conduct all elections entirely by mail: Colorado, Hawaii, Oregon, Washington, and Utah. Three others, California, Nebraska, and North Dakota, offer no-excuse opt-in vote by mail. These changes have all taken place in the last 10 years and were enacted by state legislatures.

A presidential election taking place in the middle of a global pandemic is an unprecedented issue that will leave many Americans concerned about their voting rights and safety. Even if authorities ease social distancing restrictions or the United States manages to avoid a second spike in infections in the Fall altogether, we can still reasonably expect that the public will be inclined to avoid in-person voting to the extent possible. The increase in voting by mail will likely result in delays in reporting results. Many pundits may be eager to call elections based on exit polls due to these delays, but based on a survey of American states’ legal frameworks, we anticipate voting patterns that break from the norm, meaning exit polls will be extremely unreliable.

Due to this, it will be absolutely crucial that Secretaries of State utilize best practices from past crises to ensure Americans are not disenfranchised by the pandemic. Elections have moved forward during 9/11, natural disasters, and a variety of other unprecedented situations. While these situations are different in scale, they are also different in terms of time to prepare.

We compare the legal frameworks for voting by mail to describe the landscape from recent elections. In parallel, we examine past crises that have impacted elections within states to see what best practices can be learned from these experiences. We propose a series of franchise-ensuring and disinformation-preventing recommendations to Secretaries of State based on their state’s existing legal framework for voting and previous crisis-time voting efforts by states with similar frameworks.

While the political discourse will likely fixate on voting during a pandemic in the lead up to federal elections in November, we recognize the best-positioned state official to implement changes that can ensure the franchise are the Secretaries of State. Additionally, we think it’s crucial in a time of “unprecedented” situations to base our recommendations on the history of states that have in the past navigated elections amidst crises. By making recommendations into a logically flowing decision map, we hope to make a compelling case to Secretaries of State to ensure American voting rights to the best of their abilities while operating within their existing frameworks.

Project 3. GDPR as a Metric of Privacy in the Context of Contact Tracing

Ryan Chung and Jasmine Hyppolite

COVID-19 has posed a global health crisis in which the use of technology seems necessary to aid in controlling the spread. Due to the unique ability for the virus to have asymptomatic hosts or not show physical symptoms immediately while infecting individuals, contact tracing has been deemed the best way to get an understanding of how the virus spreads within communities and between individuals. With this, technology companies have announced their efforts to build applications and make use of data that would be important in mitigating and tracking the spread of the virus in order to aid the government, although the government itself has not released any information about making an application without the help of private companies. Yet, how secure are various implementations of this system?

To evaluate and categorize the privacy precautions being taken, this paper will be interpreting GDPR’s guidelines to specifically apply to contact tracing. It will specifically look at two implementations— Apple and Google’s decentralized proposal and Australia’s centralized approach— with the ultimate goal of producing an evaluation framework that applies GDPR’s checklist to contact tracing, serving as a new set of guidelines for contact tracing technologies to come.

Project 4. Mobility and stay-at-home orders: a compliance analysis

David Netter

The impact that COVID-19 has on our day to day lives is palpable. In a few months only, countries started issuing stay-at-home orders to prevent and slow the spread of the virus. In the United States, most states implemented these new rules, and in some cases, counties. As this global health crisis emerged, some countries decided to use location tracking for their citizens, in an attempt to trace a contamination chain.

Inspired by these methods, this paper seeks to use mobile phone geospatial data by county on the days before and after movement restrictions are implemented. Using a regression analysis, the goal is to look at what metrics correlate the most with the compliance with the stay-at-home or shelter in place order.

The next steps will focus on gathering more data and look at what happens once the orders are lifted.

Project 5. Privacy vs. Tracking System during COVID-19 Pandemic

Bruna Saraiva

During times of crises, new opportunities arise. This has been a “mantra” followed by many institutions around the world during the COVID-19 pandemic. Moreover, it has given great space for technology to reinvent itself and create innovations to help fight and control the pandemic. The use of tracking systems, for example, is one of the most debated topics of these times. Large companies and governments around the world are studying and working to develop systems that will help control the spread of the virus and, ideally, also preserve individuals' privacy. However, will individuals accept this? Will they trust these institutions? How much and for what reasons are individuals willing to forgo their privacy?

This study analyzes individuals’ willingness to forgo their privacy to fight COVID-19 pandemic and get back to their “normal” lives. In order to test this, a survey was designed and fielded to over 300 individuals around the United States. Some of the questions in the survey included: demographic questions (age, gender, household income, region, and device type), specific questions about privacy (“What personal information would you be willing to provide for government agencies/private institutions/health institutions for tracking purposes?”). In addition, the survey aimed at understanding what entities people trust more to give away their information if needed: government, private companies, and health organizations. This project plays an important role since the popularity and usage of tracking systems might help fight the spread of the virus and, possibly, lower the economic impact of social distancing actions and allow people to slowly return to work a “normal” life. Companies and governments would benefit from understanding population concern with contact tracing technologies, since it might guide the development of technologies and truthful marketing messages that address individuals’ concerns.

Results Summary: Preliminary results demonstrate that people would be more willing to provide their private information mainly to protect themselves and their families (43.17%). Moreover, a significant number of individuals would also be willing to provide their private information in order to control the pandemic (38.80%). However, compared to the latter, more individuals would not provide their private information at all (42.08%). Also, the entity that individuals considered to be the most trustworthy to access and utilize their private information are health organizations/institutions, with a rank score of 3.09, while others had scores below 2.5.

Project 6. Fake versus Real Faces: Can you tell the difference?

Paul Marino, Aidan Keenan, and Kaitlyn Greta

Over the past several years, AI-face-generation technology has improved to the point where many of its products are indistinguishable from real faces. We surveyed 200 people to see whether educating them about common characteristics of computer generated faces improves their ability to identify an image as fake. Our survey helps determine the potential scope of this new AI-technology: if people are sufficient at identifying AI-generated faces on their own, then enhanced security measures to prevent against fake image exploitation are not needed. Half of our respondents received direction on identifying fake images, and the other did not.

Results Summary: Out of the 100 respondents that did not receive any direction on common characteristics of fake images, around half could distinguish fake images from real images. Additionally, our survey found that our directive efforts significantly increased people’s ability to correctly distinguish fake images from real ones. Thus, our experiment shows that only 1 in 2 people can identify fake images—a proportion that merits increased security measures to prevent against the exploitation of fake images. However, our results demonstrate that people’s ability to identify fake images dramatically increases when given guidance on common characteristics of fake images. With some simple training, people can identify real images from fake ones.

Project 7. Bike-Sharing Is Transit: Building Tools to Plan and Optimize Bike-Sharing Networks

Dhruv Gupta

Over 30 million Americans lack reliable access to a car. Many live in “transit deserts”, where transportation demand far exceeds transit supply. Bike-sharing has emerged as an eco-friendly, healthy, and congestion-limiting transit option. This thesis presents BikePath: a novel simulation tool to model bike-sharing, letting planners simulate the impacts of different ridership-demand and weather scenarios. In Chapter 1, I interviewed 26 bike-sharing operators and transit officials in 16 major American cities, finding that interest groups and political stakeholders have conflicting goals that limit the availability of bike-sharing. My interviews highlighted the need for efficient bike-sharing planning and optimization tools to maximize ridership and minimize operating costs. In Chapter 2, as a case study, I used BikePath to simulate a policy giving all Boston Public Schools teachers Bluebikes memberships. Even with the additional ridership-demand, BikePath found a way to remove a third of the 3000 bikes in the network and still meet transportation demand. In Chapter 3, I analyzed cellphone GPS data for 1.5 million trips to identify locations for new bike-sharing stations based on forecasted demand. This thesis offers new tools for bike-sharing planners and operators for improving access to efficient, reliable transit options for those who need it most.

Project 8. Google Translate: Reinforcing Gender Bias

Pernilla Hamren and Colin McGinn

Laws are meant to protect individuals from bias based on gender but discrimination in society persist. A bias towards the representation of a particular gender instills a distorted stereotype of the gender role and women generally tend to be under-represented in male dominated occupations. Gender bias shapes our world view and can potentially limit the ability of individuals to select professions that may suit them and demonstrates why women still lag behind men in certain careers which impedes fair practices, pay equity and equality. Google has taken many measures to thwart gender bias in their services; Google AI tool no longer uses gendered labels for images of people. In fact, Google AI’s second principle is “Avoid creating or reinforcing unfair bias”. In April 2020, Google released a new model for its NMT (Neural Machine Translation) system that is supposedly 99% accurate in producing the requested masculine or feminine rewrites. Understanding the patterns and prevalence of algorithmic bias can help challenge and break gender stereotypes. In this research, we will investigate if Google Translate has successfully eliminated gender bias across occupations by developing an automated script to rapidly enter and save inputted translations using Google Translate’s storage service. This study attempts to detect gender inequality in translations across occupations and will include Google Translate’s most common language pairs English to Spanish and English to French.

Preliminary Results: Our initial trials demonstrate that Google’s NMT system exhibits gender biases across occupations.

Project 9. State Voter Websites & Accessibility Policy: Many are Incompliant

Alyx van der Vorm

In this study we evaluated how many of the state voter registration websites and voter look-up websites follow federal accessibility laws in order to see what the accessibility of the online voting system is to the disabled population. Up to 19.4% of the American population have disabilities and government functions are increasingly migrating online; given the approaching 2020 election and census, state website accessibility has never been more crucial. We define ‘inaccessible’ as a website where it is not possible for a voter using assistive technologies to use the site for its intended purpose. Currently, Section 508 of the Rehabilitation Act serves as the federal guideline for website accessibility, though individual states are not bound by this. Instead, states can either adopt Section 508 into their own guidelines, follow alternate international guidelines such as those set by the World Wide Web Consortium (W3C), create their own legislation and guidelines, or have none. In this study we investigated all the state voter registration and voter registration look-up websites through WAVE (Web Accessibility Evaluation Tool), in order to calculate how many errors each website had. The existence of errors renders sites less accessible and may mean they do not follow accessibility laws. We then tested accessibility using VoiceOver on Mac to see how the WAVE reported errors actually impacted user experience. Finally, we researched and ranked the accessibility policy of every state to see if these policies were being potentially violated in any case.

Results Summary: When testing with WAVE, we found that the majority (79%) of state voter registration websites, many of which are supposed to follow federal accessibility laws, do not follow regulations and thus are inaccessible to many disabled citizens. Similarly, we found that the majority (72%) of state voter look-up websites also do not follow federal accessibility regulations. When testing with VoiceOver, we categorized 11 states’ voter registration websites as completely inaccessible; Alaska, Colorado, Delaware, Indiana, Minnesota, Nebraska, New Jersey, Oregon, Kansas, Mississippi and West Virginia. Additionally, we categorized 12 states’ voter registration look-up websites as completely inaccessible; Alaska, Connecticut, Georgia, Mississippi, New Hampshire, New Jersey, North Dakota, Oregon, South Dakota, Texas, Vermont and Wisconsin. Thus, our online voting system is plagued with inaccessibility errors, blocking disabled Americans from participating in the democratic process. Four states completely failed all 4 tests (i.e. both phase 1 and 2 for both website types); Alaska, Mississippi, New Jersey and Oregon. While Alaska, Mississippi and Oregon all fall in rank 2 (i.e. these states only have soft guidelines not strict requirements for web accessibility standards), New Jersey falls in rank 1 (i.e. strict accessibility policy). A further 13 states had either their voter registration or voter registration look-up site fail both WAVE testing (phase 1) and VoiceOver testing (phase 2). Out of those states, all but two fall in rank 1 i.e. have strict accessibility requirements of the highest level (WCAG 2.0 or Section 508). Thus, Colorado, Kansas, Minnesota, Nebraska, Connecticut, Georgia, New Hampshire, North Dakota, Texas, Vermont and Wisconsin fail to fully comply with their states’ web accessibility policy requirements. These states should be contacted to alert them of these non-compliant websites.

Project 10. Deepfake Bot Submissions to Federal Public Comment Websites Cannot be Distinguished from Human Submissions

Max Weiss

The federal comment period is an important way that federal agencies incorporate public input into policy decisions. Now that comments are accepted online, public comment periods are vulnerable to attacks at Internet scale. For example, in 2017, more than 21 million (96% of the 22 million) public comments submitted regarding the FCC’s proposal to repeal net neutrality were discernible as being generated using search-and-replace techniques [1]. Publicly available artificial intelligence methods can now generate “Deepfake Text,” computer-generated text that closely mimics original human speech. In this study, I tested whether federal comment processes are vulnerable to automated, unique deepfake submissions that may be indistinguishable from human submissions. I created an autonomous computer program (a bot) that successfully generated and submitted a high volume of human-like comments during October 26-30, 2019 to the federal public comment website for the Section 1115 Idaho Medicaid Reform Waiver.

Results summary: The bot generated and submitted 1,001 deepfake comments to the public comment website at Medicaid.gov over a period of four days. These comments comprised 55.3% (1,001 out of 1,810) of the total public comments submitted. Comments generated by the bot were often highly relevant to the Idaho Medicaid waiver application, including discussion of the proposed waiver’s consequences on coverage numbers, its impact on government costs, unnecessary administrative burdens, and relevant personal experience. Finally, in order to test whether humans can distinguish deepfake comments from other comments submitted, I conducted a survey of 108 respondents on Amazon’s Mechanical Turk. Survey respondents, who were trained and assessed through exercises in which they distinguished more obvious bot versus human comments, were only able to correctly classify the submitted deepfake comments half (49.63%) of the time, which is comparable to the expected result of random guesses or coin flips. This study demonstrates that federal public comment websites are highly vulnerable to massive submissions of deepfake comments from bots and suggests that technological remedies (e.g., CAPTCHAs) should be used to limit the potential of abuse.