The New Wildcats: High-Risk Banking From Worst-Case Certiﬁcate Practices Online

Zheng Dong; Kevin Kane; Siyu Chen; L. Jean Camp

Abstract

Phishing attacks against bank websites occur when imposters masquerade as official bank websites. The idea is to convince the victim that the imposter is actually from a known, familiar institution, in order to fool him or her into providing passwords and other personal information. A solution requires the ability to distinguish legitimate banking institutions from other sites. The current core security designed to thwart these attacks relies on certificates that cryptographically certify the connection between a website and a user. However, such certificates are often used incorrectly, and even when implemented properly, they have weaknesses that can be exploited for attack against online banking sites. We implemented a large-scale examination of certiﬁcates, downloading some 4 million certiﬁcates over two years using machines on three continents as a baseline for comparison against a second set of bank certificates from the Federal Deposit Insurance Corporation (FDIC)’s list of 27,000 federally insured depository institutions.

Results summary: We found that the use of certificates and the rest of the core authentication and transmission security infrastructure is weak for online banking, with a greater share of bank sites having at least one of the PKI vulnerabilities analyzed when compared against a group of popular general interest websites. As shown above, long-lived certificates (which exacerbate the risk of breach) are used 45 percent of the time by banks, but only 24 percent of the time by general websites. For FDIC-insured banks, only 50 percent have a certificate that reflects the bank or domain name, and only 23 percent of banks had official domains at all. Even when the banks have both domains and certificates, 41 percent of those do not match. Since certificates are intended to verify the identity of an online entity, the lack of widespread available verification is problematic.

In response to these weaknesses, we present a set of technical best practices, and show how rarely these standards are met in practice. The failures we identify mean that banks are not correctly identiﬁed to their customers, and traffic between banks and customers is often insecure. We close with a speciﬁc regulatory and technology policy solution of creating an authenticated official banking website indicator that will reduce the vulnerability of banking websites to phishing and related attacks and, which would require a structural change neither in the certiﬁcates themselves nor in the larger public key infrastructure.

Background

What is PKI?

PKI comprises a set of standards, the organizations that implement those standards, and the devices that use the resulting standardized documents. PKI for websites deﬁnes a hierarchy of issuers (i.e., those who can authenticate) and the structure of the certiﬁcates themselves (i.e., what data are authenticated). The existence of PKI enables consistent issuance of public key certiﬁcates.

Understanding certificates

Certificates are at the core of PKI. A certiﬁcate is a set of assertions, often about the identity of a website’s owner that is cryptographically signed by a trusted third-party organization, which provides mathematically veriﬁable evidence of the assertions’ validity.

The underlying mathematical structure of a certiﬁcate relies on public key cryptography, which uses a set of complementary mathematically based keys, one secret and one public, each of which can decrypt what the other encrypts. Information such as a digital signature is encrypted by the secret key and can be decrypted only with the public key. This means anyone can obtain the public key and conﬁrm that the information was encrypted with the secret key. The secret key and public key are linked to an identiﬁer, and that identiﬁer corresponds to a Certiﬁcate Authority (CA), usually a trusted third-party organization, which issues the certiﬁcate and attests to certain facts by signing the certiﬁcate.

The certificates serve two main purposes.

The first is to conﬁrm that a website is what it claims to be, as a form of identiﬁcation. Therefore, domain names and the common name of the party responsible for the domain name are in a certiﬁcate. For example, if “IU.edu” is the domain name, then that domain name should be listed either as the subject’s common name or in the subject alternative name extension. The owner of the website (in this case, Indiana University) should be listed as the subject’s organization name in the certiﬁcate.

The second purpose is to enable encrypting communication between the domain name and anyone who communicates with that domain. In other words, a certiﬁcate should conﬁrm to whom you are speaking and then prevent anyone else from listening in on the conversation. In technical terms, once veriﬁcation of the presented certiﬁcate is complete, the public key encrypts a random pre-master secret, which in turn generates a master secret key and a session key. A secure communication channel is then established between the user’s computer and the website. The (symmetric) session key now protects future communication against eavesdropping or modiﬁcation by a third party.

While there has been considerable work on how users interact with certiﬁcate warnings and notiﬁcations from their browsers when a website has problems implementing certificates [3, 4], this study focuses on understanding how often these problematic implementations occur today on the web, especially for banking sites.

Potential vulnerabilities

Certiﬁcates are mathematically secure and elegant, but improperly implemented certificates can make users vulnerable to attackers.

Lack of authentication results in masquerade attacks, in which individuals trustingly give personal information to a website controlled by an attacker. Masquerade attacks include phishing, pharming, and man-in-the-middle attacks.

Phishing attacks trick the victim into entering information on a false site with an incorrect and possibly misleading domain name. Often, the false site looks extremely similar in design to the legitimate site. The absence of a certiﬁcate is common, but so is the use of misleading but apparently trustworthy certiﬁcates. For example, an attacker can obtain a legitimate certiﬁcate for an obfuscated domain name (e.g., amazon.com.payment.gerin.net) or by hosting the attack in the cloud (thereby leveraging the trusted cloud certiﬁcate).

Pharming is a more sophisticated attack, in which the attacker manipulates the victim’s software to direct him or her to a website that is incorrect but nonetheless shows the correct domain name. This is done by changing the IP address from that of the actual website to that of the attacker’s website in one of the victim’s devices. In this case, only certiﬁcates can distinguish the two sites. The most common form of attack requires adding incorrect information to the victim’s local device (e.g., a laptop or phone), but home routers are also quite vulnerable.

Man-in-the-middle (MITM) attacks occur when an attacker inserts him- or herself into the initial authentication. The attacker pretends to the website to be the user, and pretends to the user to be the website. This is detected by matching the certiﬁcate presented by the connected website to the domain requested by the user. Certiﬁcate warnings from browsers are often seen when connecting online through public networks at airports, hotels, or coffee shops and a network connection to the network provider’s site interrupts the initial user sought-after website (e.g. showing starbucks.com first on a user’s device after connecting to a Starbucks network). A malicious party can intercept the same way with MITM, without being visible to the user or the web server. The solution to this attack requires a functional, semantically meaningful PKI.

Even when certificates are implemented, certiﬁcates can fail in four ways.

The set of facts embedded in the signature is somehow incorrect, either because of changes over time or incorrect issuance.
The cryptography could be ﬂawed [5].
The software that is supposed to conﬁrm the authenticity of the certiﬁcate is ﬂawed, and authenticates flawed or falsified certiﬁcates.
Individuals could perceive that the certiﬁcate means something quite diﬀerent from the intended issuance and implications.

In our examination of bank certiﬁcates, we found problems of the ﬁrst and second type. Other researchers have documented serious problems in terms of the third type [6, 7, 11]. The fourth type is a focus of ongoing research in industry, in the academy, and indeed in our research group.

Certificate structure and practices

Certificates issued for the World Wide Web follow the X.509 format, which contains the following ﬁelds:

Certiﬁcate version. This ﬁeld indicates the format of other certiﬁcate ﬁelds.
Serial number. This is a unique certiﬁcate identiﬁer assigned by the CA.
Signature algorithm. This is the algorithm used to generate and verify the digital signature.
Message authentication algorithm. This algorithm generates the message digest, which is the compressed form of the entire certiﬁcate information and what is technically veriﬁed in the certiﬁcate signature.
Issuer. This ﬁeld contains information about the CA that issues the certiﬁcate. Common name, organization, physical address (city, state, country), and email are typically but not universally included.
Validity. The start and end dates of the certiﬁcate’s validity period.
Subject. This contains information about the entity to which the certiﬁcate is issued. Typical components are common name, organization, physical address (city, state, country), and email.
Certiﬁcate extensions. Depending on the certiﬁcate version, several optional but important certiﬁcate ﬁelds may exist. For example, “basic constraints” can indicate whether a certiﬁcate can be used as an intermediate certiﬁcate, which can sign subordinate certiﬁcates. “Extended key usage” restricts the use of the public key to a list of speciﬁc purposes enumerated upon issuance.

The ability to sign subordinates is particularly important for security because any CA can issue a certiﬁcate to any website. Since the actual operations of CAs can vary signiﬁcantly, it may be possible for an attacker to obtain a valid certiﬁcate from a less-diligent CA, which will receive identical trust from web browsers as any other certificate.

One approach to address this is the use of Extended Validation (EV) certiﬁcates to create more trustworthy certiﬁcates. EV certiﬁcates are explicit assertions by the CA that there was a higher-than-normal level of due diligence in the certiﬁcate issuance. For example, CAs typically do not perform strict veriﬁcations on the actual association of an entity requesting a certiﬁcate and the corresponding website, since there is no such standard practice in the industry.

However, EV certiﬁcates are not widely used. In addition, there is no research indicating that average users actually notice visual cues used to distinguish EV from non-EV certificates in browsers [4]. Thus, due to the signiﬁcantly higher cost and diﬃculty of obtaining an EV certiﬁcate, the majority of websites still use non-EV certiﬁcates.

Finally, practices related to issuing certificates vary widely and change slowly. Reasons for this include the large number of CAs, each of which has its own operational processes and the burden of legacy requirements. Technical practices, expertise levels, and jurisdictional practices vary signiﬁcantly across certiﬁcate authorities.

Therefore, while the existing technical structure of the certiﬁcates enables identiﬁcation of ﬁnancial institutions, it is the current marketplace dynamics that create disincentives to greater adoption.

Challenges with PKI

Four major categories of failures in PKI include (1) weak cryptography, (2) disorganized revocation, (3) inadequate information, and (4) ﬂawed evaluation software.

Weak cryptography
For those unfamiliar with basic cryptography, this simply means that there are stronger and weaker signatures. Weaker signatures have a greater risk of falsiﬁcation, just as weakly designed banknotes are at higher risk for forgery.
The current consensus among the cryptography community is that 1024-bit RSA keys offer insufficient security for the typical validity periods of end-entity X.509 certiﬁcates, as attacks against RSA have become increasingly sophisticated. Since 2011, the common recommendation has been for at least a 2048-bit key length for these certiﬁcates [8]. Yet in 2014, CAs continued to allow issuance of certiﬁcates with 1024-bit RSA keys for validity periods of at least one year. Some argue that this is due to legacy platforms whose software cannot use keys longer than 1024 bits and resource-constrained platforms that expend more processing time and battery power to do public key operations on longer keys. While these factors may constrain key length, they do not constrain certiﬁcate lifetime. Thus, there is no justiﬁcation, particularly for high-value certiﬁcates, for the use of lifetimes longer than recommended for a given key length.
Another element of weak cryptograph is the use of hash algorithm. The hash algorithm MD5 was standard for certiﬁcate signatures before SHA-1, and it continues in use despite increasingly effective attacks. The Flame malware attack in 2012 took advantage of a collision in MD5 to create a fraudulent certiﬁcate [9]. Recognition of the increasingly severe weaknesses in MD5 helped generally eliminate its use in new issuance, but older certiﬁcates that use MD5 were still in use as of 2014. Certificates that downgrade from SHA1RSA to MD5RSA and from SHA256RSA to SHA1RSA continue to be observed, although the trend overall is positive; that is, entities that get new certificates may downgrade as well as upgrade.
In both cases, the use of weak cryptography is complicated by the use of long validity periods, sometimes 3, 5, 7 years or more for end-entity certiﬁcates. The validity period limits the possible exposure of a cryptography break by rendering a certiﬁcate useless by the time an attacker could brute-force the key. When the validity period exceeds this safe duration because of advances in crypto-analysis, these certiﬁcates become vulnerable but continue to be accepted.
Disorganized revocation
Two standards for revocation, certiﬁcate revocation lists CRLs and the Online Certiﬁcate Status Protocol (OCSP), are in common use. CRLs are lists of serial numbers of certiﬁcates that appear unreliable in terms of cryptography; reliable software can check the lists before accepting the cryptography. The advantage of a CRL is that updated lists can be downloaded periodically. One CRL ﬁle can include multiple revoked certiﬁcates, saving time when checking several certiﬁcates from the same CA simultaneously. The OCSP obtains a certificate’s real-time revocation status from the server. It requires conﬁrmation before use by the relying party. OCSP is more responsive to changes in certiﬁcate status, but CRLs are less affected by network delays or slow connections.
Practices amongst CAs vary, with some issuing certiﬁcates with CRL information, some with OCSP, some with both, and some with neither. Even if the CA implements best practices in its certiﬁcate issuance, this problem is further complicated by the irregular behavior of browsers and Web-application clients in checking revocation status. In 2014, Mozilla Firefox decided to use OCSP exclusively, meaning that all certiﬁcates with only CRL information in the certiﬁcate become effectively irrevocable to Firefox clients [10]. The use of CRL requires a substantial data download compared with the smaller traffic required for OCSP. Clients on constrained data connections, such as cellular connections, may use only OCSP, if they do any revocation checking at all. Apps and other non-browser web clients that use SSL frequently do no revocation checking at all, making it practically impossible to effectively revoke the certiﬁcates of servers to which they connect.
Inadequate information
Failures to include appropriate or necessary ﬁelds to limit the use and valid applications of a certiﬁcate are a recurring problem. In the past, CAs issued certiﬁcates with poorly chosen Extended Key Usages (EKUs). The EKU is what restricts a certiﬁcate to use only for particular purposes, such as authenticating an SSL server, authenticating a client, signing code, or providing a trusted timestamp. The Flame malware attack also took advantage of an intermediate CA that had an unused but valid code-signing EKU, allowing rogue certiﬁcates issued from it to be used to sign code.
Flawed evaluation software
The reality of certiﬁcate-checking is a source of serious and legitimate concern [11]. Both Apple [12] and Microsoft [13] have long-lived ﬂaws in software that evaluates certiﬁcates. Apple’s software practices were relatively more grounded in the use of open code, with the code available to all to review. Yet a signiﬁcant certiﬁcate-authenticating error stood for months. In contrast, Microsoft had internal software engineering and formal code review, and errors in its code lasted even longer. While these are serious issues, they are beyond the scope of this work.

Methods

Approach

We document the current state of bank certiﬁcates. We compare these with general-purpose certiﬁcates (i.e., the top 1 million websites). We survey the various proposals for the certiﬁcate market writ large, including pinning and notaries. We identify how those ﬁt and fail to ﬁt the unique problem of banking certiﬁcates.

Having identiﬁed the systematic failures in certiﬁcates, we discuss the proposals in the technical community for addressing them. None of these resolve the problems that plague online banking. What is needed is a policy solution. We close with a policy proposal, including technical and implementation recommendations, to ensure certiﬁcates can be a valid basis for consumer trust.

Collecting certificates

Evaluating the state of certiﬁcates in the wild requires large-scale analysis of the certiﬁcates. We wanted to be able to answer two questions: First, what is the state of banking certiﬁcates? Second, are they more reliable than general-purpose certiﬁcates, that is, those used by the top million websites?

We also compare this collection to other efforts. The fundamental difference is that we have the only certiﬁcate compilation focused on ﬁnancial-sector analysis. We also illustrate that our certiﬁcate compilation is at least as complete as other approaches, as we complement our daily scans with geographical diversity and multiple data sources.

The dataset we compiled used the PlanetLab research platform [14], allowing us to view certiﬁcates from different locations on the globe. Speciﬁcally, our scripts run on servers in the United States, both East and West coast time zones. We also ran the scripts on Asian and European servers through PlanetLab. Some certiﬁcates can be hosted on content-distributed networks, and thus will be the same from every vantage point. Other certificates are linked to a speciﬁc device, so that the same domain will result in different certiﬁcates when visited from different places.

We implemented each of the following search and compilation strategies daily from December 18, 2012, until March 2014:

The top one million websites from the ranking of the previous day. Our script obtains the website list from Alexa every morning and tries to connect to each website on the list via HTTPS. We download a certiﬁcate from the website if it is different from our previous observation.
FDIC-insured bank official websites. The FDIC maintains an official list of its member institutions. For each member, our script retrieves the name, physical address, and official web domain of the bank (if any). The script then removes invalid URLs (e.g., email addresses) from the list, and tries to download a certiﬁcate from each valid website on the official list.
For each FDIC website without a certiﬁcate matching the listed domain name, we download the homepage of the website and search for HTTPS links. We download additional certiﬁcates by following these hyperlinks. We ﬁlter out popular and common links (e.g., “Like Us on Facebook”) from the observations.

Certiﬁcates from the two data sources enable us to conduct a thorough analysis on the current status of banking certiﬁcates in the United States. By August 20, 2014, we had observed 1.1 million distinct certiﬁcates from 3.8 million popular general websites. Note that our geographically distributed exploration results in a far broader view of the PKI than the average user would experience. One study of browser histories illustrated that for a speciﬁc individual, some 90% of all root certificates would not be encountered at all [15].

There are similar projects in collecting certificates. The Electronic Frontier Foundation (EFF) actively scans the IPv4 address space for certiﬁcates and continuously augments its TLS Observatory [16]. With the EFF browser extension, users can submit their observed certiﬁcates and receive warnings from the EFF server if there is a discrepancy between a certiﬁcate the user observes and the previously observed certiﬁcates stored in the Observatory. Another centralized certiﬁcate notary is maintained by the International Computer Science Institute (ICSI) from live HTTPS traffic passively collected at its participating organizations [17]. Based on this dataset, Amann et al. performed a data analysis on the structural differences between benign certiﬁcates and rogue certiﬁcates observed in previous CA compromises [18]. Finally, similar to EFF, Durumeric et al., regularly scanned the entire IPv4 address space for certiﬁcates and made several recommendations for the PKI ecosystem [19].

None of these organizations made their datasets available for evaluation as a whole dataset. Our results are being made available via Protected Repository for the Defense of Infrastructure Against Cyber Threats (PREDICT) for reproduction or further investigation by others. They support individual queries. The ICSI dataset includes roughly one year of our own compilation that we made available to that project. The larger ICSI dataset was not available for our analysis. We also encountered challenges in accessing EFF’s data. Thus, building a dataset for analysis of general certiﬁcates was necessary. In addition, while others have evaluated phishing sites and associated certiﬁcates, to our knowledge no other group has a dataset of banking certiﬁcates.

Analysis of banking certiﬁcates

We investigated two approaches for analyzing FDIC-insured bank certiﬁcates: direct observation and machine-learning classiﬁcation. We started by making several direct observations of problems in the banking certiﬁcates collected. We then supplemented these insights with machine learning. This led to the discovery of distinct patterns in and between the categories of certiﬁcates and systematic diﬀerences between non-banking and banking certiﬁcates.

Machine learning

We examined the classiﬁcation performance with three diﬀerent machine-learning algorithms: J48, NBTree, and Random Forest.

J48 is a Java implementation of a traditional decision-tree algorithm, C4.5. This algorithm builds the decision tree based on information gains of each member in the feature set.
NBTree is a combination of the Naive Bayes regression and a decision tree.
Random Forest is an ensemble algorithm that builds several decision trees and makes the ﬁnal decision based on a majority vote of all decision trees. For each tree in the forest, it uses only a subset of randomly selected features.

We used machine-learning models to classify certiﬁcates into two categories: FDIC-insured banks and general websites. The set of certiﬁcates available to be classiﬁed as “banks” was less than one-quarter of all FDIC-insured entities. Table 1 below lists the classiﬁcation performance. For each algorithm, we report the overall percentages of certiﬁcates correctly and incorrectly identiﬁed. For each category, we record the true positive and false positive rates, indicating percentages of the correct and incorrect instances for the particular category. As noted in the table, all three algorithms had an overall accuracy rate above 99.4%. The true positive rates for the bank category is 96% for all three algorithms and the false positive rates (i.e. the certiﬁcates categorized as banks but are actually in the general category) are as low as 0.1%. For the general website category, the true positive rates are as high as 99.9%, while only 3.7% of the bank certiﬁcates were ever misclassiﬁed as general. The correctness of classiﬁcation can improve even further by combining the results of all three machine-learning algorithms.

Table 1. Classiﬁcation performance summary.

Results

Banks without valid websites or certificates

Many banks lacked domains, and thus appropriate certiﬁcates. Among all the 27,000 records in the oﬃcial FDIC list, only 6,000 had valid domains. We tried to connect to every web domain on the list, but we could establish HTTPS connections with only 4,000 of them (Figure 1).

We found no domain or certificate for 20,000 banks. The lack of association of domain name and certiﬁcate is problematic for two reasons. First, it means that it would be feasible for an attacker to register a domain for a bank, obtain a certiﬁcate for the domain name, and have that be the sole certiﬁcate. Second, as banks close, merge, or simply change branding, it would be quite feasible for an attacker to obtain a domain name similar to an expired bank domain, and then obtain a certiﬁcate. As no certiﬁcate ever would have been issued previously for that domain, none of the proposed changes to the certiﬁcate architecture would address such an attack.

Figure 1. Breakdown of banks by having valid websites and by having certificates.

Issues seen in certificates

Mismatch of the web domain and subject entries in the certiﬁcate. A certiﬁcate can be used only in the speciﬁc web domains indicated in the subject’s common name and alternative name extension fields. Among all downloaded certiﬁcates, we discovered 498 domain name mismatches in bank certificates for 41% of bank websites at some point in our data collection (Table 2). For comparison, 84% of general websites had mismatched certificates.
Certiﬁcate sharing by multiple domains. Some web domains shared the same mismatched certiﬁcate. This occurs when a single entity hosts online banking for multiple organizations. However, and to greater risk, many of the shared certiﬁcates were provided as part of the default server conﬁguration, which has not been changed by their website administrators. As one extreme example, one certiﬁcate for sinkdns.org was observed in use by 51 diﬀerent HTTPS bank domains. A certiﬁcate of webaccess1.com was used by 43 different banks. Certiﬁcates of the virtualization company Parallels were shared by 37 ﬁnancial websites. In total, 5% of bank websites used shared certificates.
Period of certificate validity. With any key, cryptographic or physical, the longer it is unchanged, the more risk of it being subverted. Unlike physical keys, cryptographic keys cannot be tracked, making subversion undetectable. Software vulnerabilities, such as web clients that misconfigure TLS, can expose keys to risk. This is exacerbated by weak encryptions keys, (5) below.
Missing EKU. The EKU limits the use of a certificate for the intended purpose. The most common use is to indicate that a certificate cannot be used to issue other certificates. A certificate that is subverted, but issued legitimately, can be used by to create new certificates under the control of an attacker.
Weak encryption standard used. Not all encryption standards are equally strong. There is no cost-based reason against using best cryptographic practices and obtaining a stronger key with a superior algorithm. Algorithms with well-known weaknesses continue to be issued, presumably for keys that have little commercial value. However, depository institutions are not in that category of customers.

Table 2. Potential vulnerabilities uncovered in certificates from bank and general websites. The percentage of bank websites and percentage of general websites for each vulnerability reflects the share of sites with that attribute at some point during the data collection from December 2012 to March 2014.

Figure 2. Banks have fewer domain name mismatches – half as many as popular general interest sites —but are much more risk-seeking when it comes to certificate lifetimes.

Vulnerability in the cloud

The issue of cloud computing also makes the lack of consistent, identiﬁable ﬁnancial certiﬁcates problematic. Botnets provide a platform where there is no constraint on criminal activity. Cloud provider services are also misused by attackers, including attackers who engage in masquerade attacks such as phishing. In our research, 22 sites PhishTank identifies as phishing sites were hosted on Google Drive. In addition, there are reports of criminal use of Microsoft’s Azure [20].

Discussion

We showed that there are signiﬁcant problems with ﬁnancial certiﬁcates. We propose how these might be at least mitigated.

Banks, citizens, customers, creators of web browsers, and other legitimate businesses all have a shared interest in having identifiable and secure bank websites. Creating a mechanism for distinguishing and recognizing banks encourages online banking and online trust.

The technical entities understand the requirements for certiﬁcates and the regulatory authorities understand the nature of systematic risk. A collaboration that consists of major cloud providers, banking regulators, cryptographic and interaction experts, browser manufactures, and selected banks could feasibly create and support adoption of best practices suitable for depository institutions.

Recommended technical best practices

Preventing masquerade fraud against ﬁnancial institutions requires differentiating legitimate ﬁnancial sites as distinct from other sites. Rather than trying to identify every phishing site against every bank, a valid cryptographic mechanism could exist for identiﬁcation of banks only. The simple model of “good versus bad” in PKI fails to provide adequate information. If this were combined with targeted password reuse identiﬁcation or other mechanisms to ﬂag input into websites, it could make masquerading as a bank far more difficult to masquerade as a bank [21]. Yet any such solution requires a reliable and correct implementation of PKI for banks.

Here we enumerate some basic best practices. None of these proposals are particularly innovative in and of themselves, but combined, they create a list of feasible requirements for high-value certiﬁcates, such as for the ﬁnancial industry.

The X.509 standard itself sets a very low bar for what constitutes a valid certiﬁcate. As a result, industry consortiums mandate further requirements, and many of these are obligatory for inclusion in the trusted root certiﬁcate list of web browsers. Several issuance best practices can be added on top of these requirements. Although legacy requirements are chieﬂy why these best practices are not yet required, they should eventually become so.

Strong cryptography
The ﬁrst best practice is the use of strong cryptography. RSA remains the dominant public key algorithm for certiﬁcates, and the cryptographic community recommends at least 2048-bit keys for end-entity certiﬁcates. MD5 has been shown to be vulnerable, and new research is exposing vulnerabilities in SHA-1 as well [21]. Therefore, the SHA-2 family of hash algorithms should be employed as part of the signature algorithm as much as possible. The use of strong cryptography then can be augmented by applying reasonable validity periods to the certiﬁcates, such as one to three years for end-entity certiﬁcates. This limits exposure from any future attacks.
Where possible, elliptic key algorithms should be considered instead of RSA, as support for these algorithms becomes increasingly deployed. Elliptic curve (EC) keys should have at least 256 bits of length for end-entity certiﬁcates. Because the EC standard is still under discussion in both the Internet Engineering Task Force (IETF) and World Wide Web Consortium (W3C), requiring it would be premature. The issues with the National Institute of Standards and Technology (NIST)’s dual EC deterministic random bit generator (DRBG), speciﬁcally the potential back door [23] and Bullrun decryption program, [24] reasonably resulted in decreased trust in this standard. While the challenges of operational risk can be handled in part at a national level, cryptographic standards for browsers and interoperability cannot be based on untrusted curves. Thus, we recommend the use of RSA with a key size of 2048 bits, as market acceptance will not be problematic. Similarly, requiring SHA-2 is a reasonable and arguably necessary step. Maximum validity periods could be determined empirically. A single year would be ideal; however, two is not beyond the pale. The longest lifetime we saw in our compilation is 40 years (happily, not from a bank). Clearly, this is not reasonable.
Usable revocation information in certificates
For cases where there is either compromise of a particular certiﬁcate or an attack against an entire class, CAs should include usable revocation information in every certiﬁcate. Usable means that every major browser and web app that supports any kind of revocation checking can use this revocation information. Although the particulars of revocation checking are beyond the scope of this document, there can be none at all if the CAs do not participate.
Discouraging wildcard certificates
The purpose of the certiﬁcate is not only to enable a key exchange to occur, but also, to bind the server’s identity to a particular principal, such as a person or a corporate entity, with the authority to use that domain. Wildcard certiﬁcates arose with the expectation that all servers under a particular domain name would belong to the same principal. Therefore, it was an acceptable optimization to use a single certiﬁcate for a larger set of server names, given that each individual certiﬁcate incurs a certain cost.
The advent of multi-tenant environments turned this expectation on its head. Hosting providers that use load-balancing SSL terminators may deploy the same certiﬁcates with multiple domain names used by many diﬀerent customers. For example, the hosting company godaddy.com may host the domain 123456789.com. However, because of the structure of an X.509 certiﬁcate, only a single subject name is present, namely that of the hosting company (godaddy.com). The registered owner of the domain exists as a point of contact, but the SSL certiﬁcate itself does not correctly identify the site’s owner. Yet if the hosting provider allocates hostnames from its own domain name but uses a wildcard certiﬁcate, not even that information identifying the site’s owner is available. For example, if the site is 123456789.godaddy.com, the certificate may not provide any information about Company 123456789. The use of a wildcard certiﬁcate in this case, while expedient, breaks a fundamental assumption of the certiﬁcate-based identity model. Therefore, for each site operated by a different entity, CAs should issue unique certiﬁcates as much as possible. In situations where this is not possible, such as the SSL terminator scenario mentioned previously, the CAs should maintain records of attestations from the hosting provider that the domain owners authorize this use.
Wildcards should be discouraged, with a uniﬁed certiﬁcate issuer being an ideal practice for larger multi-domain entities. Wildcards should be prohibited in multi-tenant environments in the case of hosting services for a depositor entity. A federally insured bank with a domain name should reasonably be expected to have the corresponding certiﬁcate, even if that certiﬁcate is associated with the domain name as a second-level instead of ﬁrst-level certiﬁcate. Multi-tenant environments can support a uniﬁed certiﬁcate issuer but may be unable to support domain-speciﬁc certiﬁcates.
Limit EKU per certificate
Recall the Extended Key Usage (EKU) extension that indicates the purpose or valid use of a certiﬁcation. A best practice is for CAs to issue separate certiﬁcates for separate purposes and not combine multiple unrelated EKUs in a single certiﬁcate. In practice, not all certiﬁcate chain engines check “transitive EKUs,” where not only must the end-entity certiﬁcate possess a certain EKU, but all CAs along the path to the root must as well. However, it is still a best practice for a CA to segregate its intermediate CAs by intended purpose, such as server authentication or code signing. Further, it is best practice for a CA to embed EKUs in the certiﬁcates of those CAs as well, so that a compromised CA is still limited to its original purposes.

If certiﬁcates are a part of operational risk for an individual institution, then systematic weaknesses in the PKI protecting depository interactions are part of the systematic risk for the banking system. Thus, there should be at least a minimal standard. The best practices above are a solid starting point for depository institutions.

Inadequacies of relying only on technical changes

The situation currently has avoidable risks not addressed by any of the proposed technical best practices for improving the PKI. Consider primarily the lack of association between domain names and certiﬁcates for 21,011 banks (Figure 1). This lack of association, which leads to the potential for an attacker to create a masquerade site, would not be resolved by any of the current technical proposals.

Phishing is now a race that defenders cannot win. A phishing domain can be detected only after it is used in an attack. Thus, barring a change in policy, there will always be a window of opportunity for phishers. The attack site must further be labeled as malicious, then associated with a warning. Takedowns usually occur within a week or so [25]. The implication of this cycle is that there is no history to new phishing domains to analyze for blacklisting, so history-based proposals for solving the challenges in PKI would fail. Results from revocation mechanisms such as Certiﬁcate Revocation Lists (CRLs) and Online Certiﬁcate Status Protocol (OCSP) have a lag between the time when a bogus certiﬁcate ﬁrst appears and when it becomes blacklisted. In the extreme case when a CA is compromised, the CRL and OCSP may become untrusted altogether.

Whitelists such as the Electronic Frontier Foundation certificate observatory, which issues warnings when certiﬁcates are inconsistent with the observatory, are also vulnerable. New certiﬁcates are not ﬂagged when they ﬁrst appear. Thus, this common attack could in fact be exacerbated by the existence of the observatory if it were to become trusted.

Another disadvantage of using whitelists and revocation is that these approaches are inherently centralized. In contrast, the use of certificate notaries represents a distributed approach to validate certificates. The Perspectives Project offers a tool that relies on a comparison between the user-submitted certiﬁcate hashes and observations made by geographically distributed notary servers [26]. Convergence [27] is a Firefox browser extension that lets users control which data sources (e.g., notaries) to trust without disclosing their network addresses to the data source. However, an average online user may not be able to evaluate the trustworthiness of online notaries. It may make an attack much easier if an adversary runs a notary and can trick other users to trust it.

Certiﬁcate pinning associates each website with a small whitelist stored by the local browser. The list is updated upon ﬁrst visit, as originally proposed in Tsow, Viecco, and Camp [28]. Google Chrome implemented this approach and protected several Google-owned domains against the use of rogue certiﬁcates. One weakness of this approach is the long tail in browsing, given the sheer scope of the problem of authenticating everyone.

Under DNS-based Authentication of Named Entities (DANE) [29], Domain Name System Security Extensions (DNSSECs) bind a domain name to its legitimate certiﬁcate. The requirements for DANE are universal adoption of both DANE and DNSSEC, on which it relies. We know of no research or evidence that points to a realistic expectation of the global, universal adoption of DNSSEC in the near term. That a speciﬁc domain under DANE can be associated with only one issuer solves only a very narrow and unusual class of attacks. It does not solve the problem of attackers with a legitimate domain name masquerading as a bank’s official site, including through cloud misuse.

If certiﬁcate and domain name providers were capable of not issuing domain names to malware providers, botnet controllers, and other malicious parties, these threats would be a lesser issue. However, certificate and domain providers are not always so scrupulous, and thus are not appropriate gatekeepers. It has been documented that six CAs in recent year issues issued rogue certificates: Comodo [30], DigiNotar [31], DigiCert [32], TurkTrust [33], French Government CA [34], and India CCA [35]. Nor is this only a recent problem. Perhaps most famously, VeriSign issued two certiﬁcates in Microsoft’s name in 2001 [36], for which Microsoft could only issue a security bulletin, as removing VeriSign as a trusted CA was clearly infeasible (MS01-017).

Finally, DANE’s reliance on DNSSEC results in all the problems of DNSSEC being a component of certiﬁcate risk. The problems of DNSSEC are both well documented [37, 38, 39] and beyond the scope of this work.

Regulatory and technology policy recommendations

Our policy proposal solves problems that lead to attacks speciﬁcally against banks, and does so with no changes to the current technical standards or to the competition among certiﬁcate providers. We propose the use of only the best standards and the creation of a mandatory certiﬁcate extension for FDIC-insured entities. This could be used to validate a certiﬁcate regardless of where it is hosted. Without the ability to identify a remote entity as a bank, masquerade attacks on the ﬁnancial system will continue. Having a signed extension by a single authority, one that is constant across all FDIC-insured entities, easily can be integrated with the current authentication practices in Firefox, Chrome, and Internet Explorer.

Advocating for identiﬁcation of speciﬁc categories of sites is not new. The W3C Standard Web Security Context: User Interface Guidelines recommend “prior designation of high-value sites,” [40], yet this has not been implemented. While the proposal is long-standing, a policy to implement it has been lacking.

The core of our proposal is that a federal entity, such as the U.S. Department of the Treasury or the FDIC itself, take two actions.

We propose the development of technical requirements for certiﬁcate issuance. Minimal requirements to control operational risk are not in any way a banking regulatory innovation, and specifying best practices in this domain is straightforward. Deﬁning maximum lifetimes and minimal cryptographic strength and recommending extensions are a feasible, reasonable way forward.
We propose the cooperative development of a third-party certiﬁcate notarization authority that applies only to banks and possibly other important financial institutions. Notice that while this would not be a CA, it would provide cryptographic notarization of an extension for certiﬁcates provided by current CAs. Such a notarization could provide proof that a legitimate federally insured bank operated the specific domain name. Rather than having every domain name reseller attempt to prevent any misleading domain name, our proposal would distinguish legitimate banking sites from other sites.

Of course, this proposal could also provide value to cloud service providers, which are currently challenged in that every customer, including masquerading attackers, has an equal capability to use the infrastructure of the cloud. By distinguishing ﬁnancial institutions from other institutions, our proposal has the potential to decrease the need for cloud providers to invest in providing certiﬁcates to every hosted site. By making this a second signature, rather than a whitelist or a blacklist approach, citizens can use this method without reporting their banking or browsing habits to any third party.

We argue that coordinating and setting up this plan is feasible due to the small number of major browser providers and cloud providers. More-secure interactions serve all parties’ interests. Furthermore, augmenting rather than replacing certiﬁcate authorities doesn’t displace or decrease business. In fact, limiting the lifetime of certiﬁcates aligns with CA incentives.

Conclusion

A functioning public key infrastructure requires certiﬁcates that authenticate a website to a user before the person authenticates to the website. The current PKI is well established. Yet the challenge of online certiﬁcation of banks is not solved. The lack of a solution enables tens of thousands of attacks on ﬁnancial institutions every year. It also enables snooping, allowing eavesdroppers to observe the content of communications with ﬁnancial institutions. Our policy proposal offers a way to answer the basic query “Is this a bank?”, and further to support the conﬁdentiality of connections to banks. Of course, answering that question enables the solution but does not solve the challenges of human factors.

The current policy of relying entirely on competition in the certiﬁcate authority market to set standards is inadequate. We illustrated that the current practice of purchasing certiﬁcates with neither best practices nor regulatory minimums badly fails consumers, particularly in the ﬁnancial sector.

The lack of security is widespread. Certiﬁcates with incorrect names, incorrectly structured certiﬁcates, or cryptographically weak and shared certiﬁcates all plague online banking. We show the vast majority of banks (88%) apparently lack the expertise, support, or incentive to implement certiﬁcates correctly.

We conclude by arguing for a change in the regulation of certiﬁcates for the ﬁnancial sector. We describe and recommend the adoption of commonly accepted best practices. We propose the creation of a readily identiﬁable official banking website indicator that requires neither a structural change in the certiﬁcates themselves nor in the larger public key infrastructure. Yet our proposal will address the failure of banks to authenticate or secure communications. With the recognition of the indicator in browsers and on cell phones, our proposal would leave phishers who target FDIC-insured institutions high and dry.

The adoption and widespread use of our proposed solutions would counter the concerns that public key certiﬁcates, while critical, are “signifying nothing” [36].

Finally, we believe that our proposal can be extended to other important consumer financial institutions beyond FDIC-insured banks. For example, researchers have shown that the tens of thousands of credit unions governed by the National Association of Federal Credit Unions are generally less secure than service providers for online banking, with problems that include scripting weaknesses and certiﬁcate reuse [41].