Vintage Protocol Nonsense: Annoying the TCP Stack to Uncover Tunneled VPN Connections

Virtual Private Networks (VPNs) are often advertised as a means to provide enhanced privacy for online browsing. VPN protocols, however, were not designed for this purpose—they have been retrofitted to do so. Our research reveals this retrofitting creates critical vulnerabilities which can easily be exploited by third-party attackers. Given this, we recommend users avoid using VPNs if they are doing so in an effort to increase their online browsing security. Other tools, such as Tor Browser, should be used instead. If users insist on using a VPN, we believe WireGuard is the best option.

This post provides an in-depth explanation of our research and assessment of the VPN vulnerabilities disclosed in CVE-2019-9461 and CVE-2019-14899. A more succinct version (with less technical detail) is available here. Note: users only checking to see if their VPN vendor/operating system has addressed the issue can skip to the User Mitigation and Vendor Responses section, below.

In a future post, we’ll discuss our assessment of the current mitigations and whether they fix the underlying vulnerability. We also plan to address the misinformation surrounding our disclosure more directly in a separate post.

Introduction

A modern VPN creates an encrypted tunnel from the user to the VPN server. The server then acts as a middleman, in effect, retrieving the content requested by the user without revealing the user's identity. In a standard configuration, when a user connects to a VPN the VPN software on the user's machine creates a tunneling interface (tun/tap) to direct all of the user's internet traffic. The traffic sent through this TUN interface is encrypted using security protocols, typically based on SSL/TLS, before being sent on to the VPN server. The server retrieves the requested traffic, and then reverses the process—sending the traffic back to the user in encrypted form.

Notably, this configuration was not designed to provide the type of security VPNs are advertised to offer today. To the contrary, VPNs (which may refer to a multitude of different services and protocols) were only designed to allow a remote user point-to-point access to a private network in order to access on-site resources. This legacy is most apparent in how TUN interfaces are addressed (in which the client and server TUN interfaces communicate like a local, private network by using one of the three blocks of IPs reserved for private networks).

As noted at the outset, our research is motivated by the fact that VPNs have become a fundamental part of the conversation surrounding online user privacy and anonymity, but the risks involved in using them is not fully understood. To better understand these risks, we develop attacks that exploit aspects of a typical VPN configuration. Our focus is not on otherwise vulnerable VPNs, but on properly configured commercial VPNs that are used for privacy and anonymity purposes by privacy enthusiasts, activists, and dissidents. Special attention has been paid to vulnerable communities that use VPNs to avoid persecution and communicate with otherwise inaccessible communities.

Threat Model

Many people in the global West associate VPNs with either avoiding cease-and-desist letters from their ISP for pirating movies, music, games, and software, or spoofing their location to access websites or features in apps that aren't available in their physical location. In many parts of the world, however, VPNs provide access to far more than region-restricted TV shows. At times, they provide the only real access users have to the international community and restricted information. In areas where the act of visiting banned websites can be severely punished, VPNs (along with Tor and other privacy-enhancing tools) create gateways to the rest of the world because they promise user identities and browsing habits will be hidden from unwanted third-party observation.

Yet for vulnerable populations concerned about the repercussions for accessing banned information or communicating with dissident groups, using a VPN at home still evokes a strong sense of paranoia. Many of these individuals therefore choose to use public networks at coffee shops or restaurants to add an additional layer of security. And, even in western nations, people who are concerned about cyber criminals and people snooping on their browsing habits are encouraged to use a VPN while on a public network. Indeed, virtually all VPN providers listed an untrusted, public network as one of the primary use cases for VPNs. The message is : If you are on an untrusted network, you should use a VPN. Our research was based on challenging this assumption.

In the threat model we employed, the attacker must be able to actively view ("sniff") all of the encrypted traffic being sent between the VPN client and the VPN server. The attacker also needs to be on the same local network as the victim, either as the access point or another local user connected to the same access point as the victim (attackers are the ninjas in the figure above). If the attacker is not the access point, then they need to have a wireless card capable of sniffing other client traffic on the same network, or they must perform a cache-poisoning attack to become the first hop for the victim's traffic.

In a real world scenario, this threat model would typically involve the attacker acting as a third-party public access point such as a cafe, airport, or mobile network controlled by a telecommunications company (with potentially some nation-state involvement). Even with a properly configured VPN on the victim's device, the attack we describe in the following section allows an attacker to infer existing connections and potentially reset or inject malicious payloads into those connections.

It is important to note that the injection of payloads is possible on all unsecured TCP connections, but does not work on protocols that implement application-level security like HTTPS (which has fortunately become significantly more common across the web in the past decade). Nonetheless, in the underdeveloped parts of the Internet that at-risk users typically operate in, there is a signiﬁcant amount of trafﬁc that is not protected by HTTPS. We used Selenium to scrape all the websites from Citizen Lab’s list of potential blocked websites in China and Brazil, and found that these two countries have 26.03% and 50.92%, respectively, as the ratio of websites that included some unencrypted element. Unencrypted ads are commonplace in China. Consider China’s attack on GreatFire’s GitHub account, which was a DDoS carried out by recruiting millions of users around the world by injecting JavaScript into their browser. The attack speciﬁcally targeted connections to one of Baidu’s ad servers that served plaintext JavaScript for advertisements. Furthermore, even if the victim is careful to only navigate to secure HTTPS sites, their communication with HTTPS addresses can still be inferred and reset by the attacker in this threat model.

Vulnerability Overview

The attack we developed to expose this new class of VPN vulnerability involves three phases. Although the explanation of how and why it works is nuanced, the attack is actually fairly simple in implementation and can be completed in under 30 seconds.

PHASE ONE:

The attacker, after seeing that the IP address you are communicating with is associated with a commercial VPN provider, determines the IP address assigned to your VPN’s TUN interface. This takes roughly 3 seconds.

The VPN tunnel interface is assigned an address on the VPN server's subnet, much in the same way each device in your home is assigned an address on your local network. For instance, your modem may have an address of 192.168.1.1, and your phone may have the address 192.168.1.12. VPNs work in much the same way, where your device is connected to the VPN's local network. By default, OpenVPN (the most popular VPN client) uses 10.8.0.1 for the server, and the VPN interface on your device may be assigned 10.8.0.12 (for example).

Even though this address is private, the default behavior on many operating systems allows attackers to infer it with ease by spoofing traffic to a victim's device. In the default OpenVPN configuration, and all the configurations we tested in the wild, this can be determined in under 3 seconds.

PHASE TWO:

The attacker determines if you are visiting a website of their choosing (such as 64.106.46.56) by spoofing internet traffic coming from the address sent to your VPN interface address. This takes roughly 5 seconds.

This phase of the attack requires a little more information than the prior one because TCP/IP connections have both an IP and a PORT. Each active connection you make to a website or service uses a PORT, typically assigned at random, and this "four-tuple" (your IP and PORT, as well as the server's IP and PORT) is used by your operating system to determine where to send your internet traffic. Most operating systems have around 30,000 ports available for TCP/IP connections, while most web servers use 80 (http) and 443 (https). The attacker will need to scan all of them on the client's machine, but this phase can still be completed in around 5 seconds.

For each website an attacker wants to check (for example, if they are going through a blocklist), they can determine if you are visiting it in under 10 seconds. Notably, an attacker will not need to repeat phase one for each subsequent site they check. This attack works regardless of whether TLS/SSL is being used. For many users, this alone is a significant enough threat. If a victim is visiting a website that doesn't use SSL, however, an attacker can also hijack the connection.

PHASE THREE:

The attacker exploits the behavior of the SEQUENCE and ACKNOWLEDGEMENT numbers in the TCP protocol to obtain a number in the correct range, allowing them to spoof packets with a malicious payload and inject data into the connection. This takes roughly 20 seconds.

Sequence and acknowledgement numbers are how a user's device and web server keep track of the information that has been exchanged. The sequence number indicates the current segment in the connection. The acknowledgement number indicates the next expected segment in the connection.

As with any traffic, delays and detours will occur. Sometimes traffic is delivered out of order or much later than is expected. When this happens, the client or server will send a different response. If the numbers are not close to the current sequence and acknowledgement numbers at all, the device receiving them will not respond. If the numbers are incorrect—but still within an acceptable range—the device will respond with a CHALLENGE ACK.

Again, in practice these three phases are easily accomplished by an informed third-party in less than 30 seconds. Readers interested in obtaining a more technical explanation of the attack should review the following section.

Vulnerability In-Depth

This section provides an in-depth explanation of the three-phase attack used to exploit the VPN vulnerability. Each phase is addressed below.

1. Find the victim's internal VPN client IP:

When a VPN is configured properly on the client device, each connection that is established will set the source as the private IP address assigned by the VPN server (instead of the address it is usually assigned by the local access point). In order for the attacker to infer existing connections on the victim device, they must first find the private client address in use by the target device.

An attacker can use the source address of the local network gateway to spoof SYN-ACK packets to each private address the client could be using. The destination and source port of the probing packets do not matter in this phase. The probes also need to include the external MAC address of the victim or else the probes will not be route-able by the attacker. This requirement limits the attack to the local network. The response is easy to identify for the attacker in this phase since it will not be tunneled and will include the source IP address used by the tunnel interface (instead of the normal public-facing address it should be using). On a typical Linux device, a routing table entry ensures that any packet destined for the IP of the local gateway will use the default external interface instead of first reaching the tun/tap virtual device as intended for a VPN client.

For this initial phase, the attacker only needs to send one packet to each private address to see if the victim is using it. Accordingly, 254 packets need to be sent for each /24 subnet the attacker is probing. Using our test scripts, we were able to consistently scan a /16 subnet in under 8 seconds. This threat model also allows the attacker to observe the single IP address the victim is always talking to (the VPN server). The attacker could therefore easily connect to the same VPN server to determine the range of private addresses being served to the clients.

Example nping command that would trigger the appropriate challenge ACK response if the victim VPN client was indeed using the internal IP of 10.8.2.16:

nping --tcp --flags SA --dest-ip 10.8.2.16 -e ap0 --dest-mac Ma:Ca:Dd:rE:Ss:Xx

2. Determine if a connection exists:

Once the attacker knows the internal IP address, they can probe the victim for existing TCP connections. The goal of phase 2 is to determine if the victim is communicating with a given website (64.106.46.56, in our example) and if so, determine the exact port they are using for this connection. As stated in our threat model, we are considering a targeted attack, where a government in control of the access point has a specific list of polarizing websites they want to check.

For phase 2, the attacker repeatedly spoofs SYN-ACKs to the victim where the destination is the internal VPN client IP found in phase 1. The source address is the IP of whatever site the attacker wants to see if the victim is connected to, while the source port is usually held constant as either port 80 or 443 depending on HTTP vs HTTPS. The attacker cycles through the entire ephemeral port range of the client, using each as the destination port of the SYN probes.

nping --tcp --flags SA --source-ip 64.106.46.56 -g 80 --dest-ip 10.8.2.16 -p [32768-60999] -e ap0 --dest-mac Ma:Ca:Dd:rE:Ss:Xx

As the attacker scans the victim's port range for a matching four-tuple connection, the victim will respond in one of two ways.

If the four tuple in the SYN-ACK does not exist, the victim will send a RST out the tun interface.
If there is an existing connection for the four tuple, the victim will send a challenge ACK out the tun interface.

TCP RST packets do not contain the time stamp field, making the length of the packet 12 bytes fewer than a challenge ACK. The attacker can use this simple size difference to reliably determine the tunnel response that includes the challenge ACK.

For the initial port scan, the attacker only needs to send a single ACK to each possible port the victim could use. On Linux, the normal ephemeral port range is 32768 to 60999. There will be a delay between the time the victim receives the spoofed packet and the time the attacker sniffs the appropriate response. In our script, the attacker can reliably find a port within ~300 of the one in the first round. The script then probes a second round at a slower rate within that range to find an estimate of the exact port in use. Finally, the attacker can verify if they found the exact port as many times as needed by spoofing the same packet and expecting to sniff the same amount of challenge-ACK responses. The script we used for testing took no more than 6 seconds to reliably determine if a victim was connected to a given website.

Example nping command that triggers an encrypted challenge-ACK from the victim, meaning the connection on port 40404 does exist:

nping --tcp --flags SA --source-ip 64.106.46.56 -g 80 --dest-ip 10.8.2.16 -p 40404 -e ap0 --dest-mac Ma:Ca:Dd:rE:Ss:Xx

Example nping command that triggers an encrypted RST from the victim, meaning the connection on a given port (40403, in this example) does NOT exist:

nping --tcp --flags SA --source-ip 64.106.46.56 -g 80 --dest-ip 10.8.2.16 -p 40403 -e ap0 --dest-mac Ma:Ca:Dd:rE:Ss:Xx

3. Injecting into the connection:

The third phase involves sending (and sniffing) the most packets. It is the most complicated of the three phases and ends up taking about 80% of the total attack time. At this stage the attacker knows about a specific existing TCP connection (source IP, source port, destination IP, destination port) and attempts to infer the exact sequence number and in-window ACK needed for the client to accept the malicious payload. All of these inference methods have been used before in similar off-path exploits (i.e. Cao). Some additions made through IETF and RFCs to address these attacks, such as significantly narrowing the acceptable ACK window and rate-limiting challenge-ACKs, have made it more difficult but still possible for the attacker to infer.

This phase involves three steps:

3a. Infer an in-window sequence number:

Spoof RSTs to the existing victim connection while incrementing the sequence number in large blocks. The victim will respond with an encrypted challenge ACK (always the same size) if the spoofed sequence number was in-window.

3b. Infer an in-window ACK for the connection:

Use the in-window sequence number found in step 1 to continually spoof empty PSH-ACKs to the victim while decrementing the ACK number in large blocks. The victim will respond with an encrypted challenge ACK once the ACK number guessed by the attacker goes just below the one in use. In practice, when testing on a typical Linux device (ubuntu 19.04, kernel 5.0) the ACK number had to be less than the one in use by at most 20k to be accepted.

3c. Infer the exact sequence number needed to inject:

Finally, use both the in-window sequence and in-window ACK found in the previous steps to find the exact sequence number. The attacker can continually spoof empty PSH-ACKs with the previously found values while decrementing the sequence number by one every send. As soon as the sequence number goes below the one in use, the victim will start responding with challenge ACKs. Our script took extra time in step 1 to make sure the in-window sequence number found was already within about 200 of the left edge of the window.

In phase 3, the attacker can just continually sniff for the same constant size of the encrypted packet where a single ACK is triggered. The attacker can find this by either monitoring the victim's traffic or connecting to the VPN server themselves to compare their own traffic. There is not as much noise as phase 2 since the victim is only responding to a small number of probes (instead of every single one). The attacker can also re-check that inferred values (i.e. the in-window sequence from step 1) are indeed correct by resending the same packets that may have triggered the response. If the attacker sniffs the same amount of responses as probes that are re-sent, then they know the value is correct.

The amount of packets needed for this phase and the time it takes for the complete inference depends mostly on the size of the TCP window and how long it takes the attacker to sniff a response from the victim. Additionally, against a Linux victim the attacker has to wait a half-second after each triggered challenge ACK due to the rate limit added to mitigate Cao's off-path attack. Our script went through three rounds during the initial in-window sequence scan to get a closer estimate of the exact one in use. Each round decreased the size of the block we skipped after each send and increased the amount of time we waited between each scan.

Real World Example

To show what a real world attack using the strategies outlined above would look like on the Internet, we used our test script to create a video demonstration based on a specific scenario in which the attacker is in control of the access point that the victim is using to connect to the website of the Democracy Party of China (a party which has been banned in China since 1998). In this scenario, the victim is using a standard Linux machine running Ubuntu 18.04 with a properly configured Nord VPN connection. The attacker in the demo is attempting to see if the victim is connected to the DPC website and, if so, inject a malicious HTTP payload when the victim interacts with the page.

Differences in OSes

There are a few small variances in how different client operating systems implement the TCP protocol and respond to the attacker's probes. On BSD based systems, unlike on current Linux systems, there is no challenge ACK rate limiting per-connection. The amount of time it takes to perform phase 2 and 3 is therefore significantly reduced because the attacker does not need to wait a half second after each triggered response. On FreeBSD, the client machine does not care what the ACK number is for the injected payload. This means the attacker can completely skip the second step of phase 3 and infer the exact sequence in use by the victim. On Android systems, the first phase is a bit more simple since it does not matter if the source address is the local gateway.

On the Apple devices we tested, including MacOS Mojave and iOS 13.1.2, it was more difficult to perform the very first phase of the attack. Unlike BSD, Linux, and Android, these systems do not respond in plain-text with the source of the internal VPN address. Instead, the attacker has to use phase 2 to infer the tun IP address. We found that almost all Apple devices establish a TCP connection in the background to the Apple Notification Service on port 5223, which only seems to use ~10 different IPs to serve clients. We also found that because the connection is re-established as soon as the client connects to the VPN server, most Mojave devices will use a source port very close to the one chosen for the original TCP connection being used to talk to the VPN server. As the attacker can see the port being used to communicate with the VPN server, they have a much better idea of the port the victim would be using for the notification service.

Security Disclosures

On December 4, 2019, we reported this vulnerability to the public oss-security mailing list after first reporting it to the Linux Security, Android, Apple, and the private oss-security distros mailing list and allowing for the maximum embargo period to expire for each. We made no other efforts to publicly disclose the vulnerability ourselves. After the vulnerability was made public, however, it was published in several places. In turn, misinformation about what the attack is and how it works was also spread. It is our hope that this blog post clears up any confusion about the attack, while also offering practical advice for affected individuals .

User Mitigation and Vendor Response

User Mitigation:

Users on Linux clients can fix the vulnerability themselves by turning reverse path filtering on with a sysctl command or an iptables command. To check and see if reverse path filtering is enabled on your machine, you can use the following command:

sysctl net.ipv4.conf.all.rp_filter

If this command returns 0 or 2, your device is susceptible to the attack, but it can be easily mitigated with the following command:

sysctl net.ipv4.conf.all.rp_filter=1

To turn reverse path filtering on permanently, you should add the following lines to /usr/lib/sysctl.d/50-default.conf:

net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.all.rp_filter=1

Or if you prefer, it can be mitigated by an iptables rule (or an equivalent nftables rule) such as the following:

iptables -t raw \! -i tun0 -d 10.0.0.0/8 -j DROP

Vendor Response:

OpenVPN published security advisory in response to our report where they stated that the vulnerability was in the way Unix-based systems are configured and not a flaw in OpenVPN's software. Although this is true, and it would be difficult for OpenVPN to anticipate the way in which each of their users will configure their software, the vulnerability nevertheless affects anyone using their software without modification. Considering that OpenVPN is currently the most popular VPN software used for commercial VPNs, this is concerning prospect for at-risk individuals. Their decision not to address the issue with mitigations in the client side software leaves anyone using their vanilla client vulnerable to this attack.

WireGuard actively participated in the development of a solution from the first day the vulnerability was privately disclosed and issued a fix on the day the vulnerability was publicly disclosed. Like OpenVPN, they also acknowledged that this is not a vulnerability in their software, but unlike OpenVPN they "are in the business of properly configuring people's networking stacks." You can follow the conversation in their mailing list archive here.

Private Internet Access says they have addressed the issue in a post on their blog shortly after the disclosure. They have not yet, however, published the details of the work they did to accomplish this.

Mullvad released a statement saying that only the first phase of the attack worked against their app and patched this the day after the public disclosure. Mullvad is one of the VPN services that we use personally. We have been more than satisfied with the quality and transparency through adoption of the open-source model. Despite this, we do think that their post might be unintentionally misleading since it talks about the app specifically, and not the service in general. Their app may be patched to mitigate this vulnerability, but if a user is using OpenVPN with Mullvad's servers, the user will need to incorporate another mitigation.

ProtonVPN released a statement notifying their users that they patched the vulnerability using the iptables rule above, but also acknowledged that Android would require a phone that has been rooted, and that iOS and macOS are unlikely to change their policy of multihoming to address the issue.

Linux developers are considering a mitigation by binding interfaces. We will update this blog if there are any developments.

OpenBSD was patched shortly after the disclosure. You can follow the conversation here.

As far as we can tell, neither Google nor Apple have addressed the issue, but we will update this post when and if they do (this post was last updated on 05-25-2020).

Media Coverage and Clarification

The following websites covered the basics of the attack and our disclosure.

https://www.zdnet.com/article/new-vulnerability-lets-attackers-sniff-or-hijack-vpn-connections/

http://www.circleid.com/posts/20200225_five_security_blind_spots_from_prolonged_implementation_of_bcp/

https://threatpost.com/linux-bug-vpns-hijacking/150891/

There are some interesting discussions on Hacker News and Slashdot, but also a few misunderstanding we hope that this post has cleared up:

Hacker News

Slashdot

A few videos were created to explain the attack in varying levels of detail. One of the more notable examples is Tom Lawrence's video, which is really impressive considering he made it the day after the public disclosure was made. The AT&T Tech Channel also produced a quality video explaining the important aspects of the attack in a shorter video.

Lawrence Systems - Vulnerability Lets Attackers Sniff or Hijack VPN on *nix based Systems

AT&T Tech Channel - VPN Hijack Vulnerability

TL;DR

We have discovered a vulnerability that allows a malicious actor connected to the same network as a victim to determine if the victim is visiting a specific website and either reset or hijack that connection. By virtue of sharing the same network, attackers can exploit the vulnerability to quickly scan a list of banned or targeted websites and determine if someone on the network is accessing them via a VPN. Even if a victim is connected using SSL/TLS, attackers can then exploit the vulnerability to deny service. And if a victim is only connected with HTTP, an attacker can go so far as to completely hijack the connection. This attack shows a fundamental flaw in the security claims made by VPN proponents who claim that using a VPN on a public network prevents malicious actors from knowing which websites you are visiting, blocking your access to specific sites, or spoofing these websites to steal your information or spy on you.

Even though several VPN vendors and operating systems have implemented a fix for this particular vulnerability, our research has unveiled a fundamental problem with the internet protocols used by VPNs. If the threats you are concerned about include accessing banned websites or restricted communications, you may be better served by a technology which isn't vulnerable to this class of attack such as Tor.

About the Authors

This work was completed by William Tolley, as part of his OTF information control fellowship at the University of California, Berkeley, and Beau Kujath, as part of an NSF-funded project at the University of New Mexico. William and Beau are both PhD students at the University of New Mexico, where they are advised by Jedidiah Crandall. This project was overseen by Jedidiah Crandall, and Narseo Vallina-Rodriguez, who served as William's OTF advisor. Mohammad Taha Khan contributed to the development of the attack code and helped test various providers, operating systems, and VPN platforms.

The authors would like to thank Adam Lynn for his insight and direction and John Stith for his copyediting that helped punctuate the end of this project.

Funding Information

This project received funding from the following sources:

OTF's Information Controls Fellowship Program (ICFP) supports examination into how governments in countries, regions, or areas of OTF's core focus are restricting the free flow of information, impeding access to the open internet, and implementing censorship mechanisms, thereby threatening the ability of global citizens to exercise basic human rights and democracy. The program supports fellows to work within host organizations that are established centers of expertise by offering competitively paid fellowships for three, six, nine, or twelve months in duration.

This material is based upon work supported by the U.S. National Science Foundation under Grant Nos. 1518878, 1518523, and 1801613. Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the National Science Foundation.

Blind In/On-Path Attack Disclosure FAQ