2 or 3 things about OpenVPN

Created in 2001, OpenVPN is a widely used open source VPN protocol, and many VPN providers (such as those who sponsor youtubers !) use it in their VPN clients.
The internal of this protocol were relatively obscur to me so i’ve decided to had a closer look on it.

For this, i’ve had a look at the OpenVPN documentation (!) and the code itself.

I used OpenVPN code from github,commit a1cb1b47b138b9f654cd0bca5de6d08dbca61888.

Two authentication modes

OpenVPN supports two authentication modes: static key and TLS.

When static key mode is used, several static preshared keys are used for HMAC computation and encryption/decryption.
While relatively simple, this mode has the major flaw to not offer forward secrecy: someone who manages to steal these preshared keys can decrypt previously collected VPN traffic.
Static key mode should therefore be avoided,

and we won’t talk about it anymore in this article.

When TLS mode is used, the OpenVPN peers establish a TLS session used to negotiate severals tunnel options (such as crypto primitive for the data channel, will compression be used…) and cryptographic material.

The TLS messages (handshake and application data) are not exchanged directly; Instead, OpenVPN peers exchange OpenVPN messages embedding these TLS messages.

This exchange of TLS messages nested in OpenVPN messages is the OpenVPN control channel.

As the name suggests, this channel is used to control the configuration of the OpenVPN session,and to negotiate crypographic material for the data channel, which is the channel used to exchange data.

Two channels

As we’ve just explained, OpenVPN uses two separate channels:

A data channel: This is the channel used to exchange application data.
A control channel: This channel is used to establish the tunnel, negotiate various options and derivate cryptographic material for the data channel.

The control channel uses TLS but the data channel doesn’t.

Cinematic of an OpenVPN session

| ---- P_CONTROL_HARD_RESET_CLIENT_V2 ----> |
| <---- P_CONTROL_HARD_RESET_SERVER_V2 ---- |
| --------------- P_ACK_V1 ---------------> |
| ------- P_CONTROL_V1 ClientHello -------> |
| <-- P_CONTROL_V1 ServerHello, etc ------- |
| <-- P_CONTROL_V1, encrypted handshake --> |
| <-- P_CONTROL_V1, encrypted handshake --> |
| <--------- P_CONTROL_V1, TLS app -------- |
| <--------- P_CONTROL_V1, TLS app -------- |
| --------------- P_DATA_V2 --------------> | 
| <-------------- P_DATA_V2 --------------- |
|                   (...)                   |
v                                           v

As we can see, several packets types are involved in this diagram.

Let’s enumerate them !

OpenVPN packets typology

P_CONTROL_HARD_RESET_CLIENT_V2 (value 0x07) – Sent by the client to forget the previous state and initializes a new key exchange,
P_CONTROL_HARD_RESET_SERVER_V2 (value 0x08) – Sent by the server to forget the previous state and initializes a new key exchange,
P_CONTROL_V1 (value 0x04) – Control channel packet, embeds TLS packets,
P_ACK_V1 (value 0x05) – Acknowledges a control packet,
P_DATA_V2 (value 0x09) – Data channel packet, it’s the packet type used to transport the user data exchanged through the OpenVPN tunnel.

Packets structure

The structure of an OpenVPN packet depends on its type:

P_CONTROL_HARD_RESET_CLIENT_V2:

+------+------------+------+-----------+------+-----------------------------+-------------------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet Array ID |
+------+------------+------+-----------+------+-----------------------------+-------------------------+
<--1---><-----8-----><-----><----4----><--4---><--------------1------------->

P_CONTROL_HARD_RESET_SERVER_V2:

+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet ID Array | Remote session-id | Message Packet ID |
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------------+
<--1---><-----8-----><-----><----4----><--4---><--------------1-------------><----------var-----------><--------8---------><---------4-------->

P_CONTROL_V1:

+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet ID Array | Message Packet ID | TLS message |
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------+
<--1---><-----8-----><----->><----4----><--4--><--------------1------------><----------var------------><--------8--------->

P_ACK_V1:

+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet ID Array | Remote session-id |
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+
<--1---><-----8-----><----->><----4----><--4--><--------------1------------><----------var------------><--------8--------->

P_DATA_V2, if HMAC:

+------+---------+------+----+------------+
| Type | peer-id | HMAC | IV | ciphertext |
+------+---------+------+----+------------+
<--1---><---3---->

P_DATA_V2, if GCM mode is used:

+------+---------+-----------+---------+------------+
| Type | peer-id | GCM count | GCM Tag | ciphertext |
+------+---------+-----------+---------+------------+
<--1---><---3----><----4-----><---16--->

When GCM mode is used, the GCM additional data are the concatenation

Type + Peer ID + GCM Count.

The packets fields are the following:

Type = Opcode <<< 3 ^ key_id, with key_id is the index of the control channel TLS session.
peer-id (in P_DATA_V2 messages): Identifies the peer.
session-id: Identifies the control messages of a given session.
Packet ID : ID of the message as an antireplay mechanism.
Packet ID Array Len and Packet ID Array: An array of acknowledged messages.
Message Packet ID: Identifies of the embedded message.

HMAC firewall and tls-auth option

This option is activated via the tls-auth token in the configuration files, or in the command line.

It protects the TLS messages exchanged before peers authentication. Thus, an attacker

cannot exploit a vulnerability in the code who handles the ClientHello (for instance) in the TLS stack on the server side,
cannot interact with the OpenVPN server and therefore cannot DoS it,
cannot fingerprint the TLS part of the OpenVPN server (for instance a malicious client cannot enumerate TLS ciphersuites accepted by the server).

Data channel key generation

The cryptographic key for data channel can be generated:

via the key-method-1: Each peer generates its random material, sends it to the remote peer and uses it as keys for encryption and hmac computation. This key-method-1 is no longer recommended.
via the key-method-2: Unlike key-method-1, this key generation method generates keys using randomness from each peer. Derivation is done using PRF specified in TLS (in section 5 of rfc 2246). This « old-fashioned OpenVPN key establishment » for data channel uses a client and a server seed exchanged through the TLS application messages.

In more recent versions of OpenVPN, data key generation is done through exporter mechanism: A specific key (exporter key),generated during the TLS handshake, is expanded through an HKDF (described in rfc 5869) into cryptographic material for each side.

OpenvpnPRF

This key derivation function uses the following inputs:

client_secret: Comes from P_CONTROL_V1 TLS payload sent by the client,
client_seed_1 & client_seed_2: Comes from P_CONTROL_V1 TLS payload sent by the client,
server_seed_1 & server_seed_2: Comes from P_CONTROL_V1 TLS payload sent by the server.

The derivation involves the following operations:

Define seed such as seed = client_seed_1 + server_seed_1,
Define label such as label = b'OpenVPN master secret',
openvpn_master_secret = PRF(client_secret, label, seed),
Take the 48 first bytes of openvpn_master_secret: openvpn_master_secret = openvpn_master_secret[:48],
seed = client_seed_2 + server_seed_2 + session_id_client + session_id_server,
Define label such as label = b'OpenVPN key expansion',
openvpn_keys = PRF(openvpn_master_secret, label, seed).

The cryptographic keys are then extracted from openvpn_keys:

client_data_cipher_key = openvpn_keys[ : 32].
If the chosen encryption algorithm uses a 16 bytes key (such as AES128), only the 16 first bytes of client_data_cipher_key are used.
server_data_cipher_key = openvpn_keys[128 : 160].

if data packet integrity is provided via HMAC algorithm,

client_data_hmac_key = openvpn_keys[64 : 128],
server_data_hmac_key = openvpn_keys[192 : 256].

if data packet integrity is provided via GCM tag,

client_gcm_counter = openvpn_keys[64 : 72],
server_gcm_counter = openvpn_keys[192 : 200].

Those two gcm counter values – also known as the implicit IV – are 8 bytes long, while GCM mode involves 12 bytes counters.

The 4 missing bytes are the explicit part of the counter, defined via the GCM count field of the data packet.

The TLS PRF

The TLS PRF is described in section 5 of rfc 2246 (this is the rfc who describes version 1.0 of TLS).

Exporter mechanism

If configured to use this mechanism, one of the TLS messages sent by the client embeds an IV_PROTO token whose value iv_proto is such that

iv_proto & IV_PROTO_TLS_KEY_EXPORT != 0,

with IV_PROTO_TLS_KEY_EXPORT = 1 << 3.

If it wants to use exporter mechanism too, the server embeds the tls-ekm keyword in its answer message .

HKDF-Expand() is defined in RFC 5869 and HKDF-Expand-Label() in section 7.1 of RFC 8446 (how specifies version 1.3 of TLS).

Key derivation is done in the following way (we suppose that SHA384 and TLSv1.3 are used):

Let label = b'EXPORTER-OpenVPN-datakeys',
Let data = SHA384(b''), ie data is the SHA384 hash of the empty string,
exportsecret = HKDF-Expand-Label(exporter_secret, label, data, 48, SHA384), where exporter_secret is computed during the handshake,
Let exporterlabel = b'exporter',
openvpn_crypto_material = HKDF-Expand-Label(exportsecret, exporterlabel, data, 256, SHA384).

Client and server secrets for data channel are then defined as substring of openvpn_crypto_material.

What’s inside a data packet ?

The plaintext inside an data packet has the following structure:

plaintext = sequence number (4 bytes) + data + PKCS#5 padding.

If GCM mode is used, there is neither sequence number nor padding:

padding is not needed as GCM is, unlike CBC, a stream mode, and sequence numbers are replaced by GCM counters.

OpenVPN dissector

It seems that wireshark doesn’t offer the possibility to decrypt OpenVPN traffic.

To overcome this, this tool https://github.com/T0lva/openvpn-dissector can dissect and decrypt OpenVPN traffic.
To do this, it dissects the TLS traffic to extract client & server seeds (key-method-2 without exporter mechanism) and detects whether new way is used.

Obviously nothing magic happens and having a TLS keylogfile who contains the TLS keys negotiated during handshake is required to decrypt TLS traffic and in the end, OpenVPN traffic.

I generated this keylogfile by using a patched version of OpenSSL (keys can be dump in various functions of ssl/tls13_enc.c), but more sophisticated tools can be used,
such as peetch (https://github.com/quarkslab/peetch) or a gdb-based script.

tls-crypt and tls-crypt-v2

In addition to the tls-auth option, OpenVPN supports the tls-crypt and tls-crypt-v2 options.

When one of these options is used, the Control channel messages are encrypted.

Reference

https://openvpn.net/about/
https://build.openvpn.net/doxygen/key_generation.html
https://community.openvpn.net/openvpn/wiki/SecurityOverview

Laisser un commentaire Annuler la réponse