Created in 2001, OpenVPN is a widely used open source VPN protocol, and many VPN providers (such as those who sponsor youtubers !) use it in their VPN clients.
The internal of this protocol were relatively obscur to me so i’ve decided to had a closer look on it.
For this, i’ve had a look at the OpenVPN documentation (!) and the code itself.
I used OpenVPN code from github,commit a1cb1b47b138b9f654cd0bca5de6d08dbca61888
.
Two authentication modes
OpenVPN supports two authentication modes: static key and TLS.
When static key mode is used, several static preshared keys are used for HMAC computation and encryption/decryption.
While relatively simple, this mode has the major flaw to not offer forward secrecy: someone who manages to steal these preshared keys can decrypt previously collected VPN traffic.
Static key mode should therefore be avoided,
and we won’t talk about it anymore in this article.
When TLS mode is used, the OpenVPN peers establish a TLS session used to negotiate severals tunnel options (such as crypto primitive for the data channel, will compression be used…) and cryptographic material.
The TLS messages (handshake and application data) are not exchanged directly; Instead, OpenVPN peers exchange OpenVPN messages embedding these TLS messages.
This exchange of TLS messages nested in OpenVPN messages is the OpenVPN control channel.
As the name suggests, this channel is used to control the configuration of the OpenVPN session,and to negotiate crypographic material for the data channel, which is the channel used to exchange data.
Two channels
As we’ve just explained, OpenVPN uses two separate channels:
- A data channel: This is the channel used to exchange application data.
- A control channel: This channel is used to establish the tunnel, negotiate various options and derivate cryptographic material for the data channel.
The control channel uses TLS but the data channel doesn’t.
Cinematic of an OpenVPN session
| ---- P_CONTROL_HARD_RESET_CLIENT_V2 ----> |
| <---- P_CONTROL_HARD_RESET_SERVER_V2 ---- |
| --------------- P_ACK_V1 ---------------> |
| ------- P_CONTROL_V1 ClientHello -------> |
| <-- P_CONTROL_V1 ServerHello, etc ------- |
| <-- P_CONTROL_V1, encrypted handshake --> |
| <-- P_CONTROL_V1, encrypted handshake --> |
| <--------- P_CONTROL_V1, TLS app -------- |
| <--------- P_CONTROL_V1, TLS app -------- |
| --------------- P_DATA_V2 --------------> |
| <-------------- P_DATA_V2 --------------- |
| (...) |
v v
As we can see, several packets types are involved in this diagram.
Let’s enumerate them !
OpenVPN packets typology
P_CONTROL_HARD_RESET_CLIENT_V2
(value0x07
) – Sent by the client to forget the previous state and initializes a new key exchange,P_CONTROL_HARD_RESET_SERVER_V2
(value0x08
) – Sent by the server to forget the previous state and initializes a new key exchange,P_CONTROL_V1
(value0x04
) – Control channel packet, embeds TLS packets,P_ACK_V1
(value0x05
) – Acknowledges a control packet,P_DATA_V2
(value0x09
) – Data channel packet, it’s the packet type used to transport the user data exchanged through the OpenVPN tunnel.
Packets structure
The structure of an OpenVPN packet depends on its type:
P_CONTROL_HARD_RESET_CLIENT_V2
:
+------+------------+------+-----------+------+-----------------------------+-------------------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet Array ID |
+------+------------+------+-----------+------+-----------------------------+-------------------------+
<--1---><-----8-----><-----><----4----><--4---><--------------1------------->
P_CONTROL_HARD_RESET_SERVER_V2
:
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet ID Array | Remote session-id | Message Packet ID |
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------------+
<--1---><-----8-----><-----><----4----><--4---><--------------1-------------><----------var-----------><--------8---------><---------4-------->
P_CONTROL_V1
:
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet ID Array | Message Packet ID | TLS message |
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+-------------+
<--1---><-----8-----><----->><----4----><--4--><--------------1------------><----------var------------><--------8--------->
P_ACK_V1
:
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+
| Type | session-id | HMAC | Packet ID | Time | Message Packet ID Array Len | Message Packet ID Array | Remote session-id |
+------+------------+------+-----------+------+-----------------------------+-------------------------+-------------------+
<--1---><-----8-----><----->><----4----><--4--><--------------1------------><----------var------------><--------8--------->
P_DATA_V2, if HMAC
:
+------+---------+------+----+------------+
| Type | peer-id | HMAC | IV | ciphertext |
+------+---------+------+----+------------+
<--1---><---3---->
P_DATA_V2
, if GCM mode is used:
+------+---------+-----------+---------+------------+
| Type | peer-id | GCM count | GCM Tag | ciphertext |
+------+---------+-----------+---------+------------+
<--1---><---3----><----4-----><---16--->
When GCM mode is used, the GCM additional data are the concatenation
Type + Peer ID + GCM Count
.
The packets fields are the following:
Type = Opcode <<< 3 ^ key_id
, withkey_id
is the index of the control channel TLS session.peer-id
(inP_DATA_V2
messages): Identifies the peer.session-id
: Identifies the control messages of a given session.Packet ID
: ID of the message as an antireplay mechanism.Packet ID Array Len
andPacket ID Array
: An array of acknowledged messages.Message Packet ID
: Identifies of the embedded message.
HMAC firewall and tls-auth option
This option is activated via the tls-auth
token in the configuration files, or in the command line.
It protects the TLS messages exchanged before peers authentication. Thus, an attacker
- cannot exploit a vulnerability in the code who handles the
ClientHello
(for instance) in the TLS stack on the server side, - cannot interact with the OpenVPN server and therefore cannot DoS it,
- cannot fingerprint the TLS part of the OpenVPN server (for instance a malicious client cannot enumerate TLS ciphersuites accepted by the server).
Data channel key generation
The cryptographic key for data channel can be generated:
- via the key-method-1: Each peer generates its random material, sends it to the remote peer and uses it as keys for encryption and hmac computation. This key-method-1 is no longer recommended.
- via the key-method-2: Unlike key-method-1, this key generation method generates keys using randomness from each peer. Derivation is done using PRF specified in TLS (in section 5 of rfc 2246). This « old-fashioned OpenVPN key establishment » for data channel uses a client and a server seed exchanged through the TLS application messages.
In more recent versions of OpenVPN, data key generation is done through exporter mechanism: A specific key (exporter key),generated during the TLS handshake, is expanded through an HKDF (described in rfc 5869) into cryptographic material for each side.
OpenvpnPRF
This key derivation function uses the following inputs:
client_secret
: Comes fromP_CONTROL_V1
TLS payload sent by the client,client_seed_1
&client_seed_2
: Comes fromP_CONTROL_V1
TLS payload sent by the client,server_seed_1
&server_seed_2
: Comes fromP_CONTROL_V1
TLS payload sent by the server.
The derivation involves the following operations:
- Define
seed
such asseed = client_seed_1 + server_seed_1
, - Define
label
such aslabel = b'OpenVPN master secret'
, openvpn_master_secret = PRF(client_secret, label, seed)
,- Take the 48 first bytes of
openvpn_master_secret
:openvpn_master_secret = openvpn_master_secret[:48]
, seed = client_seed_2 + server_seed_2 + session_id_client + session_id_server
,- Define
label
such aslabel = b'OpenVPN key expansion'
, openvpn_keys = PRF(openvpn_master_secret, label, seed)
.
The cryptographic keys are then extracted from openvpn_keys
:
client_data_cipher_key = openvpn_keys[ : 32]
.- If the chosen encryption algorithm uses a 16 bytes key (such as AES128), only the 16 first bytes of
client_data_cipher_key
are used. server_data_cipher_key = openvpn_keys[128 : 160]
.
if data packet integrity is provided via HMAC algorithm,
client_data_hmac_key = openvpn_keys[64 : 128]
,server_data_hmac_key = openvpn_keys[192 : 256]
.
if data packet integrity is provided via GCM tag,
client_gcm_counter = openvpn_keys[64 : 72]
,server_gcm_counter = openvpn_keys[192 : 200]
.
Those two gcm counter values – also known as the implicit IV – are 8 bytes long, while GCM mode involves 12 bytes counters.
The 4 missing bytes are the explicit part of the counter, defined via the GCM count field of the data packet.
The TLS PRF
The TLS PRF is described in section 5 of rfc 2246 (this is the rfc who describes version 1.0 of TLS).
Exporter mechanism
If configured to use this mechanism, one of the TLS messages sent by the client embeds an IV_PROTO
token whose value iv_proto
is such that
iv_proto & IV_PROTO_TLS_KEY_EXPORT != 0
,
with IV_PROTO_TLS_KEY_EXPORT = 1 << 3
.
If it wants to use exporter mechanism too, the server embeds the tls-ekm
keyword in its answer message .
HKDF-Expand() is defined in RFC 5869 and HKDF-Expand-Label() in section 7.1 of RFC 8446 (how specifies version 1.3 of TLS).
Key derivation is done in the following way (we suppose that SHA384 and TLSv1.3 are used):
- Let
label = b'EXPORTER-OpenVPN-datakeys'
, - Let
data = SHA384(b'')
, iedata
is the SHA384 hash of the empty string, exportsecret = HKDF-Expand-Label(exporter_secret, label, data, 48, SHA384)
, whereexporter_secret
is computed during the handshake,- Let
exporterlabel = b'exporter'
, openvpn_crypto_material = HKDF-Expand-Label(exportsecret, exporterlabel, data, 256, SHA384)
.
Client and server secrets for data channel are then defined as substring of openvpn_crypto_material
.
What’s inside a data packet ?
The plaintext inside an data packet has the following structure:
plaintext = sequence number (4 bytes) + data + PKCS#5 padding
.
If GCM mode is used, there is neither sequence number nor padding:
padding is not needed as GCM is, unlike CBC, a stream mode, and sequence numbers are replaced by GCM counters.
OpenVPN dissector
It seems that wireshark doesn’t offer the possibility to decrypt OpenVPN traffic.
To overcome this, this tool https://github.com/T0lva/openvpn-dissector can dissect and decrypt OpenVPN traffic.
To do this, it dissects the TLS traffic to extract client & server seeds (key-method-2 without exporter mechanism) and detects whether new way is used.
Obviously nothing magic happens and having a TLS keylogfile who contains the TLS keys negotiated during handshake is required to decrypt TLS traffic and in the end, OpenVPN traffic.
I generated this keylogfile by using a patched version of OpenSSL (keys can be dump in various functions of ssl/tls13_enc.c), but more sophisticated tools can be used,
such as peetch (https://github.com/quarkslab/peetch) or a gdb-based script.
tls-crypt and tls-crypt-v2
In addition to the tls-auth
option, OpenVPN supports the tls-crypt
and tls-crypt-v2
options.
When one of these options is used, the Control channel messages are encrypted.
Reference
- https://openvpn.net/about/
- https://build.openvpn.net/doxygen/key_generation.html
- https://community.openvpn.net/openvpn/wiki/SecurityOverview