Implementing Web Push with NaviServer

Introduction

Web Push notifications are a relatively recent technology. It allows one to send messages to the user out of context of your web application, i.e. even when the respective browsertab is closed. This can be very useful for time-sensitive information or for alerting the user of some event. At the time of writing (2018) this feature is supported by Google Chrome and Mozilla Firefox. For mobile devices, currently, only the Android operation system supports it.

Libraries in various languages exist that hide the implementation details of Web Push, however, as far as we know, no open source Tcl implementation exists so far. In this article we describe how we filled that gap using NaviServer and created a simple API open for anyone to use.

The Web Push Protocol is a standard that specifies how the delivery of such messages works. Both Chrome and Firefox implement this protocol, so your application does not have to distinguish which browser your user is working with. Before we get into details how it works we are going to define some terms.

Web Push Terminology

A notification is a message to the user that appears outside of the normal user interface, e.g. in the bottom left of the desktop instead of inside the browser.
A push message is the message that is sent over the internet from the server to the client. These messages often follow the fire-and-forget pattern, however, it is also possible to specify a lifetime up to 28 days, i.e. they are delivered on the next start of the browser.
Push notifications are notifications that are created due to a push message.
Browsers implement a push service which is responsible for delivering a push message to the correct client.

The Web Push Protocol

In essence, the Web Push Protocol works as follows: The client subscribes to your push notifications using a public key which identifies your application. This is done using the Push API implemented by the browser (based on the Service Worker API, see as well Push API, W3C Working Draft 15 Dec 2017). The result of this process is a subscription that usually looks something like this:

    {
      "endpoint": "https://fcm.googleapis.com/fcm/send/fKx5MYaL80w:APA91bF3ssm...",
      "keys": {
         "p256dh": "BP0n1og3aSVEUu1dlovLK_wIa9qkqJpsJ…",
         "auth": "By49Q9zIMeIt376ycxbq-w"
      }
    }

This subscription uniquely identifies the client device and browser. Your application server must collect and save these subscriptions since you need them every time you want to send a push message. The endpoint is the URL where you will send your post request to in order to trigger a push notification. The keys field is needed for data bearing push messages, which we explain more detailed in a later section.

VAPID

For a push service to forward your message to the client it needs to be certain that it actually came from you. This is ensured through a JSON Web Token (jwt) that must be included in the header of every push message. This jwt must be created according to the VAPID specification (see also: RFC 8292: Voluntary Application Server Identification (VAPID) for Web Push).

Generating VAPID keys

The first thing you will need is a private key that is used for identification. The Web Push and VAPID encryption mechanisms use the elliptic-curve Diffie-Hellman (ECDH) protocol for key agreement. We will not go into detail here how it works, just note that this algorithm uses geometric curves to derive keys.
Many different curves exist; for Web Push we need a curve called "prime 256v1" (abbreviated“P-256”). To create a new private key in Tcl we can use the NaviServer crypto API:

    ns_crypto::eckey generate \
        -name prime256v1 \
        -pem MyPrivateKey.pem

This call generates a new private key based on the curve "prime256v1" in PEM file format with the specified file name. This PEM file should be kept at a secure location, since anyone who has access to it could send notifications to your users. Behind the scenes NaviServer uses OpenSSL for cryptography so this is essentially the same as:

    openssl ecparam \
        -name prime256v1 \
        -genkey \
        -noout \
        -out MyPrivateKey.pem

In the next step, we derive a public key based on the private key (which is possible for elliptic curves):

    ns_crypto::eckey pub \
        -pem MyPrivateKey.pem \
        -encoding base64url

This function returns as result the corresponding public key in DER format base64url encoded. Note that this is not exactly the same output as OpenSSL’s openssl ecparam ... -pubout ... function. The first few bytes in the standard PEM format define which curve is used. Since this is clear for Web Push by specification this is left out for public keys. Public keys are 65 bytes long. Hence, with the OpenSSL command line itnerface, it is necessary to strip the unneeded content via unix shell commands:

    openssl ecparam \
        -in MyPrivateKey.pem \
        -pubout \
        -outform DER \
        | tail -c 65 | base64 | tr -d '=' | tr '/+' '-_'

This key pair identifies your application to the push service. The client uses the public key in the subscription process and you prove your identity to the push service by signing a jwt with your private key for each request.

Signing a jwt

A jwt consists of three parts: header, payload and signature. These are separated by dots, so a typical jwt looks like this:

xx.yy.zz

For Web Push the first part is always the same. It contains the string {typ:"JWT",alg:"ES256"} base64url encoded which is eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiJ9
The payload consists of a JSON object containing a set of claims. For the above token the following claims were used:
```
{"sub":"mailto:georg@test.com","aud":"https://fcm.googleapis.com","exp":"1531846615"}
      
```
These three fields are mandatory. They are specified as follows:
1. sub contains an address to contact your administrators. This is usually a mailto or a website. The push service providers will use this if something goes wrong with your feed.
2. aud is the audience of the jwt. This must contain the scheme and host of the push service. For the endpoint in our example (https://fcm.googleapis.com/fcm/send/fKx5MYaL80w:APA91bF3ssm...) this field has to contain https://fcm.googleapis.com. Beware no not add a trailing slash, Google’s push service rejects requests if you do.
3. exp This is the Unix time in seconds when the VAPID header expires. This must not be more than 24 hours from the time of the request. Because of the risk of a replay attack long-lived headers are not recommended.
To finish the body of the jwt you must first create a JSON formatted string of the claim without any whitespaces and newlines. The base64url encoded version of this string constitutes the body of the jwt.
The header and the body of the jwt separated by a “.” make up the input for the signature. NaviServer offers a function for this:
```
    ns_crypto::md vapidsign \
        -digest sha256 \
        -encoding base64url \
        -pem MyPrivateKey.pem \
        INPUT
```
This uses your private key to sign the jwt. In our example INPUT is:

eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiJ9.eyJzdWIiOiJtYWlsdG86Z2VvcmdAdGVzdC5jb20iLCJhdWQiOiJodHRwczovL2ZjbS5nb29nbGVhcGlzLmNvbSIsImV4cCI6IjE1MzE4NDY2MTUifQ

The result is the final part of the jwt which is appended after another “.”. Our final jwt looks as follows:

eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiJ9.eyJzdWIiOiJtYWlsdG86Z2VvcmdAdGVzdC5jb20iLCJhdWQiOiJodHRwczovL2ZjbS5nb29nbGVhcGlzLmNvbSIsImV4cCI6IjE1MzE4NDY2MTUifQ.0hnVHb0ylGKqT6vTSm95V2nvVbeSRRFnqHA2xlOmdBPRL_C40R9QurRxt9p0pOakeEqQo1Ge0qC00XoOVTB_Eg
The signature will look different for you since you will be using your own private key to sign it.

Sending a push message

To send a push message you send a post request to the endpoint of a subscription. For the push service to accept your request you need to set the authorization header appropriately. It consists of a string in the form “vapid t=JWT,k=MyPublicKey” where JWT is the signed jwt and MyPublicKey is the public part of the key the jwt was signed with in base64url encoding (the one we created with ns_crypto::eckey pub …). For our example it looks as follows:

vapid t=eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiJ9.eyJzdWIiOiJtYWlsdG86Z2VvcmdAdGVzdC5jb20iLCJhdWQiOiJodHRwczovL2ZjbS5nb29nbGVhcGlzLmNvbSIsImV4cCI6IjE1MzE4NDY2MTUifQ.0hnVHb0ylGKqT6vTSm95V2nvVbeSRRFnqHA2xlOmdBPRL_C40R9QurRxt9p0pOakeEqQo1Ge0qC00XoOVTB_Eg,k=BFzhXP5G5Pp5xmEfESPsd7L6N2oQZZypGd2tUR5diW9spzJFs5DXaUuM1iMVfZGunUhtHkyYjqPfcQ2bfzKzbeY

Note that there are no newlines in this string; they are inserted here just for readability purposes. The last thing we need is to set the TTL header. This the time-to-live in seconds of the push message. If the user is not online the push service will retain the message for that amount of time, delivering it as soon as the user comes back. The maximum TTL value varies between push services but is usually about one month. In the response of the post request you receive the TTL value that was set by the push service. If the return value is lower than the value you sent you know that you reached the maximum.
We are now able to send push notifications to a user. The following curl command shows an example post request:

    curl -v -X POST \
      -H "Authorization: vapid t=eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiJ9.eyJzdW..." \
      -H "TTL: 0" \
      "https://fcm.googleapis.com/fcm/send/fKx5MYaL80w:APA91bF3ssm..."

Note that the body of the post request is empty. This is a push message without payload, i.e. the client receives a push event but without any added data. Push messages with payload require more sophisticated encryption which we are going to describe in the next section.

Payload Encryption for Web Push

To send a push message with payload proper encryption is required. This ensures that your data can only be read by the client that receives the message, i.e. also the push service has no access. Here is a rough overview of the necessary steps:

Generate a new set of keys for each message.
Derive a shared secret using the Diffie-Hellman-algorithm and compute encryption parameters
Encrypt the payload
Format a post request according to a given mode (we cover "aesgcm" and the newer "aes128gcm")

For every data bearing push message it is necessary to generate a new pair of keys. We do this the same way we did for VAPID:

    ns_crypto::eckey generate \
        -name prime256v1 \
        -pem TempPrivateKey.pem
    ns_crypto::eckey pub \
        -pem TempPrivateKey.pem \
        -encoding base64url

These keys are temporary, i.e. they should be used only once for a single push message to a single client.
The next step is deriving the shared secret between your server and the client. The “p256dh” field in the subscription contains the public key of the client. We use this field and our temporary private key to come to the shared secret. Using NaviServer this can be done using the following command:

    ns_crypto::eckey sharedsecret \
        -pem TempPrivateKey \
        -encoding base64url \
        p256dh

Note that this function expects binary input. The “p256dh” field in a subscription is usually in base64url format so it needs to be converted to binary first, e.g. using ns_base64urldecode.
Now that we have the shared secret we can derive the encryption key and nonce. To that end a HMAC-based key derivation function (HKDF) is used repeatedly. What this function essentially does is taking a cryptographically weak input, usually a string, and making it cryptographically strong in a deterministic way. NaviServer offers this function:

    ns_crypto::md hkdf -digest sha256 \
        -salt     randomValues \
        -secret   inputKeyMaterial \
        -info     infostring \
        -encoding binary \
        32

Salt is a random value that adds cryptographic strength to the output. The secret parameter contains the input key material. Info is optional and contains application specific information. HKDF can create outputs of arbitrary length up to 255 times as long as the hash used (SHA-256 for Web Push). It is specified as 32 in the above example.
How exactly the input key material and infostring are created and HKDF is used varies between the GCM modes which is why we are going to look at them separately.

Encryption parameters in aesgcm

At the time of writing, aesgcm is the most common GCM mode for Web Push. The first step is to create the input key material (ikm). This is the first use of the HKDF function. The parameters for it must be set as follows:

salt: The “auth” field of the subscription is used here. Note that ns_crypto::md hkdf expects this parameter in binary format.
secret: The shared secret derived from the client public key (“p256dh” of subscription) and our temporary private key again in binary format
info: In aesgcm the info string consists of the string “Content-Encoding: auth” followed by one null byte.
length: The length of the input key material must be 32 bytes

In code it looks something like this:

    ns_crypto::md hkdf -digest sha256 \
        -salt   auth \
        -secret sharedSecret \
        -info   “Content-Encoding auth\x00” \
        -encoding binary \
         32

Creating encryption key and nonce

To create the encryption key and nonce we use again the HKDF. The first thing we need is salt. For Web Push we need to create salt containing 16 random bytes. Salt should be unique for every single push message. Using NaviServer this can be done as follows:

    ns_crypto::randombytes -encoding binary 16

Next we need to generate two info strings. One for the encryption key and one for the nonce. In aesgcm these are constructed as follows:

“Content-Encoding: “ followed by either “aesgcm” or “nonce” followed by one null byte followed by “P-256” followed by one null byte followed by the length of the client public key as a 16-bit big-endian integer (this integer should always have the value 65) followed by the client public key (contents of the “p256dh” field in binary format) followed by the length of the temporary server public key as a 16-bit big-endian integer (again 65) followed by the temporary server public key in binary format.

Hence, this string contains the public key we created previously solely for encrypting a single push message. Now we have all ingredients to construct the encryption key and nonce:

    set key [ns_crypto::md hkdf -digest sha256 \
        -salt salt \
        -secret ikm \
        -info keyInfoString \
        -encoding binary \
        16]

    set nonce [ns_crypto::md hkdf -digest sha256 \
        -salt salt \
        -secret ikm \
        -info nonceInfoString \
        -encoding binary \
        12]

The length of the encryption key is 16 bytes whereas the length of the nonce is 12 bytes. The secret parameter ikm refers to the input key material created in the previous step.

Padding

Padding is used to avoid potential attackers to distinguish different types of messages based on the length of the payload. In aesgcm padding consists of a 16-bit big-endian integer (two bytes) indicating how many bytes of padding there is followed by that number of null bytes. Padding is prepended to the data before encryption. Padding is optional, so you do not have to add any padding if you do not want to; however, even if you add no padding the first two bytes must then contain the integer 0 encoded in 16 bits.

Encrypting the payload

To encrypt our payload we use the cipher aes-128-gcm. The nonce serves as an initialisation vector (iv). Our post request also needs to contain the authentication tag which is 16 bytes long for Web Push. NaviServer provides encryption functionality as follows:

    ns_crypto::aead::encrypt string \
        -cipher aes-128-gcm \
        -iv nonce \
        -key encryptionKey \
        -encoding binary
         paddedData

This returns a dictionary containing the ciphertext (“bytes”) and the authentication tag (“tag”). The body of the post request must contain the ciphertext followed by the tag. For one request the maximum payload size is 4096 bytes. Hence, you have 4078 bytes for your data: 4096 - 16 for the tag - 2 for mandatory padding length. Note that for some endpoints, e.g. on various mobile devices, the maximum payload size might actually be lower than these 4096 bytes.

Formatting headers in aesgcm

The post request for a push message with payload must contain these headers:

Encryption: salt=mySalt
Crypto-Key: dh=tempPubKey
Content-Encoding: aesgcm
Content-Type: application/octet-stream

mySalt is the salt we used in the encryption process. tempPubKey is the public key of the keypair we created for this one message. Both parameters need to be encoded in base64url format. Since we send the data in raw binary we set the content-type to application/octet-stream. The body of the post request contains our encrypted data (ciphertext plus authentication tag).
To identify yourself you also need to add the authorization header containing your VAPID details. As in requests without payload the TTL header needs to be set as well.

Encryption parameters in aes128gcm

Aes128gcm is the newest GCM mode that is used for Web Push. It is as well the standard proposed by RFC 8188 (Encrypted Content-Encoding for HTTP). This standard was published in 2017, so some push services might not support it yet (most prominently, current versions of Chrome and Firefox support it). We will only highlight the differences to aesgcm:

The infostring for the initial HKDF call that creates the initial key material differs from aesgcm. In aesgcm this string was “Content-Encoding: auth\x00”. For aes128gcm this string also contains the client and server public keys in binary format. It is constructed as follows:

“WebPush: info” followed by a null byte followed by the client public key (“p256dh”) in binary format followed by the temporary server public key in binary format.
The infostring for the second and third HKDF calls that create the encryption key and nonce is static in aes128gcm: It contains the string “Content-Encoding: “ followed by “aes128gcm” respectively “nonce” followed by a null byte.
Padding is appended (in contrast to prepended in aesgcm) to the data in aes128gcm. A delimiter octet “\x02” is expected between the data and the padding. The padding itself again consists of null bytes.
The salt and temporary server public key are prepended as a header to the ciphertext and tag in the body of the post request. The header consists of the following:

salt (16bytes) | rs (4 bytes) | pubKeyLen (1 byte) | pubKey (65 bytes)
The record size (rs) is a 32-bit big endian integer. The final payload must be less than this number of bytes. It is allowed to simply set this to the maximum payload size of 4096. Next is the length of key that follows as 8-bit integer. Since the length of public keys for Web Push is 65 this will always be the number 65 encoded in 8 bits. The last part is the temporary public key that was created for this push message. These 86 bytes need to be the first bytes in the payload of an aes128gcm request. Hence, the final form of the body should look like this: aes128gcmheader | ciphertext | authenticationtag
Encryption and crypto-key headers are not used for aes128gcm since this information is stored in the body of the request.

Tcl API for Web Push and NaviServer

We created a simple to use Tcl API that builds on NaviServer which handles all of the VAPID and encryption details for you. As a user of this API all you need to do is call a single function to send a push notification to a client:

    webpush::send \
       -subscription /subscription/ \
       -data /data/ \
       -claim /claim/ \
       -privateKeyPem /privateKeyPem/ \
       -localKeyPath /localKeyPath/ \
       ?-mode /mode/? \
       ?-timeout /timeout/?\
       ?-ttl /ttl/?

The subscription parameter is a dict that contains at least an “endpoint”. If you want to send push notifications with payload this dict also needs to contain an “auth” and a “p256dh” field. Note that this is not a nested dictionary as the standard JSON subscription object, i.e. no “keys” field is expected.
data contains the data you want to send as a payload. Make sure not to exceed that maximum payload size, otherwise an exception will be thrown.
claim is a dict containing at least a “sub” field. If “aud” and “exp” fields exist they will be validated, otherwise the will be extracted from the endpoint respectively set to a default.
privateKeyPem is the path to a pem file containing the private key you want to use for VAPID.
localKeyPath is a path to a directory with write access to create temporary keys for encryption.
mode GCM mode, can be set to “aesgcm” which is the default or “aes128gcm”.
timeout Is the timeout value for the post request in seconds.
ttl is the time to live in seconds for the push message.
And that is all you need to do to successfully send a push message to a client. Of course, the public key the client subscribed to must match the private key used here.

Encryption

In case you want to assemble your post requests yourself you can just use our API for encrypting your data. This is done using the following function:

    webpush::encrypt \
      -data /data/ \
      -privateKeyPem /privateKeyPem/ \
      -auth /auth/ \
      -p256dh /p256dh/ \
      -salt /salt/ \
      -mode /mode/

This function encrypts the data using a private key in form of a pem file, the auth secret, the client public key (p256dh) and a salt value. auth, p256dh, salt are expected in binary format. Currently "aesgcm" and "aes128gcm" are supportet content encryption modes. The encrypted data is returned in binary format.

Decryption

If you want to write your own client or want to use this encryption scheme for other purposes you can also use our API to decrypt messages using the following function:

    webpush::decrypt \
      -encrData /encrData/ \
      -privateKeyPem /privateKeyPem/ \
      -auth /auth/ \
      -mode /mode/ \
      ?-serverPubKey /serverPubKey/? \
      ?-salt /salt/?

As input you need the encrypted data, the private key matching the public key that was used for encryption, i.e. the private key matching the public key in “p256dh” and the auth as well as the salt used for encryption. Currently supported GCM modes are “aesgcm” and “aes128gcm”.

Conclusion

Hopefully we could shed some light on how the Web Push protocol as well as payload encryption works. Of course, one could simply use on of the available libraries and be done with it, however, it is often good to understand what is actually happening in order to evaluate all risks of a new technology appropriately.

References

W3C: Push API, W3C Working Draft 15 Dec 2017
RFC 8030: Generic Event Delivery Using HTTP Push
RFC 8291: Message Encryption for Web Push
RFC 5288: AES Galois Counter Mode (GCM) Cipher Suites for TLS
RFC 8188: Encrypted Content-Encoding for HTTP
RFC 8292: Voluntary Application Server Identification (VAPID) for Web Push
Mozilla: Push API
Mozilla: Service Worker API