Keeping it secret: VoIP call encryption

In my last column (21st Century Phone Tapping), I discussed the problem of unauthorised eavesdropping on VoIP calls. This column looks at some solutions.

There are many different VoIP protocols, including the Session Initiation Protocol (SIP), H.323 and a number of proprietary protocols such as Cisco’s Skinny.

All of these have one thing in common; they are signalling protocols which manage call control but do not carry the media stream (voice or video) stream. Media is carried by a separate protocol, this is the Realtime Transport Protocol (RTP).

To protect any VoIP call from eavesdropping, all you have to do is to encrypt the media stream. A published standard, Secure RTP (SRTP) defines how to do this using the AES algorithm.

Unfortunately it is not quite that simple. SRTP details how to encrypt a media stream but does not define how encryption keys are generated. Each voice call needs two keys (one for each direction), these must be agreed using a separate key exchange protocol.

In the SIP world there are many key exchange protocols to choose from. The two mostly widely deployed are SDES which exchanges encryption keys as part of the call setup and ZRTP designed by Phil Zimmermann, creator of PGP.

The problem with SDES is that because it uses the signalling stream for key exchange, any intermediate device that has to process signalling will also be able to read the encryption keys. This means that the operator of those devices could also capture and later decrypt your “secure” call.

ZRTP avoids this problem by negotiating the keys in the media stream using a secure algorithm, preventing intermediates device from capturing the keys. ZRTP is available in a number of devices, including many cell phones.

This means that VoIP and cell phone users are now able to make secure calls back to their office, safe in the knowledge that the call cannot be monitored by any fixed or mobile network operator.