[guardian-dev] Blog post: VoIP Security Architecture

Tom Ritter tom at ritter.vg
Sat Nov 23 00:02:09 EST 2013


On 22 November 2013 15:40, elijah <elijah at riseup.net> wrote:

> On 11/22/2013 12:03 PM, Lee Azzarello wrote:
>
>  How could the RTP channel be encrypted prior to key agreement
>> through a verbal SAS confirmation?
>>
>
> None of the session encryption keys are derived from the SAS. It is the
> other way around. Both the session encryption keys and the SAS are
> derived from the result of the unauthenticated DH exchange (in the case
> of the first time contact) or preshared key (in some cases where you
> have talked previously).
>
> Failing the SAS doesn't do anything except alert you that you are
> probably being MiTM'ed.
>
> From
> http://zfoneproject.com/docs/ietf/rfc6189.html#SASVerifiedFlag:
>
>  A user interface element (i.e., a checkbox or button) is needed to
>> allow the user to tell the software the SAS verify was successful,
>> causing the software to set the SAS Verified flag (V), which
>> (together with our cached shared secret) obviates the need to perform
>> the SAS procedure in the next call. An additional user interface
>> element can be provided to let the user tell the software he detected
>> an actual SAS mismatch, which indicates a MiTM attack. The software
>> can then take appropriate action, clearing the SAS Verified flag, and
>> erase the cached shared secret from this session. It is up to the
>> implementer to decide if this added user interface complexity is
>> warranted.
>>
>
> In effect, the SAS is just an after the fact authentication of the
> unauthenticated DH, and it is up to the application to decide how to handle
> a SAS mismatch.



In addition to what Elijah said, I think it's worth noting that the SAS can
ONLY authenticate the channel if you *recognize the other person's voice*.
 If you don't, you can't have any confidence that there isn't a very
sophisticated attacker in the middle who is performing a one-second-delayed
translation attack.

The attacker would perform ZRTP with both parties, have 2 native speakers,
each listening to Alice and Bob and repeating what they say exactly,
*except* when they speak the SAS. In that case, they say the correct SAS
for the ZRTP session that they are impersonating.  The non-malicious
parties to the call would only notice this if they knew the voice of their
communication partner, and realized that this was not who they were talking
to.

The security of ZRTP relies on recognizing the other party and/or the
difficulty and expense of performing such an attack.

-tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mayfirst.org/pipermail/guardian-dev/attachments/20131123/7e17b556/attachment-0001.html>


More information about the Guardian-dev mailing list