Bad/not unique SIP Call-ID GUIDs

July 31, 2010

      Hi everyone,

Am I the only one to run into an unusually high share of duplicate 
Call-IDs?

A SIP Call-ID is supposed to be a GUID (Globally Unique Identifier), 
and RFC 3261 is quite clear on just how unique it needs to be.  From 
8.1.1.4:

    In a new request created by a UAC outside of any dialog, the
    Call-ID header field MUST be selected by the UAC as a globally
    unique identifier over space and time unless overridden by
    method-specific behavior. All SIP UAs must have a means to
    guarantee that the Call-ID header fields they produce will
    not be inadvertently generated by any other UA.

    [...]

    Use of cryptographically random identifiers (RFC 1750 [12]) in
    the generation of Call-IDs is RECOMMENDED.  Implementations
    MAY use the form "localid at host".  Call-IDs are case-sensitive
    and are simply compared byte-by-byte.

       Using cryptographically random identifiers provides some
       protection against session hijacking and reduces the
       likelihood of unintentional Call-ID collisions.

In theory, this means that no other SIP message series initiated by 
any other UA anywhere, ever should have that same identifier.

In practice, that's a pretty tall order to absolutely logically 
guarantee by algorithmic means.  However, it's certainly possible to 
get pretty close, from a probabilistic perspective.  The use of a 
highly effective cryptographic digest function coupled with time 
and/or RTC-seeded pseudorandom values coupled with some sort of 
host-specific values should result, in the form of a composite, in a 
very unique value.

It is with this understanding in mind that pretty much all CDRs in our 
least cost routing and other SIP service delivery solutions are keyed 
off of the Call-ID.  The Call-ID is the search key on our CDR tables, 
and for a variety of logical and performance reasons has a unique 
constraint on the index.  Call-IDs are also used in any situation 
requiring the organisation of calls into a single-key hash or tree 
structure of some description.

We use them because they're the only "unique" element that 
conveniently persists across all SIP messages that we would want to 
logically group together, irrespectively of whether the message is a 
request, a reply, whether it's an initial or an in-dialog request, 
etc.  They're easy to extract from the message body, and just about 
anything that operates on SIP messages at a relatively low level 
exposes some interface by which one can extract the Call-ID.  Anything 
else would rely on implementing an additional, uniform composite hash 
function of some kind in many different, unrelated software components 
that all utilise the same database backing, IPC, etc.

That's what they're supposed to be used for.  Call-IDs are GUIDs, and 
GUIDs make great keys.

Long story short, the problem I've been running into recently is that 
a surprisingly high number of CDRs generated by certain customers 
using our solutions are bounced by the database because of Call-ID 
collisions.  I think the quotation from the scrolls above quite 
clearly spells out that this is singularly the fault of the UACs 
involved, and while I won't name any names publicly, there are 
certainly some vendors in particular on whom I have my eye.

Nevertheless, it is presenting an accounting problem for us and some 
of our ITSP customers.  I'd be curious to know if anyone else has run 
into it, and what solutions you may have adopted to deal with it.

Thanks!

-- 
Alex Balashov - Principal
Evariste Systems LLC
1170 Peachtree Street
12th Floor, Suite 1200
Atlanta, GA 30309
Tel: +1-678-954-0670
Fax: +1-404-961-1892
Web: http://www.evaristesys.com/

abalashov＠evaristesys.com

David_Hiers＠adp.com

abalashov＠evaristesys.com

scott＠sberkman.net

abalashov＠evaristesys.com

tags

participants (3)