Bad/not unique SIP Call-ID GUIDs

Hi everyone, Am I the only one to run into an unusually high share of duplicate Call-IDs? A SIP Call-ID is supposed to be a GUID (Globally Unique Identifier), and RFC 3261 is quite clear on just how unique it needs to be. From 8.1.1.4: In a new request created by a UAC outside of any dialog, the Call-ID header field MUST be selected by the UAC as a globally unique identifier over space and time unless overridden by method-specific behavior. All SIP UAs must have a means to guarantee that the Call-ID header fields they produce will not be inadvertently generated by any other UA. [...] Use of cryptographically random identifiers (RFC 1750 [12]) in the generation of Call-IDs is RECOMMENDED. Implementations MAY use the form "localid at host". Call-IDs are case-sensitive and are simply compared byte-by-byte. Using cryptographically random identifiers provides some protection against session hijacking and reduces the likelihood of unintentional Call-ID collisions. In theory, this means that no other SIP message series initiated by any other UA anywhere, ever should have that same identifier. In practice, that's a pretty tall order to absolutely logically guarantee by algorithmic means. However, it's certainly possible to get pretty close, from a probabilistic perspective. The use of a highly effective cryptographic digest function coupled with time and/or RTC-seeded pseudorandom values coupled with some sort of host-specific values should result, in the form of a composite, in a very unique value. It is with this understanding in mind that pretty much all CDRs in our least cost routing and other SIP service delivery solutions are keyed off of the Call-ID. The Call-ID is the search key on our CDR tables, and for a variety of logical and performance reasons has a unique constraint on the index. Call-IDs are also used in any situation requiring the organisation of calls into a single-key hash or tree structure of some description. We use them because they're the only "unique" element that conveniently persists across all SIP messages that we would want to logically group together, irrespectively of whether the message is a request, a reply, whether it's an initial or an in-dialog request, etc. They're easy to extract from the message body, and just about anything that operates on SIP messages at a relatively low level exposes some interface by which one can extract the Call-ID. Anything else would rely on implementing an additional, uniform composite hash function of some kind in many different, unrelated software components that all utilise the same database backing, IPC, etc. That's what they're supposed to be used for. Call-IDs are GUIDs, and GUIDs make great keys. Long story short, the problem I've been running into recently is that a surprisingly high number of CDRs generated by certain customers using our solutions are bounced by the database because of Call-ID collisions. I think the quotation from the scrolls above quite clearly spells out that this is singularly the fault of the UACs involved, and while I won't name any names publicly, there are certainly some vendors in particular on whom I have my eye. Nevertheless, it is presenting an accounting problem for us and some of our ITSP customers. I'd be curious to know if anyone else has run into it, and what solutions you may have adopted to deal with it. Thanks! -- Alex Balashov - Principal Evariste Systems LLC 1170 Peachtree Street 12th Floor, Suite 1200 Atlanta, GA 30309 Tel: +1-678-954-0670 Fax: +1-404-961-1892 Web: http://www.evaristesys.com/

I've been yammering at vendors for years for some way to influence the callID composition. It's never caused us much trouble, but I can see where it would. If you need uniqueness, you probably need to do it yourself. If it bugged us, we'd throw a mediator in front of the db to add a customer identifier to the callID or something. There's no problem that you can't solve with one more level of abstraction :) <RANT> As long as "rough consensus and running code" is enough to thrill the IETF and SIP Forum, their beloved MAYs and SHOULDs will continue to drain much of the value from their product. </RANT> David Hiers CCIE (R/S, V), CISSP ADP Dealer Services 2525 SW 1st Ave. Suite 300W Portland, OR 97201 o: 503-205-4467 f: 503-402-3277 -----Original Message----- From: voiceops-bounces at voiceops.org [mailto:voiceops-bounces at voiceops.org] On Behalf Of Alex Balashov Sent: Saturday, July 31, 2010 3:33 AM To: voiceops at voiceops.org Subject: [VoiceOps] Bad/not unique SIP Call-ID GUIDs Hi everyone, Am I the only one to run into an unusually high share of duplicate Call-IDs? A SIP Call-ID is supposed to be a GUID (Globally Unique Identifier), and RFC 3261 is quite clear on just how unique it needs to be. From 8.1.1.4: In a new request created by a UAC outside of any dialog, the Call-ID header field MUST be selected by the UAC as a globally unique identifier over space and time unless overridden by method-specific behavior. All SIP UAs must have a means to guarantee that the Call-ID header fields they produce will not be inadvertently generated by any other UA. [...] Use of cryptographically random identifiers (RFC 1750 [12]) in the generation of Call-IDs is RECOMMENDED. Implementations MAY use the form "localid at host". Call-IDs are case-sensitive and are simply compared byte-by-byte. Using cryptographically random identifiers provides some protection against session hijacking and reduces the likelihood of unintentional Call-ID collisions. In theory, this means that no other SIP message series initiated by any other UA anywhere, ever should have that same identifier. In practice, that's a pretty tall order to absolutely logically guarantee by algorithmic means. However, it's certainly possible to get pretty close, from a probabilistic perspective. The use of a highly effective cryptographic digest function coupled with time and/or RTC-seeded pseudorandom values coupled with some sort of host-specific values should result, in the form of a composite, in a very unique value. It is with this understanding in mind that pretty much all CDRs in our least cost routing and other SIP service delivery solutions are keyed off of the Call-ID. The Call-ID is the search key on our CDR tables, and for a variety of logical and performance reasons has a unique constraint on the index. Call-IDs are also used in any situation requiring the organisation of calls into a single-key hash or tree structure of some description. We use them because they're the only "unique" element that conveniently persists across all SIP messages that we would want to logically group together, irrespectively of whether the message is a request, a reply, whether it's an initial or an in-dialog request, etc. They're easy to extract from the message body, and just about anything that operates on SIP messages at a relatively low level exposes some interface by which one can extract the Call-ID. Anything else would rely on implementing an additional, uniform composite hash function of some kind in many different, unrelated software components that all utilise the same database backing, IPC, etc. That's what they're supposed to be used for. Call-IDs are GUIDs, and GUIDs make great keys. Long story short, the problem I've been running into recently is that a surprisingly high number of CDRs generated by certain customers using our solutions are bounced by the database because of Call-ID collisions. I think the quotation from the scrolls above quite clearly spells out that this is singularly the fault of the UACs involved, and while I won't name any names publicly, there are certainly some vendors in particular on whom I have my eye. Nevertheless, it is presenting an accounting problem for us and some of our ITSP customers. I'd be curious to know if anyone else has run into it, and what solutions you may have adopted to deal with it. Thanks! -- Alex Balashov - Principal Evariste Systems LLC 1170 Peachtree Street 12th Floor, Suite 1200 Atlanta, GA 30309 Tel: +1-678-954-0670 Fax: +1-404-961-1892 Web: http://www.evaristesys.com/ _______________________________________________ VoiceOps mailing list VoiceOps at voiceops.org https://puck.nether.net/mailman/listinfo/voiceops This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system.

On 08/02/2010 11:15 AM, Hiers, David wrote:
If you need uniqueness, you probably need to do it yourself.
Agreed, and would be my preferred solution, except that when integrating disparate network elements that all have the common requirement to discern individual message series, but whose "programmatic" environments vary substantially in their capabilities. Rolling my own uniqueness works in a general-purpose programming environment where certain data structure primitives and library functions can be counted on, but not in more narrowly-conceived, domain-specific runtimes.
If it bugged us, we'd throw a mediator in front of the db to add a customer identifier to the callID or something. There's no problem that you can't solve with one more level of abstraction :)
Not always possible, especially in platforms we don't fully control. This may be an issue somewhat particular to what we do. -- Alex Balashov - Principal Evariste Systems LLC 1170 Peachtree Street 12th Floor, Suite 1200 Atlanta, GA 30309 Tel: +1-678-954-0670 Fax: +1-404-961-1892 Web: http://www.evaristesys.com/

Not something I'd ever come across, but the best solution I would think would be to create an auto-increment key that is unique (based on incrementing vales) within the database. You can also look into what system it is that is producing the non-unique call IDs and try to work with that Vendor. In theory, non-unique call ID's could really mess up the state of a UA. Tags are normally meant to help combat that, but this kind of thing is where SIP implementations (and the underlying interpretations of the RFC's) really vary. -Scott -----Original Message----- From: voiceops-bounces at voiceops.org [mailto:voiceops-bounces at voiceops.org] On Behalf Of Alex Balashov Sent: Saturday, July 31, 2010 6:33 AM To: voiceops at voiceops.org Subject: [VoiceOps] Bad/not unique SIP Call-ID GUIDs Hi everyone, Am I the only one to run into an unusually high share of duplicate Call-IDs? A SIP Call-ID is supposed to be a GUID (Globally Unique Identifier), and RFC 3261 is quite clear on just how unique it needs to be. From 8.1.1.4: In a new request created by a UAC outside of any dialog, the Call-ID header field MUST be selected by the UAC as a globally unique identifier over space and time unless overridden by method-specific behavior. All SIP UAs must have a means to guarantee that the Call-ID header fields they produce will not be inadvertently generated by any other UA. [...] Use of cryptographically random identifiers (RFC 1750 [12]) in the generation of Call-IDs is RECOMMENDED. Implementations MAY use the form "localid at host". Call-IDs are case-sensitive and are simply compared byte-by-byte. Using cryptographically random identifiers provides some protection against session hijacking and reduces the likelihood of unintentional Call-ID collisions. In theory, this means that no other SIP message series initiated by any other UA anywhere, ever should have that same identifier. In practice, that's a pretty tall order to absolutely logically guarantee by algorithmic means. However, it's certainly possible to get pretty close, from a probabilistic perspective. The use of a highly effective cryptographic digest function coupled with time and/or RTC-seeded pseudorandom values coupled with some sort of host-specific values should result, in the form of a composite, in a very unique value. It is with this understanding in mind that pretty much all CDRs in our least cost routing and other SIP service delivery solutions are keyed off of the Call-ID. The Call-ID is the search key on our CDR tables, and for a variety of logical and performance reasons has a unique constraint on the index. Call-IDs are also used in any situation requiring the organisation of calls into a single-key hash or tree structure of some description. We use them because they're the only "unique" element that conveniently persists across all SIP messages that we would want to logically group together, irrespectively of whether the message is a request, a reply, whether it's an initial or an in-dialog request, etc. They're easy to extract from the message body, and just about anything that operates on SIP messages at a relatively low level exposes some interface by which one can extract the Call-ID. Anything else would rely on implementing an additional, uniform composite hash function of some kind in many different, unrelated software components that all utilise the same database backing, IPC, etc. That's what they're supposed to be used for. Call-IDs are GUIDs, and GUIDs make great keys. Long story short, the problem I've been running into recently is that a surprisingly high number of CDRs generated by certain customers using our solutions are bounced by the database because of Call-ID collisions. I think the quotation from the scrolls above quite clearly spells out that this is singularly the fault of the UACs involved, and while I won't name any names publicly, there are certainly some vendors in particular on whom I have my eye. Nevertheless, it is presenting an accounting problem for us and some of our ITSP customers. I'd be curious to know if anyone else has run into it, and what solutions you may have adopted to deal with it. Thanks! -- Alex Balashov - Principal Evariste Systems LLC 1170 Peachtree Street 12th Floor, Suite 1200 Atlanta, GA 30309 Tel: +1-678-954-0670 Fax: +1-404-961-1892 Web: http://www.evaristesys.com/ _______________________________________________ VoiceOps mailing list VoiceOps at voiceops.org https://puck.nether.net/mailman/listinfo/voiceops

On 08/02/2010 12:56 PM, Scott Berkman wrote:
Not something I'd ever come across, but the best solution I would think would be to create an auto-increment key that is unique (based on incrementing vales) within the database.
Well, yes, unique primary keys are great. :-) That's what I'd ordinarily do with a Call-ID in many situations, and that's what's causing the problem; when there is a primary key collision, the entire transaction is rolled back. The problem, which I mentioned but perhaps underemphasised, is that I need a value that is not only unique but occurs straightforwardly within the SIP message body itself, so that it is possible to easily associate messages at a higher level of abstraction. That's the purpose that a Call-ID is intended to serve.
You can also look into what system it is that is producing the non-unique call IDs and try to work with that Vendor. In theory, non-unique call ID's could really mess up the state of a UA. Tags are normally meant to help combat that, but this kind of thing is where SIP implementations (and the underlying interpretations of the RFC's) really vary.
Tags are dialog-bound identifiers. The beauty of the Call-ID is that it unites a superset of messages, including many sequential interactions which do not form a dialog. -- Alex Balashov - Principal Evariste Systems LLC 1170 Peachtree Street 12th Floor, Suite 1200 Atlanta, GA 30309 Tel: +1-678-954-0670 Fax: +1-404-961-1892 Web: http://www.evaristesys.com/
participants (3)
-
abalashov@evaristesys.com
-
David_Hiers@adp.com
-
scott@sberkman.net