
I use this extensively from a lot of network elements. Check frequency as a keep-alive, anywhere from 30 to 60 seconds is generally good BUT i strongly urge it be made configurable and not hard-coded, since if used for client trunk polling this can become overwhelming at scale if they are all on the same timer. If used in this fashion, I also recommend putting a random 1-1000ms delay in to prevent 10,000 sip messages from hitting your border element all at once every 30 sec. As far as details go, I also recommend the following: 1: Make the from, to and request URI user part configurable. Metaswitch has these statically coded to be "metaswitch" which can cause 2 issues, the first is that if you are pinging out of your network you are revealing information about your infrastructure which is bad (but not so serious here), and the 2nd is if you are pinging a strict proxy like opensips, it will try to route the ping instead of responding to it locally (i am aware you can config opensips to fix this but its a silly workaround) 2: Allow the setting of hop-count to prevent unnecessary forwarding 3: If used for device heartbeats a failed ping should take a device down. A successful ping OR a successful call should bring a device up. Waiting for a successful ping to bring a device up can dramatically slow network convergence periods in the event of an interruption. 4: Existing media sessions should not be interrupted by a failed options ping, instead you should be using session timers or some other orphaned call detection to do cleanup there. Failed options pings should stop new dialogues from being established, but existing dialogues should survive or fail per-dialogue based on RFC 4028 or some other dialogue-centric method. This is all based on nothing more than my own meandering experience and the readings from my sack of voodoo chicken bones. -Ryan On 06/25/2012 07:32 AM, Frank Bulk wrote:
Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting?
If you have a product that does it well, what parameters does it use?
How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls?
Any input here or offline would be appreciated.
Kind regards,
Frank
_______________________________________________ VoiceOps mailing list VoiceOps at voiceops.org https://puck.nether.net/mailman/listinfo/voiceops