
Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting? If you have a product that does it well, what parameters does it use? How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls? Any input here or offline would be appreciated. Kind regards, Frank

On 06/25/12?09:32?-0500, Frank Bulk wrote:
Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting?
If you have a product that does it well, what parameters does it use?
How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls?
Any input here or offline would be appreciated.
We poll once every 60 seconds from our Metaswitch when communicating with unauthenticated (or authenticated by IP address) SIP trunk peers. I don't recall if that was a suggestion that we received, or if that was a value we picked. I do not know without digging into our switch documentation what occurs if one options response is not received, but most likely there would need to be two (or more) options statements missed before the sip binding goes into alarm and gets taken down. I suspect that the sip binding comes back up on the next response. In my observations, the contents of the SIP options response has no bearing. In fact, an error response (such as "not supported") is sufficient to keep the sip binding up, if I recall correctly. -- Dan White

I poll every 30 seconds. Failure threshold is if there are 2 unanswered OPTIONS and then it determines failure. Depending on how critical the device is you are polling, you may want to poll a lot more often than 30 seconds Christian Pena -----Original Message----- From: voiceops-bounces at voiceops.org [mailto:voiceops-bounces at voiceops.org] On Behalf Of Dan White Sent: Monday, June 25, 2012 11:35 AM To: Frank Bulk Cc: voiceops at voiceops.org Subject: Re: [VoiceOps] SIP OPTIONS PING On 06/25/12?09:32?-0500, Frank Bulk wrote:
Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting?
If you have a product that does it well, what parameters does it use?
How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls?
Any input here or offline would be appreciated.
We poll once every 60 seconds from our Metaswitch when communicating with unauthenticated (or authenticated by IP address) SIP trunk peers. I don't recall if that was a suggestion that we received, or if that was a value we picked. I do not know without digging into our switch documentation what occurs if one options response is not received, but most likely there would need to be two (or more) options statements missed before the sip binding goes into alarm and gets taken down. I suspect that the sip binding comes back up on the next response. In my observations, the contents of the SIP options response has no bearing. In fact, an error response (such as "not supported") is sufficient to keep the sip binding up, if I recall correctly. -- Dan White _______________________________________________ VoiceOps mailing list VoiceOps at voiceops.org https://puck.nether.net/mailman/listinfo/voiceops

In my observations, the contents of the SIP options response has no bearing. In fact, an error response (such as "not supported") is sufficient to keep the sip binding up, if I recall correctly.
You will also have the joy of making sure your providers support SIP options. Depending on the provider I get any one of these back: 501, 403, 484, 404, 405, 408, 502. Some just plain don't respond, but the big guys do usually respond.

We use SIP OPTION Pings on our smokeping setup (using a probe that has sipsak do the option). To measure latency and packet loss to remote end points. Also the SONUS gear will blacklist a IP address if it doesn't reply to an OPTION and not send anymore traffic to the end point until it starts to reply again. -----Original Message----- From: voiceops-bounces at voiceops.org [mailto:voiceops-bounces at voiceops.org] On Behalf Of Frank Bulk Sent: Monday, June 25, 2012 9:32 AM To: voiceops at voiceops.org Subject: [VoiceOps] SIP OPTIONS PING Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting? If you have a product that does it well, what parameters does it use? How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls? Any input here or offline would be appreciated. Kind regards, Frank _______________________________________________ VoiceOps mailing list VoiceOps at voiceops.org https://puck.nether.net/mailman/listinfo/voiceops

I use this extensively from a lot of network elements. Check frequency as a keep-alive, anywhere from 30 to 60 seconds is generally good BUT i strongly urge it be made configurable and not hard-coded, since if used for client trunk polling this can become overwhelming at scale if they are all on the same timer. If used in this fashion, I also recommend putting a random 1-1000ms delay in to prevent 10,000 sip messages from hitting your border element all at once every 30 sec. As far as details go, I also recommend the following: 1: Make the from, to and request URI user part configurable. Metaswitch has these statically coded to be "metaswitch" which can cause 2 issues, the first is that if you are pinging out of your network you are revealing information about your infrastructure which is bad (but not so serious here), and the 2nd is if you are pinging a strict proxy like opensips, it will try to route the ping instead of responding to it locally (i am aware you can config opensips to fix this but its a silly workaround) 2: Allow the setting of hop-count to prevent unnecessary forwarding 3: If used for device heartbeats a failed ping should take a device down. A successful ping OR a successful call should bring a device up. Waiting for a successful ping to bring a device up can dramatically slow network convergence periods in the event of an interruption. 4: Existing media sessions should not be interrupted by a failed options ping, instead you should be using session timers or some other orphaned call detection to do cleanup there. Failed options pings should stop new dialogues from being established, but existing dialogues should survive or fail per-dialogue based on RFC 4028 or some other dialogue-centric method. This is all based on nothing more than my own meandering experience and the readings from my sack of voodoo chicken bones. -Ryan On 06/25/2012 07:32 AM, Frank Bulk wrote:
Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting?
If you have a product that does it well, what parameters does it use?
How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls?
Any input here or offline would be appreciated.
Kind regards,
Frank
_______________________________________________ VoiceOps mailing list VoiceOps at voiceops.org https://puck.nether.net/mailman/listinfo/voiceops

This all depends on what you want from it. In my experience OPTIONS pings are useful for identifying possible issues that may or may not require investigation but not a good indicator of actual issues that require actual action to be taken without human intervention. Asterisk for example will take a peer out of service if it doesn't respond in the amount of time specified in the "qualify" setting. While this sounds like a good thing it requires knowing your peers and setting the "qualify" value appropriately or you may accidentally take a peer out of service with values set too low or may leave a peer in service too long with values too high. As some have mentioned already some vendors do not even respond to OPTIONS, other times you are getting a response directly from a proxy which may not be indicative of the services behind it and many other times the service delivering media is not the same (or even on the same subnet) as the one delivering SIP messaging so you wouldn't want to drop calls in progress. In the end you know your setup and (hopefully) your vendors. Use this for what will work for you. In general the more information you can gather the better but knowing when to take action or not to take action is more unique to your individual needs. Jesse -----Original Message----- From: Frank Bulk [mailto:frnkblk at iname.com] Sent: Monday, June 25, 2012 9:32 AM To: voiceops at voiceops.org Subject: [VoiceOps] SIP OPTIONS PING Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting? If you have a product that does it well, what parameters does it use? How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls? Any input here or offline would be appreciated. Kind regards, Frank ________________________________ This e-mail and any files transmitted with it are ShoreTel property, are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not one of the named recipient(s) or otherwise have reason to believe that you have received this message in error, please notify the sender and delete this message immediately from your computer. Any other use, retention, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited

Jesse: Thanks for that feedback. There's no doubt that SIP OPTIONS PING is not the silver bullet, but our softswitch has no indication today that the control plane is out of service, so therefore no alarming, etc. At least if we don't get any response we know there is some kind of connectivity failure. Of course, you can work through the different scenarios from there. Frank -----Original Message----- From: voiceops-bounces at voiceops.org [mailto:voiceops-bounces at voiceops.org] On Behalf Of Jesse Howard Sent: Tuesday, June 26, 2012 1:33 PM To: voiceops at voiceops.org Subject: Re: [VoiceOps] SIP OPTIONS PING This all depends on what you want from it. In my experience OPTIONS pings are useful for identifying possible issues that may or may not require investigation but not a good indicator of actual issues that require actual action to be taken without human intervention. Asterisk for example will take a peer out of service if it doesn't respond in the amount of time specified in the "qualify" setting. While this sounds like a good thing it requires knowing your peers and setting the "qualify" value appropriately or you may accidentally take a peer out of service with values set too low or may leave a peer in service too long with values too high. As some have mentioned already some vendors do not even respond to OPTIONS, other times you are getting a response directly from a proxy which may not be indicative of the services behind it and many other times the service delivering media is not the same (or even on the same subnet) as the one delivering SIP messaging so you wouldn't want to drop calls in progress. In the end you know your setup and (hopefully) your vendors. Use this for what will work for you. In general the more information you can gather the better but knowing when to take action or not to take action is more unique to your individual needs. Jesse -----Original Message----- From: Frank Bulk [mailto:frnkblk at iname.com] Sent: Monday, June 25, 2012 9:32 AM To: voiceops at voiceops.org Subject: [VoiceOps] SIP OPTIONS PING Does anyone have some real-world experience with SIP OPTIONS PING? Our softswitch vendor is looking to implement support and seeking some input on check frequency, response timeouts, how quickly to check recheck on a down, and how long to wait after it's up before restoring the SIP trunk group. And should there be a ramp-up period through some kind of rate-limiting? If you have a product that does it well, what parameters does it use? How do you think the software should handle existing calls that may or may not have active media flows? Should it tear down calls immediately, use some kind of active media detection, or wait x minutes before tearing down calls? Any input here or offline would be appreciated. Kind regards, Frank ________________________________ This e-mail and any files transmitted with it are ShoreTel property, are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not one of the named recipient(s) or otherwise have reason to believe that you have received this message in error, please notify the sender and delete this message immediately from your computer. Any other use, retention, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited _______________________________________________ VoiceOps mailing list VoiceOps at voiceops.org https://puck.nether.net/mailman/listinfo/voiceops
participants (7)
-
Christian.Pena@corp.earthlink.com
-
dwhite@olp.net
-
frnkblk@iname.com
-
jared@compuwizz.net
-
jhoward@ShoreTel.com
-
jjackson@aninetworks.net
-
ryandelgrosso@gmail.com