Exceptionally Long Queue Length & Timesync error

All installation and configuration problems and questions

Moderators: gerski, enjay, williamconley, Op3r, Staydog, gardo, mflorell, MJCoate, mcargile, Kumba, Michael_N

Exceptionally Long Queue Length & Timesync error

Postby dgroth02 » Tue Aug 07, 2018 10:25 am

Ok this is now happening to me on multiple systems (all hosted in datacenters), but I'll give the details of one of the systems here

System Configuration:
Multi-Server System, 1 DB/Web Server, 7 dialing servers
ViciBox v.7.0.4-170113
VERSION: 2.14-667a
BUILD: 180331-1715
Asterisk 11.25.1-Vici on all servers

Dual Quad core Xeon in each server Minimum (Slightly varying speeds). DB Server with 128GB of RAM, each dialing server has 16GB RAM. SSD drives in all servers

This particular system has 50 agents spread across the servers, dialing at approximately 7 to 1.

The problem is essentially that occasionally, at various dialing levels, a server will go "Red" in the summary screen. When that happens, all the agents with their phones on that server will get the dreaded Timesync Error. and will be unable to log back in. The asterisk command line shows hundreds of " channel.c:1310 __ast_queue_frame: Exceptionally long voice queue length queuing" warnings.

After a reboot, the "Red" is cleared and phones can re-register, agents can log in, and things will continue as normal.

Here is what I have noticed that does/does not affect the problem and what I have done to date
  • Seems to happen most often when a remote client has a particularly bad Internet connection, but not always - Connection can be great and it will still happen
  • A servers are in time synchronization with each other
  • All servers have remote client IP's http, sip, and rtp white listed only
  • System uses SIP trunks and have their IP's white listed.
  • All other access to the system on all ports is closed
  • Trunks are balanced across all dialing servers

Yes I have read the manual and searched the forums, and while I see others mentioning this issue, no real solution was presented (apart from an NTPdate sync in the crontab - which doesn't work) and that was 2 years ago.

For my clients that I recommend Vicidial to as a dialing server, this is pretty much the ONLY issue they ever run into, but it is quickly becoming the issue that causes them to leave Vicidial.

I would really appreciate ANY help or direction.

Thank you.
dgroth02
 
Posts: 8
Joined: Wed May 06, 2015 2:24 pm

Re: Exceptionally Long Queue Length & Timesync error

Postby uncapped_shady » Tue Aug 07, 2018 12:47 pm

Hi what timing devices are you using in your servers? Vicidial recommends the Amfeltec PCI Express cards.
uncapped_shady
 
Posts: 24
Joined: Sat Jan 20, 2018 5:51 pm
Location: South Africa Gauteng

Re: Exceptionally Long Queue Length & Timesync error

Postby dgroth02 » Tue Aug 07, 2018 2:17 pm

Just the standard dahdi_dummy internal timing. I've been told by a friend that this may have to do with Asterisk 11.25.1 and that I should upgrade to 11.25.3 - Anyone else have this issue with this resolution?
dgroth02
 
Posts: 8
Joined: Wed May 06, 2015 2:24 pm

Re: Exceptionally Long Queue Length & Timesync error

Postby dgroth02 » Thu Aug 09, 2018 6:10 pm

Have upgraded asterisk and still having this issue...still suspect Internet related timeouts. Any suggestions on making vic close connections faster or make it less sensitive?
dgroth02
 
Posts: 8
Joined: Wed May 06, 2015 2:24 pm

Re: Exceptionally Long Queue Length & Timesync error

Postby williamconley » Thu Aug 09, 2018 6:26 pm

Time sync error is tossed during two basic categories of fault:

1) Time is out of sync between servers. Solution: Sync all the servers to ONE of the servers in the cluster (with iBurst) so they are always in sync. Use NTP, don't "set the time" periodically. Note that this is not something that would affect individual agents, it would down an entire server, and it would be off by about 6 seconds to cause the issue. 5% of the time this is the problem.

2) The time field of the agent's session is not updated at all (instead of being 'out of sync', it's just 'not updating', but tosses the same error!) This can be caused by a bad connection between the agent's web screen and the agent's web server, although it can also happen if the agent's dialer (the server to which the agent has registered his phone) processes fail. There are screens (screen -list) running which must update various fields continually. If one of those terminates OR Just Stops working even though it's still listed, the results can be deadly to the cluster. The one that ordinarily causes "red server" in the "Reports" page is the update script. Often just killing that script (and allowing it to regenerate on it's own, at the 1 minute mark from the keepalive script) will remove the redness and bring the server back online.

Sounds like you have more than one of the 2) problems going if you have bad internet for some agent and red servers.
Vicidial Installation and Repair, plus Hosting and Colocation
SugarCRM integration - Customization and Add-ons - We Bring It All Together.
http://www.PoundTeam.com # 352-269-0000 # +44 (203) 769-2294 # +506 4001-8914
williamconley
 
Posts: 17236
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)

Re: Exceptionally Long Queue Length & Timesync error

Postby dgroth02 » Fri Aug 10, 2018 9:32 am

I understand timesync, but this still doesn't explain LongQueue length errors on Asterisk console. Would the update script cause this to happen?

If the time field is sensitive to updates - is there a way to make it less sensitive?
dgroth02
 
Posts: 8
Joined: Wed May 06, 2015 2:24 pm

Re: Exceptionally Long Queue Length & Timesync error

Postby williamconley » Fri Aug 10, 2018 2:35 pm

It's not "sensitive", it's a purposebuilt "symptom" indicating "you have a problem, you must fix this". Connectivity or time, one of them is broken. If you make it "not sensitive", then the system will stop working and you'll never know. Data will be incorrect. Calls won't transfer. All sorts of bad things will happen. But you will receive NO warnings.

Long data queues have their own cause, unrelated to Vicidial code. Likely also due to connectivity issues. Another symptom. Treat the cause.
Vicidial Installation and Repair, plus Hosting and Colocation
SugarCRM integration - Customization and Add-ons - We Bring It All Together.
http://www.PoundTeam.com # 352-269-0000 # +44 (203) 769-2294 # +506 4001-8914
williamconley
 
Posts: 17236
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)


Return to Support

Who is online

Users browsing this forum: No registered users and 20 guests