Page 1 of 1

stuck on transfer can't leave 3 way

PostPosted: Wed Feb 21, 2018 1:43 am
by alo
I hope this isn't a duplicate post. I searched pretty thoroughly and didn't see the exact post.

Recently we have run into issues with leaving a threeway call. it occurs when an agent transfers a call. we have noticed this happen when transferring both externally and internally to another agent. Using both the local closer button, using the custom transfer button and manually typing the number.

Basically the agent remains on the call with the customer and third party even after pressing leave three way.

I have made sure the asterisk version matches whats in the server section.

I adjusted my apache server-tuning because I heard that might cause it. no luck there.

StartServers 450
MinSpareServers 250
MaxSpareServers 500
ServerLimit 768
MaxClients 768

Anybody seeing this or have any recommendations?

I thought I saw that sometimes using an internal local IP when installing vicidial on vicibox had something to do with this but Not sure what it would be.

Vicibox 8.0.1
VERSION: 2.14-656a
BUILD: 180215-1318
SVN 2918
DB Schema Version: 1536
Asterisk 11.25.3-vici
Single server DB/Web/dialer
25 agents
webrtc Viciphone

Re: stuck on transfer can't leave 3 way

PostPosted: Wed Feb 21, 2018 6:03 am
by mflorell
How frequently does this issue happen?

What does the web browsers' console show when this happens?

Re: stuck on transfer can't leave 3 way

PostPosted: Wed Feb 21, 2018 11:21 am
by ed123
Please post your carrier settings. We encountered like this one before.

Re: stuck on transfer can't leave 3 way

PostPosted: Wed Feb 21, 2018 11:49 am
by williamconley
Also post your asterisk version from astguiclient.conf.

Was this system upgraded from a prior version?

Re: stuck on transfer can't leave 3 way

PostPosted: Thu Feb 22, 2018 12:04 am
by alo
When its happening it happens randomly but frequently. Perhaps when the systems under more load?
I will be checking the web browsers (chrome) console and will update when I can.
Its hard distinguishing user error from the actual problem so I need to catch an agent while I am watching to make sure they did it correctly :)
Also I need to make sure its not the external number not connecting, thats a different deal the agents get confused by if the transfer just isn't answering.

Upgraded from Asterisk 11.25.1-vici to Asterisk 11.25.3-vici
I did run install.pl --copy_sample_conf_files --no-prompt --asterisk_version=11.25 after upgrade. Maybe I should have done --asterisk_version=11.25.3?

# Asterisk version VICIDIAL is installed for
VARasterisk_version => 11.25


[ACCT1]
allow=ulaw
type=friend
host=XXX.XXX.XX.XXX
dtmfmode=rfc2833
context=trunkinbound
port=5060

exten => _91NXXNXXXXXX,1,AGI(agi://127.0.0.1:4577/call_log)
exten => _91NXXNXXXXXX,2,Dial(${ACCT1}/${EXTEN:1},,To)
exten => _91NXXNXXXXXX,3,Hangup

Re: stuck on transfer can't leave 3 way

PostPosted: Thu Feb 22, 2018 12:49 am
by williamconley
Code: Select all
Enter the Asterisk version that you are installing VICIDIAL for
(value should be only one of the options below:)
 1.2
 1.4
 1.8
 11.X
 13.X
Enter asterisk version or press enter for default: [11]


I do not see "11.25" in the list of "should be only one of the options below". Do you? 8-)

Note that when you change to 11.X, you will need to reboot (or kill any screens running with "11.25" which likely won't match) and probably for your dialplan to reload. (System settings has an option to force reload ... or you can just change something like a phone or carrier).

Re: stuck on transfer can't leave 3 way

PostPosted: Thu Mar 22, 2018 11:18 am
by alo
Okay, I have been Troubleshooting this for a bit now.

1. @william Changing to 11.X Didn't seem to affect this specific issue. I even tried re running the install script with it and rebooting etc.
2.@matt I spent some time watching the agents screen and I didn't seen anything jump out at me, not sure what specifically I am looking for but I can say the only warnings I see in the chrome console was not loading favicon. But I did get a better Idea of what the agents are seeing. Basically the leave threeway button can be pressed but nothing seems to happen. I tried pressing it over and over, the button visually looks to be depressing but nothing seems to happen. the hangup button then hangs up both lines.

This is happening across a couple different clusters especially when more agents log in. So I am wondering if its load related. I have the database on its own separate drives and that did not seem to help.

Anything anyone can think of for me to check? specific logs etc?

Re: stuck on transfer can't leave 3 way

PostPosted: Thu Mar 22, 2018 11:52 am
by williamconley
1) Agent Screen (Web Browser) java console log. Best during a test scenario if you can reliably reproduce the error situation. If you have access to both a functional and non-functional to compare the ajax calls, this may allow digging through to find the cause. Be prepared for a red herring or two. Vicidial is complex. 8-)

2) asterisk console (screen -r asterisk). You may benefit from turning on agi debugging and/or other debugging views. the screen -r asterisk is necessary to check for agi errors (which ONLY show in that one screen), for all other debugging the log files and/or "asterisk -R" are suitable.

When you did the reinstall ... did you "install sample conf files"? If not ... do it again with that option active. Those sample conf files contain version specific code. working with the old one will leave the flaw in place.

Re: stuck on transfer can't leave 3 way

PostPosted: Tue Apr 17, 2018 3:07 pm
by alo
Okay, Back to this again. I think this mostly has to do with the agents being considered in DEAD status even though they are on a live call.

my theory here is either webserver overloaded or Database. and on this cluster I have the Webserver and the Database on the same server.

Basically every so often agents get put into dead status even though they are still on a call. any suggestions here?

btw, Heres the answers to the last replys.

See nothing that pops out to me in the web browser console log or asterisk console.
As for running it with sample files I did have this, --copy_sample_conf_files

Re: stuck on transfer can't leave 3 way

PostPosted: Tue Apr 17, 2018 3:25 pm
by williamconley
every so often agents get put into dead status


Please confirm that you are asserting the agents in DEAD status are the only agents who can not transfer.

Confirm also that you've verified this (if you have not personally verified this, if you're working on hearsay or "memory", please personally verify several fresh cases before continuing, it's important).

Re: stuck on transfer can't leave 3 way

PostPosted: Thu Apr 19, 2018 4:44 pm
by alo
I was able to log in as an agent and it did seem to be the case. I can't speak for every occurrence but I can tell you in my testing so far thats what happened.

And now, not sure If I am just noticing it more or if its getting far worse, but I have been seeing far more dead calls on the realtime report where the agents are still speaking with a customer. I believe this may be my actual issue!

Not sure whats causing this, but I am thinking possibly a webserver issue or perhaps table locks. vicidial_agent_logs, vicidial_manager, vicidial_live_calls?

I remembered seeing somewhere that sometimes with high usage running the flush_DBqueue might help.
I noticed select count(*) from vicidial_manager; giving over 30k rows so I run /usr/share/astguiclient/AST_flush_DBqueue.pl --seconds=900 -q
As far as I can tell it doesn't really help.

I also remember hearing about this: netstat -n | grep TIME_WAIT | wc -l
gives me usually below 100 never more then 200.

I added a slave report database because they tend to run reports like agent stats and assigned all reports to use the salve data base but that hasn't seemed to help.
I run the archive Daily script and the archive months=1 script at night. this seems to maybe help lessen the number of dead calls. but that could be my imagination. its been proving hard to track.

Any Ideas what might cause the dead calls?
If we think its locked tables or database lag maybe moving a table to a memory table could help?

Re: stuck on transfer can't leave 3 way

PostPosted: Thu Apr 19, 2018 10:04 pm
by williamconley
You're just all over the place. lol.

How many agents and servers are you running in total and "agents per server"?

Does this happen all the time across the board, or more during heavier usage?

Does this happen to all agents on all servers?

Have you tried reducing the load on one server and seeing if that server "never" has dead calls?

Server load (1/5/10 minutes from the cli) for the server(s) involved?

Are your servers dedicated for web/dialer roles? Or do they share both web and dialer?

If I recall correctly, dropping packets between the agent web browser and the vicidial web server can cause dead call misfire. But it's been a while since we had one. There have been many other reasons over the years, though.

Try googling "vicidial dead call"

Re: stuck on transfer can't leave 3 way

PostPosted: Thu Apr 26, 2018 8:05 pm
by alo
Still Chasing this down! Get ready for this... I am starting to think the dead calls is a separate but possibly related issue. Trying to get as much info for you so we might be able to crack this puzzle!

This seemed to have started once I moved to vicibox 8 along with some other weird issues destined for other posts. no matter how many times I reinstall. (I know you are a big proponent of practicing on reinstalls:)) we have a couple separate clusters for our different offices. the ones still on vicibox 7 seem fine but the ones on vicibox 8 seem to have this issue.

Heres the info on the two separate clusters experiencing this:

All Dell R610 LSI mega raid cards raid 1 SSD drives with 32gb ram
Update: as I put these down on paper and into words I am starting to wonder if my problem is overloaded servers.

Office 1
1 Database/web/agent server
1 Dialer
30 agents log in to the main database server and the dialer helps with the call volume.

Office 2
Database/webserver raid 10
3 asterisk servers
- load balance phones 25 per server

The servers are connecting to each using the internal zone and have less the a second ping time to each other.
I have tried asterisk 11.25.3-vici and asterisk 13.20.0-vici same deal. I use the viciphone webrtc with google chrome browser.
I have tried in office and at home agents.
two different carriers
Updating SVN (2973)

Ill be trying to add more resources next. do you recommend the webserver not be on the database? the reason I have always done web/db together is because back before I used internal network to connect servers, too much traffic was hitting the yast2 firewall and messing with contracks max and I just haven't had a reason to switch how I have been doing it. Maybe you will give me one.

Server load never looks bad
1.75,1.80,1.78 io wait is always 0
Dialers less then 1, 1, 1

Some of these don't get rebooted every night could that cause an issue?

I believe its all agents all servers sporadically 1 out of 10 transfers average.

Any other pieces for this puzzle?

Re: stuck on transfer can't leave 3 way

PostPosted: Fri May 04, 2018 1:51 pm
by kashinc
Did you ever end up figuring this out? I recently just started having the same exact issue.

Re: stuck on transfer can't leave 3 way

PostPosted: Fri May 04, 2018 2:01 pm
by williamconley
Verify your Vicidial Version matches on all servers and is the correct one for your db schema.

Verify your Asterisk version is set properly both in astguiclient.conf and in admin->servers

Veify your Asterisk sample conf files were installed for the proper version during install.pl (often requires a re-install and validation that you chose an actual option for Asterisk version during install.pl, it's very touchy sometimes that you choose a supplied option and don't just make up something that seems similar or correct).

Re: stuck on transfer can't leave 3 way

PostPosted: Wed Jun 13, 2018 12:05 pm
by alo
Still Seem to be having this issue.
Seems adding more Asterisk servers seems to help. But we have added way more resources then we ever needed to before.

I have tried Changing Carriers, Server Locations (different Internet) Re installed, Asterisk 11, Asterisk 13.

This seems to have started happening in Revisions after Maybe svn 2900 maybe a little before then.
Do we know if any code changes happened around then that could be causing this?

Re: stuck on transfer can't leave 3 way

PostPosted: Fri Aug 24, 2018 4:08 pm
by emel_punk
I am having the same issue, my agents got stuck between 2 calls and can't hangup. I've updated today but no luck either.

Re: stuck on transfer can't leave 3 way

PostPosted: Sat Aug 25, 2018 5:25 pm
by alo
Emel_punk are you using a webphone or something like Xlite or zoiper?

Re: stuck on transfer can't leave 3 way

PostPosted: Mon Aug 27, 2018 11:59 am
by emel_punk
Yes, I am using X-lite

Re: stuck on transfer can't leave 3 way

PostPosted: Fri Nov 30, 2018 10:25 pm
by alo
Oh boy! I really invested the time on this and I think I figured out what is happening here!

Now to figure out why....
I started looking at the Chrome developer tools and found when I press the leave threeway call It sends the manager_send.php and gets this response.

Code: Select all
One of these variables is not valid:
Channel  must be greater than 2 characters
ExtraChannel SIP/<<My_IP_address>>-00000022 must be greater than 2 characters
queryCID VXvdcW15436321923075307530753075 must be greater than 14 characters
exten NEXTAVAILABLE must be set
ext_context default must be set
ext_priority 1 must be set
session_id 8600051 must be set
Redirect Action not sent


So I checked the form data and see its not sending anything for channel. Excerpt below.
Code: Select all
server_ip=192.168.0.211&session_name=1543632170_3007513223134&ACTION=RedirectXtraNeW&format=text&channel=&call_server_ip=192.168.0.225&queryCID=VXvdcW15436321923075307530753075&exten=NEXTAVAILABLE&ext_context=default&ext_priority=1&extrachannel=SIP/<<My_carrier_ip>>-00000022&lead_id=17963616


This happens after a few hours of heavy production. after a reboot of the asterisk dialing server Everything works correctly.

Code: Select all
NeWSessioN|8600051|
|INSERT INTO vicidial_manager values('','','2018-11-30 18:58:44','NEW','N','192.168.0.211','','Redirect','CXAR2320181130185844','Channel: SIP/30075-00000027','Context: default','Exten: 8600051','Priority: 1','CallerID: CXAR2320181130185844','','','','','');|


And I see form Data is sending channel.
Code: Select all
server_ip=192.168.0.211&session_name=1543633045_3007510968804&ACTION=RedirectXtraNeW&format=text&channel=SIP/Inphxin-00000001&call_server_ip=192.168.0.225&queryCID=VXvdcW15436331223075307530753075&exten=NEXTAVAILABLE&ext_context=default&ext_priority=1&extrachannel=SIP/<<My_carrier_ip>>-00000028&lead_id=2342570


So my theory here is that its not processing because the channel is not being sent. So the question is WHY??? and why after rebooting the dialer is it resolved? (not rebooting the database or webserver or anything)

SO heres my theory on that, I noticed after a reboot the channel seems to always reset to SIP/Inphxin-00000001 so maybe they are sending so many transfers that the number gets too high or something??

and this is where I am stuck.

Anyone experience this or know why it wouldn't send the channel info in the manager_send.php?

Thanks for reading:)

As a reminder heres my stuff.
Asterisk 13.21.1-vici
SVN 3052
Installed using ViciBox v.8.0.1

VERSION: 2.14-694a
BUILD: 181005-1738
DB Schema Version: 1561
Using webrtc Viciphone
Dell R610s Raid 1 H700 with SSDs
1DB, 1 Web, 3 Dialers 15 agents per server

Re: stuck on transfer can't leave 3 way

PostPosted: Sun Dec 02, 2018 10:30 pm
by alo
SO I tested it with both Softphone(xlite) and Webrtc(Viciphone) after they point where the leave three way button stops working.

Both methods I still get: the error "One of these variables is not valid: Channel must be greater than 2 characters."

I checked the live_calls table and see the channel when it stopped working looked like SIP/Inphxin-000098ab. after reboot when its working correctly its SIP/Inphxin-00000001.

Could it have something to do with the letters?
where does the system pull the channel data that it puts in the manager_send.php call from?

Re: stuck on transfer can't leave 3 way

PostPosted: Sun Dec 02, 2018 10:52 pm
by williamconley
Add some debug code to dump the channel variable at all stages and see where it gets lost.

Re: stuck on transfer can't leave 3 way

PostPosted: Tue Dec 04, 2018 3:06 am
by alo
I am not sure where to add the debug code, but I am looking into that.

In the meantime I have one more point of discovery.

the issue takes place on the server the call went out or in on, not the server the agents phone is on.
Example. had a call go out on a different dialing server with an issue. rebooted that server, issue resolved. without rebooting the agents phone server.
Sent a call out on the agents phone server without rebooting. gives the issue. So based on that I have decided the issue is on the server the call was placed on.

Re: stuck on transfer can't leave 3 way

PostPosted: Tue Dec 04, 2018 11:21 am
by williamconley
Next time try just restarting asterisk and/or each individual "screen" on the asterisk server. Most of those screens can be terminated and they will regenerate in about a minute (less than a reboot). Find out which screen regeneration (or combination) fixes the problem.

You may also just find obvious code on the screen in question without having to restart it. Or (as in overloaded inbound servers which have turned red), you may find the screen frozen/stuck (no more activity). In that case, the solution is to kill it and it will auto-regenerate and you can then add more debug code to find your problem. Unfortunately if it's "overload", you can't fix that with a bug fix, but only with an extra server OR by spreading the inbound calls among more servers. Hold queues take a lot of memory and RAM.

Re: stuck on transfer can't leave 3 way

PostPosted: Tue Dec 04, 2018 8:12 pm
by alo
So Helpful!! Thank you.

I think we are getting somewhere.
Killing the asterisk screen fixes the issue. I terminated the other screens one by one first and nothing resolved.

I still don't see anything obvious in the asterisk console. (or the asterisk console that was open in the asterisk screen)


I did see a few errors on the other screens:
3640.ASTupdate
Code: Select all
ERRMSG: |pattern match timed-out|2018-12-04 16:19:56||1|37988|Command: Action: CoreShowChannels
ActionID: 1543969197.3366|Match String: /Event: CoreShowChannelsComplete\nActionID: 1543969197.3366\nEventList: Complete\nListItems: \d+/|


3655.ASTfastlog
Code: Select all
DBD::mysql::db do failed: Duplicate entry '1543957935.34514' for key 'PRIMARY' at /usr/share/astguiclient/FastAGI_log.pl line 1500, <STDIN> line 23.
DBD::mysql::db do failed: Duplicate entry '1543958098.37643' for key 'PRIMARY' at /usr/share/astguiclient/FastAGI_log.pl line 554, <STDIN> line 23.


screen -r 3646.ASTlisten
Code: Select all
----- AMI Version 2.10.4 -----

Re: stuck on transfer can't leave 3 way

PostPosted: Tue Dec 04, 2018 9:05 pm
by williamconley
alo wrote:3640.ASTupdate
Code: Select all
ERRMSG: |pattern match timed-out|2018-12-04 16:19:56||1|37988|Command: Action: CoreShowChannels
ActionID: 1543969197.3366|Match String: /Event: CoreShowChannelsComplete\nActionID: 1543969197.3366\nEventList: Complete\nListItems: \d+/|

Not sure about this one, you should dig in. May be a red herring, but it could be something.
alo wrote:3655.ASTfastlog
Code: Select all
DBD::mysql::db do failed: Duplicate entry '1543957935.34514' for key 'PRIMARY' at /usr/share/astguiclient/FastAGI_log.pl line 1500, <STDIN> line 23.
DBD::mysql::db do failed: Duplicate entry '1543958098.37643' for key 'PRIMARY' at /usr/share/astguiclient/FastAGI_log.pl line 554, <STDIN> line 23.

Safe to ignore. This query should have an ignore errors function added to it. Probably.
alo wrote:screen -r 3646.ASTlisten
Code: Select all
----- AMI Version 2.10.4 -----

What are you saying? That's all that was on this screen? Check it during normal operation.

Now that you've narrowed it down to asterisk ... dig a little deeper. Try NOT killing asterisk, just typing "reload" or reload for individual modules. See if any specific errors can be capture at or before the moment of the problem being visible. Consider checking the File logs (and consider turning on logging on the server in question in admin->servers). I believe the asterisk screen writes to a file for troubleshooting, but I'm not certain it writes Everything (this console exists because it contains perl errors NOT visible anywhere else, even with other local asterisk connections).

Re: stuck on transfer can't leave 3 way

PostPosted: Tue Dec 04, 2018 10:21 pm
by alo
Reload does not resolve it.

"core restart now" does. also does "systemctl restart vicidial" (I think as a matter of course that restarts asterisk)

I am wondering if it might have to do with the ASTupdate screen error. if it timed out at CoreShowChannels maybe it couldn't get the channel information and stopped working. although I tried restarting that screen and also tried killing the ast update script and starting it with Debug and it doesn't seem to have a problem although the Leave three way issue is still happening until I restart asterisk.

Re: stuck on transfer can't leave 3 way

PostPosted: Wed Dec 05, 2018 4:55 pm
by alo
I have been doing a core restart now on each server as I receive complaints.

I decided to add more dialing servers to test a few different things.
1. dialing server without agents but with asterisk 13 or webrtc - Still broken
2. dialing server without agents with asterisk 11 - unknown
3. dialing server with asterisk 13, webrtc agents, but only set to 10 max trunks - Not broken!

So it seems unrelated to agents and related to the amount of calls going out.

Re: stuck on transfer can't leave 3 way

PostPosted: Wed Dec 05, 2018 5:10 pm
by williamconley
alo wrote:ASTupdate screen error. if it timed out at CoreShowChannels maybe it couldn't get the channel information and stopped working.

Is that same error ONLY but ALWAYS present during an "issue"?

alo wrote:"core restart now" does.

Core restart now kills all calls, and the newly generating calls work ... for a while? right?

alo wrote:3. dialing server with asterisk 13, webrtc agents, but only set to 10 max trunks - Not broken!

So it seems unrelated to agents and related to the amount of calls going out.


Amount of calls ... Have you tried slowly increasing the channel limit until it breaks?

what about average server load? (from "uptime")

What hardware is this on? How much memory? Have you checked all the mysql/system logs for errors that pop up during these events (preferably right at or just before the issue becomes visible ...)?