question from newbie - speech recognition in call-center

Any and all non-support discussions

Moderators: gerski, enjay, williamconley, Op3r, Staydog, gardo, mflorell, MJCoate, mcargile, Kumba, Michael_N

question from newbie - speech recognition in call-center

Postby johnyjj2 » Thu Oct 15, 2009 12:45 pm

Hello :-)!

I need to create application which behaves as follows:
1. User calls a special number.
2. User talks to server, giving some numbers and additional informations (only digits + about five control commands, however language which I'd like to use is not so popular so it may be difficult to have proper acoustic model for this language. However I can train acoustic model or at least try to use model of English language). I want the user to tell twelve digits, the server to recognize those digits, calculate control sum and answer to the user "code is proper" or "code is improper". So it should be call center with use of speech recognition, not humans.
3. Server saves some text data on its disk, based on speech recognition and communication with the user.

Can it be done with the use of ViciDial?

Thanks very much for your help in advance!
Greetings :-)!
johnyjj2
 
Posts: 3
Joined: Thu Oct 15, 2009 12:43 pm

Postby mflorell » Sat Oct 17, 2009 7:08 pm

What language do you need?

We would usually do something like this in a custom AGI script.
mflorell
Site Admin
 
Posts: 18339
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Postby johnyjj2 » Sun Oct 18, 2009 4:14 am

Thanks for your answer :-)!

It is Polish language.

Greetings:)!
johnyjj2
 
Posts: 3
Joined: Thu Oct 15, 2009 12:43 pm

Postby mflorell » Mon Oct 19, 2009 2:57 pm

I think that you will first need to find a speech recognition app that will take your model. I don't have too much experience with this except for Sphinx which is free but extremely CPU hungry.
mflorell
Site Admin
 
Posts: 18339
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Postby williamconley » Mon Oct 19, 2009 9:45 pm

your first version should attempt it without speech rec if possible. we've looked into speech rec and found it to be intense, but doable. is there a need to avoid having clients enter this on keypad? are the entires uniformly shaped (numbers and characters always in the same place)? If so, you could have the user enter the first XX digits as numbers, and press keypad specific number of times for letters (5 = J, 55=K, 555=L). If there are a limited number of "letters" and they are in predictable places, this would work. then the rest of the script becomes much simpler, and in fact, "doable". then later add voice recognition.

on the other hand, that being said, if the recognition is only for the alphabet, then it could be done. sphinx is the place to begin.
Vicidial Installation and Repair, plus Hosting and Colocation
Newest Product: Vicidial Agent Only Beep - Beta
http://www.PoundTeam.com # 352-269-0000 # +44(203) 769-2294
williamconley
 
Posts: 20019
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)

Postby johnyjj2 » Tue Oct 20, 2009 12:05 am

Thanks for your answers :-)?

If not Sphinx, then what? Julius, HTK?

Yes, I can try using keypad but the most difficult thing for me is to establish the connection between mobile phone and server :-P.

What kind of device do I need on server side? I thought about SIP trunk or normal analog phone line using a sangoma or Digium card.

How would I connect (in the case of speech) from mobile phone? By calling a special number or by entering a special application on mobile?

Greetings :-)!
johnyjj2
 
Posts: 3
Joined: Thu Oct 15, 2009 12:43 pm

Postby williamconley » Tue Oct 20, 2009 1:02 am

Not even remotely difficult to establish the connection. It's just a phone call and a dial plan entry. Could be handled with a survey within vicidial for most of it quite easily.

SIP trunk is a good place to start (cheapest test version). And DTMF from cell phone to Vicidial is quite reliable. Better than voice recognition. Do it with ulaw or alaw to begin with.

It could be done with limewire modifications or with other survey applications. We've written our own survey applications as well. You play the greeting, then ask for the information you require (in pieces if you can to be sure it isn't too long to enter if there is text involved) and then you run your "verification" algorythm and give the client the response and either loop or terminate.

Trust me, though, you want to do the first version without voice just to get it running. If you can. Is there a "pattern" to the digits you are verifying? (text in specific places always?)
Vicidial Installation and Repair, plus Hosting and Colocation
Newest Product: Vicidial Agent Only Beep - Beta
http://www.PoundTeam.com # 352-269-0000 # +44(203) 769-2294
williamconley
 
Posts: 20019
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 222 guests