Improving Recognition Accuracy of Urdu Weather Service by Identifying Out-of-Vocabulary Words

Saad Irtza, Aneek Anwar, Sarmad Hussain

Abstract


Mobile based dialogue systems in local languages provide a very suitable information delivery channel. Many of the tasks can be addressed by designing and developing small vocabulary systems. However, as small vocabulary systems generally try to match each input word onto one of the words in the vocabulary, if inadvertently out of vocabulary (OOV) words are spoken, they are also mapped onto the closed set of words in vocabulary and reduce the accuracy. The current work addresses this issue. We present the development of mobile based dialogue system in local language (Urdu) to provide weather information to urban and rural populations. Performance of this speaker independent automatic speech recognition system (ASR) is evaluated by offline and online testing. In offline testing, based on unseen dataset limited to the speakers used for training the system, 100% accuracy is achieved. In online testing, 74.79% accuracy is achieved. Analysis shows that a significant reduction in accuracy is caused by out-of-vocabulary words (OOV) spoken by users. Phone-based model is then added to detect and reject OOV words and system accuracy improves to 88.24%.

Full Text:

PDF

References


Huang, X., Acero, A. and Hon, H.W.: 2001. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall.

Chen, J., Wu, J. and Wang, Z.: 2003. A Chinese spoken dialogue system for train information. In: Proc of IEEE SMC’2003 [C], Washington D.C., USA, (EI: 2003487750883, ISTP: BX83D).

Zue, V., Seneff, S., Glass, J., Polifroni, J., Pao, C., Hazen, T.J. and Hetherington, L.: JUPITER: 2000. A telephone-based conversational interface for weather information. In: IEEE Transactions on Speech and Audio Processing, Vol. 8, No. 1.

Bick, E. and Hansen, J.A.: 2007. The Fyntour Multilingual Weather and Sea Dialogue System. In: Ron Artstein and Laure Vieu (eds.), Proceedings of DECALOG - The 2007 Workshop on the Semantics and Pragmatics of Dialogue, May 30 – 1, pp. 157-158.

Mestrovic, A., Bernic, L., Pobar, M., Ipsic, S.M. and Ipsic, I.: 2010. A Croatian Weather Domain Spoken Dialog System Prototype. In: Journal of Computing and Information Technology - CIT 18, 4, 309–316 doi:10.2498/cit.1001916.

Eckert, W., Kuhn, T., Niemann, H., Rieck, S., Scheuer, A. and Schukat-Talamazzini, E.G.: 1993. A spoken dialogue system for German intercity train timetable inquiries. In: EUROSPEECH, Berlin, 1993, pp. 129-132.

Baggia,P., Kellner,A., Prennou,G., Popovici, C., Sturm, J. and Wessel, F. 1999. Language Modelling and Spoken Dialogue Systems - the ARISE experience. In: EUROSPEECH.

Narayanan, S., Ananthakrishnan, S., Belvin, R., Ettaile, E., Gandhe, S., Ganjavi, S., Georgiou, P. G., Hein, C. M., Kadambe, S., Knight, K., Marcu, D., Neely, H. E., Srinivasamurthy, N., Traum, D. and Wang, D.: 2004. The transonics spoken dialogue translator: An aid for EnglishPersian doctor-patient interviews. In: AAAI Fall Symposium.

Akram, M.U. and Arif, M.: 2004. Design of an Urdu Speech Recognizer based upon acoustic phonetic modelling approach. In: IEEE INMIC 2004, pp. 91-96, 24-26.

H. Sarfraz, S. Hussain, R. Bokhari, A. A. Raza, I. Ullah, Z. Sarfraz, S. Pervez, A. Mustafa, I. Javed, R. Parveen, 2010. “Large Vocabulary Continuous Speech Recognition for Urdu”, in the Proceedings of International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, 21-23.

Ashraf, J., Iqbal, N., Khattak, N.S. and Zaidi, A.M.: 2010. Speaker Independent Urdu Speech Recognition Using HMM. In: INFOS, IEEE, Cairo, 28-30.

A. A. Raza, S. Hussain, H. Sarfraz, I. Ullah and Z. Sarfraz, 2010. “An ASR System for Spontaneous Urdu Speech”, In the Proc. of Oriental COCOSDA, Kathmandu, Nepal. 24-25.

Irtza, S. and Hussain, S.: 2012. Error Analysis of Single Speaker Urdu Speech Recognition System. In: CLT-12, University of Engineering and Technology, Lahore, Pakistan.

Irtza, S. and Hussain, S. 2013. "Minimally Balanced Corpus for Speech Recognition", in the Proceedings of 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA'13), IEEE, Sharjah.

Young, S. R.: 1994. Recognition Confidence Measures: Detection of Misrecognitions, and Out-Of-Vocabulary Words. In: Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP-94, Adelaide, Australia.

S. Maria da Feira, 2009. "Out-Of-Vocabulary and Confidence Measures for Speech Recognition Using Phone Models", Proc Conf. on Telecommunications - ConfTele , Portugal, Vol. 1 , pp. 457 - 460.

Kombrink, S. Burget, L., Matejka, P., Karafiat, M., Heřmansky, 2009. "Posterior-based Out of Vocabulary Word Detection in Telephone Speech", In: Proc. Of INTERSPEECH 2009, Brighton, GB, ISCA, p. 80-83, ISSN 1990- 9772.

M. Thomae, T. Fábián, R. Lieb, and G. Ruske, 2005. "Lexical out-of-vocabulary models for one-stage speech interpretation", In: Proc. of INTERSPEECH, pp.441-444.






Copyright (c) 2016 Saad Irtza, Aneek Anwar, Sarmad Hussain

Powered By KICS