Multi-Font and Multi-Size Printed Sindhi Character Recognition using Convolutional Neural Networks

Authors

  • Asghar Ali Chandio School of Engineering and Information Technology, University of New South Wales, Australia Information Technology Department, Quaid-e-Awam University of Engineering, Science & Technology, Nawabshah, Pakistan
  • Mehwish Leghari
  • Mehjabeen Leghari
  • Akhtar Hussain Jalbani

Abstract

In this paper, a problem of multi-font, multi-color and multi-size printed character recognition of Sindhi language are addressed. Although previous studies for offline handwritten isolated Sindhi character recognition with unique font and size have achieved satisfactory results, the problem of multi-fonts, multi-size and multi-color character recognition is still a major challenge. This is due to the various varieties in the shape, style, and layout of the character. A synthetic dataset with background color image consisting of Sindhi characters with multi-fonts, multi-size, and multi-colors is created. Three types of experiments with Convolutional Neural Networks (CNN) are performed separately. The first CNN network uses max-pooling layer after every two convolutional layers, the second network applies multi max-pooling layers after the last convolutional layer and the third network is created without applying any max-pooling layer. The experimental results demonstrate that convolutional neural network with max-pooling layers improves the performance significantly. The recognition results of 99.96%, 97.94%, and 98.72% are achieved with first, second and third networks respectively, which shows that CNN with pooling layers is more effective.

Author Biography

Asghar Ali Chandio, School of Engineering and Information Technology, University of New South Wales, Australia Information Technology Department, Quaid-e-Awam University of Engineering, Science & Technology, Nawabshah, Pakistan

PhD Scholar

Assistant Professor, QUEST, Nawabshah

References

[1] Shaikh NA, Mallah GA, Shaikh ZA. (2009). Character segmentation of Sindhi, an Arabic style scripting language, using height profile vector. Australian Journal of Basic and Applied Sciences, vol. 3, no. 4, pp. 4160-9 4169.

[2] Shaikh ZA, Shaikh NA. A universal thinning algorithm for cursive and non-cursive character patterns (2006). MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, vol. 25, no. 2, pp. 163.

[3] Bhatti, Z., Ismaili, I.A., Hakro, D.N., and Waqas, A.,(2014) “Unicode Based Bilingual Sindhi-English Pictorial Dictionary for Children”, American Journal of Software Engineering, Volume 2, No. 1, pp. 1-7.

[4] Awan SA, Abro ZH, Jalbani AH, Hameed M (2018) “Handwritten Sindhi Character Recognition Using Neural Networks”. Mehran University Research Journal of Engineering and Technology, vol. 1, no. 37, pp. 1-6.

[5] Solangi YA, Solangi ZA, Raza A, Shaikh NA, Mallah GA, Shah A (2018) “Offline-printed Sindhi Optical Text Recognition: Survey”. In 5th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS), pp. 1-5.

[6] Hakro, D.N., Ismaili, I.A., Talib, A.Z. Bhatti, Z., and Mojai, G.N., (2014) “A Recognition”, Sindh University Research Journal (Science Series), Volume 46, No. 3, pp. 323-334, Jamshoro, Pakistan.

[7] Soomro WJ, Ismaili IA, Shoro GM (2018). Optical Character Recognition System for Sindhi Text: A Survey. University of Sindh Journal of Information and Communication Technology. Vol. 28, No. 2, pp. 1-7.

[8] Bhatti, Z., Ismaili, I.A., Soomro, W.J., and Hakro, D.N., (2014) “Word Segmentation Model for Sindhi Text”, American Journal of Computing Research Repository, Volume 2, No. 1, pp. 1-7.

[9] Hakro, D.N., Talib, Z., and Mojai, G.N., (2014) “Multilingual Text Image Database for OCR”, Sindh University Research Journal (Science Series), Volume 47, No. 1, pp. 181-186, Jamshoro, Pakistan.

[10] Mehwish Leghari & Mutee U Rahman (2015). Towards Transliteration between Sindhi Scripts Using Roman Script, Linguistics and Literature Review, vol. 1, no. 2, pp. 95- 104

[11] Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing. Vol. 234, pp. 11-26.

[12] Schmidhuber J. Deep learning in neural networks (2015): An overview. Neural networks. Vol. 61, pp. 85-117.

[13] LeCun Y, Bengio Y, Hinton G. (2015) Deep learning. nature. Vol. 521, No, 7553, pp. 436.

[14] Xiao X, Jin L, Yang Y, Yang W, Sun J, Chang T (2017). Building fast and compact convolutional neural networks for offline handwritten Chinese character recognition. Pattern Recognition, vol. 1, no. 72, pp. 72-81.

[15] Zhong Z, Jin L, Feng Z (2015), Multi-font printed Chinese character recognition using multi-pooling convolutional neural network. In 13th IEEE Internaltional Conference on Document Analysis and Recognition (ICDAR), pp. 96-100.

[16] Ahmed SB, Naz S, Razzak MI, Yousaf R (2017), "Deep learning based isolated Arabic scene character recognition,"In Proceedings of 1st IEEE International Workshop on Arabic Script Analysis and Recognition (ASAR), pp. 46-51, Nancy.

[17] El-Sawy A, Loey M, Hazem EB (2017), Arabic handwritten characters recognition using convolutional neural network. WSEAS Transactions on Computer Research. Vol. 5, pp. 11-9.

[18] Chandio AA, Pickering M, Shafi K., (2018) “Urdu Natural Scene Character Recogntion using Convolutional Neural Networks”. In Proceedings of 2nd IEEE International Workshop on Arabic Script Analysis and Recognition (ASAR), London.

[19] Naz S, Umar AI, Ahmad R, Ahmed SB, Shirazi SH, Siddiqi I, Razzak MI (2016), Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing. Vol. 177, pp. 228-41.

[20] Naz S, Umar AI, Ahmad R, Siddiqi I, Ahmed SB, Razzak MI, Shafait F (2017) Urdu nastaliq recognition using convolutional–recursive deep learning. Neurocomputing. Vol. 243, pp. 80-87.

[21] Samadiani N, Hassanpour H (2015), A neural network-based approach for recognizing multi-font printed English characters. Journal of Electrical Systems and Information Technology. Vol. 2, No. 2, pp. 207-218.

[22] Rashad M, Semary NA (2014), Isolated printed Arabic character recognition using KNN and random forest tree classifiers. InInternational Conference on Advanced Machine Learning Technologies and Applications, Springer, Cham, pp. 11-17.

[23] Yamina OJ, El Mamoun M, Kaddour S (2017) Printed Arabic optical character recognition using support vector machine. In IEEE International Conference on Mathematics and Information Technology (ICMIT). pp. 134-140.

[24] Din IU, Siddiqi I, Khalid S, Azam T (2017), Segmentation-free optical character recognition for printed Urdu text. EURASIP Journal on Image and Video Processing, Vol. 1, pp. 62.

[25] Zheng L (2006), Machine printed arabic character recognition using s-gcm. In IEEE 18th International Conference on Pattern Recognition (ICPR), Vol. 2, pp. 893-896.

[26] http://www.bhurgri.com/bhurgri/amar/index.php/sindhi-computing/sindhi-font.

Downloads

Published

2019-10-14

Issue

Section

Computer Science