Urdu Named Entity Recognition System using Hidden Markov Model

Muhammad Kamran Malik; Syed Mansoor Sarwar

PDF

Published: 2017-08-21

Muhammad Kamran Malik

PUCIT, University of the Punjab

Syed Mansoor Sarwar

PUCIT, University of the Punjab

Abstract

Named Entity Recognition (NER) is the process of identifying Person, Organization, Location name and other miscellaneous information like number, date and measure from text. In this paper, we describe the development of a NER system for Urdu Language using Hidden Markov Model (HMM). We first show a comparison of IOB2 and IOE2 tagging schemes. We then show preprocess of the Urdu language before feeding data to the HMM model for training using the IOE2 tagging scheme. Finally, we use the Part of Speech (POS) information, gazetteers and rules to improve the accuracy of the system. Our system yields 66.71%, 71.70% and 69.12% as the values for precision, recall, and f-measure, respectively.

Pakistan Journal of Engineering and Applied Sciences

Urdu Named Entity Recognition System using Hidden Markov Model

Abstract

Issue

Section

References

Pakistan Journal of Engineering and Applied Sciences

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section

References