Recognition System for Libyan Entity Names
Named Entity Recognition (NER) is a computational linguistic concept that is used to find and classify appropriate nouns in a text such as person names, geographical locations, and organizations. Such a concept is fundamental in the field of natural language processing. In Libya, many private and public institutions suffer from using the proper translation of entity names from Arabic language into English. Therefore, in this paper, we are concerned with analyzing Arabic articles to extract and recognize entity names. A recognition system is developed for recognizing names of persons, academic institutions, and cities in Libya. At first, a training corpus and dictionaries are built for the intended entity names in this research. Then, the aspects of the entity names are studied, and their patterns and rules are designed. Then, the implementation is performed using Nooj linguistic language. The recognition of person names and Libyan cities and academic institutions was carried out. Statistics showed the frequencies of the appearance rate of person names, academic institutions, and cities in our training corpus. The obtained results are promised and met the research goals for tackling the problem of Arabic named entity recognition.
S. Bird, E. Klein, and E. Loper, “Natural Language Processing with Python”, Published by O’Reilly Medi, 2009.
N. Y. Habash, “Introduction to Arabic Natural Language Processing”, Morgan & Claypool, vol. 10, 2010.
S. Mesfar, "Named entity recognition for arabic using syntactic grammars", In Natural Language Processing and Information Systems, Springer, pp. 305-316, 2007.
R. Salah and L. Zakaria, "A Comparative Review of Machine Learning for Arabic Named Entity Recognition", International Journal on Advanced Science, Engineering and Information Technology, vol.7, 2017.
K. Shaalan and H. Raza, "NERA: Named entity recognition for Arabic", Journal of the American Society for Information Science and Technology, vol. 60, pp. 1652-1663, 2009.
W. Zaghouani, "RENAR: A rule-based Arabic named entity recognition system", ACM Transactions on Asian Language Information Processing, vol. 11, p. 2, 2012.
S. Max, “NooJ Manual”, 2003. Available for download at: www.nooj4nlp.net (Accessed in Nov, 2019).
N. Chincho, “MUC-7 named entity task definition version 3.5”, In Proceedings of the Seventh Message Understanding Conference, Washington, DC, 1997.
J. Baptista, "A local grammar of proper nouns", In Proc. Seminarios de Lingustica 2, Faro: Universidade do Algarve, 1998, pp. 21-37.
N. Friburger and D. Maurel, "Finite-state transducer cascades to extract named entities in texts", Theoretical Computer Science, vol. 313, no. 1, pp. 94-104, 2004.
J.-S. Name and K.-S. Choi, "A Local Grammar-based Approach to Recognizing of Proper Names in Korean Texts", In Proc. Workshop on Very Large Corpora, ACL, Tsing-hua Univ. and Hong-Kong University of Science and Technology, pp. 273-288, 1997.
M. Aboaoga and M. Ab Aziz, "Arabic person names recognition by using a rule-based approach", Journal of Computer Science, vol. 9, p. 922, 2013.
H. Traboulsi, “Named Entity Recognition: A Local Grammar-based Approach”, Ph.D. dissertation, Dept. of Computing, Surrey Univ. Guildford, U.K, 2006.
S. Abdallah, K., Shaalan, M. Shoaib, “Integrating Rule-Based System with Classification for Arabic NER”, pp. 311–322. Springer-Verlag, Heidelberg, 2012.
K. Shaalan, "A survey of Arabic named entity recognition and classification", Computational Linguistics, vol. 40, pp. 469-510, 2014.