Building a Named Entity Recognition model for Ethiopian Languages: a comparative analysis of composite feature embedding

Authors

  • Sintayehu Hirpassa ASTU

DOI:

https://doi.org/10.20372/mwu.jessd.2025.15671

Keywords:

Named Entity Recognition, Amharic, conditional random field, Recurrent neural network, Long short term, Convolutional neural network

Abstract

Named Entity Recognition (NER) is a crucial and indispensable step in information extraction, machine translation, and question-and-answering systems across various languages. The selection and encoding of input features play a significant role in determining the quality of NER by generating semantic and grammatical representation vectors. However, the existing NER models insufficient when it comes to handling new and unseen entity types in the expanding Amharic digital data. Therefore, there is extensive research focused on developing more effective and accurate NER models. In this context, we propose a deep learning NER model that effectively represents word tokens through a combinatorial feature embedding design. We conducted a comparative analysis with existing models for Ethiopian languages. The word vectors created for all tokens using an unsupervised learning algorithm are merged with a set of language-independent features specifically developed for this purpose. These combined features are then fed into a neural network model to predict word classes. Empirical results obtained from the Ethiopian language dataset demonstrate that incorporating character-level word embeddings along with other features in BiLSTM-CRF models yields state-of-the-art performance. In addition to showing the model's ability to generalize to different languages, we evaluated its performance and achieved remarkable accuracy rates: 92.88% and 82.35% on the AM_NER and Oro_NER datasets, respectively.

Downloads

Download data is not yet available.

Downloads

Published

2025-02-13

How to Cite

Hirpassa, S. (2025). Building a Named Entity Recognition model for Ethiopian Languages: a comparative analysis of composite feature embedding. Journal of Equity in Sciences and Sustainable Development, 8(1), 93–105. https://doi.org/10.20372/mwu.jessd.2025.15671