首页    期刊浏览 2025年05月07日 星期三
登录注册

文章基本信息

  • 标题:Improving Rule-Based Method for Arabic POS Tagging Using HMM Technique
  • 本地全文:下载
  • 作者:Meryeme Hadni ; Said Alaoui Ouatik ; Abdelmonaime Lachkar
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2013
  • 卷号:3
  • 期号:8
  • 页码:257-269
  • DOI:10.5121/csit.2013.3821
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Part-of-speech (POS) tagger plays an important role in Natural Language Applications like Speech Recognition, Natural Language Parsing, Information Retrieval and Multi Words Term Extraction. This study proposes a building of an efficient and accurate POS Tagging technique for A rabic language using statistical approach. Arabic Rule-Based method suffers from misclassified and unanalyzed words due to the ambiguity issue. To overcome these two problems, we propose a Hidden Markov Model (HMM) integrated with Arabic Rule-Based method. Our POS tagger generates a set of 4 POS tags: Noun, Verb, Particle, and Quranic Initial (INL). The proposed technique uses the different contextual information of the words with a variety of the features which are helpful to predict the various POS classes. To evaluate its accuracy, the proposed method has been trained and tested with the Holy Quran Corpus containing 77 430 terms for undiacritized Classical Arabic language. The experiment results demonstrate the efficiency of our method for Arabic POS Tagging. The obtained accuracies are 97.6% and 94.4% for respectively our method and for the Rule based tagger method
  • 关键词:Natural Language Applications; Natural Language Parsing; part-of-speech Tagger; Hidden ;Markov Model; Speech Recognition
国家哲学社会科学文献中心版权所有