首页    期刊浏览 2025年04月17日 星期四
登录注册

文章基本信息

  • 标题:Unstructured Documents Categorization: A Study
  • 本地全文:下载
  • 作者:Debnath Bhattacharyya1 ; Poulami Das1 ; Debashis Ganguly1
  • 期刊名称:International Journal of Signal Processing, Image Processing and Pattern Recognition
  • 印刷版ISSN:2005-4254
  • 出版年度:2008
  • 卷号:1
  • 期号:1
  • 出版社:SERSC
  • 摘要:The main purpose of communication is to transfer information from one corner to another of the world. The information is basically stored in forms of documents or files created on the basis of requirements. So, the randomness of creation and storage makes them unstructured in nature. As a consequence, data retrieval and modification become hard nut to crack. The data, that is required frequently, should maintain certain pattern. Otherwise, problems like retrieving erroneous data or anomalies in modification or time consumption in retrieving process may hike. As every problem has its own solution, these unstructured documents have also given the solution named unstructured document categorization. That means, the collected unstructured documents will be categorized based on some given constraints. This paper is a review which deals with different techniques like text and data mining, genetic algorithm, lexical chaining, binarization method to reach the fulfillment of desired unstructured document categorization appeared in the literature.
  • 关键词:Unstructured Documents; Categorization; Text and Data mining;Genetic Algorithm; Lexical Chaining; Binarization.
国家哲学社会科学文献中心版权所有