期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2012
卷号:9
期号:2
出版社:IJCSI Press
摘要:This paper describes the plane sweep algorithm for optimal use of tandem replicated data in a document. It is based on an observation that the original plane sweep algorithm is used to search in documents. Plane sweep algorithm does not feature a fast algorithm to find the word that is repeated in tandem in a document. With the help of effective parameters we could make a new technique to create the algorithm that detect the number of tandem replicated words in a document and reduce the number of compares, thus reducing the number of keywords in a document speed up our search algorithm. For this purpose we need to have a link between words and documents that the proposed algorithm WPSR provides a similar solution as original plane sweep algorithm. However, considering the volume of data we get the canonical form that this situation helps us to increases recognition of duplicate words in large scale. In proposed algorithm time complexity with lower order has been created than the basic algorithm, also terms of the algorithm with the WPSR system reliability, consider the best web search. Finally, system efficiency and performance increase more than the previous similar algorithm in this field.
关键词:Plane sweep algorithm; Replicated data; String matching; Optimized algorithm; Web search; Text retrieval; Proximity search.