首页    期刊浏览 2025年05月08日 星期四
登录注册

文章基本信息

  • 标题:Big Data Analysis Using Fuzzy Clustering Algorithms Implemented on Spark Framework
  • 本地全文:下载
  • 作者:Divyashree. V ; Deepika. N
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2017
  • 卷号:5
  • 期号:3
  • 页码:5971
  • DOI:10.15680/IJIRCCE.2017.0503347
  • 出版社:S&S Publications
  • 摘要:A huge amount of data containing useful information, called Big Data, is generated on a daily basis. Forprocessing such tremendous volume of data, there is a need of Big Data frameworks such as Hadoop MapReduce,Apache Spark etc. Among these, Apache Spark performs up to 100 times faster than conventional frameworks likeHadoop Mapreduce. we focus on the design of partitional clustering algorithm and its implementation on ApacheSpark. In this paper, we propose a partitional based clustering algorithm called Scalable Random Sampling withIterative Optimization Fuzzy c-Means algorithm (SRSIO-FCM) which is implemented on Apache Spark to handle thechallenges associated with Big Data Clustering. Experimentation is performed on several big datasets to show theeffectiveness of SRSIO-FCM in comparison with a proposed scalable version of the Literal Fuzzy c-Means (LFCM)called SLFCM implemented on Apache Spark.
  • 关键词:Apache Spark; Big Data; SRSIO-FCM; LFCM; SLFCM.
国家哲学社会科学文献中心版权所有