Open Access Peer Reviewed DOI Prefix: 10.20431
📄 Submit Paper

Submit Paper

Article Processing Fee

Pay Online

           

Crossref logo

  DOI Prefix   10.20431


 

International Journal of Research Studies in Computer Science and Engineering
Volume 6, Issue 1, 2019, Page No: 6-15

A Comparison-Based Soft Clustering Algorithm for Documents

Ganesh Yadav1, Vipul Kumar Verma2

1.Assistant Professor, Department of CSE, IIMT Greater Noida, India.
2.Assistant Professor, Department of CSE, IIMT Greater Noida, India.

Citation : Ganesh Yadav, Vipul Kumar Verma, A Comparison-Based Soft Clustering Algorithm for Documents International Journal of Research Studies in Computer Science and Engineering 2019, 6(1) : 6-15.

Abstract

Data document clustering is an most important tool for searching document such as Web search engines. Clustering data documents enables the accessor to have a good overall view of the information contained in the documents that he has. However, existing clustering algorithms faces from various aspects; complex clustering algorithms (where each document belongs to exactly one cluster) cannot detect the multiple themes of a document, while flexible such as soft clustering algorithms (where each document can belong to multiple clusters) are usually inefficient. We propose CSCA (Comparison-based Soft Clustering), an efficient soft clustering algorithm based on a given similarity measure. CSCA requires only a similarity measure for clustering and uses randomization to help make the clustering efficient. Comparison with existing complex hard clustering algorithms like K-means and its variants shows that CSCA is both effective and efficient.


Download Full paper: Click Here