A comparative analysis of similarity metrics on sparse data for clustering in recommender systems

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

This work shows similarity metrics behavior on sparse data for recommender systems (RS). Clustering in RS is an important technique to perform groups of users or items with the purpose of personalization and optimization recommendations. The majority of clustering techniques try to minimize the Euclidean distance between the samples and their centroid, but this technique has a drawback on sparse data because it considers the lack of value as zero. We propose a comparative analysis of similarity metrics like Pearson Correlation, Jaccard, Mean Square Difference, Jaccard Mean Square Difference and Mean Jaccard Difference as an alternative method to Euclidean distance, our work shows results for FilmTrust and MovieLens 100K datasets, these both free and public with high sparsity. We probe that using similarity measures is better for accuracy in terms of Mean Absolute Error and Within-Cluster on sparse data.

Original languageEnglish (US)
Title of host publicationA comparative analysis of similarity metrics on sparse data for clustering in recommender systems
EditorsTareq Z. Ahram
Pages291-299
Number of pages9
ISBN (Electronic)9783319942285
DOIs
StatePublished - 1 Jan 2019
EventAdvances in Intelligent Systems and Computing - , Germany
Duration: 1 Jan 2015 → …

Publication series

NameAdvances in Intelligent Systems and Computing
Volume787
ISSN (Print)2194-5357

Conference

ConferenceAdvances in Intelligent Systems and Computing
Country/TerritoryGermany
Period1/01/15 → …

Keywords

  • Clustering
  • Recommender systems
  • Similarity measures

Fingerprint

Dive into the research topics of 'A comparative analysis of similarity metrics on sparse data for clustering in recommender systems'. Together they form a unique fingerprint.

Cite this