A Clustering Approach for Outliers Detection in a Big Point-of-Sales Database

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

Abstrakti

Finding outliers, rare events from a collection of patterns, has become an emerging issue in the area of machine learning concerned with detecting and eventually removing anomalous objects in data. A key challenge with outliers/anomalies detection is because they are not a well-formulated issue. Outliers are defined as the extreme values that deviate from the overall patterns in data; they may indicate experimental errors, variability in measurement, or a novelty. Detecting outliers in large databases can lead to the discovery of hidden knowledge. However, identifying and removing outliers often helps to assure that the observations represent the problem correctly. Though there are several techniques for detecting outliers/anomalies in a given database, thus, no single technique is proven to be the standard universal choice. Depending on the nature of the target application, different implementations require the use of different outlier detection methods. The clustering method is a very powerful method in the field of machine learning and defines outliers in terms of their distance to the cluster centers. In this study, we propose a clustering-based approach to identifying outliers in a retail point-of-sales dataset. To select the best clustering algorithm for the purpose, two algorithms are applied, K-means for hard, crisp clustering, and (FCM) Fuzzy C-means for soft clustering. The experimental results show that the K-means algorithm outperforms the (FCM) Fuzzy C-means algorithm in terms of outlier detection efficiency, and it is an effective outlier detection solution.

AlkuperäiskieliEnglanti
Otsikko2019 International Conference on Machine Learning and Data Engineering (iCMLDE)
ToimittajatPhill Kyu Rhee, Kuo-Yuan Hwa, Tun-Wen Pai, Daniel Howard, Md Rezaul Bashar
KustantajaIEEE
Sivut65–71
ISBN (painettu)978-1-7281-0404-1
DOI - pysyväislinkit
TilaJulkaistu - 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaInternational Conference on Machine Learning and Data Engineering (iCMLDE) - 2019 International Conference on Machine Learning and Data Engineering (iCMLDE)
Kesto: 2 joulukuuta 20194 joulukuuta 2019

Konferenssi

KonferenssiInternational Conference on Machine Learning and Data Engineering (iCMLDE)
Ajanjakso02/12/1904/12/19

Keywords

  • Clustering
  • Noise
  • Outlier detection
  • Point-of-sales analysis

Sormenjälki

Sukella tutkimusaiheisiin 'A Clustering Approach for Outliers Detection in a Big Point-of-Sales Database'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Viittausmuodot