Keyword

Business Strategies, Competitive Advantages, Data-driven K-means Algorithm, K-means Algorithm, Pre-processing Techniques, Strategic Planning

Abstract

In today’s dynamic business environment, Machine Learning (ML) or algorithm-based, data-driven models are essential for competitive advantage and strategic planning. This study aims to demonstrate the effectiveness of ML models - specifically the standard K-means clustering algorithm in identifying patterns that can inform strategic business decisions. A synthetic dataset was generated to simulate real-world business data scenarios, and the K-means algorithm was applied both with and without data pre-processing techniques such as scaling. The results indicate that although K-means remains a powerful and widely applicable clustering method, its performance is significantly improved by proper data scaling and identification of the optimal number of clusters. The findings of this study offer valuable insight how to develop business strategies over complex business scenarios.


Full Text : PDF

References
  • Annas, M. and Wahab, S. N., 2023. Data mining methods: K-means clustering algorithms. International Journal of Cyber and IT service management, 3(1). ISSN: 2797-1325.
  • Arthur, D. and Vassilvitskii, S., 2007. K-Means++: The advantage of careful seeding in Proc. 18th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, New Orleans, Louisiana, USA.
  • Bereta, L., Cohen-Addad, V., Lattanzi, S. and Parotsidis, N., 2023. Multi-swap K-means++ in Proc. of the 37th International Conference on Neural information processing Systems, Article no: 1135. pp. 26069-26091. New Orleans LA USA.
  • Bishop, C. M. and Bishop, H., 2023. Deep Learning: Foundations and Concepts. Springer. Cham, Switzerland.
  • Dalmaijer, E. S., Nord, C. L and Astle, D. E., 2022. Statistical power for cluster analysis. BMC Bioinformatics, vol. 23. Article number 205.
  • Frost, N., Moshkovitz, M and Rashtchian, C., 2020. ExKMC: Expanding Explainable k-Means clustering. arXiv preprint arXiv:2006.02399. Available at: https://arxiv.org/abs/2006.02399 
  • Hastie, T., Tibshirani, R. and Friedman, J., 2009. The Elements of Statistical Learning: Data mining, Inference, and Prediction. Springer. New York, NY.
  • Herdiana, I., Kamal, M. A., Triyani, Estri, M. N. and Renny, 2025. A more precise elbow method for optimum K-means clustering. arXiv preprint arXiv:2502.00851. Available at: https://arxiv.org/abs/2502.00851
  • Hossain, M. Z., Akhtar, M. N., Ahmad, R.B., and Rahman, M., 2019. A dynamic K-Means clustering for data mining. Indonesian Journal of Electrical Engineering and Computer Science, 13(2). pp. 521 – 526.
  • Husein, A.M., Waruwu, F.K., Batu Bara, Y.M.T., Donpril, M. and others, 2021. Clustering Algorithm For Determining Marketing Targets Based on Customer Purchase Patterns And Behaviors. Sinkron: Jurnal dan Penelitian Teknik Informatika, 6(1), pp.137–143. Available at:
  • https://www.researchgate.net/publication/355591874_Clustering_Algorithm_For_Determining_Marketing_Targets_Based_Customer_Purchase_Patterns_And_Behaviors
  • Ikotun, A. M., Ezugwu, A. E., Abualigah, L., Abuhaija, B., and Heming, J., 2023. K-means clustering algorithms: A comprehensive review, variants, and advances in the era of big data, Scientific Reports, vol. 622, April 2023, pp. 178-210. https://www.nature.com/articles/s41598-023-33214-y
  • John, J.M., Shobayo, O. and Ogunleye, B., 2024. An Exploration of Clustering Algorithms for Customer Segmentation in the UK Retail Market. arXiv preprint arXiv:2402.04103. Available at: https://arxiv.org/abs/2402.04103
  • Lam, D. and Wunsch, D., 2014. Clustering. Signal Processing: Signal processing Theory and Machine Learning, Signal Processing Theory and Machine Learning, vol. 1, pp. 1115-1149.
  • https://www.researchgate.net/publication/285180280_Clustering, DOI: DOI:10.1016/B978-0-12-396502-8.00020-6
  • Lang, A. and Schubert, E., 2024. Accelerating K-Means clustering with Cover Trees* in Proc. of International Conference on Similarity and Search Applications (SISAP 2023). Lecture Notes in Computer Science (LNCS, Volume 14289).
  • Lattanzi, S. and Soler, C., 2019. A better K-means++ algorithm via local search in Proc. of the 36th International Conference on Machine Learning, PMLR97. Volume 97, pp. 3662—3671. Found at
  • https://proceedings.mlr.press/v97/lattanzi19a.html. San Diego CA, USA.
  • Lee, S. S. and Lin, J. C., 2012. An accelerated K-means clustering algorithm using selection and erasure rules. Journal of Zhejiang university SCIENCE C, vol. 13, pp. 761-768. Springer nature link found at
  • https://doi.org/10.1631/jzus.C1200078.
  • Li, Y. and Wu, H., 2012. A clustering method based on K-Means algorithm, 2012. International Conference on Solid State Devices and Materials Science. Physics Procedia, vol. 25 (2012), pp.1104 – 1109.
  • Manish, S. and Sanjay, S., 2024. A review on analysis of K-Means clustering machine learning algorithm based on unsupervised learning. Journal of Artificial Intelligence and Systems, vol. 6, pp. 85-95. ISSN: 2642-2859.
  • Mnih, V., Kavukcuoglu, K., Silver, D, and Rusu, A. A., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540), pp. 529-533. DOI: 10.1038/nature14236 https://pubmed.ncbi.nlm.nih.gov/25719670/
  • Mohammadi, S. O., Kalhor, A., and Bodaghi, H., 2021. H-Splits: Improved K-Means clustering algorithm to automatically detect the number of clusters. Computer Vision and Pattern Recognition. University of Tehran, Iran.
  • Napolean, D. and Pavalakodi, S., 2011. A new method for dimensionality reduction using K-Means clustering algorithm for high dimensional data set. International Journal of Computer Applications, 13(7). DOI: 10.5120/1789-2471 https://www.ijcaonline.org/archives/volume13/number7/1789-2471/
  • Oti, E. U., Olusola, M. O., Eze, F. C. and Enogwe, S. U., 2021. Comprehensive Review of K-Means Clustering Algorithms, International Journal of Advances in scientific Research and Engineering, 7(8). pp. 64—69. E-ISSN: 2454-8006. DOI: 10.31695/IJASRE.2021.34050
  • Pasin, O. and Gonenc, S., 2023. An investigation into epidemiological situations of COVID with fuzzy K-means and K-prototype clustering methods. Scientific Reports, vol. 13. Article number 6255.
  •  https://www.nature.com/articles/s41598-023-33214-y.
  • Sinaga, K. P. and Yang, M. S., 2020. Unsupervised K-Means Clustering Algorithm. IEEE Access, vol. 10, pp. 80716 – 80727, ISSN: 2169-3536. DOI: 10.1109/ACCESS.2020.2988796
  • https://www.scirp.org/reference/referencespapers?referenceid=3464953
  • Singh, S. and Gill, N. S., 2013. Analysis and study of K-means clustering algorithm. International Journal of Engineering Research and Technology, 2(7). ISSN: 2278-0181.
  • Sutton, R. S. and Barto, A. G., 2018. Reinforcement Learning: An introduction. MIT Press. Second Edition. Cambridge, MA. http://incompleteideas.net/book/the-book-2nd.html
  • Suyal, M. and Sharma, S., 2024. A review on analysis of K-Means clustering machine learning algorithm based on unsupervised learning. Journal of Artificial Intelligence and Systems, vol. 6, pp. 85-95. ISSN: 2642-2859.
  • Wongoutong, C., 2024. The impact of neglecting features scaling in K-means clustering. PLoS One, 19(12). Mae Fah Luang University, Thailand. https://pmc.ncbi.nlm.nih.gov/articles/PMC11623793/.
  • Xiao, B., Wang, Z., Liu, Q. and Liu, X., 2018. SMK-means: an improved mini batch K-means algorithm based on mapreduce with big data. Tech Science Press, 1(1), pp. 1-5.
  • Xie, W., Wang, X. and Xu, B., 2020. An improved K-means clustering algorithm based on density selection in Proc. of the 2020 international Conference on Machine Learning and big data Analytics for IoT Security and Privacy, 2. Shanghai, China November 2020.
  • Yu, l., 2024. The application of K-means clustering algorithm in the evaluation of e-commerce websites. J.  Electrical Systems, 20(6).
  • Zhu, X. and Goldberg, A. B., 2009. Introduction to Semi-Supervised Learning. Morgan and Claypool Publishers. Series: Synthesis Lectures on Artificial Intelligence and Machine Learning , ISBN: 978-1-59829-547-4. San Rafael, CA.
  • Zubair, M., Iqbal, M. A., Shil, A., Chowdhury, M. J. M., Moni, M. A. and Sarkar, I. H., 2024. An improved K-means clustering algorithm towards an efficient data-driven modelling. Annals of Data Science (2024), 11(5), pp. 1525 – 1544.