Patrones de Comportamiento en usuarios de transporte interprovincial en Ecuador mediante Técnicas de Machine Learning
Abstract
Este estudio tiene como objetivo analizar y predecir patrones de comportamiento de los usuarios de transporte interprovincial en Ecuador mediante técnicas de aprendizaje automático. Se utilizó un conjunto de datos proporcionado por la Unión de Cooperativas de Transporte Interprovincial de Ecuador que abarca viajes realizados entre 2022 y 2024. La metodología incluyó la implementación de K-means para la segmentación de usuarios y PCA para la reducción dimensional. Inicialmente, K-means identificó cuatro clústeres, pero el solapamiento entre grupos motivó la aplicación de PCA, mejorando la separación. Los resultados revelaron cuatro grupos: Ritmo Diario, Exploradores de Fin de Semana, Nómadas de Eventos y Viajeros Flexibles. Esta segmentación ofrece información clave para optimizar los servicios de transporte y mejorar la experiencia del usuario al ajustar recursos a las necesidades de cada grupo.
References
Alguliyev, R. M., Aliguliyev, R. M., & Sukhostat, L. V. (2021). Parallel batch k-means for Big data clustering. Computers & Industrial Engineering, 152, 107023. https://doi.org/10.1016/J.CIE.2020.107023
Anowar, F., Sadaoui, S., & Selim, B. (2021). Conceptual and empirical comparison of dimensionality reduction algorithms (PCA, KPCA, LDA, MDS, SVD, LLE, ISOMAP, LE, ICA, t-SNE). Computer Science Review, 40, 100378. https://doi.org/10.1016/J.COSREV.2021.100378
Argüello Erazo, S. E., Villa Uvidia, R. N., & Palahuachi Sumba, J. P. (2020). Historia y evolución de la gestión del transporte público urbano en la provincia de Chimborazo.
Bagirov, A. M., Aliguliyev, R. M., & Sultanova, N. (2023). Finding compact and well-separated clusters: Clustering using silhouette coefficients. Pattern Recognition, 135, 109144. https://doi.org/10.1016/J.PATCOG.2022.109144
Bai, L., Liang, J., & Cao, F. (2020). A multiple k-means clustering ensemble algorithm to find nonlinearly separable clusters. Information Fusion, 61, 36–47. https://doi.org/10.1016/J.INFFUS.2020.03.009
Bandyopadhyay, S., Thakur, S. S., & Mandal, J. K. (2020). Product recommendation for e-commerce business by applying principal component analysis (PCA) and K-means clustering: benefit for the society. Innovations in Systems and Software Engineering, 17(1), 45–52. https://doi.org/10.1007/S11334-020-00372-5
Brůhová Foltýnová, H., Vejchodská, E., Rybová, K., & Květoň, V. (2020). Sustainable urban mobility: One definition, different stakeholders’ opinions. Transportation Research Part D: Transport and Environment, 87, 102465. https://doi.org/10.1016/J.TRD.2020.102465
Chiabaut, N., & Faitout, R. (2021). Traffic congestion and travel time prediction based on historical congestion maps and identification of consensual days. Transportation Research Part C: Emerging Technologies, 124, 102920. https://doi.org/10.1016/J.TRC.2020.102920
Chun, K. C., Bahk, J., Kim, H., Jeong, H. C., & Kim, G. (2023). Classification of the metropolitan subway stations and spheres of influence of main commercial areas in Seoul. Physica A: Statistical Mechanics and Its Applications, 609, 128387. https://doi.org/10.1016/J.PHYSA.2022.128387
Cubric, M. (2020). Drivers, barriers and social considerations for AI adoption in business and management: A tertiary study. Technology in Society, 62, 101257. https://doi.org/10.1016/J.TECHSOC.2020.101257
de Oliveira, M. S., Steffen, V., de Francisco, A. C., & Trojan, F. (2023). Integrated data envelopment analysis, multi-criteria decision making, and cluster analysis methods: Trends and perspectives. Decision Analytics Journal, 8, 100271. https://doi.org/10.1016/J.DAJOUR.2023.100271
El-Rawy, M., Wahba, M., Fathi, H., Alshehri, F., Abdalla, F., & El Attar, R. M. (2024). Assessment of groundwater quality in arid regions utilizing principal component analysis, GIS, and machine learning techniques. Marine Pollution Bulletin, 205, 116645. https://doi.org/10.1016/J.MARPOLBUL.2024.116645
Ezugwu, A. E., Ikotun, A. M., Oyelade, O. O., Abualigah, L., Agushaka, J. O., Eke, C. I., & Akinyelu, A. A. (2022). A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/J.ENGAPPAI.2022.104743
Fabre, L., Bayart, C., Bonnel, P., & Mony, N. (2024). Estimating Bus Passenger Mobility with Wi-Fi Data and Clustering. Transportation Research Procedia, 76, 445–457. https://doi.org/10.1016/J.TRPRO.2023.12.067
Festa, D., Novellino, A., Hussain, E., Bateson, L., Casagli, N., Confuorto, P., Del Soldato, M., & Raspini, F. (2023). Unsupervised detection of InSAR time series patterns based on PCA and K-means clustering. International Journal of Applied Earth Observation and Geoinformation, 118, 103276. https://doi.org/10.1016/J.JAG.2023.103276
Gagolewski, M., Bartoszuk, M., & Cena, A. (2021). Are cluster validity measures (in) valid? Information Sciences, 581, 620–636. https://doi.org/10.1016/J.INS.2021.10.004
Gbadoubissa, J. E. Z., Ari, A. A. A., & Gueroui, A. M. (2020). Efficient k-means based clustering scheme for mobile networks cell sites management. Journal of King Saud University - Computer and Information Sciences, 32(9), 1063–1070. https://doi.org/10.1016/J.JKSUCI.2018.10.015
Golbabaei, F., Yigitcanlar, T., Paz, A., & Bunker, J. (2020). Individual Predictors of Autonomous Vehicle Public Acceptance and Intention to Use: A Systematic Review of the Literature. Journal of Open Innovation: Technology, Market, and Complexity, 6(4), 106. https://doi.org/10.3390/JOITMC6040106
Güller, C., & Varol, C. (2024). Unveiling the daily rhythm of urban space: Exploring the influence of built environment on spatiotemporal mobility patterns. Applied Geography, 170, 103366. https://doi.org/10.1016/J.APGEOG.2024.103366
Hajihosseinlou, M., Maghsoudi, A., & Ghezelbash, R. (2024). A comprehensive evaluation of OPTICS, GMM and K-means clustering methodologies for geochemical anomaly detection connected with sample catchment basins. Geochemistry, 84(2), 126094. https://doi.org/10.1016/J.CHEMER.2024.126094
Halim, Z., Sargana, H. M., Aadam, Uzma, & Waqas, M. (2021). Clustering of graphs using pseudo-guided random walk. Journal of Computational Science, 51, 101281. https://doi.org/10.1016/J.JOCS.2020.101281
Hassan, B. A., Tayfor, N. B., Hassan, A. A., Ahmed, A. M., Rashid, T. A., & Abdalla, N. N. (2024). From A-to-Z review of clustering validation indices. Neurocomputing, 601, 128198. https://doi.org/10.1016/J.NEUCOM.2024.128198
Herath, H. M. K. K. M. B., & Mittal, M. (2022). Adoption of artificial intelligence in smart cities: A comprehensive review. International Journal of Information Management Data Insights, 2(1), 100076. https://doi.org/10.1016/J.JJIMEI.2022.100076
Ikotun, A. M., Ezugwu, A. E., Abualigah, L., Abuhaija, B., & Heming, J. (2023). K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data. Information Sciences, 622, 178–210. https://doi.org/10.1016/J.INS.2022.11.139
Jafarzadegan, M., Safi-Esfahani, F., & Beheshti, Z. (2019). Combining hierarchical clustering approaches using the PCA method. Expert Systems with Applications, 137, 1–10. https://doi.org/10.1016/J.ESWA.2019.06.064
Jalolova, M., Amirov, L., Askarova, M., & Zakhidov, G. (2022). Territorial features of railway transport control mechanisms. Transportation Research Procedia, 63, 2645–2652. https://doi.org/10.1016/J.TRPRO.2022.06.305
Jansson, N. F., Allen, R. L., Skogsmo, G., & Tavakoli, S. (2022). Principal component analysis and K-means clustering as tools during exploration for Zn skarn deposits and industrial carbonates, Sala area, Sweden. Journal of Geochemical Exploration, 233, 106909. https://doi.org/10.1016/J.GEXPLO.2021.106909
Javidan, S. M., Banakar, A., Vakilian, K. A., & Ampatzidis, Y. (2023). Diagnosis of grape leaf diseases using automatic K-means clustering and machine learning. Smart Agricultural Technology, 3, 100081. https://doi.org/10.1016/J.ATECH.2022.100081
Junjie, J., Wenhao, S., & Yuan, W. (2024). A risk assessment approach for road collapse along tunnels based on an improved entropy weight method and K-means cluster algorithm. Ain Shams Engineering Journal, 15(7), 102805. https://doi.org/10.1016/J.ASEJ.2024.102805
Kaplan, A., & Haenlein, M. (2019). Siri, Siri, in my hand: Who’s the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence. Business Horizons, 62(1), 15–25. https://doi.org/10.1016/J.BUSHOR.2018.08.004
Khan, F., Khan, O., Parvez, M., Ahmad, S., Yahya, Z., Alhodaib, A., Kumar Yadav, A., & Ağbulut, Ü. (2024). K-means clustering optimization of various quantum dots and nanoparticles-added biofuels for engine performance, emission, vibration, and noise characteristics. Thermal Science and Engineering Progress, 54, 102815. https://doi.org/10.1016/J.TSEP.2024.102815
Khan, I. K., Daud, H. B., Zainuddin, N. B., Sokkalingam, R., Farooq, M., Baig, M. E., Ayub, G., & Zafar, M. (2024). Determining the optimal number of clusters by Enhanced Gap Statistic in K-mean algorithm. Egyptian Informatics Journal, 27, 100504. https://doi.org/10.1016/J.EIJ.2024.100504
Kim, H., Kim, H. K., & Cho, S. (2020). Improving spherical k-means for document clustering: Fast initialization, sparse centroid projection, and efficient cluster labeling. Expert Systems with Applications, 150, 113288. https://doi.org/10.1016/J.ESWA.2020.113288
Lee, L. C., & Jemain, A. A. (2021). On overview of PCA application strategy in processing high dimensionality forensic data. Microchemical Journal, 169, 106608. https://doi.org/10.1016/J.MICROC.2021.106608
López, J. F., Sánchez, M. E., Pomaquero, J. C., & Vasco, J. A. (2024). Regulaciones en la ley de economía social del sector transporte-Ecuador. Revista Venezolana de Gerencia, 29(Especial 11), 279-292. https://doi.org/10.52080/rvgluz.29.e11.16
Lv, Y., Zhi, D., Sun, H., & Qi, G. (2021). Mobility pattern recognition based prediction for the subway station related bike-sharing trips. Transportation Research Part C: Emerging Technologies, 133, 103404. https://doi.org/10.1016/J.TRC.2021.103404
Ma, Y., Li, W., Tang, K., Zhang, Z., & Chen, S. (2021). Driving style recognition and comparisons among driving tasks based on driver behavior in the online car-hailing industry. Accident Analysis & Prevention, 154, 106096. https://doi.org/10.1016/J.AAP.2021.106096
Mehedi Hassan, M., Mollick, S., & Yasmin, F. (2022). An unsupervised cluster-based feature grouping model for early diabetes detection. Healthcare Analytics, 2, 100112. https://doi.org/10.1016/J.HEALTH.2022.100112
Miskolczi, M., Földes, D., Munkácsy, A., & Jászberényi, M. (2021). Urban mobility scenarios until the 2030s. Sustainable Cities and Society, 72, 103029. https://doi.org/10.1016/J.SCS.2021.103029
Mussabayev, R., Mladenovic, N., Jarboui, B., & Mussabayev, R. (2023). How to Use K-means for Big Data Clustering? Pattern Recognition, 137, 109269. https://doi.org/10.1016/J.PATCOG.2022.109269
Naghizadeh, A., & Metaxas, D. N. (2020). Condensed Silhouette: An Optimized Filtering Process for Cluster Selection in K-Means. Procedia Computer Science, 176, 205–214. https://doi.org/10.1016/J.PROCS.2020.08.022
Narayanan, S., Chaniotakis, E., & Antoniou, C. (2020). Shared autonomous vehicle services: A comprehensive review. Transportation Research Part C: Emerging Technologies, 111, 255–293. https://doi.org/10.1016/J.TRC.2019.12.008
Ning, Z., Chen, J., Huang, J., Sabo, U. J., Yuan, Z., & Dai, Z. (2022). WeDIV – An improved k-means clustering algorithm with a weighted distance and a novel internal validation index. Egyptian Informatics Journal, 23(4), 133–144. https://doi.org/10.1016/J.EIJ.2022.09.002
Nowak-Brzezinska, A., & Horyn, C. (2020). Outliers in rules - the comparision of LOF, COF and KMEANS algorithms. Procedia Computer Science, 176, 1420–1429. https://doi.org/10.1016/J.PROCS.2020.09.152
Pellegrino, N., Fieguth, P. W., & Haji Reza, P. (2023). K-Means for noise-insensitive multi-dimensional feature learning. Pattern Recognition Letters, 170, 113–120. https://doi.org/10.1016/J.PATREC.2023.04.009
Ragunthar, T., Ashok, P., Gopinath, N., & Subashini, M. (2021). A strong reinforcement parallel implementation of k-means algorithm using message passing interface. Materials Today: Proceedings, 46, 3799–3802. https://doi.org/10.1016/J.MATPR.2021.02.032
Rahman, M. M., & Thill, J. C. (2023). Impacts of connected and autonomous vehicles on urban transportation and environment: A comprehensive review. Sustainable Cities and Society, 96, 104649. https://doi.org/10.1016/J.SCS.2023.104649
Ran, X., Suyaroj, N., Tepsan, W., Ma, J., Zhou, X., & Deng, W. (2024). A hybrid genetic-fuzzy ant colony optimization algorithm for automatic K-means clustering in urban global positioning system. Engineering Applications of Artificial Intelligence, 137, 109237. https://doi.org/10.1016/J.ENGAPPAI.2024.109237
Ros, F., Riad, R., & Guillaume, S. (2023). PDBI: A partitioning Davies-Bouldin index for clustering evaluation. Neurocomputing, 528, 178–199. https://doi.org/10.1016/J.NEUCOM.2023.01.043
Schreiber, J. B. (2021). Issues and recommendations for exploratory factor analysis and principal component analysis. Research in Social and Administrative Pharmacy, 17(5), 1004–1011. https://doi.org/10.1016/J.SAPHARM.2020.07.027
Sun, Y., Liu, H., & Gao, Y. (2023). Research on customer lifetime value based on machine learning algorithms and customer relationship management analysis model. Heliyon, 9(2), e13384. https://doi.org/10.1016/J.HELIYON.2023.E13384
Taghvaee, V. M., Nodehi, M., Saber, R. M., & Mohebi, M. (2022). Sustainable development goals and transportation modes: Analyzing sustainability pillars of environment, health, and economy. World Development Sustainability, 1, 100018. https://doi.org/10.1016/J.WDS.2022.100018
Tang, R., De Donato, L., Bes̆inović, N., Flammini, F., Goverde, R. M. P., Lin, Z., Liu, R., Tang, T., Vittorini, V., & Wang, Z. (2022). A literature review of Artificial Intelligence applications in railway systems. Transportation Research Part C: Emerging Technologies, 140, 103679. https://doi.org/10.1016/J.TRC.2022.103679
Troccoli, E. B., Cerqueira, A. G., Lemos, J. B., & Holz, M. (2022). K-means clustering using principal component analysis to automate label organization in multi-attribute seismic facies analysis. Journal of Applied Geophysics, 198, 104555. https://doi.org/10.1016/J.JAPPGEO.2022.104555
Wei, Q. (2024). Accounting Data Encryption Processing Based on K-Means Clustering Algorithm. Procedia Computer Science, 247, 819–825. https://doi.org/10.1016/J.PROCS.2024.10.099
Yarlagadda, J., Jain, P., & Pawar, D. S. (2021). Assessing safety critical driving patterns of heavy passenger vehicle drivers using instrumented vehicle data – An unsupervised approach. Accident Analysis & Prevention, 163, 106464. https://doi.org/10.1016/J.AAP.2021.106464
Yarushkina, N., Matyugina, E., & Vanina, I. (2022). Transport integration in providing the economic development of the territory. Transportation Research Procedia, 63, 486–494. https://doi.org/10.1016/J.TRPRO.2022.06.039
Zhang, C., Lasaulce, S., Hennebel, M., Saludjian, L., Panciatici, P., & Poor, H. V. (2021). Decision-making oriented clustering: Application to pricing and power consumption scheduling. Applied Energy, 297, 117106. https://doi.org/10.1016/J.APENERGY.2021.117106
Zhao, W., Ma, J., Liu, Q., Song, J., Tysklind, M., Liu, C., Wang, D., Qu, Y., Wu, Y., & Wu, F. (2023). Comparison and application of SOFM, fuzzy c-means and k-means clustering algorithms for natural soil environment regionalization in China. Environmental Research, 216, 114519. https://doi.org/10.1016/J.ENVRES.2022.114519
Zhu, J., Ji, S., Yu, J., Shao, H., Wen, H., Zhang, H., Xia, Z., Zhang, Z., & Lee, C. (2022). Machine learning-augmented wearable triboelectric human-machine interface in motion identification and virtual reality. Nano Energy, 103, 107766. https://doi.org/10.1016/J.NANOEN.2022.107766

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.