![]() |
International Society of Science and Applied Technologies |
A Comprehensive Approach to Tabular Data Classification: FT-Transformer Enhanced by KNN Imputation and ICA in Diabetes Classification | ||||
Author | Tho Nguyen
|
|||
Co-Author(s) | Hao Mai Xuan; Quoc-Thong Nguyen; Kim Duc Tran; Ludovic Koehl; Kim Phuc Tran
|
|||
Abstract | Diabetes represents a major global public health challenge, imposing a significant burden on healthcare systems and socio-economic development. Its impact is expected to increase, with prevalence projected to rise by 59.7% from 2021 to 2050, affecting over 1.31 billion people. There is a growing emphasis on processing medical data and evaluating machine learning models to harness the full potential of artificial intelligence in diabetes diagnosis. The FT-Transformer demonstrated good classification performance with accuracy (98,07%), precision (95.76%) and recall (98.26%), alongside models like Random Forest, XGBoost, and LGBM. Although it slightly lags behind these models, it holds promise for future applications in more complex database systems. The data processing approach, which combines KNN imputation, ICA, and Isolation Forest, simplifies and optimizes the model effectively. These findings mark a significant step forward in streamlining data processing and provide future researchers with insights into identifying optimal models for application, particularly in medical databases such as diabetes diagnosis.
|
|||
Keywords | FT-Transformer, tabular data, ICA, KNN imputation, explainable artificial intelligence, diabetes | |||
Article #: DSBFI25-30 |
January 6-8, 2025 - Da Nang, Vietnam |