![]() |
International Society of Science and Applied Technologies |
The STRIDE Database: Workers’ Compensation Claims and Adjuster Notes | ||||
Author | Chelsea M. Zuvieta
|
|||
Co-Author(s) | Richard Bauder; Brian White; Taghi M. Khoshgoftaar
|
|||
Abstract | The Structured and Textual Records for Injury Data Exploration (STRIDE) database consists of two datasets, tabular and text, detailing workers’ compensation claims that were compiled, cleaned, and anonymized for analysis of workplace injuries. The database provides insights into work-related injury types, costs, and outcomes, with the potential to inform safety improvements, cost reduction strategies, and fair workers’ compensation practices. Additionally, the repository includes notes documented by insurance adjusters. The tabular data encompasses 230,833 workers’ compensation claims, and the text data contains notes for 25,691 of those claims. We present the baseline results of a medical cost classification model, which shows promise for future machine learning experiments. Primary research outcomes of the STRIDE database should include predictions on different costs, injury severity, claim length, and legal involvement.
|
|||
Keywords | Workplace Injury, Workers’ Compensation, Machine Learning, Dataset, Data Analysis, Natural Language Processing | |||
Article #: RQD2025-177 |
Proceedings of 30th ISSAT International Conference on Reliability & Quality in Design |