RESUMEN
PURPOSE: The cure rate in Hodgkin lymphoma is high, but the response along with treatment is still unpredictable and highly variable among patients. Detecting those patients who do not respond to treatment at early stages could bring improvements in their treatment. This research tries to identify the main biological prognostic variables currently gathered at diagnosis and design a simple machine learning methodology to help physicians improve the treatment response assessment. METHODS: We carried out a retrospective analysis of the response to treatment of a cohort of 263 Caucasians who were diagnosed with Hodgkin lymphoma in Asturias (Spain). For that purpose, we used a list of 35 clinical and biological variables that are currently measured at diagnosis before any treatment begins. To establish the list of most discriminatory prognostic variables for treatment response, we designed a machine learning approach based on two different feature selection methods (Fisher's ratio and maximum percentile distance) and backwards recursive feature elimination using a nearest-neighbor classifier (k-NN). The weights of the k-NN classifier were optimized using different terms of the confusion matrix (true- and false-positive rates) to minimize risk in the decisions. RESULTS AND CONCLUSIONS: We found that the optimum strategy to predict treatment response in Hodgkin lymphoma consists in solving two different binary classification problems, discriminating first if the patient is in progressive disease; if not, then discerning among complete and partial remission. Serum ferritin turned to be the most discriminatory variable in predicting treatment response, followed by alanine aminotransferase and alkaline phosphatase. The importance of these prognostic variables suggests a close relationship between inflammation, iron overload, liver damage and the extension of the disease.