Classification and Regression Models for Breast Cancer Predictive Analysis
This research paper presents an analysis of the Breast Cancer Wisconsin Diagnostic dataset aimed at predicting the nature of breast tumors and estimating the mean radius of cancerous tumors. With breast cancer being one of the most prevalent forms of cancer among women, understanding the characteristics of tumors is crucial for effective diagnosis and treatment. The study employs various classification and regression models to leverage correlations between dataset features, allowing for the differentiation between benign (non-harmful) and malignant (cancerous) tumors. The methodology includes data preprocessing, exploratory data analysis, and the application of machine learning techniques such as logistic regression, linear discriminant analysis (LDA), and feature selection. The results demonstrate the effectiveness of these models in accurately predicting tumor classifications, further aiding clinicians in making informed, data-driven decisions regarding breast cancer diagnosis and management. By providing insights into key attributes correlated with tumor severity, this analysis contributes to the ongoing effort to enhance predictive accuracy in breast cancer assessments.
The ZIP file contains the dataset, the R code, the obtained plot and a report.