Upload your data and get a complete breast cancer classification with pca + logistic regression report. Free.
or click to browse · max 3 MB
Running breast cancer classification with pca + logistic regression analysis...
Sent to — interactive charts, statistical results, R code, and AI insights.
Analyze another filePCA reduces 30 cell nucleus measurements to principal components for visualization, while logistic regression on scaled features provides interpretable binary classification of malignant vs benign tumors with ROC/AUC diagnostics
Use this when you have labeled binary outcome data with many numeric features and want both dimensionality reduction visualization and interpretable linear classification
Do not use if features are non-linearly separable, if you need probability calibration, or if the dataset has fewer than 50 observations per class
Built for: Clinical researchers, bioinformatics analysts, medical data scientists, oncology research fellows, pathology informatics specialists
Typical data source: CSV with 30 cell nucleus measurements from fine needle aspirate (FNA) biopsies plus a diagnosis label column (M=malignant, B=benign)
Dataset with 32 columns
Minimum 50 rows
569 tumors, 30 numeric features (mean, SE, worst for 10 measurements), binary diagnosis (M=malignant, B=benign). ~63% benign. PCA reduces 30 features to 2-3 components for visualization, logistic regression identifies the most discriminating features.
Dataset balance between malignant and benign cases and class imbalance risk
PCA 2D visualization showing cluster separation between malignant and benign tumors
Scree plot showing variance explained by each principal component
Logistic regression coefficients showing feature direction and magnitude
ROC curve with AUC score for model discrimination ability
Confusion matrix showing false negatives and sensitivity vs specificity trade-off
Top discriminating features ranked by absolute logistic regression coefficient
Complete model performance metrics including accuracy, sensitivity, specificity, F1, and AUC
Plain-English interpretation — what the numbers mean, what's significant, and what to do next.
Need something simpler? Diabetes Risk Drivers — When you only need to identify which health risk factors drive a clinical outcome, without building a full binary classifier or generating ROC/AUC performance diagnostics
Need more power? Fraud Anomaly — When your classes are heavily imbalanced or partially unlabeled and you need anomaly detection rather than supervised binary classification with labeled training data
Similar: Churn Drivers, Attrition Drivers
Breast Tumor Malignancy Classification
Upload a dataset with 30 cell nucleus measurements and a diagnosis label. The module builds a PCA-reduced feature space, fits a logistic regression classifier, and outputs accuracy, AUC, sensitivity, specificity, feature importance, and a full confusion matrix.
See our FAQ for details on pricing, data privacy, and how the analysis works. Every report includes a Methodology section showing the statistical test, assumptions checked, and diagnostics run.
Run any analysis on your own data — validated R analyses, interactive reports, AI insights, and PDF export.
Try Free — No Credit Card