A 12-Fold ED Risk Gap Hidden in Your DNA

Deep Learning on Genome-Wide Association Studies to Predict the Patient-Specific Risk of Radiation-Induced Erectile Dysfunction

A deep learning model trained on 810 genetic variants predicted post-radiation ED with AUC 0.75 and an 11.8-fold odds gap between risk groups.

Journal: Radiotherapy and Oncology | Published: 2026-04-11 | Type: Journal Article | PMID: 41967608 Authors: Oh Jung Hun, Auer Paul, Hall William, Rosenstein Barry S, Deasy Joseph O, Kerns Sarah (Memorial Sloan Kettering Cancer Center; Medical College of Wisconsin; Icahn School of Medicine at Mount Sinai) Funding/COI: NCI/NIH. No competing financial interests declared.

Summary

Radiation-induced erectile dysfunction (RIED) affects a substantial portion of prostate cancer patients after radiotherapy, but predicting who will develop it has been largely guesswork. This study trained a biologically informed deep learning model — BioDeepGWAS — on 810 single-nucleotide polymorphisms (SNPs) plus two clinical variables (age and androgen deprivation therapy use) from 387 prostate cancer patients with no pre-existing ED. On the held-out test set, the model achieved an AUC of 0.75 and separated the highest-risk third from the lowest-risk third with an odds ratio of 11.8. That gap is clinically meaningful on paper, but single-cohort deep learning results without external validation warrant skepticism.

Claims

AUC of 0.75 on the held-out test set (~77 patients, 20% of 387 evaluable participants)
Odds ratio of 11.8 (p=0.0002) between the top and bottom tertiles of predicted risk
Good calibration across six predicted risk bins (calibration p=0.9531 — no significant divergence from observed rates)
810 lead SNPs selected at a univariate association threshold of p<0.001, combined with age and ADT use
Post-hoc pathway analysis flagged neurophysiological processes, gonadotropin regulation, and blood vessel morphogenesis as biologically relevant drivers

Study Quality

The GenePARE cohort provided germline DNA from 668 prostate cancer patients, of whom 387 had no pre-existing ED and were evaluable — yielding 221 RIED cases and 166 controls. The 70/10/20 train/validation/test split is standard, but it leaves roughly 270 training samples for a model with 810 genetic features, creating a high-dimensional problem with sparse data. The authors report calibration statistics alongside discrimination metrics, which is better practice than AUC alone. However, the SNP selection threshold of p<0.001 is far more permissive than the conventional GWAS significance threshold of p<5×10⁻⁸, meaning many of the 810 input features likely represent noise rather than true signal. There is no external validation cohort, so generalizability to a different institution's population is unknown.

Red Flags

387 evaluable patients and ~77 in the test set is small for deep learning; results are fragile to small sample variation
SNP selection at p<0.001 (not genome-wide significance) inflates the feature space with probable false positives
No external validation — a single-cohort result is insufficient to claim clinical readiness
810 predictors with ~270 training samples carries genuine overfitting risk, even with regularization
Post-hoc pathway analysis is exploratory and hypothesis-generating only, not confirmatory
Abstract and excerpts do not describe missing data handling or imputation strategy

Strengths

Germline DNA from actual blood samples (not imputed from GWAS summary statistics)
BioDeepGWAS incorporates known biological pathway structure rather than treating all SNPs as equivalent
Formal calibration testing reported — not just AUC
NCI-funded; no declared conflicts of interest
Only patients without pre-existing ED included, which sharpens the outcome definition
Reasonably balanced case/control ratio (221:166)

Verdict

An AUC of 0.75 and an odds ratio of 11.8 between risk groups are numbers worth noticing — but this model has not been tested outside the dataset it was tuned on, and the feature-selection threshold is loose enough that a meaningful fraction of those 810 SNPs may be noise. The pathway findings (vascular, neurological, hormonal) are biologically coherent and plausible, but coherence isn't validation. The real test is whether BioDeepGWAS holds up in an independent European or diverse-ancestry cohort. Until that replication exists, this is a promising proof-of-concept from a well-credentialed group, not a clinical tool.