INTRODUCTION
Chronic kidney disease (CKD) represents a significant global health challenge [
1]. The 2016 United States Renal Data System annual report underscores the rising incidence of treated end-stage renal disease, with an annual increase of 2–4% in nearly one-third of countries between 2003 and 2016 [
2]. Similarly, South Korea experienced a 2.7-fold increase in CKD patients between 2006 and 2015 [
3]. Considering the pivotal role of kidneys in waste filtration, fluid and electrolyte balance, and blood pressure regulation, regular screening of renal function is essential, particularly in the general population and high-risk groups including those with diabetes mellitus and hypertension [
4,
5].
Glomerular filtration rate (GFR), a key indicator of renal function, measures the ability of the kidneys to filter waste products from the blood. It is an essential measure for assessing kidney health, staging CKD, predicting patient outcomes, and guiding medication dosages in acute or chronic renal failure [
6]. As direct GFR measurement is challenging, estimated GFR (eGFR) offers a practical alternative, evaluating the efficiency of kidney filtration using endogenous (originating within the body) or exogenous (introduced externally) markers. The most common clinical formulas for calculating eGFR are the Modification of Diet in Renal Disease (MDRD) and Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) formulas [
6], which incorporate factors such as serum levels of creatinine, age, sex, and race.
The dense vascular network in kidneys is essential for filtration within the glomeruli. Notably, the kidneys and eyes share structural, developmental, and functional parallels, including their role in the renin-angiotensin-aldosterone system. Moreover, both organs are vulnerable to inflammation, oxidative stress, endothelial dysfunction, and microangiopathy [
7]. Renal microvascular pathology significantly contributes to the development of renal insufficiency. The unique accessibility of the retina for direct, noninvasive visualization of its microvasculature makes the use of fundus photographs a promising approach for screening kidney diseases [
8,
9].
Deep neural networks (DNNs) applied to imaging data have demonstrated superior diagnostic performance, surpassing traditional image interpretation under various conditions. Previous studies have demonstrated the ability of DNNs to predict systemic conditions including age [
10–
13] and hemoglobin levels, and ocular conditions such as visual acuity [
14] and intraocular pressure [
15] using fundus photographs. Notably, DNNs have successfully identified diseases such as CKD, including details such as blood levels of creatinine and CKD classification, through retinal images [
16–
19].
Considering the lack of research directly comparing the MDRD and CKD-EPI formulas, we conducted a detailed comparison to assess their influence on DNN performance. This approach establishes the potential of fundus photographs as an effective screening tool for CKD.
DISCUSSION
Our study demonstrated the ability of fundus photographs to predict eGFR. Furthermore, our results emphasize the superior performance of the CKD-EPI formula over the MDRD formula. This performance difference may be attributed to sex-related information acting as a confounding factor in the analysis of fundus photographs.
Previous studies have demonstrated a significant association between nephropathy and retinopathy. This correlation is attributed to the shared dense capillary network in the retina and kidneys. Pathological conditions affecting these capillaries could damage both organs. This association is observed in systemic conditions, such as diabetes mellitus and hypertension, and in patients without systemic diseases [
24]. For instance, 45% of CKD patients exhibit retinal abnormalities detectable by ophthalmologists [
24]. These abnormalities include vascular pathologies, such as diabetic retinopathy and hypertensive retinopathy, and other conditions, such as glaucoma and macular degeneration [
24,
25]. Notably, the risk of retinopathy increases 3-fold at eGFR < 30 mL/min/1.73 m
2 [
24].
DNNs excel at recognizing complex patterns, potentially uncovering information in fundus photographs previously undetectable to humans. Consequently, multiple studies have used DNNs and fundus photographs to predict renal function. These studies have demonstrated a remarkable ability to predict CKD, with receiver operating characteristic scores ranging from 0.81 to 0.93 [
16,
19,
26]. Furthermore, accuracy improves when incorporating systemic diseases such as diabetes and hypertension [
16,
17,
19,
26]. Several studies have indicated the potential to predict serum levels of creatinine using fundus photographs [
17,
18]. Therefore, we designed this study to predict renal function accurately using such photographs.
In clinical settings, the eGFR is calculated using the CKD-EPI and MDRD formulas. Although both formulas consider creatinine levels, age, and sex, they differ in their coefficients [
6]. Studies that have used DNNs and fundus photographs for eGFR prediction have used the MDRD or CKD-EPI formula ([
16,
19]. We calculated eGFR using both CKD-EPI and MDRD formulas to evaluate the variation in their influence on the learning and prediction capabilities of DNNs.
This study showed superior performance of the CKD-EPI formula compared to the MDRD formula. The DNN, trained using the eGFR
CKD-EPI formula, exhibited a higher AUC in the CAM. Furthermore, scatter plots revealed a coefficient closer to 1, indicating a well-clustered distribution around the central line. Conversely, the DNN trained using the eGFR
M-DRD formula demonstrated lower accuracy and uninterpretable CAM emphasis patterns, indicating inadequate model training. Our findings are consistent with previous studies that have predicted CKD using the MDRD formula with an AUC of 0.81 [
26], while those that have used the CKD-EPI formula have reported a higher AUC ranging from 0.85 to 0.93 [
16,
19].
The MDRD and CKD-EPI formulas use creatinine levels, age, and sex to calculate eGFR. As we used identical fundus photographs, hyperparameters, and model structures, the input data for each model was ultimately the same. However, the differentially distributed results from the two models may be attributed to the complex interactions between the inputs.
We considered sex a confounding variable due to the discriminatory histograms exhibited by the MDRD and CKD-EPI formulas (
Fig. 2A). Although eGFR
CKD-EPI exhibited nearly identical distributions for both sexes, the eGFR
MDRD exhibited a discriminatory distribution based on sex. This indicates an interaction between sex and the eGFR
MDRD. Females exhibited a lower kurtosis and higher mean eGFR
MDRD, inducing covariance.
Fundus photographs include age and sex information (
Fig. 3). Previous studies have demonstrated high accuracy in predicting age and sex from such photographs [
10–
13]. Consequently, age and sex could act as confounding factors when predicting target variables. This effect was particularly evident in our creatinine level prediction model, where the levels significantly differed by sex (
Fig. 5A). As the DNN could easily distinguish sex using fundus photographs, it may have prioritized sex identification over-interpreting subtle features related to creatinine levels, leading to a reduction in loss. This resulted in scatter plots of creatinine levels segregated by sex and suboptimal CAM results.
Furthermore, the prediction of renal function using fundus photographs presents a high degree of complexity. Consider the simpler example of predicting hemoglobin levels using fundus photographs. As with renal function, hemoglobin levels are influenced by age and sex. However, hemoglobin prediction using fundus photographs is highly accurate [
27–
29]; this is because hemoglobin in capillaries can be directly observed in fundus photographs, requiring minimal inference.
By contrast, predicting renal function involves multiple inferential stages. Notably, vascular pathologies can simultaneously affect the kidneys and retina; therefore, the prediction of renal function involves identifying these pathologies using fundus photographs and subsequently inferring renal function. As each stage introduces potential confounders, accuracy decreases with the increasing complexity of the reasoning process.
Our study demonstrated the superior accuracy of the CKD-EPI formula in eGFR predictions compared to the MDRD formula, particularly in diverse populations, including kidney transplant recipients and those with an eGFR > 60 mL/ min/1.73 m
2 [
6,
30–
33]. Furthermore, the CKD-EPI formula is a better predictor of mortality and end-stage renal failure risk than the MDRD formula [
34].
This study had several limitations, including its single-center design and focus on an East Asian population. However, our findings align with those from multi-ethnic studies, indicating a limited impact of ethnicity. Fundus photographs, rich in vascular and neural structures, reflect systemic health, rendering them potential biomarkers. Our study emphasizes the importance of considering confounders in similar studies. In elderly populations or those with low muscle mass, serum levels of creatinine may underestimate renal function, overestimating eGFR and reducing predictive model accuracy.
The results of this study are consistent with those of previous CNN-based research [
16,
19,
26]. We tested several models in addition to EfficientNet but obtained largely consistent results. It appears that CNNs may have limitations in predicting kidney function from fundus photographs. Recently, the superior performance of attention mechanisms used in large language models has received significant interest, and their application in image analysis is increasing. We plan to conduct further research using these new mechanisms alongside CNNs.
Fundus photography has the potential to be used as a biomarker in medical evaluations, assessing vascular and neural functions. It is fast and noninvasive, causing minimal discomfort to patients while allowing direct visualization of blood vessels and nerves. As more clinical data accumulate and automated interpretation methods become more widespread, the barriers to using fundus photography for screening are expected to decrease, making it increasingly accessible for many physicians.
In conclusion, we developed DNN models to predict renal function using fundus photographs; however, these models are susceptible to the influence of sex, a potential confounding factor. Therefore, the CKD-EPI formula, less susceptible to sex bias compared to the MDRD formula, is recommended to obtain more reliable results. Furthermore, careful consideration of such confounders is essential in future DNN studies using fundus photographs. This emphasizes the need for further studies to enhance the accuracy of artificial intelligence technologies in the prompt diagnosis and management of kidney diseases, thereby optimizing patient outcomes.