Causal effect of education on type 2 diabetes: A network Mendelian randomization study
Li-Zhen Liao, Zhi-Chong Chen, Wei-Dong Li, Xiao-Dong Zhuang, Xin-Xue Liao
Li-Zhen Liao, Wei-Dong Li, Department ofHealth, Guangdong Pharmaceutical University, Guangzhou 510275, Guangdong Province, China
Li-Zhen Liao, Wei-Dong Li, Guangdong Provincial Key Laboratory of Pharmaceutical Bioactive Substances, Guangdong Pharmaceutical University, Guangzhou 510006, Guangdong Province, China
Zhi-Chong Chen, Department of Cardiology, The Sixth Affiliated Hospital of Sun Yat-Sen University, Guangzhou 510080, Guangdong Province, China
Xiao-Dong Zhuang, Xin-Xue Liao, Department of Cardiology, The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou 510080, Guangdong Province, China
Author contributions: Liao LZ and Chen ZC conceived the study and contributed equally to this study; Li WD completed the analyses; Zhuang XD led the writing; Liao XX supervised the study.
Corresponding author: Xin-Xue Liao, PhD, Chief Physician, Department of Cardiology, The First Affiliated Hospital of Sun Yat-Sen University, No. 58 Zhongshan 2nd Road, Yuexiu District, Guangzhou 510080, Guangdong Province, China.
The causality between education and type 2 diabetes (T2DM) remains unclear.


To identify the causality between education and T2DM and the potential metabolic risk factors [coronary heart disease (CHD), total cholesterol, low-density lipoprotein, triglycerides (TG), body mass index (BMI), waist circumference (WC), waist-to-hip ratio (WHR), fasting insulin, fasting glucose, and glycated hemoglobin] from summarized genome-wide association study (GWAS) data used a network Mendelian randomization (MR).


Two-sample MR and network MR were performed to obtain the causality between education-T2DM, education-mediator, and mediator-T2DM. Summary statistics from the Social Science Genetic Association Consortium (discovery data) and Neale Lab consortium (replication data) were used for education and DIAGRAMplusMetabochip for T2DM.


The odds ratio for T2DM was 0.392 (95%CI: 0.263-0.583) per standard deviation increase (3.6 years) in education by the inverse variance weighted method, without heterogeneity or horizontal pleiotropy. Education was genetically associated with CHD, TG, BMI, WC, and WHR in the discovery phase, yet only the results for CHD, BMI, and WC were replicated in the replication data. Moreover, BMI was genetically associated with T2DM.


Short education was found to be associated with an increased T2DM risk. BMI might serve as a potential mediator between them.

Key Words: Mendelian randomization, Education, Type 2 diabetes mellitus, Genome-wide association study, Coronary heart disease, Body mass index

Core Tip: Genetically predicted education was negatively causally associated with type 2 diabetes (T2DM). The odds ratio for T2DM was 0.392 (95%CI: 0.263-0.583) per standard deviation increase (3.6 years) in education. Body mass index might serve as a potential mediator.


Type 2 diabetes mellitus (T2DM), affecting approximately 415 million people, has become a major global health problem. This number will rapidly increase to 642 million by 2040[1]. Environmental, genetic, and metabolic risk factors all contribute to T2DM[2]. Among these, controlling the modifiable risk factors to decrease the T2DM morbidity, turns into a key public health priority[3]. A previous systematic review and meta-analysis reported that diabetes self-management education can reduce all-cause mortality risk in T2DM patients[4], indicating that education may help T2DM management. However, the causality between primary education (years of schooling) and T2DM remains unclear. If the causal association exists, the potential pathways involved in the association from education to T2DM have not yet been studied.

Mendelian randomization (MR) serves as a strategy for assessing the causality between disease and common risk factors[5]. Since the genetically determined risk factors occur before the disease onset, MR can avoid the potential bias of reverse causation in retrospective studies. The risk of confounding is also reduced[5]. In this study, the causality between education and T2DM was analyzed and potential metabolic risk factors were explored from summarized genome-wide association study (GWAS) data. The potential metabolic risk factors included coronary heart disease (CHD), total cholesterol (TC), low-density lipoprotein (LDL), triglycerides (TG), body mass index (BMI), waist circumference (WC), waist-to-hip ratio (WHR), fasting insulin, fasting glucose, and glycated hemoglobin (HbA1c).

GWAS data summary

Summary data from array-based analysis for single nucleotide polymorphism (SNP) was included. We selected all SNPs associated with education at genome-wide significance (P < 5 × 10-8) in the available GWAS (Supplementary Table 1). The metrics of SNP quality were requested as follows. There was strong evidence of between-study heterogeneity in the SNP-trait association (P ≤ 0.005). The imputation quality metric (info or r2) was ≤ 0.90, and there was Hardy-Weinberg disequilibrium (P ≤ 0.001). Summary statistics from the Social Science Genetic Association Consortium (SSGAC)[6] and Neale Lab consortium ( were used for education. Because several large summary statistics from different consortia were available for the same biomarker, we used the earliest and largest summary data (SSGAC) as the discovery data set and the most recent data (Neale Lab) as the replication data set. DIAGRAM plus Metabochip consortium data were used for T2DM[7], CARDIoGRAM plus C4D consortium data were utilized for CHD[8], GLGC consortia data were used for TC[9], LDL[9], and TG[9], Genetic Investigation of Anthropometric Traits consortium data were used for BMI[10], WC[11], and WHR[12], and Mapping and Geographic Information Consortium data were used for fasting insulin[13], fasting glucose[14], and HbA1c[15]. The datasets are presented in Table 1.

Table 1 Details of studies and datasets used for analyses.
Cases, n
Controls, n
Sample size
PubMed ID
First author
Education (discovery data)NANA29372327225129OkbaySSGAC2016SD (yr)
Education 2 (replication data)NANA226899UKB-a:505NealeNeale Lab2017NA
T2DM348401149817731222885922MorrisDIAGRAM plus Metabochip2012Log odds
CHD6080112350418430526343387NikpayCARDIoGRAM plus C4D2015Log odds
TGNANA10851424097068Willer GLGC2013SD (mg/dL)
LDLNANA9907324097068Willer GLGC2013SD (mg/dL)
TCNANA10836324097068Willer GLGC2013SD (mg/dL)
BMINANA15289325673413Locke GIANT2015SD (kg/m2)
WCNANA23210125673412Shungin GIANT2015SD (cm)
WHRNANA6058623754948Randall GIANT2013SD (ratio)
Fasting insulinNANA3823820081858Dupuis MAGIC2010Log pmol/L
Fasting glucoseNANA5807422581228Manning MAGIC2012mmol/L
HbA1cNANA4636820858683Soranzo MAGIC2010%
Two-sample MR and causality evaluation

Our MR study was conducted on the MR-Base platform online ([16]. We explored the associations as follows: (1) Causality: The conventional MR approach [inverse variance weighted (IVW)] method, MR Egger method, and weighted median method were used: (a) Causality between genetically determined education and T2DM; (b) Causality between education and metabolic risk factors (CHD, TG, LDL, TC, BMI, WC, WHR, fasting insulin, fasting glucose, and HbA1c); and (c) Causality between metabolic risk factors and T2DM; (2) Heterogeneity: We conducted heterogeneity tests in MR analyses using IVW and MR Egger; (3) Horizontal pleiotropy: The MR egger intercept was assessed; (4) Leave-one-out analysis; and (5) Funnel plots.

Network MR for “education-mediator-T2DM” analyses

Two-sample MR and network MR were performed to obtain the causality between education-T2DM, education-mediator, and mediator-T2DM[18]. A network MR analysis consisted of three two-sample MR tests: (1) The causality between education and T2DM; (2) The causality between education and the potential mediators; and (3) The causality between the potential mediators and T2DM.

We could conclude that the specific metabolic risk factor might serve as a mediator between education and T2DM if the causality was estimated in all three steps.

Statistical analysis

The statistical tests were two-sided. The statistical test for the MR analyses was considered statistically significant at P < 0.05. All of the calculations were conducted using Stata (College Station, TX, United States) and R language.

Causality between genetically determined education and T2DM

Characteristics of the SNPs are shown in Supplementary Table 1. We used the SSGAC consortia for education to explore the causal associations between education and T2DM. In the IVW method, the odds ratio [OR (95%CI)] for T2DM was 0.392 (0.263-0.583) per standard deviation increase (3.6 years) in education (Table 2 and Figure 1A). Results were consistent in weighted median method (OR: 0.406, 95%CI: 0.246-0.672; P = 0.000) (Table 2). Both IVW and MR Egger estimates indicated no heterogeneity amongst these 17 SNPs (P = 0.163 and P = 0.124, respectively) (Table 2). There was no directional horizontal pleiotropy (MR egger intercept P = 0.979) (Table 2). In a leave-one-out analysis, no single instrument was strongly driving the overall effect of education on T2DM (Figure 1C). Besides, there was no funnel plot asymmetry (Figure 1D). Both the leave-one-out analysis and funnel plot further suggested that no SNPs exhibited horizontal pleiotropy.

Figure 1
Figure 1 Mendelian randomization study of the effect of education on type 2 diabetes mellitus. A: Mendelian randomization estimate for education on type 2 diabetes mellitus (T2DM); B: Effect sizes of the single-nucleotide polymorphism (SNP)-education associations [x-axis, SD (3.6 years) units] and the SNP-T2DM associations (y-axis, log odds of T2DM); C: Leave-one-out sensitivity analysis; D: Funnel plot of the causality between education and T2DM. MR: Mendelian randomization; OR: Odds ratio.
Table 2 Causal associations between genetically determined education and type 2 diabetes mellitus.
P value
Heterogeneity P
MR Egger intercept P
Education-T2DMMR Egger170.381 0.047 3.093 0.381 0.124 0.979
Weighted median170.406 0.246 0.672 0.000
Inverse variance weighted170.392 0.263 0.583 0.000 0.163

To sum up, the genetically predicted education was negatively causally associated with T2DM.

Causality between education and metabolic risk factors

The causality between education and metabolic biomarkers, including CHD, TG, LDL, TC, BMI, WC, WHR, fasting insulin, fasting glucose, and HbA1c, is shown in Table 3. Education was causally associated with CHD, TG, BMI, WC, and WHR in the discovery phase. Yet in the replication data, only the results for CHD, BMI, and WC were duplicated.

Table 3 Causal association between genetically determined education and metabolic risk factors.
P value
Heterogeneity P
MR Egger intercept P
Discovery (education-metabolic risk factors)
CHDMR Egger710.145 0.417 0.729 0.123 0.263
Weighted median71-0.238 0.107 0.026
Inverse variance weighted71-0.318 0.079 0.000 0.117
TGMR Egger57-0.183 0.263 0.489 0.112 0.937
Weighted median57-0.192 0.057 0.001
Inverse variance weighted57-0.204 0.043 0.000 0.130
LDLMR Egger57-0.274 0.275 0.324 0.316 0.415
Weighted median570.007 0.063 0.909
Inverse variance weighted57-0.051 0.045 0.261 0.326
TCMR Egger57-0.087 0.277 0.755 0.197 0.838
Weighted median57-0.018 0.064 0.775
Inverse variance weighted57-0.031 0.045 0.500 0.223
BMIMR Egger590.013 0.353 0.970 0.000 0.415
Weighted median59-0.209 0.048 0.000
Inverse variance weighted59-0.272 0.058 0.000 0.000
WCMR Egger59-0.002 0.404 0.996 0.000 0.429
Weighted median59-0.269 0.056 0.000
Inverse variance weighted59-0.319 0.066 0.000 0.000
WHRMR Egger58-0.743 0.461 0.113 0.028 0.346
Weighted median58-0.298 0.097 0.002
Inverse variance weighted58-0.311 0.076 0.000 0.027
Fasting insulinMR Egger58-0.269 0.215 0.215 0.010 0.312
Weighted median58-0.039 0.048 0.419
Inverse variance weighted58-0.053 0.035 0.129 0.014
Fasting glucoseMR Egger58-0.185 0.172 0.287 0.167 0.398
Weighted median58-0.044 0.039 0.257
Inverse variance weighted58-0.041 0.028 0.152 0.172
HbA1cMR Egger580.0040770.17790.98180.56740.889
Weighted median58-0.0093230.040940.8199
Inverse variance weighted58-0.020560.029080.47960.6038
Replication (education-metabolic risk factors)
CHDMR Egger20-0.985 0.772 0.218 0.196 0.537
Weighted median20-0.474 0.209 0.024
Inverse variance weighted20-0.511 0.164 0.002 0.222
TGMR Egger130.099 1.076 0.928 0.000 0.702
Weighted median13-0.161 0.134 0.229
Inverse variance weighted13-0.319 0.163 0.051 0.000
BMIMR Egger140.830 0.582 0.179 0.079 0.071
Weighted median14-0.381 0.106 0.000
Inverse variance weighted14-0.308 0.095 0.001 0.018
WCMR Egger140.669 0.637 0.314 0.179 0.169
Weighted median14-0.246 0.113 0.030
Inverse variance weighted14-0.252 0.096 0.009 0.118
WHRMR Egger150.235 1.375 0.867 0.010 0.661
Weighted median15-0.511 0.222 0.021
Inverse variance weighted15-0.375 0.213 0.079 0.014
Causality between metabolic risk factors and T2DM

Based on the above results, CHD, BMI, and WC might serve as potential mediators between education and T2DM. Thus, we further evaluated whether these three potential mediators were associated with T2DM using MR analyses. Only BMI (but not CHD or WC) was positively associated with T2DM (Table 4). Hence, BMI might serve as a potential mediator between education and T2DM.

Table 4 Causal association between genetically determined metabolic risk factors and type 2 diabetes mellitus.
P value
Heterogeneity P
MR Egger intercept P
CHD-T2DMMR Egger280.947 0.697 1.288 0.733 0.000 0.815
Weighted median281.111 0.996 1.239 0.059
Inverse variance weighted281.073 0.954 1.207 0.243 0.000
BMI-T2DMMR Egger723.370 1.328 8.556 0.013 0.000 0.240
Weighted median722.622 2.164 3.178 0.000
Inverse variance weighted722.046 1.374 3.048 0.000 0.000
WC-T2DMMR Egger3911.670 1.590 85.654 0.021 0.000 0.049
Weighted median392.371 1.788 3.145 0.000
Inverse variance weighted391.607 0.879 2.938 0.123 0.000

Diabetes has rapidly become a global epidemic and a significant public health concern. Identifying the high-risk populations and addressing the risk factors for diabetes might prove to be an effective diabetes prevention strategy. Traditional risk factors for T2DM include obesity, diet, physical activity, socioeconomic status, etc.[19], some of which are modifiable through behavioral or pharmacological intervention (e.g., obesity, diet, and physical activity). It is of great importance to determine whether traditional risk factors have a causal role in T2DM or are merely bystanders.

Previous studies have reported that education might help T2DM management. A prospective, randomized, single-center study revealed that pharmacotherapeutic education of patients with T2DM could significantly improve 30-d post-discharge medication adherence without a significant reduction in adverse clinical outcomes[20]. A systematic review and meta-analysis also demonstrated that educational interventions improved medication adherence among adult patients diagnosed with diabetes[20]. Moreover, structured education had a positive impact on glucose control and hypoglycemia in T2DM[22]. However, there has been no direct study exploring the causality between original education and T2DM.

Interestingly, previous findings indicated a causal association between low educational attainment and increased risk of smoking[23]. Also, observational studies suggested an association between smoking and risk of T2DM[24,25]. Based on the above reviews, we made a bold assumption that education might be negatively causally associated with T2DM. As we expected, our MR results revealed that the OR (95%CI) for T2DM was 0.392 (0.263-0.583) per standard deviation increase (3.6 years) in education, indicating that the genetically predicted education was negatively causally associated with T2DM. Since education was shown to be a protective factor against T2DM, and it was modifiable, longer education years among the population are recommended for T2DM prevention.

As there was no full description of the underlying mechanisms possibly connecting education to T2DM, we investigated this relationship. As for the possible mediators from education to T2DM, ten representative modifiable metabolic risk factors were chosen for further MR analysis. Our results indicated that education was causally associated with CHD, TG, BMI, WC, and WHR in the discovery phase. Yet, in the replication data set, only the results for CHD, BMI, and WC were duplicated. Therefore, there were negative causal associations between genetically determined education and CHD, BMI, and WC, which were considered to be the potential mediators between education and T2DM.

A previous study revealed that genetic predisposition towards 3.6 years of additional education was associated with a one-third lower risk of CHD, indicating that low education was a causal risk factor in the development of CHD[26], which was in accord with our finding. Another MR analysis also suggested that there might be a negative causal effect of education on BMI[27]. Our results provide more evidence supporting that more education years might help the public better control BMI, which is beneficial for blood glucose regulation and T2DM prevention. There was no direct causal study exploring the effect of education on WC. A randomized clinical trial revealed that nutrition therapy and a multimedia diabetes education program positively impacted achieving metabolic control goals in T2DM, including HbA1C, glucose decrease, TG, and weight loss. Yet, the WC change was still not statistically different[28]. Another study reported that diet-related and lifestyle-related school-based education could reduce central adiposity in pre-teenagers[29]. We analyzed whether the different effects of education on WC might be due to various educational programs or diverse populations used in the studies.

Since our MR investigation indicated that CHD, BMI, and WC might be potential mediators from less education years to increased risk of T2DM, we further evaluated whether these three potential mediators were associated with T2DM by MR analysis. Only BMI was positively associated with T2DM. In summary, BMI might serve as a mediator between education and T2DM.

BMI is a well-described risk factor for T2DM[30], and several large-scale MR studies have addressed its positive causal association with T2DM[31-33]. Our results also demonstrated that higher BMI could lead to higher risk of T2MD. Combined with the effects of education on BMI and T2MD, for most developing countries where the majority of population receives short education years, we recommend longer education time and a BMI control program for public T2DM prevention.


Short education was found to be associated with an increased T2DM risk. BMI might serve as a potential mediator between them.

Research background

The causality between education and type 2 diabetes mellitus (T2DM) remains unclear.

Research motivation

In this study, a network Mendelian randomization (MR) framework was applied to determine the causality between education and T2DM from summarized genome-wide association study data.

Research objectives

We used a network MR to identify the causality between education and T2DM and the potential metabolic risk factors [coronary heart disease (CHD), total cholesterol, low-density lipoprotein, triglycerides, body mass index (BMI), waist circumference (WC), waist-to-hip ratio, fasting insulin, fasting glucose, and glycated hemoglobin] from summarized genome-wide association study data.

Research methods

Two-sample MR and network MR were performed to obtain the causality between education-T2DM, education-mediator, and mediator-T2DM. Summary statistics from the Social Science Genetic Association Consortium (discovery data) and Neale Lab consortium (replication data) were used for education. DIAGRAM plus Metabochip consortium data were utilized for T2DM.

Research results

In the IVW method, the odds ratio (95%CI) for T2DM was 0.392 (0.263-0.583) per standard deviation increase (3.6 years) in education, without heterogeneity or horizontal pleiotropy. Education was genetically associated with CHD, triglycerides, BMI, WC, and waist-to-hip ratio in the discovery phase, yet only the results for CHD, BMI, and WC were confirmed in the replication data. Moreover, BMI was positively associated with T2DM.

Research conclusions

Short education was found to be associated with increased T2DM risk. BMI might serve as a potential mediator between them.

Research perspectives

For most developing countries, the majority of the population receive short education years. Longer education time is recommended as is a BMI control program, for public T2DM prevention.


