model

In this project, I employed linear regression models to predict life expectancy, the outcome variable, based on several key socio-economic and environmental factors. The predictor variables included a composite index measuring human development, a metric indicating the annual carbon dioxide emissions per person, and the average income of a country’s residents. The linear regression approach was chosen because it allows us to quantify the relationship between these continuous predictor variables and life expectancy, providing clear and interpretable estimates of how changes in each predictor are associated with changes in the expected lifespan of individuals within different countries. For instance, we assessed how improvements in human development, reductions in carbon dioxide emissions, and increases in average income influence life expectancy, yielding insights into the broader impacts of socio-economic and environmental policies on public health outcomes.


Call:
lm(formula = le_2021 ~ hdi_2021, data = clean_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.9936 -1.2675  0.3369  2.1408  7.2580 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   39.627      1.753   22.60   <2e-16 ***
hdi_2021      43.635      2.568   16.99   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.261 on 100 degrees of freedom
Multiple R-squared:  0.7428,    Adjusted R-squared:  0.7402 
F-statistic: 288.8 on 1 and 100 DF,  p-value: < 2.2e-16

Call:
lm(formula = le_2021 ~ co2_prod_2021, data = clean_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-14.624  -4.525   1.196   4.457  12.013 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   67.22466    0.65425  102.75  < 2e-16 ***
co2_prod_2021  0.45221    0.08797    5.14 1.36e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.718 on 100 degrees of freedom
Multiple R-squared:  0.209, Adjusted R-squared:  0.2011 
F-statistic: 26.42 on 1 and 100 DF,  p-value: 1.364e-06

Call:
lm(formula = le_2021 ~ gnipc_2021, data = clean_data)

Residuals:
     Min       1Q   Median       3Q      Max 
-13.2975  -3.9292   0.7402   3.5701   8.8432 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 6.569e+01  6.456e-01 101.752  < 2e-16 ***
gnipc_2021  2.466e-04  3.131e-05   7.877 4.18e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.051 on 100 degrees of freedom
Multiple R-squared:  0.3829,    Adjusted R-squared:  0.3767 
F-statistic: 62.04 on 1 and 100 DF,  p-value: 4.181e-12

Characteristic

Beta

95% CI

1

p-value

(Intercept) 40 36, 43 <0.001
hdi_2021 44 39, 49 <0.001

R² = 0.743; Adjusted R² = 0.740; Sigma = 3.26; Statistic = 289; p-value = <0.001; df = 1; Log-likelihood = -264; AIC = 535; BIC = 542; Deviance = 1,063; Residual df = 100; No. Obs. = 102

1

CI = Confidence Interval

Characteristic

Beta

95% CI

1

p-value

(Intercept) 67 66, 69 <0.001
co2_prod_2021 0.45 0.28, 0.63 <0.001

R² = 0.209; Adjusted R² = 0.201; Sigma = 5.72; Statistic = 26.4; p-value = <0.001; df = 1; Log-likelihood = -322; AIC = 649; BIC = 657; Deviance = 3,270; Residual df = 100; No. Obs. = 102

1

CI = Confidence Interval

Characteristic

Beta

95% CI

1

p-value

(Intercept) 66 64, 67 <0.001
gnipc_2021 0.00 0.00, 0.00 <0.001

R² = 0.383; Adjusted R² = 0.377; Sigma = 5.05; Statistic = 62.0; p-value = <0.001; df = 1; Log-likelihood = -309; AIC = 624; BIC = 632; Deviance = 2,551; Residual df = 100; No. Obs. = 102

1

CI = Confidence Interval

\[\begin{equation} \text{Life Expectancy} = \beta_0 + \beta_1 \times \text{Human Development Index} + \epsilon \end{equation}\]

\[\begin{equation} \text{Life Expectancy} = \beta_0 + \beta_1 \times \text{CO2 Production} + \epsilon \end{equation}\]

\[\begin{equation} \text{Life Expectancy} = \beta_0 + \beta_1 \times \text{Gross National Income per Capita} + \epsilon \end{equation}\]