-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
609 lines (445 loc) · 24.5 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
---
output: github_document
bibliography: references.bib
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# RegrCoeffsExplorer
<!-- badges: start -->
![Build Status](https://img.shields.io/badge/build-passing-brightgreen)
![Lifecycle:Stable](https://img.shields.io/badge/Lifecycle-Stable-97ca00)
[![R-CMD-check](https://github.com/r-lib/rcmdcheck/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/r-lib/rcmdcheck/actions/workflows/R-CMD-check.yaml)
[![CRAN Status](https://www.r-pkg.org/badges/version/RegrCoeffsExplorer)](https://CRAN.R-project.org/package=RegrCoeffsExplorer)
<!-- badges: end -->
_Always present effect sizes for primary outcomes_ [@Wilkinson1999Statistical].
_The size of the regression weight depends on the other predictor variables included in the equation and is, therefore, prone to change across situations involving different combinations of predictors_ [@Dunlap1998Interpretations].
_Any interpretations of weights must be considered context-specific_ [@Thompson1998].
_Dissemination of complex information derived from sophisticated statistical models requires diligence_ [@Chen2003].
The **goal** of `RegrCoeffsExplorer` is to **enhance the interpretation** of
regression results by providing **visualizations** that integrate empirical data
distributions. This approach facilitates a more thorough understanding of the
impacts of changes exceeding one unit in the independent variables on the
dependent variable for models fitted within Linear Models (LM),
Generalized Linear Models (GLM), and
Elastic-Net Regularized Generalized Linear Models (GLMNET) frameworks.
## Installation
CRAN version [@R-RegrCoeffsExplorer] can be installed with:
```{r, eval=F, echo=T, results="hide", warning=F, message=F }
install.packages("RegrCoeffsExplorer")
```
You can install the development version of `RegrCoeffsExplorer` from
[GitHub](https://github.com/vadimtyuryaev/RegrCoeffsExplorer) with:
```{r, eval=F, echo=T, results="hide", warning=F, message=F }
library(devtools)
devtools::install_github("vadimtyuryaev/RegrCoeffsExplorer",
ref = "main",
dependencies = TRUE,
build_vignettes = TRUE)
```
Alternatively, you can use the `remotes` library with the following command:
`remotes::install_github()`.
## A Very Brief Recap on Logistic Regression
Logistic regression [@mccullagh1989] is a statistical method used for binary classification.
Unlike linear regression, which predicts continuous outcomes, logistic
regression predicts the probability of a binary outcome (1 or 0, Yes or No,
True or False). The core function in logistic regression is the
logistic function, also known as the **sigmoid function**, which maps any input
into the range (0, 1), making it interpretable as a **probability**. In logistic
regression, we model the **log odds** (logarithm of Odds Ratio) of the probability
as a linear function of the input features.
Sigmoid function is defined as:
$$\sigma(z)=\frac{1}{1+\exp(-z)}$$
Probability of **success** is calculated in the following manner:
$$P(Y=1|\mathbf{X}) = \pi(\mathbf{X}) = \frac{\exp(\beta_{0}+\beta_{1}X_{1}+\ldots+\beta_{n}X_{n})}{1+\exp(\beta_{0}+\beta_{1}X_{1}+\ldots+\beta_{n}X_{n})} =\frac{\exp(\mathbf{X}\beta)}{1+\exp(\mathbf{X}\beta)} = \frac{1}{1+\exp(-\mathbf{X}\beta)}$$
Odds Ratio is:
$$OR = \frac{\pi(\mathbf{X})}{1-\pi(\mathbf{X})} =\frac{\frac{\exp(\mathbf{X}\beta)}{1+\exp(\mathbf{X}\beta)}}{1- \frac{\exp(\mathbf{X}\beta)}{1+\exp(\mathbf{X}\beta)}} =\exp(\beta_{0}+\beta_{1}X_{1}+\ldots+\beta_{n}X_{n})$$
Log Odds or the Logistic Transformation of the probability of success is:
$$\log{OR}=\log{\frac{\pi(\mathbf{X})}{1-\pi(\mathbf{X})}} =\beta_{0}+\beta_{1}X_{1}+\ldots+\beta_{n}X_{n}$$
Change in log odds when one predictor variable ($X_{1}$) increases by one unit,
while **all other variables remain unchanged**:
$$\log{\frac{P(Y=1|\mathbf{X_{X_1=X_1+1}})}{P(Y=0|\mathbf{X_{X_1=X_1+1}})}} -\log{\frac{P(Y=1|\mathbf{X_{X_1=X_1}})}{P(Y=0|\mathbf{X_{X_1=X_1}})}} \overset{1}{=}$$
$$\overset{1}{=} \beta_{0}+\beta_{1}(X_{1}+1)+\ldots+\beta_{n}X_{n} - (\beta_{0}+\beta_{1}X_{1}+\ldots+\beta_{n}X_{n}) =\beta_{1}$$
Therefore, coefficient $\beta_{1}$ shows expected change in the Log Odds for a
one unit increase in $X_1$. Thus, expected change in the Odds Ratio is
$\exp(\beta_{1})$. Finally, expected change in the Odds Ratio if $X_1$ changes
by k units whilst all other variables remain unchanged is $[\exp(\beta_{1})]^k$
or $\exp(\beta_{1} \times k)$.
## Examples
### Consider implications of a change exceeding one unit on the Odds Ratio
To generate a dataset for logistic regression analysis, we simulate four
continuous predictor variables and one categorical predictor variable.
The continuous predictors are sampled from a normal distribution, each with
distinct means and standard deviations. The categorical predictor is generated
as a dummy variable with two levels. The binary response variable is calculated
by applying a logistic function to a linear combination of the predictors,
thereby transforming the latent variable to a probability scale, which serves
as the basis for generating binary outcomes through a binomial sampling process.
```{r example, fig.width=12, fig.height=6}
library(RegrCoeffsExplorer)
library(gridExtra)
# Set seed for reproducibility
set.seed(1945)
# Set the number of observations
n = 1000
# Random means and SDs
r_means = sample(1:5, 4, replace = TRUE)
r_sd = sample(1:2, 4, replace = TRUE)
# Generate predictor variables
X1 = rnorm(n, mean = r_means[1], sd = r_sd[1])
X2 = rnorm(n, mean = r_means[2], sd = r_sd[2])
X3 = rnorm(n, mean = r_means[3], sd = r_sd[3])
X4 = rnorm(n, mean = r_means[4], sd = r_sd[4])
# Create a dummy variable
F_dummy=sample(1:2, n, replace = TRUE) - 1
# Convert to factor
Factor_var=factor(F_dummy)
# Define coefficients for each predictor
beta_0 = -0.45
beta_1 = -0.35
beta_2 = 1.05
beta_3 = -0.7
beta_4 = 0.55
beta_5 = 1.25
# Generate the latent variable
latent_variable = beta_0 + beta_1*X1 + beta_2*X2 + beta_3*X3 + beta_4*X4 +beta_5*F_dummy
# Convert the latent variable to probabilities using the logistic function
p = exp(latent_variable) / (1 + exp(latent_variable))
# Generate binomial outcomes based on these probabilities
y = rbinom(n, size = 1, prob = p)
# Fit a GLM with a logistic link, including the factor variable
glm_model = glm(y ~ X1 + X2 + X3 + X4 + Factor_var,
family = binomial(link = "logit"),
data = data.frame(y, X1, X2, X3, X4, Factor_var))
grid.arrange(vis_reg(glm_model, CI = TRUE, intercept = TRUE,
palette = c("dodgerblue", "gold"))$"SidebySide")
```
Note that upon consideration of the empirical distribution of
data, particularly concerning the influence on the response variable, `y`,
attributable to the interquartile change (Q3-Q1) in the
dependent variables, there is a discernible enlargement in the magnitudes of
coefficients `X2` and `X4`.
Let us delve further into the underlying reasons for this phenomenon.
### Check estimated coefficients
```{r glm-summary}
summary(glm_model)
```
### Obtain Odds Ratio (OR)
```{r Odds-Ratio}
exp(summary(glm_model)$coefficients[,1])
```
The coefficients for `X1` through `X4` represent the change in the OR
for a **one-unit shift** in each respective coefficient, while the coefficient
for `Factor_var1` signifies the variation in OR resulting from a transition from
the reference level of 0 to a level 1 in the factor variable. At first glance,
it may seem that the factor variable exerts the most significant impact on the
odds ratio.Yet, this interpretation can often be deceptive, as it fails to take
into account the distribution of empirical data.
### Real data differences
```{r real-data-differences}
# Calculate all possible differences (1000 choose 2)
all_diffs <- combn(X2, 2, function(x) abs(x[1] - x[2]))
# Count differences that are exactly 1 units
num_diffs_exactly_one = sum(abs(all_diffs) == 1)
# Count the proportion of differences that more or equal to 2 units
num_diffs_2_or_more = sum(abs(all_diffs)>=2)/sum(abs(all_diffs))
print("Number of differences of exactly 1 unit:")
num_diffs_exactly_one
print("Proportion of differences of two or more units:")
num_diffs_2_or_more
```
**None** of the differences observed within the values of the variable `X2` equate
to a single unit. Furthermore, **in excess of 15 percent** of these differences
are equal or surpass a magnitude of two units.Therefore, when analyzing standard
regression output displaying per-unit interpretations, we, in a sense, comment
on difference that might not exist in the real data.Consequently, when engaging
in the analysis of standard regression outputs that provide interpretations on a
per-unit basis, there is an implicit commentary on disparities that may not be
present within the actual data. A more realistic approach is to utilize an
actual observable difference, for example `Q3`-`Q1`, to calculate the OR.
### Plot changes in OR and empirical data distribution for the `X2` variable
```{r plot-Odds-Ratio,fig.width=12, fig.height=8}
plot_OR(glm_model,
data.frame(y, X1, X2, X3, X4, Factor_var),
var_name="X2",
color_filling=c("#008000", "#FFFF00","#FFA500","#FF0000"))$"SidebySide"
```
The top plot delineates the variations in the OR corresponding to data
differentials spanning from the minimum to the first quartile (Q1),
the median (Q2), the third quartile (Q3), and the maximum.The bottom plot
depicts a boxplot with a notch to display a confidence interval around the
median and jitters to add random noise to data points preventing overlap and
revealing the underlying data distribution more clearly. Substantial changes in
the OR progressing alone the empirical data are clearly observed.
### Customize plots - 1/2
```{r, warning=F, message=F}
require(ggplot2)
vis_reg(glm_model, CI = TRUE, intercept = TRUE,
palette = c("dodgerblue", "gold"))$PerUnitVis+
ggtitle("Visualization of Log Odds Model Results (per unit change)")+
ylim(0,6)+
xlab("Predictors")+
ylab("Estimated OR")+
theme_bw()+
scale_fill_manual(values = c("red","whitesmoke" ))+
theme(plot.title = element_text(hjust = 0.5))
```
As observed, when returning individual plots, the resulting entities are `ggplot`
objects. Consequently, any operation that is compatible with ggplot can be
applied to these plots using the `+` operator.
### Customize plots - 2/2
```{r, warning=F, message=F}
vis_reg(glm_model, CI = TRUE, intercept = TRUE,
palette = c("dodgerblue", "gold"))$RealizedEffectVis+
scale_fill_manual(values = c("#DC143C","#DCDCDC" ))+
geom_hline(yintercept=exp(summary(glm_model)$coefficients[,1][3]*IQR(X2)), # note the calculation
linetype="dashed", color = "orange", size=1)
```
## Vignettes
Would you like to know more? Please, check out the in-depth vignettes below.
```{r vignettes, eval=F, echo=T, results="hide", warning=F, message=F}
vignette("BetaVisualizer",
package = "RegrCoeffsExplorer") # To visualize realized effect sizes
vignette("OddsRatioVisualizer", # To visualize Odds Ratios
package = "RegrCoeffsExplorer")
```
## A note on estimation of Confidence Intervals for objects fitted using regularized regression
It is imperative to to gain a comprehensive understanding of the post-selection
inference [@Hastie2015] rationale and methodologies prior to the generation and
graphical representation of confidence intervals for objects fitted via the
Elastic-Net Regularized Generalized Linear Models. Please, kindly
consult the hyperlinks below containing the designated literature
and the `BetaVisualizer` vignette.
1. [Statistical Learning with Sparsity](https://www.ime.unicamp.br/~dias/SLS.pdf)
2. [Recent Advances in Post-Selection Statistical Inference](https://www.math.wustl.edu/~kuffner/TibshiraniSlides.pdf)
3. [Tools for Post-Selection Inference](https://CRAN.R-project.org/package=selectiveInference)
## A cautionary note on intepretation of interaction effects in Generalized Linear Models (GLM)
A frequently misrepresented and misunderstood concept is that coefficients
for interaction terms in GLMs **do not** have straightforward slope
interpretations. This implies, among other considerations,
that in models including interaction terms, the ORs derived from
coefficients might not be meaningful [@Chen2003]. Many situations demand
recalculation of correct ORs, and the interpretation of interaction terms depends
on other predictors in the model due to inherent non-linearity. Consequently,
researchers should exercise caution when including and interpreting interaction
terms in GLM models.
In the following, we adopt the approach by @McCabe2021 to demonstrate
computationally how interactions may depend on all predictors in the model and
how these interactions can be estimated and interpreted on the probability scale.
### Theoretical considerations regarding the interpretation of interaction terms
Consider a Linear Model with two continuous predictors and an interaction term:
$$E[Y|\textbf{X}] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1 x_2$$
Take the second order cross-partial derivative of $E[Y|\textbf{X}]$ with
respect to both $x_1$ and $x_2$:
$$\gamma_{12}^2 = \frac{\partial^2 E[Y| \textbf{X}]}{\partial x_1 \partial x_2} = \beta_{12}$$
The interaction term $\beta_{12}$ shows how effect of $x_1$ on $E[Y|\textbf{X}]$
changes for every one unit increase in $x_2$ and vice versa.
Now consider a logistic regression model with a non-linear link function $g(\cdot)$,
two continuous predictors and an interaction term:
$$g(E[Y|\textbf{X}])=\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1 x_2$$
Converting GLM to a natural scale using the inverse link function:
$$E[Y|\textbf{X}]=g^{-1}(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1 x_2)$$
Note that the relationship is **no longer linear**.
As an example, consider logistic regression:
$$\log(\frac{E[Y|\textbf{X}]}{1-E[Y|\textbf{X}]})=\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1 x_2=\eta$$
Transformation leads to:
$$E[Y|\textbf{X}]=\frac{1}{1+exp(-\{\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1 x_2\})}=\frac{1}{1+exp(-\eta)}=\frac{exp(\eta)}{1+exp(\eta)}$$
Let's take the second order cross-partial derivative.
Using the chain rule:
$$\gamma_{12}^2 = \frac{\partial^2 E[Y|\textbf{X}]}{\partial x_1 \partial x_2} = \frac{\partial^2 g^{-1}(\eta)}{\partial x_1 \partial x_2} = \frac{\partial }{\partial x_1} \left[ \frac{\partial g^{-1}(\eta)}{\partial x_2} \right] \overset{2}{=}$$
$$\overset{2}{=} \frac{\partial}{\partial x_1} \left[ \frac{\partial g^{-1}(\eta)}{\partial \eta} \frac{\partial \eta}{ \partial x_2} \right] = \frac{\partial}{\partial x_1} [(\beta_2+\beta_{12} x_1)\dot{g}^{-1}(\eta)]$$
Utilizing the product rule followed by the chain rule:
$$\frac{\partial}{\partial x_1} \left[(\beta_2+\beta_{12} x_1)\dot{g}^{-1}(\eta) \right] =\frac{\partial}{\partial x_1} [(\beta_2+\beta_{12} x_1)]\dot{g}^{-1}(\eta) + [(\beta_2+\beta_{12} x_1)]\frac{\partial}{\partial x_1}[\dot{g}^{-1}(\eta)] \overset{3}{=}$$
$$\overset{3}{=} \beta_{12} \dot{g}^{-1}(\eta)+(\beta_2+\beta_{12}x_1)(\beta_1+\beta_{12}x_2)\ddot{g}^{-1}(\eta)$$
First and second derivative of the inverse link function are:
$$\dot{g}^{-1}(\eta)=\frac{exp(\eta)}{(1+exp(\eta))^2}$$
$$\ddot{g}^{-1}(\eta)=\frac{exp(\eta)(1-exp(\eta))}{(1+exp(\eta))^3}$$
Therefore:
$$\gamma_{12}^2=\beta_{12} \frac{e^{\eta}}{(1+e^{\eta})^2}+(\beta_1+\beta_{12}x_2)(\beta_2+\beta_{12}x_1)\frac{e^{\eta}(1-e^{\eta})}{(1+e^{\eta})^3}$$
Calculation above show that an interaction term in GLMs depends on all predictors
within the model.This implies that the coefficient $\beta_{12}$ alone does not
adequately describe how the effect of variable $x_1$ on $E[Y|\textbf{X}]$ changes for
each one-unit increase in the variable $x_2$, and vice versa.
### Computational perspectives on interaction terms in Generalized Linear Models
We sample two moderately correlated predictors `X1b` and `X2b` from a standard
bivariate normal distribution and use them to simulate a logistic regression
model. By fitting the model, we obtain the estimated coefficients and
calculate the values of $\hat{\gamma}^2_{12}$. Subsequently, we visualize
the slopes of $\hat{E}[Y|\mathbf{X}]$ calculated for several combinations of
`X1b` and `X2b`.
#### Sample two predictors `X1b` and `X2b` from a standard bivariate normal distribution:
$$\left( \begin{array}{c}
X_1 \\
X_2
\end{array} \right)
\sim \mathcal{N}\left(
\begin{array}{c}
\begin{pmatrix}
\mu_1 \\
\mu_2
\end{pmatrix}
\end{array},
\begin{pmatrix}
\sigma_1^2 & \rho \sigma_1 \sigma_2 \\
\rho \sigma_1 \sigma_2 & \sigma_2^2
\end{pmatrix}
\right)$$
Recall that the the pdf of the bivariate normal distribution has the following
form [@PSUStat505Bivariate]:
$$\phi(x_1, x_2) = \frac{1}{2\pi \sigma_1 \sigma_2 \sqrt{1 - \rho^2}} \exp\left(-\frac{1}{2(1 - \rho^2)} \left[ \left(\frac{x_1 - \mu_1}{\sigma_1}\right)^2 - 2\rho \left(\frac{x_1 - \mu_1}{\sigma_1}\right) \left(\frac{x_2 - \mu_2}{\sigma_2}\right) + \left(\frac{x_2 - \mu_2}{\sigma_2}\right)^2 \right] \right)$$
```{r}
# Load necessary library
library(MASS) # for sampling from a multivariate normal distribution
library(ggplot2)
library(reshape2) # for melting data frames
# Set parameters
n_samples = 1000 # Number of samples
mean_vector = c(0, 0) # Mean vector for X1 and X2
std_devs = c(1, 1) # Standard deviations for X1 and X2
correlation = 0.6 # Correlation between X1 and X2
# Generate the covariance matrix
cov_matrix = matrix(c(std_devs[1]^2,
std_devs[1]*std_devs[2]*correlation,
std_devs[1]*std_devs[2]*correlation,
std_devs[2]^2),
nrow = 2)
# Generate samples from the bivariate normal distribution
set.seed(2020)
samples = mvrnorm(n = n_samples, mu = mean_vector, Sigma = cov_matrix)
# Convert samples to a data frame
bivariate_sample_df = data.frame(X1 = samples[, 1], X2 = samples[, 2])
X1b = bivariate_sample_df$X1
X2b = bivariate_sample_df$X2
```
#### Fit logistic model with an interaction term:
```{r}
# Set parameters
n_samples = 1000
beta_0 = -1
beta_1 = 2
beta_2 = -1.5
beta_3 = 0.5 # Coefficient for the interaction term
# Calculate probabilities including interaction term
log_odds = beta_0 + beta_1 * X1b + beta_2 * X2b + beta_3 * X1b * X2b
prob = 1 / (1 + exp(-log_odds))
# Generate response variable
Y = rbinom(n_samples, size = 1, prob = prob)
# Fit logistic regression
data = data.frame(Y = Y, X1 = X1b, X2 = X2b, X1X2 = X1b * X2b)
model = glm(Y ~ X1 + X2 + X1X2, family = binomial(link = "logit"), data = data)
# Print estimated coefficients
summary(model)
```
#### Check distribution of interaction term values at different values of `X1b` and`X2b`
```{r, fig.height=6, fig.width=6, dpi=1200}
# Define x1 and x2 ranges
x1_range = seq(-3, 3, length.out = 100)
x2_quantiles = quantile(X2b, probs = c(0.25, 0.50, 0.75))
# Define the function to calculate gamma squared d[E[Y|X]]/dx1dx2
second_derivative = function(x1, x2, beta) {
z = beta[1] + beta[2]*x1 + beta[3]*x2 + beta[4]*x1*x2
g = exp(z) / (1 + exp(z))
term1 = beta[4] * exp(z) / (1 + exp(z))^2
term2 = (beta[2] + beta[4]*x2) * (beta[3] + beta[4]*x1) * exp(z) * (1 - exp(z)) / (1 + exp(z))^3
gamma_squared = term1 + term2
return(gamma_squared)
}
# Calculate gamma squared for each combination
gamma_squared_values = outer(x1_range, x2_quantiles, Vectorize(function(x1, x2) {
second_derivative(x1, x2, coef(model))
}))
# Create a data frame from the matrix
gamma_df = as.data.frame(gamma_squared_values)
# Add X1 values as a column for identification
gamma_df$X1 = x1_range
# Melt the data frame for use with ggplot
long_gamma_df = melt(gamma_df, id.vars = "X1",
variable.name = "X2_Quantile",
value.name = "GammaSquared")
# Convert X2_Quantile to a factor
levels(long_gamma_df$X2_Quantile) = c("25%", "50%", "75%")
# Plot the boxplot
ggplot(long_gamma_df, aes(x = X2_Quantile, y = GammaSquared)) +
geom_boxplot() +
labs(title = "Boxplot of Gamma Squared Values",
x = "X2b Percentile",
y = "Gamma Squared") +
theme_minimal()+
theme(plot.title = element_text(hjust = 0.5))
```
Note that the estimate of the interaction term is positive ($0.68345$). Yet,
significant number of the gamma squared values are negative. Moreover,
the magnitude and sign of $\hat{\gamma}_{12}^2$ are contingent upon the specific
combination of `X1b` and `X2b` variables utilized in the analysis.
#### Visualize changes in $\hat{E}[Y|\textbf{x}]$ associated with one unit increase in `X1b` at different quantiles of `X2b`.
```{r,fig.height=8, fig.width=8, dpi=1200}
# Generate a sequence of x1 values for plotting
x1_values = seq(-1.5, 1.5, length.out = 100)
# Calculate predicted probabilities for each combination of x1 and quantile x2
predictions = lapply(x2_quantiles, function(x2) {
predicted_probs = 1 / (1 + exp(-(coef(model)[1] + coef(model)[2] * x1_values + coef(model)[3] * x2 + coef(model)[4] * x1_values * x2)))
return(predicted_probs)
})
# Calculate values for plotting slopes
x1_values_lines_1 = c(-1,0)
x1_values_lines_2 = c(0.5,1.5)
x2_quantiles_lines = quantile(X2b)[c(2,4)]
predictions_lines_1 = lapply(x2_quantiles_lines, function(x2) {
predicted_probs = 1 / (1 + exp(-(coef(model)[1] + coef(model)[2] * x1_values_lines_1 + coef(model)[3] * x2 + coef(model)[4] * x1_values_lines_1 * x2)))
return(predicted_probs)
})
predictions_lines_2 = lapply(x2_quantiles_lines, function(x2) {
predicted_probs = 1 / (1 + exp(-(coef(model)[1] + coef(model)[2] * x1_values_lines_2 + coef(model)[3] * x2 + coef(model)[4] * x1_values_lines_2 * x2)))
return(predicted_probs)
})
# Plot the results
plot(x1_values, predictions[[1]], type = 'l', lwd = 2, ylim = c(0, 1),
ylab = "Estimated Probability", xlab = "X1b", main = "Interaction Effects")
segments(-1, predictions_lines_1[[2]][1],0, predictions_lines_1[[2]][2],
col="red4", lwd=2.5,lty="dotdash")
segments(-1, predictions_lines_1[[2]][1],0, predictions_lines_1[[2]][1],
col="red4", lwd=2.5,lty="dotdash")
segments( 0, predictions_lines_1[[2]][1],0, predictions_lines_1[[2]][2],
col="red4", lwd=2.5,lty="dotdash")
segments(-1, predictions_lines_1[[1]][1],0, predictions_lines_1[[1]][2],
col="red4", lwd=2.5,lty="dotdash")
segments(-1, predictions_lines_1[[1]][1],0, predictions_lines_1[[1]][1],
col="red4", lwd=2.5,lty="dotdash")
segments( 0, predictions_lines_1[[1]][1],0, predictions_lines_1[[1]][2],
col="red4", lwd=2.5,lty="dotdash")
segments(0.5, predictions_lines_2[[2]][1],1.5, predictions_lines_2[[2]][2],
col="springgreen4", lwd=3,lty="twodash")
segments(0.5, predictions_lines_2[[2]][1],1.5, predictions_lines_2[[2]][1],
col="springgreen4", lwd=3,lty="twodash")
segments(1.5, predictions_lines_2[[2]][1],1.5, predictions_lines_2[[2]][2],
col="springgreen4", lwd=3,lty="twodash")
segments(0.5, predictions_lines_2[[1]][1],1.5, predictions_lines_2[[1]][2],
col="springgreen4", lwd=3,lty="solid")
segments(0.5, predictions_lines_2[[1]][1],1.5, predictions_lines_2[[1]][1],
col="springgreen4", lwd=3,lty="solid")
segments(1.5, predictions_lines_2[[1]][1],1.5, predictions_lines_2[[1]][2],
col="springgreen4", lwd=3,lty="solid")
lines(x1_values, predictions[[2]], lty = 2, lwd = 2)
lines(x1_values, predictions[[3]], lty = 3, lwd = 2)
legend("topleft", legend = c("25% X2b", "50% X2b", "75% X2b"), lty = 1:3, lwd = 2)
```
The alterations in $\hat{E}[Y|\textbf{x}]$ associated with
one-unit increments in `X1b` at the first and third quartiles of `X2b` demonstrate
significant disparities. Observe the variations in $\hat{E}[Y|\textbf{x}]$ for a
unit change in `X1b` from $-1$ to $0$ (red) compared to the changes in
$\hat{E}[Y|\textbf{x}]$ for a unit change in X1b from $0.5$ to $1.5$ (green).
Observe how the magnitudes of changes in the estimated probability associated
with a unit increment in `X1b` vary depending on the position along the number
line where the increment occurs. This observation substantiates the preliminary
assertion regarding the inadequacy of $\hat{\beta}_{12}$ in capturing the
interaction effects on the probability scale within the GLM framework, which
incorporates two continuous predictors and an interaction term. Consequently,
meticulous attention is required when incorporating interaction terms and
interpreting the outcomes of models that include such terms.
# References