vignettes_gamlj_methods.Rmd
In this example we run a logistic model using the file
data(manymodels)
and show the standard R commands such as
summary()
, print()
and some additional
commands that can be applied to a GAMLj object to extract relevant
information.
We use GAMLj gamlj_glm()
function to estimate a logistic
model with variable manymodels$ybin
as dependent variable,
a continuous covariate manymodels$x
and two categorical
factors,manymodels$cat2
and manymodels$cat3
.
We add also the independent variables interaction.
manymodels$ybin<-factor(manymodels$ybin)
manymodels$cat2<-factor(manymodels$cat2)
manymodels$cat3<-factor(manymodels$cat3)
manymodels$x<-as.numeric(manymodels$x)
mod1<-gamlj_glm(formula = ybin~x*cat3*cat2, data=manymodels,model_type = "logistic")
mod1
GENERALIZED LINEAR MODEL
Model Info
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Info
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Model Type Logistic Model Model for binary y
Model stats::glm ybin ~ 1 + x + cat3 + cat2 + x:cat3 + x:cat2 + cat3:cat2 + x:cat3:cat2
Distribution Binomial Dichotomous event distribution of y
Link function Logit Log of the odd of y
Direction P(y=1)/P(y=0) P( ybin = 1 ) / P( ybin = 0 )
Sample size 120
Converged yes
C.I. method Wald
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Note. All covariates are centered to the mean
MODEL RESULTS
Model Fit
────────────────────────────────────────────
R² Adj. R² df X² p
────────────────────────────────────────────
0.303 0.235 11 49.2 < .001
────────────────────────────────────────────
Additional indices
─────────────────────────────────────────────────────────────
Info Model Value Comment
─────────────────────────────────────────────────────────────
LogLikelihood -56.561
AIC 137.121 Less is better
BIC 170.571 Less is better
Deviance 113.121 Less is better
Residual DF 108.000
Chi-squared/DF 0.938 Overdispersion indicator
─────────────────────────────────────────────────────────────
Omnibus tests
──────────────────────────────────────────
X² df p
──────────────────────────────────────────
x 25.4267 1 < .001
cat3 1.8625 2 0.394
cat2 8.7612 1 0.003
x:cat3 1.6693 2 0.434
x:cat2 3.5686 1 0.059
cat3:cat2 0.0553 2 0.973
x:cat3:cat2 0.5152 2 0.773
──────────────────────────────────────────
Parameter Estimates (Coefficients)
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
Name Effect Estimate SE Exp(B) Lower Upper z p
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
(Intercept) (Intercept) 0.668 0.279 1.951 1.1298 3.37 2.398 0.016
x x 1.653 0.432 5.224 2.2395 12.19 3.826 < .001
cat31 0 - -1 0.820 0.644 2.270 0.6424 8.02 1.273 0.203
cat32 1 - -1 0.657 0.687 1.929 0.5023 7.41 0.957 0.339
cat21 1 - -1 1.549 0.557 4.706 1.5786 14.03 2.779 0.005
x:cat31 x:(0 - -1) 0.247 0.865 1.280 0.2347 6.98 0.285 0.776
x:cat32 x:(1 - -1) 1.323 1.155 3.754 0.3903 36.10 1.145 0.252
x:cat21 x:(1 - -1) 1.482 0.864 4.401 0.8087 23.95 1.714 0.086
cat31:cat21 (0 - -1):(1 - -1) -0.229 1.288 0.796 0.0637 9.93 -0.177 0.859
cat32:cat21 (1 - -1):(1 - -1) -0.296 1.373 0.744 0.0504 10.97 -0.216 0.829
x:cat31:cat21 x:(0 - -1):(1 - -1) 0.677 1.731 1.967 0.0662 58.46 0.391 0.696
x:cat32:cat21 x:(1 - -1):(1 - -1) 1.570 2.310 4.809 0.0520 444.77 0.680 0.497
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
The model is pretty printed, and the returned object is a R6 class
object containing all tables and functions required to manipulate the
results. For instance, the Omnibus tests
table can be
assessed with
mod1$main$anova
Omnibus tests
──────────────────────────────────────────
X² df p
──────────────────────────────────────────
x 25.4267 1 < .001
cat3 1.8625 2 0.394
cat2 8.7612 1 0.003
x:cat3 1.6693 2 0.434
x:cat2 3.5686 1 0.059
cat3:cat2 0.0553 2 0.973
x:cat3:cat2 0.5152 2 0.773
──────────────────────────────────────────
If a table is required to be in R dara.frame format (for using in other code, for instance), one can obtain it as follows:
df<-mod1$main$anova$asDF
class(df)
[1] "data.frame"
df
source test df p
1 x 25.4267 1 4.60e-07
2 cat3 1.8625 2 3.94e-01
3 cat2 8.7612 1 3.08e-03
4 x:cat3 1.6693 2 4.34e-01
5 x:cat2 3.5686 1 5.89e-02
6 cat3:cat2 0.0553 2 9.73e-01
7 x:cat3:cat2 0.5152 2 7.73e-01
If all tables are required as R data.frame, one can use the command
summary()
summary(mod1)
Model Info
info value
1 Model Type Logistic Model
2 Model stats::glm
3 Distribution Binomial
4 Link function Logit
5 Direction P(y=1)/P(y=0)
6 Sample size 120
7 Converged yes
8 C.I. method Wald
specs
1 Model for binary y
2 ybin ~ 1 + x + cat3 + cat2 + x:cat3 + x:cat2 + cat3:cat2 + x:cat3:cat2
3 Dichotomous event distribution of y
4 Log of the odd of y
5 P( ybin = 1 ) / P( ybin = 0 )
6
7
8
Model Fit
r2 ar2 df1 test p
"1" 0.303 0.235 11 49.2 8.79e-07
Additional indices
info value specs
1 LogLikelihood -56.561
2 AIC 137.121 Less is better
3 BIC 170.571 Less is better
4 Deviance 113.121 Less is better
5 Residual DF 108.000
6 Chi-squared/DF 0.938 Overdispersion indicator
Omnibus tests
source test df p
1 x 25.4267 1 4.60e-07
2 cat3 1.8625 2 3.94e-01
3 cat2 8.7612 1 3.08e-03
4 x:cat3 1.6693 2 4.34e-01
5 x:cat2 3.5686 1 5.89e-02
6 cat3:cat2 0.0553 2 9.73e-01
7 x:cat3:cat2 0.5152 2 7.73e-01
Parameter Estimates (Coefficients)
source label estimate se expb expb.ci.lower
1 (Intercept) (Intercept) 0.668 0.279 1.951 1.1298
2 x x 1.653 0.432 5.224 2.2395
3 cat31 0 - -1 0.820 0.644 2.270 0.6424
4 cat32 1 - -1 0.657 0.687 1.929 0.5023
5 cat21 1 - -1 1.549 0.557 4.706 1.5786
6 x:cat31 x:(0 - -1) 0.247 0.865 1.280 0.2347
7 x:cat32 x:(1 - -1) 1.323 1.155 3.754 0.3903
8 x:cat21 x:(1 - -1) 1.482 0.864 4.401 0.8087
9 cat31:cat21 (0 - -1):(1 - -1) -0.229 1.288 0.796 0.0637
10 cat32:cat21 (1 - -1):(1 - -1) -0.296 1.373 0.744 0.0504
11 x:cat31:cat21 x:(0 - -1):(1 - -1) 0.677 1.731 1.967 0.0662
12 x:cat32:cat21 x:(1 - -1):(1 - -1) 1.570 2.310 4.809 0.0520
expb.ci.upper test p
1 3.37 2.398 0.01649
2 12.19 3.826 0.00013
3 8.02 1.273 0.20313
4 7.41 0.957 0.33858
5 14.03 2.779 0.00545
6 6.98 0.285 0.77566
7 36.10 1.145 0.25205
8 23.95 1.714 0.08646
9 9.93 -0.177 0.85917
10 10.97 -0.216 0.82919
11 58.46 0.391 0.69585
12 444.77 0.680 0.49656
Summary returns a list of data.frame, so the data.frame of particular table can be easely accessed by its position in the list.
mysum<-summary(mod1)
mysum[[4]]
Omnibus tests
source test df p
1 x 25.4267 1 4.60e-07
2 cat3 1.8625 2 3.94e-01
3 cat2 8.7612 1 3.08e-03
4 x:cat3 1.6693 2 4.34e-01
5 x:cat2 3.5686 1 5.89e-02
6 cat3:cat2 0.0553 2 9.73e-01
7 x:cat3:cat2 0.5152 2 7.73e-01
GAMLj operates a series of data transformations before estimating the
models, such as setting the contrasts for factors, centering or
standardizing the variables, and so on. The command
get_data()
allows the users to access the processed
data.
ybin x cat3 cat2
1 1 -1.123 1 -1
2 1 0.457 0 -1
3 0 1.297 -1 -1
4 0 -2.323 -1 -1
5 0 0.617 0 -1
6 1 0.697 1 -1
contrasts(manymodels$cat3)
0 1
-1 0 0
0 1 0
1 0 1
contrasts(df$cat3)
1 2
-1 -0.333 -0.333
0 0.667 -0.333
1 -0.333 0.667
GAMLj results objects respond to a series of R standard commands
coef(mod1)
Parameter Estimates (Coefficients)
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
Name Effect Estimate SE Exp(B) Lower Upper z p
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
(Intercept) (Intercept) 0.668 0.279 1.951 1.1298 3.37 2.398 0.016
x x 1.653 0.432 5.224 2.2395 12.19 3.826 < .001
cat31 0 - -1 0.820 0.644 2.270 0.6424 8.02 1.273 0.203
cat32 1 - -1 0.657 0.687 1.929 0.5023 7.41 0.957 0.339
cat21 1 - -1 1.549 0.557 4.706 1.5786 14.03 2.779 0.005
x:cat31 x:(0 - -1) 0.247 0.865 1.280 0.2347 6.98 0.285 0.776
x:cat32 x:(1 - -1) 1.323 1.155 3.754 0.3903 36.10 1.145 0.252
x:cat21 x:(1 - -1) 1.482 0.864 4.401 0.8087 23.95 1.714 0.086
cat31:cat21 (0 - -1):(1 - -1) -0.229 1.288 0.796 0.0637 9.93 -0.177 0.859
cat32:cat21 (1 - -1):(1 - -1) -0.296 1.373 0.744 0.0504 10.97 -0.216 0.829
x:cat31:cat21 x:(0 - -1):(1 - -1) 0.677 1.731 1.967 0.0662 58.46 0.391 0.696
x:cat32:cat21 x:(1 - -1):(1 - -1) 1.570 2.310 4.809 0.0520 444.77 0.680 0.497
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
Other commands are available. For all of them, if the command is
passed without arguments, the command search the required results in the
GAMLj object. If the results are not available, FALSE is returned. For
instance, the model we estimated has results tables regarding the model
fit, so the command fit()
returns them.
fit(mod1)
[[1]]
Model Fit
────────────────────────────────────────────
R² Adj. R² df X² p
────────────────────────────────────────────
0.303 0.235 11 49.2 < .001
────────────────────────────────────────────
[[2]]
Additional indices
─────────────────────────────────────────────────────────────
Info Model Value Comment
─────────────────────────────────────────────────────────────
LogLikelihood -56.561
AIC 137.121 Less is better
BIC 170.571 Less is better
Deviance 113.121 Less is better
Residual DF 108.000
Chi-squared/DF 0.938 Overdispersion indicator
─────────────────────────────────────────────────────────────
On the other hand, the model does not have results regarding post-hoc
tests (they were not required in the first run), so the command
posthoc()
returns FALSE.
ph<-posthoc(mod1)
ph
[1] FALSE
However, one can pass the options required to obtain the results to the command as one would do to the estimation command. For instance, to obtain the post-hoc tests in the example, one can issue:
posthoc(mod1,posthoc= ~cat3)
POST HOC TESTS
──────────────────────────────────────────────────────────────────
cat3 vs cat3 OR SE z p-bonferroni
──────────────────────────────────────────────────────────────────
-1 - 0 0.441 0.284 -1.273 0.609
-1 - 1 0.518 0.356 -0.957 1.000
0 - 1 1.177 0.841 0.227 1.000
──────────────────────────────────────────────────────────────────
or equivalently
posthoc(mod1,posthoc= "cat3")
POST HOC TESTS
──────────────────────────────────────────────────────────────────
cat3 vs cat3 OR SE z p-bonferroni
──────────────────────────────────────────────────────────────────
-1 - 0 0.441 0.284 -1.273 0.609
-1 - 1 0.518 0.356 -0.957 1.000
0 - 1 1.177 0.841 0.227 1.000
──────────────────────────────────────────────────────────────────
Notice that the GAMLj R commands accept their arguments (variables or terms involved) either as a formula or as a list. Furthermore, any other option accepted by the estimation command can be passed to the commands.
Here are the specific commands:
Accepts:
~cat3
, or cat3:cat4
for
combinations of levels.posthoc="cat3"
) or a list of vectors, where vectors of
length > 1 referes to interactions, such as
posthoc(mod1,posthoc=list(c("cat3","cat2")))
.
simple_effects(mod1,formula = ~x:cat3)
SIMPLE EFFECTS
ANOVA for Simple Effects of x
───────────────────────────────
cat3 χ² df p
───────────────────────────────
-1 3.18 1 0.075
0 5.46 1 0.019
1 6.46 1 0.011
───────────────────────────────
Parameter Estimates for simple effects of x
────────────────────────────────────────────────────────────────────────────────────
cat3 Effect Estimate SE Exp(B) Lower Upper z p
────────────────────────────────────────────────────────────────────────────────────
-1 x 1.13 0.634 3.10 0.894 10.7 1.78 0.075
0 x 1.38 0.589 3.96 1.249 12.6 2.34 0.019
1 x 2.45 0.965 11.62 1.752 77.1 2.54 0.011
────────────────────────────────────────────────────────────────────────────────────
Accepts:
~x:cat3
, where the first term is the simple
effect variable and the second is the moderator.
anova(mod1)
[[1]]
Model Fit
────────────────────────────────────────────
R² Adj. R² df X² p
────────────────────────────────────────────
0.303 0.235 11 49.2 < .001
────────────────────────────────────────────
[[2]]
Additional indices
─────────────────────────────────────────────────────────────
Info Model Value Comment
─────────────────────────────────────────────────────────────
LogLikelihood -56.561
AIC 137.121 Less is better
BIC 170.571 Less is better
Deviance 113.121 Less is better
Residual DF 108.000
Chi-squared/DF 0.938 Overdispersion indicator
─────────────────────────────────────────────────────────────
[[3]]
Omnibus tests
──────────────────────────────────────────
X² df p
──────────────────────────────────────────
x 25.4267 1 < .001
cat3 1.8625 2 0.394
cat2 8.7612 1 0.003
x:cat3 1.6693 2 0.434
x:cat2 3.5686 1 0.059
cat3:cat2 0.0553 2 0.973
x:cat3:cat2 0.5152 2 0.773
──────────────────────────────────────────
When a single model is passed, it returns the model fit tables. If two models are passed as arguments, the model comparison tests are returned.
mod1<-gamlj_glm(formula = ybin~x*cat3*cat2, data=manymodels,model_type = "logistic")
mod2<-gamlj_glm(formula = ybin~x+cat3+cat2, data=manymodels,model_type = "logistic")
anova(mod1,mod2)
[[1]]
Model Fit
─────────────────────────────────────────────────────────
Model R² Adj. R² df X² p
─────────────────────────────────────────────────────────
Full 0.3030 0.23523 11 49.18 < .001
Nested 0.2698 0.24519 4 43.79 < .001
ΔR² 0.0332 -0.00996 7 5.38 0.613
─────────────────────────────────────────────────────────
[[2]]
Additional indices
──────────────────────────────────────────────────────────────────────────────────────
Info Model Value Nested Model Δ Comment
──────────────────────────────────────────────────────────────────────────────────────
LogLikelihood -56.561 -59.252 2.69
AIC 137.121 128.504 8.62 Less is better
BIC 170.571 142.442 28.13 Less is better
Deviance 113.121 118.504 -5.38 Less is better
Residual DF 108.000 115.000 7.00
Chi-squared/DF 0.938 0.923 Overdispersion indicator
──────────────────────────────────────────────────────────────────────────────────────