Examples of GAMLj Methods

library(GAMLj3)
data("manymodels")

Linear models in GAMLj

In this example we run a logistic model using the file data(manymodels) and show the standard R commands such as summary(), print() and some additional commands that can be applied to a GAMLj object to extract relevant information.

Estimating the model

We use GAMLj gamlj_glm() function to estimate a logistic model with variable manymodels$ybin as dependent variable, a continuous covariate manymodels$x and two categorical factors,manymodels$cat2 and manymodels$cat3. We add also the independent variables interaction.

manymodels$ybin<-factor(manymodels$ybin)
manymodels$cat2<-factor(manymodels$cat2)
manymodels$cat3<-factor(manymodels$cat3)
manymodels$x<-as.numeric(manymodels$x)

mod1<-gamlj_glm(formula = ybin~x*cat3*cat2, data=manymodels,model_type = "logistic")

mod1


 GENERALIZED LINEAR MODEL

 Model Info                                                                                                    
 ───────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   Info                                                                                                        
 ───────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   Model Type       Logistic Model    Model for binary y                                                       
   Model            stats::glm        ybin ~ 1 + x + cat3 + cat2 + x:cat3 + x:cat2 + cat3:cat2 + x:cat3:cat2   
   Distribution     Binomial          Dichotomous event distribution of y                                      
   Link function    Logit             Log of the odd of y                                                      
   Direction        P(y=1)/P(y=0)     P( ybin = 1 ) / P( ybin = 0 )                                            
   Sample size                 120                                                                             
   Converged        yes                                                                                        
   C.I. method      Wald                                                                                       
 ───────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   Note. All covariates are centered to the mean


 MODEL RESULTS

 Model Fit                                    
 ──────────────────────────────────────────── 
   R²       Adj. R²    df    X²      p        
 ──────────────────────────────────────────── 
   0.303      0.235    11    49.2    < .001   
 ──────────────────────────────────────────── 


 Additional indices                                            
 ───────────────────────────────────────────────────────────── 
   Info              Model Value    Comment                    
 ───────────────────────────────────────────────────────────── 
   LogLikelihood         -56.561                               
   AIC                   137.121    Less is better             
   BIC                   170.571    Less is better             
   Deviance              113.121    Less is better             
   Residual DF           108.000                               
   Chi-squared/DF          0.938    Overdispersion indicator   
 ───────────────────────────────────────────────────────────── 


 Omnibus tests                              
 ────────────────────────────────────────── 
                  X²         df    p        
 ────────────────────────────────────────── 
   x              25.4267     1    < .001   
   cat3            1.8625     2     0.394   
   cat2            8.7612     1     0.003   
   x:cat3          1.6693     2     0.434   
   x:cat2          3.5686     1     0.059   
   cat3:cat2       0.0553     2     0.973   
   x:cat3:cat2     0.5152     2     0.773   
 ────────────────────────────────────────── 


 Parameter Estimates (Coefficients)                                                                              
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   Name             Effect                 Estimate    SE       Exp(B)    Lower     Upper     z         p        
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   (Intercept)      (Intercept)               0.668    0.279     1.951    1.1298      3.37     2.398     0.016   
   x                x                         1.653    0.432     5.224    2.2395     12.19     3.826    < .001   
   cat31            0 - -1                    0.820    0.644     2.270    0.6424      8.02     1.273     0.203   
   cat32            1 - -1                    0.657    0.687     1.929    0.5023      7.41     0.957     0.339   
   cat21            1 - -1                    1.549    0.557     4.706    1.5786     14.03     2.779     0.005   
   x:cat31          x:(0 - -1)                0.247    0.865     1.280    0.2347      6.98     0.285     0.776   
   x:cat32          x:(1 - -1)                1.323    1.155     3.754    0.3903     36.10     1.145     0.252   
   x:cat21          x:(1 - -1)                1.482    0.864     4.401    0.8087     23.95     1.714     0.086   
   cat31:cat21      (0 - -1):(1 - -1)        -0.229    1.288     0.796    0.0637      9.93    -0.177     0.859   
   cat32:cat21      (1 - -1):(1 - -1)        -0.296    1.373     0.744    0.0504     10.97    -0.216     0.829   
   x:cat31:cat21    x:(0 - -1):(1 - -1)       0.677    1.731     1.967    0.0662     58.46     0.391     0.696   
   x:cat32:cat21    x:(1 - -1):(1 - -1)       1.570    2.310     4.809    0.0520    444.77     0.680     0.497   
 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────

Extracting tables

The model is pretty printed, and the returned object is a R6 class object containing all tables and functions required to manipulate the results. For instance, the Omnibus tests table can be assessed with

mod1$main$anova


 Omnibus tests                              
 ────────────────────────────────────────── 
                  X²         df    p        
 ────────────────────────────────────────── 
   x              25.4267     1    < .001   
   cat3            1.8625     2     0.394   
   cat2            8.7612     1     0.003   
   x:cat3          1.6693     2     0.434   
   x:cat2          3.5686     1     0.059   
   cat3:cat2       0.0553     2     0.973   
   x:cat3:cat2     0.5152     2     0.773   
 ──────────────────────────────────────────

If a table is required to be in R dara.frame format (for using in other code, for instance), one can obtain it as follows:

df<-mod1$main$anova$asDF
class(df)

[1] "data.frame"

df

       source    test df        p
1           x 25.4267  1 4.60e-07
2        cat3  1.8625  2 3.94e-01
3        cat2  8.7612  1 3.08e-03
4      x:cat3  1.6693  2 4.34e-01
5      x:cat2  3.5686  1 5.89e-02
6   cat3:cat2  0.0553  2 9.73e-01
7 x:cat3:cat2  0.5152  2 7.73e-01

If all tables are required as R data.frame, one can use the command summary()

summary(mod1)

Model Info 

           info          value
1    Model Type Logistic Model
2         Model     stats::glm
3  Distribution       Binomial
4 Link function          Logit
5     Direction  P(y=1)/P(y=0)
6   Sample size            120
7     Converged            yes
8   C.I. method           Wald
                                                                   specs
1                                                     Model for binary y
2 ybin ~ 1 + x + cat3 + cat2 + x:cat3 + x:cat2 + cat3:cat2 + x:cat3:cat2
3                                    Dichotomous event distribution of y
4                                                    Log of the odd of y
5                                          P( ybin = 1 ) / P( ybin = 0 )
6                                                                       
7                                                                       
8                                                                       


Model Fit 

       r2   ar2 df1 test        p
"1" 0.303 0.235  11 49.2 8.79e-07


Additional indices 

            info   value                    specs
1  LogLikelihood -56.561                         
2            AIC 137.121           Less is better
3            BIC 170.571           Less is better
4       Deviance 113.121           Less is better
5    Residual DF 108.000                         
6 Chi-squared/DF   0.938 Overdispersion indicator


Omnibus tests 

       source    test df        p
1           x 25.4267  1 4.60e-07
2        cat3  1.8625  2 3.94e-01
3        cat2  8.7612  1 3.08e-03
4      x:cat3  1.6693  2 4.34e-01
5      x:cat2  3.5686  1 5.89e-02
6   cat3:cat2  0.0553  2 9.73e-01
7 x:cat3:cat2  0.5152  2 7.73e-01


Parameter Estimates (Coefficients) 

          source               label estimate    se  expb expb.ci.lower
1    (Intercept)         (Intercept)    0.668 0.279 1.951        1.1298
2              x                   x    1.653 0.432 5.224        2.2395
3          cat31              0 - -1    0.820 0.644 2.270        0.6424
4          cat32              1 - -1    0.657 0.687 1.929        0.5023
5          cat21              1 - -1    1.549 0.557 4.706        1.5786
6        x:cat31          x:(0 - -1)    0.247 0.865 1.280        0.2347
7        x:cat32          x:(1 - -1)    1.323 1.155 3.754        0.3903
8        x:cat21          x:(1 - -1)    1.482 0.864 4.401        0.8087
9    cat31:cat21   (0 - -1):(1 - -1)   -0.229 1.288 0.796        0.0637
10   cat32:cat21   (1 - -1):(1 - -1)   -0.296 1.373 0.744        0.0504
11 x:cat31:cat21 x:(0 - -1):(1 - -1)    0.677 1.731 1.967        0.0662
12 x:cat32:cat21 x:(1 - -1):(1 - -1)    1.570 2.310 4.809        0.0520
   expb.ci.upper   test       p
1           3.37  2.398 0.01649
2          12.19  3.826 0.00013
3           8.02  1.273 0.20313
4           7.41  0.957 0.33858
5          14.03  2.779 0.00545
6           6.98  0.285 0.77566
7          36.10  1.145 0.25205
8          23.95  1.714 0.08646
9           9.93 -0.177 0.85917
10         10.97 -0.216 0.82919
11         58.46  0.391 0.69585
12        444.77  0.680 0.49656

Summary returns a list of data.frame, so the data.frame of particular table can be easely accessed by its position in the list.

mysum<-summary(mod1)
mysum[[4]]

Omnibus tests 

       source    test df        p
1           x 25.4267  1 4.60e-07
2        cat3  1.8625  2 3.94e-01
3        cat2  8.7612  1 3.08e-03
4      x:cat3  1.6693  2 4.34e-01
5      x:cat2  3.5686  1 5.89e-02
6   cat3:cat2  0.0553  2 9.73e-01
7 x:cat3:cat2  0.5152  2 7.73e-01

Getting data

GAMLj operates a series of data transformations before estimating the models, such as setting the contrasts for factors, centering or standardizing the variables, and so on. The command get_data() allows the users to access the processed data.

df<-get_data(mod1)
head(df)

  ybin      x cat3 cat2
1    1 -1.123    1   -1
2    1  0.457    0   -1
3    0  1.297   -1   -1
4    0 -2.323   -1   -1
5    0  0.617    0   -1
6    1  0.697    1   -1

contrasts(manymodels$cat3)

contrasts(df$cat3)

        1      2
-1 -0.333 -0.333
0   0.667 -0.333
1  -0.333  0.667

Other types of information

GAMLj results objects respond to a series of R standard commands

Coefficients

coef(mod1)


 Parameter Estimates (Coefficients)                                                                              
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   Name             Effect                 Estimate    SE       Exp(B)    Lower     Upper     z         p        
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   (Intercept)      (Intercept)               0.668    0.279     1.951    1.1298      3.37     2.398     0.016   
   x                x                         1.653    0.432     5.224    2.2395     12.19     3.826    < .001   
   cat31            0 - -1                    0.820    0.644     2.270    0.6424      8.02     1.273     0.203   
   cat32            1 - -1                    0.657    0.687     1.929    0.5023      7.41     0.957     0.339   
   cat21            1 - -1                    1.549    0.557     4.706    1.5786     14.03     2.779     0.005   
   x:cat31          x:(0 - -1)                0.247    0.865     1.280    0.2347      6.98     0.285     0.776   
   x:cat32          x:(1 - -1)                1.323    1.155     3.754    0.3903     36.10     1.145     0.252   
   x:cat21          x:(1 - -1)                1.482    0.864     4.401    0.8087     23.95     1.714     0.086   
   cat31:cat21      (0 - -1):(1 - -1)        -0.229    1.288     0.796    0.0637      9.93    -0.177     0.859   
   cat32:cat21      (1 - -1):(1 - -1)        -0.296    1.373     0.744    0.0504     10.97    -0.216     0.829   
   x:cat31:cat21    x:(0 - -1):(1 - -1)       0.677    1.731     1.967    0.0662     58.46     0.391     0.696   
   x:cat32:cat21    x:(1 - -1):(1 - -1)       1.570    2.310     4.809    0.0520    444.77     0.680     0.497   
 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────

Predicted values

preds<-predict(mod1)
head(preds)

     1      2      3      4      5      6 
0.2072 0.6354 0.5756 0.0787 0.6599 0.7362

Residuals values

res<-residuals(mod1)
head(res)

     1      2      3      4      5      6 
 1.774  0.952 -1.309 -0.405 -1.469  0.783

GAMLj additional R commands

Other commands are available. For all of them, if the command is passed without arguments, the command search the required results in the GAMLj object. If the results are not available, FALSE is returned. For instance, the model we estimated has results tables regarding the model fit, so the command fit() returns them.

fit(mod1)

[[1]]

 Model Fit                                    
 ──────────────────────────────────────────── 
   R²       Adj. R²    df    X²      p        
 ──────────────────────────────────────────── 
   0.303      0.235    11    49.2    < .001   
 ──────────────────────────────────────────── 


[[2]]

 Additional indices                                            
 ───────────────────────────────────────────────────────────── 
   Info              Model Value    Comment                    
 ───────────────────────────────────────────────────────────── 
   LogLikelihood         -56.561                               
   AIC                   137.121    Less is better             
   BIC                   170.571    Less is better             
   Deviance              113.121    Less is better             
   Residual DF           108.000                               
   Chi-squared/DF          0.938    Overdispersion indicator   
 ─────────────────────────────────────────────────────────────

On the other hand, the model does not have results regarding post-hoc tests (they were not required in the first run), so the command posthoc() returns FALSE.

ph<-posthoc(mod1)

ph

[1] FALSE

However, one can pass the options required to obtain the results to the command as one would do to the estimation command. For instance, to obtain the post-hoc tests in the example, one can issue:

posthoc(mod1,posthoc= ~cat3)


 POST HOC TESTS

                                                                    
 ────────────────────────────────────────────────────────────────── 
   cat3    vs    cat3    OR       SE       z         p-bonferroni   
 ────────────────────────────────────────────────────────────────── 
   -1      -     0       0.441    0.284    -1.273           0.609   
   -1      -     1       0.518    0.356    -0.957           1.000   
   0       -     1       1.177    0.841     0.227           1.000   
 ──────────────────────────────────────────────────────────────────

or equivalently

posthoc(mod1,posthoc= "cat3")


 POST HOC TESTS

                                                                    
 ────────────────────────────────────────────────────────────────── 
   cat3    vs    cat3    OR       SE       z         p-bonferroni   
 ────────────────────────────────────────────────────────────────── 
   -1      -     0       0.441    0.284    -1.273           0.609   
   -1      -     1       0.518    0.356    -0.957           1.000   
   0       -     1       1.177    0.841     0.227           1.000   
 ──────────────────────────────────────────────────────────────────

Notice that the GAMLj R commands accept their arguments (variables or terms involved) either as a formula or as a list. Furthermore, any other option accepted by the estimation command can be passed to the commands.

Here are the specific commands:

Post-Hoc

Accepts:

formula: a RHS formula specify the variables to be tested, such as ~cat3, or cat3:cat4 for combinations of levels.
posthoc: a vector of variables name (posthoc="cat3") or a list of vectors, where vectors of length > 1 referes to interactions, such as posthoc(mod1,posthoc=list(c("cat3","cat2"))).

Simple Effects

simple_effects(mod1,formula = ~x:cat3)


 SIMPLE EFFECTS

 ANOVA for Simple Effects  of x  
 ─────────────────────────────── 
   cat3    χ²      df    p       
 ─────────────────────────────── 
   -1      3.18     1    0.075   
   0       5.46     1    0.019   
   1       6.46     1    0.011   
 ─────────────────────────────── 


 Parameter Estimates for simple effects of x                                          
 ──────────────────────────────────────────────────────────────────────────────────── 
   cat3    Effect    Estimate    SE       Exp(B)    Lower    Upper    z       p       
 ──────────────────────────────────────────────────────────────────────────────────── 
   -1      x             1.13    0.634      3.10    0.894     10.7    1.78    0.075   
   0       x             1.38    0.589      3.96    1.249     12.6    2.34    0.019   
   1       x             2.45    0.965     11.62    1.752     77.1    2.54    0.011   
 ────────────────────────────────────────────────────────────────────────────────────