Generalized Linear Model — gamlj

Generalized Linear Model with options to estimate logistic, probit, ordinal, multinomial and custum link function models. Provides options to estimate posthoc, simple effects and slopes, plots of main effects and interaction. Model comparison is also an option.

gamlj_glm(
  formula = NULL,
  data,
  model_type = "linear",
  dep = NULL,
  factors = NULL,
  covs = NULL,
  offset = NULL,
  weights = NULL,
  model_terms = NULL,
  fixed_intercept = TRUE,
  nested_intercept = NULL,
  custom_family = "gaussian",
  custom_link = "identity",
  nested_terms = NULL,
  omnibus = "LRT",
  estimates_ci = FALSE,
  ci_method = "wald",
  boot_r = 1000,
  ci_width = 95,
  contrasts = NULL,
  contrast_custom_values = NULL,
  contrast_custom_focus = NULL,
  show_contrastnames = TRUE,
  show_contrastcodes = FALSE,
  vcov = FALSE,
  plot_x = NULL,
  plot_z = NULL,
  plot_by = NULL,
  plot_raw = FALSE,
  plot_yscale = FALSE,
  plot_xoriginal = FALSE,
  plot_black = FALSE,
  plot_around = "ci",
  emmeans = NULL,
  posthoc = NULL,
  simple_x = NULL,
  simple_mods = NULL,
  simple_interactions = FALSE,
  covs_conditioning = "mean_sd",
  ccm_value = 1,
  ccp_value = 25,
  covs_scale_labels = "labels",
  adjust = list("bonf"),
  covs_scale = NULL,
  expb_ci = TRUE,
  es = list("expb"),
  propodds_test = FALSE,
  plot_scale = "response"
)

Arguments

formula: (optional) the formula of the model compatible with glm. If not passed, model terms should be defined as a list in model_terms option. For logistic and probit models formula can be of the form `cbind(success,fail) ~ x ` where `success` is the counts of success (y=1) and fail is the counts of failure (y=0). It accepts also the the form `probabilities / totals ~ x ` where `probabilities` is the probability of success and `totals` is the number of cases.
data: the data as a data.frame
model_type: define the type of model (link function and distribution combination) required. It can be linear, poisson, poiover, nb, logistic, probit, custom, ordinal, multinomial, and beta.
dep: a string naming the dependent variable from data; the variable must be numeric. Not needed if formula is used.
factors: a vector of strings naming the fixed factors from data. Not needed if formula is used.
covs: a vector of strings naming the covariates from data. Not needed if formula is used.
offset: a vector of strings naming the offset variables.
weights: an optional variable name indicating the vector of ‘prior weights’ to be used in the fitting process. Should be NULL or a variable name in the data.
model_terms: a list of character vectors describing fixed effects terms. Not needed if formula is used.
fixed_intercept: TRUE (default) or FALSE, estimates fixed intercept. Not needed if formula is used.
nested_intercept: TRUE (default) or FALSE, estimates fixed intercept. Not needed if formula is used.
custom_family: Distribution family for the custom model, accepts gaussian, binomial, gamma and inverse_gaussian.
custom_link: Distribution family for the model_type="custom", accepts identity, log and inverse. Use onemu2 for 1/mu^2).
nested_terms: a list of character vectors describing effects terms for nested model. It can be passed as right-hand formula.
omnibus: Whether the omnibus test for the model should be wald or LRT.
estimates_ci: TRUE (default) or FALSE , coefficients CI in tables
ci_method: The method used to compute the confidence intervals. `wald` uses the Wald method to compute standard errors and confidence intervals. `profile` computes Profile Likelihood Based Confidence Interval, in which the bounds are chosen based on the percentiles of the chi-square distribution around the maximum likelihood estimate. `quantile` performs a non-parametric boostrap, with `boot_r` repetitions, and compute the CI based on the percentiles of the boostrap distribution. `bcai` implements the bias-corrected bootstrap method.
boot_r: a number bootstrap repetitions.
ci_width: a number between 50 and 99.9 (default: 95) specifying the confidence interval width for the plots.
contrasts: a named vector of the form c(var1="type", var2="type2") specifying the type of contrast to use, one of deviation, simple, dummy, difference, helmert, repeated or 'polynomial'. If NULL, simple is used. Can also be passed as a list of list of the form list(list(var="var1",type="type1")).
contrast_custom_values: a named list with the custom contrast weights, of the form list(factorname=numeric vector), for instance list(factorname=c(1,1,-2)). only one constrast per variable is allowed.
contrast_custom_focus: if any factor is coded with 'custom', when TRUE or NULL (default) the coefficients, simple effects and simple interactions are focused on the custom contrast. If FALSE , variables are coded accordingly to the passed contrast, but no special table or test is devoted to the contrast.
show_contrastnames: TRUE or FALSE (default), shows raw names of the contrasts variables in tables
show_contrastcodes: TRUE or FALSE (default), shows contrast coefficients tables
vcov: TRUE or FALSE (default), shows coefficients covariances
plot_x: a string naming the variable placed on the horizontal axis of the plot
plot_z: a string naming the variable represented as separate lines in the plot
plot_by: a list of variables defining the levels at which a separate plot should be produced.
plot_raw: TRUE or FALSE (default), plot raw data along the predicted values
plot_yscale: TRUE or FALSE (default), set the Y-axis range equal to the range of the observed values.
plot_xoriginal: TRUE or FALSE (default), use original scale for covariates.
plot_black: TRUE or FALSE (default), use different linetypes per levels.
plot_around: 'none' (default), 'ci', or 'se'. Use no error bars, use confidence intervals, or use standard errors on the plots, respectively.
emmeans: a rhs formula with the terms specifying the marginal means to estimate (of the form '~x+x:z')
posthoc: a rhs formula with the terms specifying the table to apply the comparisons (of the form '~x+x:z'). The formula is not expanded, so x*z becomes x+z and not x+z+x:z. It can be passed also as a list of the form 'list("x","z",c("x","z")'
simple_x: The variable for which the simple effects (slopes) are computed
simple_mods: the variable that provides the levels at which the simple effects are computed
simple_interactions: should simple Interactions be computed
covs_conditioning: 'mean_sd' (default), 'custom' , or 'percent'. Use to condition the covariates (if any)
ccm_value: how many st.deviations around the means used to condition simple effects and plots. Used if simpleScale='mean_sd'
ccp_value: offsett (number of percentiles) around the median used to condition simple effects and plots. Used if simpleScale='percent'
covs_scale_labels: how the levels of a continuous moderator should appear in tables and plots: labels, values and values_labels, ovalues, ovalues_labels. The latter two refer to the variable original levels, before scaling.
adjust: one or more of 'none', 'bonf','tukey' 'holm'; provide no, Bonferroni, Tukey and Holm Post Hoc corrections respectively.
covs_scale: a named vector of the form c(var1='type', var2='type2') specifying the transformation to apply to covariates, one of 'centered' to the mean, 'standardized','log' or 'none'. 'none' leaves the variable as it is.
expb_ci: TRUE (default) or FALSE , exp(B) CI in table
es: Effect size indices. expb (default) exponentiates the coefficients. For dichotomous dependent variables relative risk indices (RR) can be obtained. marginals computes the marginal effects.
propodds_test: Test parallel lines assumptions in cumulative link model (ordinal regression)
plot_scale: Chi-squared computation method. 'lrt' (default) gives LogLikelihood ration test, 'wald' gives the Wald Chi-squared.

Value

A results object containing:

`results$model`					The underlying estimated model
`results$info`					a table
`results$main$r2`					a table of R
`results$main$fit`					a table
`results$main$anova`					a table of ANOVA results
`results$main$coefficients`					a table
`results$main$vcov`					a table
`results$main$contrastCodeTables`					an array of contrast coefficients tables
`results$main$marginals`					a table
`results$main$relativerisk`					a table
`results$main$paralleltest`					a table
`results$posthoc`					an array of post-hoc tables
`results$simpleEffects$anova`					a table of ANOVA for simple effects
`results$simpleEffects$coefficients`					a table
`results$simpleInteractions`					an array of simple interactions tables
`results$emmeans`					an array of predicted means tables
`results$mainPlots`					an array of results plots
`results$plotnotes`					a html
`results$predicted`					an output
`results$residuals`					an output

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$info$asDF

as.data.frame(results$info)

Examples

   data <- emmeans::neuralgia
   GAMLj3::gamlj_glm(formula = Pain ~ Duration,
                    data = data,
                    model_type = "logistic"
                    )
#> 
#>  GENERALIZED LINEAR MODEL
#> 
#>  Model Info                                                                 
#>  ────────────────────────────────────────────────────────────────────────── 
#>    Info                                                                     
#>  ────────────────────────────────────────────────────────────────────────── 
#>    Model Type       Logistic Model    Model for binary y                    
#>    Model            stats::glm        Pain ~ 1 + Duration                   
#>    Distribution     Binomial          Dichotomous event distribution of y   
#>    Link function    Logit             Log of the odd of y                   
#>    Direction        P(y=1)/P(y=0)     P( Pain = Yes ) / P( Pain = No )      
#>    Sample size                  60                                          
#>    Converged        yes                                                     
#>    C.I. method      Wald                                                    
#>  ────────────────────────────────────────────────────────────────────────── 
#>    Note. All covariates are centered to the mean
#> 
#> 
#>  MODEL RESULTS
#> 
#>  Model Fit                                                    
#>  ──────────────────────────────────────────────────────────── 
#>    R²            Adj. R²        df    X²          p           
#>  ──────────────────────────────────────────────────────────── 
#>    0.01983700    0.007567546     1    1.616779    0.2035415   
#>  ──────────────────────────────────────────────────────────── 
#> 
#> 
#>  Additional indices                                            
#>  ───────────────────────────────────────────────────────────── 
#>    Info              Model Value    Comment                    
#>  ───────────────────────────────────────────────────────────── 
#>    LogLikelihood      -39.943206                               
#>    AIC                 83.886413    Less is better             
#>    BIC                 88.075102    Less is better             
#>    Deviance            79.886413    Less is better             
#>    Residual DF         58.000000                               
#>    Chi-squared/DF       1.028444    Overdispersion indicator   
#>  ───────────────────────────────────────────────────────────── 
#> 
#> 
#>  Omnibus tests                               
#>  ─────────────────────────────────────────── 
#>                X²          df    p           
#>  ─────────────────────────────────────────── 
#>    Duration    1.616779     1    0.2035415   
#>  ─────────────────────────────────────────── 
#> 
#> 
#>  Parameter Estimates (Coefficients)                                                                                          
#>  ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
#>    Name           Effect         Estimate       SE            Exp(B)       Lower        Upper       z            p           
#>  ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
#>    (Intercept)    (Intercept)    -0.34728878    0.26600165    0.7066012    0.4195207    1.190133    -1.305589    0.1916924   
#>    Duration       Duration       -0.02971640    0.02390160    0.9707208    0.9262949    1.017277    -1.243281    0.2137643   
#>  ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
#>