Details: GLM effect size indices
keywords jamovi, GLM, effect size indices, omega-squared, eta-squared, epsilon-squared
2.6.1
Introduction
Standardized Effect size indices produced by GLM module are the following:
- β : standardized regression coefficients
- η2: (semi-partial) eta-squared
- η2p : partial eta-squared
- ω2 : omega-squared
- ω2p : partial omega-squared
- ϵ2 : epsilon-squared
- ϵ2p : partial epsilon-squared
All coefficients but the betas are computed with the approapriate function of the R package effectsize, with some adjustment.
β : beta
For continuous variables, it simply corresponds to the B coefficient obtained after standardizing all variables in the model. The standardization of the continuous variables is done before any transformation is applied, so if a complex model requires interaction or polynomial terms, the terms are computed after standardization, and the β are consistent.
For categorical variables, however, some comments are in order:
Categorical variables are not standardized in GAMLj, so the β should be interpreted in terms of
standardized differences in the dependent variable between the levels
contrasted by the corresponding contrast. Consider the following
example: Two groups (variable groups
) of size 20 and 10
respectively, are compared on a variable Y. If one uses GAMLj default contrast coding
(simple
), the B is the difference in groups means. The
β is the difference between the
average z-scores of the dependent variable between the two groups.
Assume these are the results:
The beta is 0.352, so it means that if we compute the mean difference between groups in the standardized y, we obtain 0.352. In fact.
However, the β you obtain is not the correlation between zy and groups. The correlation is 0.169:
Why is there this discrepancy? Because the groups are not balanced, so when the correlation is computed, the variable groups is standardized, so the contrast coding values depend on the relative size of the groups. The actual groups coding values used by the Pearson’s correlations are the following:
Thus, the correlation corresponds to running a regression with zy as dependent variables and a continuous variable featuring either -.695 or 1.390 as values. The β yielded by GAMLj, instead, is the mean difference between X levels on the standardized Y. Please notice that other software may yield different β’s for categorical variables.
If two groups are balanced and homeschedastic, the β associated with a
deviation
contrast corresponds to the fully standardized
coefficient.
η2: (semi-partial) eta-squared
This is the proportion of total variance uniquely explained by the associated effect. Being SSeff the sum of squares of the effect, SSres the sum of squares of the residuals or of SS error, and SSmodel the sum of sum of squares of the whole model, we have:
η2=SSeffSSmodel+SSres
where SSmodel+SSres=SStotal and SStotal=∑(yi−ˉy)2 and SSmodel=∑(^yi−ˉˆy)2.
Please notice that although the computation of the effect size
indexes and their confidence intervals is carried out employing effectsize R package,
GAMLj makes a correction to the
computation of η2 and of the
other non-partial indices. effectsize
R package, infact,
defines the total sum of squares as SS∗total=∑SSf+SSres, where
f is any effect in the model. For
balanced designs and many other models, SS∗total=SStotal, so no issue
arises. However, there are certain models in which SS∗total≠SStotal, and so the
index looses some of its properties when computed based on SS∗total. GAMLj operates a correction such that all the
non-partial indeces are always computed based on SStotal=SSmodel+SSres.
With the correction, one property η2 retains even when SS∗total≠SStotal is that η2=r2sp, where rsp is the semi-partial correlation
(Cohen, West, and
Aiken 2014). We can use the exercise
dataset
from (Cohen,
West, and Aiken 2014) to see this in practice (refer to
Multiple regression, moderated regression,
and simple slopes for a complete analysis). If we run a multiple
regression yendu~xage+zexer
and ask for the η2’s, we obtain the following
results:
Going in jamovi, Regression->Partial correlation we can compute the semi-partial correlation of each IV coviariating the other. This gives:
Squaring it gives r2yx.z=−0.2302=0.0529, which is equal to the corrisponding η2=.053.
Squaring the second r2sp gives r2yz.x=0.3882=0.1504664, considering rounding, which is equal to the corrisponding η2=.150.
If we used SS∗total=∑SSf+SSres, results would be:
- SS∗total=1516+4298+23810=29624
- SS∗yx.z=1516/29624=0.0511747
- SS∗yz.x=4298/29624=0.1450851
that are clearly not corresponding to the r2sp, as they should be. We should note, however, that the two methods of estimation usually give very similar results, even when SS∗total≠SStotal and exactly the same results when SS∗total=SStotal.
The same reasoning holds for all the non-partial indices.
For many people, the fact that SS∗model≠SSmodel may come as a surprise. Maybe because in the ANOVA tradition, with balanced designs, ∑SSf is always equal to SSmodel, or maybe because it would be nicer if it was always like in the ANOVA. Since we have to accept that in the GLM ∑SSf is not necessarily equal to SSmodel, let us see when this happens.
There are two cases: The easiest to understand is when ∑SSf<SSmodel. This case happens when the independent variables are positively (or all negatively) correlated so each variable explains a unique part of the variance, but the model sum of square involves also some shared variance, which ends up in SSmodel but not in ∑SSf.
A little trickier is the case when ∑SSf>SSmodel, because it seems strange that the sum of effects is larger than the overall combined effect.
Consider a regression y=a+byx.zx+byz.xz. Recall the the SSyx.z (x explains y keeping constant z) is computed as SSmodel−SSyz, where SSyz is the sum of squares explained by z without x in the model, and SSyz.x=SSmodel−SSyx. It follows that the ∑SSf=2⋅SSmodel−SSyx−SSyz is larger than SSmodel when SSmodel>SSyx+SSyz. This necessarily means that at least one variable explains more variance while keeping constant the other than alone.
Indeed, in the example above about exercising, the SS of age alone is 452, and exer alone is 3234, which sum to 3686, less than 4751, the multiple regression model SS.
The question is now: when does this happen? Well, it happens when Byz, the coefficient associated with z in simple regression, is smaller (in absolute value) than the partial coefficient of Byz.x of a multiple regression. Because Byz.x=Byz−Byx.z⋅Bxz, this happens when Byz and Byx.z⋅Bxz have different signs: therefore, we have a suppression effect!
In the example above, focusing on exer (the z variable), we have Byz.x=.916, Byx.z=−.257 and Bxz=.598, thus Byx.z⋅Bxz=−.257∗.598=−0.153686, which has a different sign than Byz=.762. Thus, the effect of exercising is stronger when keeping constant age than alone: age suppresses the effect of exercising on endurance.
η2p : partial eta-squared
This is the proportion of partial variance uniquely explained by the associated effect. That is, the variance uniquely explained by the effect expressed as the proportion of variance not explained by the other effects. Here the variance explained by the other effects in the model is completely partialed out. Its formula is:
η2p=SSeffSSeff+SSres
clearly, if there is only one independent variable, η2=η2p
ω2 : omega-squared
This is the expected value in the population of the proportion of variance uniquely explained by the associated effect. In other words, it is the unbiased version of η2. There are different formulas to visualize its computation, here is one. If dfres are the degrees of freedon of the residual variance, dfeff are the degrees of freedom of the effect, we have:
ω2=SSeff−SSres⋅(dfeff/dfres)⋅SSmodel+SSres(dfres+1)/dfres
It’s clear that omega is similat to η2, but applies a correction for the denominator.
ω2p : partial omega-squared
This is the expected value in the population of the proportion of partial variance uniquely explained by the associated effect. In other words,it is the unbiased version of η2p. With N being the sample size, We have:
ω2p=SSeff−SSres⋅(dfeff/dfres)⋅SSeff+SSres⋅[(N−dfeff)/dfres]
It’s clear that omega is similat to η2p, but applies a correction for the degress of freedom. In fact, as N increases, the two indices converge.
ϵ2p : epsilon-squared
Epsilon-squared is another correction of η2, but the correction involves only the estimation of the sum of squares of the effect, not the partial variance on which the effect is compared
ϵ2=SSeff−SSres⋅(dfeff/dfres)⋅SSmodel+SSres
ϵ2p : partial epsilon-squared
As for the non-partial Epsilon, the partial Epsilon-squared is a correction of η2p, but the correction involves only the estimation of the sum of squares of the effect, not the partial variance on which the effect is compared
ϵ2p=SSeff−SSres⋅(dfeff/dfres)⋅SSeff+SSres
Simple Effects
From version 2.6.1 on, all the effect size indices are available also
for the simple effects. To compute them, GAMLj extracts the SS of the simple effect
from R emmeans
F-tests. The SS residuals and SS model is
extracted from the model summary, given that both SS do not change when
simple effects are computed. Then the indices are computed using the
previously described formulas.
In particular, if the simple effect is se: SSres=σ2⋅dfres
where σ is extracted as
sigma(model)
.
SSmodel=Fmodel⋅dfmodel⋅SSresdfres
and
SSse=Fse⋅dfse⋅SSresdfres
Confidence intervals
In option tab Options
it is possible to ask additional
tables for the effect size indices, containing the effect size indices
and their confidence intervals (here an example with the
exercise
dataset)
Details for the confidence intervals computation can be found in Ben-Shachar, Makowski & Lüdecke (2020). Compute and interpret indices of effect size. CRAN
Comments?
Got comments, issues or spotted a bug? Please open an issue on GAMLj at github or send me an email