Processing math: 100%

Details: GLM effect size indices

keywords jamovi, GLM, effect size indices, omega-squared, eta-squared, epsilon-squared

2.6.1

Introduction

Standardized Effect size indices produced by GLM module are the following:

  • β : standardized regression coefficients
  • η2: (semi-partial) eta-squared
  • η2p : partial eta-squared
  • ω2 : omega-squared
  • ω2p : partial omega-squared
  • ϵ2 : epsilon-squared
  • ϵ2p : partial epsilon-squared

All coefficients but the betas are computed with the approapriate function of the R package effectsize, with some adjustment.

β : beta

For continuous variables, it simply corresponds to the B coefficient obtained after standardizing all variables in the model. The standardization of the continuous variables is done before any transformation is applied, so if a complex model requires interaction or polynomial terms, the terms are computed after standardization, and the β are consistent.

For categorical variables, however, some comments are in order: Categorical variables are not standardized in GAMLj, so the β should be interpreted in terms of standardized differences in the dependent variable between the levels contrasted by the corresponding contrast. Consider the following example: Two groups (variable groups) of size 20 and 10 respectively, are compared on a variable Y. If one uses GAMLj default contrast coding (simple), the B is the difference in groups means. The β is the difference between the average z-scores of the dependent variable between the two groups. Assume these are the results:

The beta is 0.352, so it means that if we compute the mean difference between groups in the standardized y, we obtain 0.352. In fact.

However, the β you obtain is not the correlation between zy and groups. The correlation is 0.169:

Why is there this discrepancy? Because the groups are not balanced, so when the correlation is computed, the variable groups is standardized, so the contrast coding values depend on the relative size of the groups. The actual groups coding values used by the Pearson’s correlations are the following:

Thus, the correlation corresponds to running a regression with zy as dependent variables and a continuous variable featuring either -.695 or 1.390 as values. The β yielded by GAMLj, instead, is the mean difference between X levels on the standardized Y. Please notice that other software may yield different β’s for categorical variables.

If two groups are balanced and homeschedastic, the β associated with a deviation contrast corresponds to the fully standardized coefficient.

η2: (semi-partial) eta-squared

This is the proportion of total variance uniquely explained by the associated effect. Being SSeff the sum of squares of the effect, SSres the sum of squares of the residuals or of SS error, and SSmodel the sum of sum of squares of the whole model, we have:

η2=SSeffSSmodel+SSres

where SSmodel+SSres=SStotal and SStotal=(yiˉy)2 and SSmodel=(^yiˉˆy)2.

Please notice that although the computation of the effect size indexes and their confidence intervals is carried out employing effectsize R package, GAMLj makes a correction to the computation of η2 and of the other non-partial indices. effectsize R package, infact, defines the total sum of squares as SStotal=SSf+SSres, where f is any effect in the model. For balanced designs and many other models, SStotal=SStotal, so no issue arises. However, there are certain models in which SStotalSStotal, and so the index looses some of its properties when computed based on SStotal. GAMLj operates a correction such that all the non-partial indeces are always computed based on SStotal=SSmodel+SSres.

With the correction, one property η2 retains even when SStotalSStotal is that η2=r2sp, where rsp is the semi-partial correlation (Cohen, West, and Aiken 2014). We can use the exercise dataset from (Cohen, West, and Aiken 2014) to see this in practice (refer to Multiple regression, moderated regression, and simple slopes for a complete analysis). If we run a multiple regression yendu~xage+zexer and ask for the η2’s, we obtain the following results:

Going in jamovi, Regression->Partial correlation we can compute the semi-partial correlation of each IV coviariating the other. This gives:

Squaring it gives r2yx.z=0.2302=0.0529, which is equal to the corrisponding η2=.053.

Squaring the second r2sp gives r2yz.x=0.3882=0.1504664, considering rounding, which is equal to the corrisponding η2=.150.

If we used SStotal=SSf+SSres, results would be:

  • SStotal=1516+4298+23810=29624
  • SSyx.z=1516/29624=0.0511747
  • SSyz.x=4298/29624=0.1450851

that are clearly not corresponding to the r2sp, as they should be. We should note, however, that the two methods of estimation usually give very similar results, even when SStotalSStotal and exactly the same results when SStotal=SStotal.

The same reasoning holds for all the non-partial indices.

For many people, the fact that SSmodelSSmodel may come as a surprise. Maybe because in the ANOVA tradition, with balanced designs, SSf is always equal to SSmodel, or maybe because it would be nicer if it was always like in the ANOVA. Since we have to accept that in the GLM SSf is not necessarily equal to SSmodel, let us see when this happens.

There are two cases: The easiest to understand is when SSf<SSmodel. This case happens when the independent variables are positively (or all negatively) correlated so each variable explains a unique part of the variance, but the model sum of square involves also some shared variance, which ends up in SSmodel but not in SSf.

A little trickier is the case when SSf>SSmodel, because it seems strange that the sum of effects is larger than the overall combined effect.

Consider a regression y=a+byx.zx+byz.xz. Recall the the SSyx.z (x explains y keeping constant z) is computed as SSmodelSSyz, where SSyz is the sum of squares explained by z without x in the model, and SSyz.x=SSmodelSSyx. It follows that the SSf=2SSmodelSSyxSSyz is larger than SSmodel when SSmodel>SSyx+SSyz. This necessarily means that at least one variable explains more variance while keeping constant the other than alone.

Indeed, in the example above about exercising, the SS of age alone is 452, and exer alone is 3234, which sum to 3686, less than 4751, the multiple regression model SS.

The question is now: when does this happen? Well, it happens when Byz, the coefficient associated with z in simple regression, is smaller (in absolute value) than the partial coefficient of Byz.x of a multiple regression. Because Byz.x=ByzByx.zBxz, this happens when Byz and Byx.zBxz have different signs: therefore, we have a suppression effect!

In the example above, focusing on exer (the z variable), we have Byz.x=.916, Byx.z=.257 and Bxz=.598, thus Byx.zBxz=.257.598=0.153686, which has a different sign than Byz=.762. Thus, the effect of exercising is stronger when keeping constant age than alone: age suppresses the effect of exercising on endurance.

η2p : partial eta-squared

This is the proportion of partial variance uniquely explained by the associated effect. That is, the variance uniquely explained by the effect expressed as the proportion of variance not explained by the other effects. Here the variance explained by the other effects in the model is completely partialed out. Its formula is:

η2p=SSeffSSeff+SSres

clearly, if there is only one independent variable, η2=η2p

ω2 : omega-squared

This is the expected value in the population of the proportion of variance uniquely explained by the associated effect. In other words, it is the unbiased version of η2. There are different formulas to visualize its computation, here is one. If dfres are the degrees of freedon of the residual variance, dfeff are the degrees of freedom of the effect, we have:

ω2=SSeffSSres(dfeff/dfres)SSmodel+SSres(dfres+1)/dfres

It’s clear that omega is similat to η2, but applies a correction for the denominator.

ω2p : partial omega-squared

This is the expected value in the population of the proportion of partial variance uniquely explained by the associated effect. In other words,it is the unbiased version of η2p. With N being the sample size, We have:

ω2p=SSeffSSres(dfeff/dfres)SSeff+SSres[(Ndfeff)/dfres]

It’s clear that omega is similat to η2p, but applies a correction for the degress of freedom. In fact, as N increases, the two indices converge.

ϵ2p : epsilon-squared

Epsilon-squared is another correction of η2, but the correction involves only the estimation of the sum of squares of the effect, not the partial variance on which the effect is compared

ϵ2=SSeffSSres(dfeff/dfres)SSmodel+SSres

ϵ2p : partial epsilon-squared

As for the non-partial Epsilon, the partial Epsilon-squared is a correction of η2p, but the correction involves only the estimation of the sum of squares of the effect, not the partial variance on which the effect is compared

ϵ2p=SSeffSSres(dfeff/dfres)SSeff+SSres

Simple Effects

From version 2.6.1 on, all the effect size indices are available also for the simple effects. To compute them, GAMLj extracts the SS of the simple effect from R emmeans F-tests. The SS residuals and SS model is extracted from the model summary, given that both SS do not change when simple effects are computed. Then the indices are computed using the previously described formulas.

In particular, if the simple effect is se: SSres=σ2dfres

where σ is extracted as sigma(model).

SSmodel=FmodeldfmodelSSresdfres

and

SSse=FsedfseSSresdfres

Confidence intervals

In option tab Options it is possible to ask additional tables for the effect size indices, containing the effect size indices and their confidence intervals (here an example with the exercise dataset)

Details for the confidence intervals computation can be found in Ben-Shachar, Makowski & Lüdecke (2020). Compute and interpret indices of effect size. CRAN

Comments?

Got comments, issues or spotted a bug? Please open an issue on GAMLj at github or send me an email

Return to main help pages

Main page General Linear Model

References

Cohen, Patricia, Stephen G West, and Leona S Aiken. 2014. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Psychology press.