Effect sizes

A bunch of functions to compte effect sizes, and convert them.

Pingouin.compute_bootci — Method

compute_bootci(x[, y, func, method, paired, confidence, n_boot, decimals, seed, return_dist])

Bootstrapped confidence intervals of univariate and bivariate functions.

Arguments

x::Array{<:Number}: First sample. Required for both bivariate and univariate functions.
y::Array{<:Number}: Second sample. Required only for bivariate functions.
func::Union{Function, String}: Function to compute the bootstrapped statistic. Accepted string values are:
- 'pearson': Pearson correlation (bivariate, requires x and y)
- 'spearman': Spearman correlation (bivariate)
- 'cohen': Cohen d effect size (bivariate)
- 'hedges': Hedges g effect size (bivariate)
- 'mean': Mean (univariate, requires only x)
- 'std': Standard deviation (univariate)
- 'var': Variance (univariate)
method::String: Method to compute the confidence intervals:
- 'norm': Normal approximation with bootstrapped bias and standard error
- 'per': Basic percentile method
- 'cper': Bias corrected percentile method (default)
paired::Bool: Indicates whether x and y are paired or not. Only useful when computing bivariate Cohen d or Hedges g bootstrapped confidence intervals.
confidence::Float64: Confidence level (0.95 = 95%)
n_boot::Int64: Number of bootstrap iterations. The higher, the better, the slower.
decimals::Int64: Number of rounded decimals.
seed::Int64: Random seed for generating bootstrap samples.
return_dist::Bool: If True, return the confidence intervals and the bootstrapped distribution (e.g. for plotting purposes).

Returns

ci::Array{<:Number}: Desired converted effect size

Notes

Results have been tested against the bootci Matlab function.

References

DiCiccio, T. J., & Efron, B. (1996). Bootstrap confidence intervals.

Statistical science, 189-212.

Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their

application (Vol. 1). Cambridge university press.

Examples

Bootstrapped 95% confidence interval of a Pearson correlation

julia> x = [3, 4, 6, 7, 5, 6, 7, 3, 5, 4, 2]
julia> y = [4, 6, 6, 7, 6, 5, 5, 2, 3, 4, 1]
julia> stat = cor(x, y)
0.7468280049029223
julia> ci = Pingouin.compute_bootci(x=x, y=y, func="pearson", seed=42)
2-element Array{Float64,1}:
 0.22
 0.93

Bootstrapped 95% confidence interval of a Cohen d

julia> stat = Pingouin.compute_effsize(x, y, eftype="cohen")
0.1537753990658328
julia> ci = Pingouin.compute_bootci(x, y=y, func="cohen", seed=42, decimals=3)
2-element Array{Float64,1}:
 -0.329
  0.589

Bootstrapped confidence interval of a standard deviation (univariate)

julia> stat = std(x)
1.6787441193290351
julia> ci = Pingouin.compute_bootci(x, func="std", seed=123)
2-element Array{Float64,1}:
 1.25
 2.2

Bootstrapped confidence interval using a custom univariate function

julia> skewness(x), Pingouin.compute_bootci(x, func=skewness, n_boot=10000, seed=123)
(-0.08244607271328411, [-1.01, 0.77])

Bootstrapped confidence interval using a custom bivariate function

julia> stat = sum(exp.(x) ./ exp.(y))
26.80405184881793
julia> ci = Pingouin.compute_bootci(x, y=y, func=f(x, y) = sum(exp.(x) ./ exp.(y)), n_boot=10000, seed=123)
julia> print(stat, ci)
2-element Array{Float64,1}:
 12.76
 45.52

Get the bootstrapped distribution around a Pearson correlation

julia> ci, bstat = Pingouin.compute_bootci(x, y=y, return_dist=true)
([0.27, 0.92], [0.6661370089058535, ...])

source

Pingouin.compute_bootci — Method

compute_bootci(x[, y, func, method, paired, confidence, n_boot, decimals, seed, return_dist])

Bootstrapped confidence intervals of univariate and bivariate functions.

Arguments

x::Array{<:Number}: First sample. Required for both bivariate and univariate functions.
y::Array{<:Number}: Second sample. Required only for bivariate functions.
func::Union{Function, String}: Function to compute the bootstrapped statistic. Accepted string values are:
- 'pearson': Pearson correlation (bivariate, requires x and y)
- 'spearman': Spearman correlation (bivariate)
- 'cohen': Cohen d effect size (bivariate)
- 'hedges': Hedges g effect size (bivariate)
- 'mean': Mean (univariate, requires only x)
- 'std': Standard deviation (univariate)
- 'var': Variance (univariate)
method::String: Method to compute the confidence intervals:
- 'norm': Normal approximation with bootstrapped bias and standard error
- 'per': Basic percentile method
- 'cper': Bias corrected percentile method (default)
paired::Bool: Indicates whether x and y are paired or not. Only useful when computing bivariate Cohen d or Hedges g bootstrapped confidence intervals.
confidence::Float64: Confidence level (0.95 = 95%)
n_boot::Int64: Number of bootstrap iterations. The higher, the better, the slower.
decimals::Int64: Number of rounded decimals.
seed::Int64: Random seed for generating bootstrap samples.
return_dist::Bool: If True, return the confidence intervals and the bootstrapped distribution (e.g. for plotting purposes).

Returns

ci::Array{<:Number}: Desired converted effect size

Notes

Results have been tested against the bootci Matlab function.

References

DiCiccio, T. J., & Efron, B. (1996). Bootstrap confidence intervals.

Statistical science, 189-212.

Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their

application (Vol. 1). Cambridge university press.

Examples

Bootstrapped 95% confidence interval of a Pearson correlation

julia> x = [3, 4, 6, 7, 5, 6, 7, 3, 5, 4, 2]
julia> y = [4, 6, 6, 7, 6, 5, 5, 2, 3, 4, 1]
julia> stat = cor(x, y)
0.7468280049029223
julia> ci = Pingouin.compute_bootci(x=x, y=y, func="pearson", seed=42)
2-element Array{Float64,1}:
 0.22
 0.93

Bootstrapped 95% confidence interval of a Cohen d

julia> stat = Pingouin.compute_effsize(x, y, eftype="cohen")
0.1537753990658328
julia> ci = Pingouin.compute_bootci(x, y=y, func="cohen", seed=42, decimals=3)
2-element Array{Float64,1}:
 -0.329
  0.589

Bootstrapped confidence interval of a standard deviation (univariate)

julia> stat = std(x)
1.6787441193290351
julia> ci = Pingouin.compute_bootci(x, func="std", seed=123)
2-element Array{Float64,1}:
 1.25
 2.2

Bootstrapped confidence interval using a custom univariate function

julia> skewness(x), Pingouin.compute_bootci(x, func=skewness, n_boot=10000, seed=123)
(-0.08244607271328411, [-1.01, 0.77])

Bootstrapped confidence interval using a custom bivariate function

julia> stat = sum(exp.(x) ./ exp.(y))
26.80405184881793
julia> ci = Pingouin.compute_bootci(x, y=y, func=f(x, y) = sum(exp.(x) ./ exp.(y)), n_boot=10000, seed=123)
julia> print(stat, ci)
2-element Array{Float64,1}:
 12.76
 45.52

Get the bootstrapped distribution around a Pearson correlation

julia> ci, bstat = Pingouin.compute_bootci(x, y=y, return_dist=true)
([0.27, 0.92], [0.6661370089058535, ...])

source

Pingouin.compute_effsize — Method

compute_effsize(x, y[, paired, eftype])

Calculate effect size between two set of observations.

Arguments

x::Array{<:Number}: First set of observations.
y::Array{<:Number}: Second set of observations.
paired::Bool: If True, uses Cohen d-avg formula to correct for repeated measurements (see Notes).
eftype::String: Desired output effect size. Available methods are:
- "none": no effect size
- "cohen": Unbiased Cohen d
- "hedges": Hedges g
- "glass": Glass delta
- "r": correlation coefficient
- "eta-square": Eta-square
- "odds-ratio": Odds ratio
- "auc": Area Under the Curve
- "cles": Common Language Effect Size

Returns

ef::Float64: Effect size

See Also

convert_effsize: Conversion between effect sizes.
compute_effsize_from_t: Convert a T-statistic to an effect size.

Notes

Missing values are automatically removed from the data. If $x$ and $y$ are paired, the entire row is removed.

If $x$ and $y$ are independent, the Cohen $d$ is:

\[d = \frac{\overline{X} - \overline{Y}} {\sqrt{\frac{(n_{1} - 1)\sigma_{1}^{2} + (n_{2} - 1) \sigma_{2}^{2}}{n1 + n2 - 2}}}\]

If $x$ and $y$ are paired, the Cohen $d_{avg}$ is computed:

\[d_{avg} = \frac{\overline{X} - \overline{Y}} {\sqrt{\frac{(\sigma_1^2 + \sigma_2^2)}{2}}}\]

The Cohen’s d is a biased estimate of the population effect size, especially for small samples (n < 20). It is often preferable to use the corrected Hedges $g$ instead:

\[g = d \times (1 - \frac{3}{4(n_1 + n_2) - 9})\]

The Glass $\delta$ is calculated using the group with the lowest variance as the control group:

\[\delta = \frac{\overline{X} - \overline{Y}}{\sigma^2_{\text{control}}}\]

The common language effect size is the proportion of pairs where $x$ is higher than $y$ (calculated with a brute-force approach where each observation of $x$ is paired to each observation of $y$, see wilcoxon for more details):

\[\text{CL} = P(X > Y) + .5 \times P(X = Y)\]

For other effect sizes, Pingouin will first calculate a Cohen $d$ and then use the convert_effsize to convert to the desired effect size.

References

Lakens, D., 2013. Calculating and reporting effect sizes to

facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4, 863. https://doi.org/10.3389/fpsyg.2013.00863

Cumming, Geoff. Understanding the new statistics: Effect sizes,

confidence intervals, and meta-analysis. Routledge, 2013.

https://osf.io/vbdah/

Examples

Cohen d from two independent samples.

julia> x = [1, 2, 3, 4]
julia> y = [3, 4, 5, 6, 7]
julia> Pingouin.compute_effsize(x, y, paired=false, eftype="cohen")
-1.707825127659933

The sign of the Cohen d will be opposite if we reverse the order of $x$ and $y$:

julia> Pingouin.compute_effsize(y, x, paired=false, eftype="cohen")
1.707825127659933

Hedges g from two paired samples.

julia> x = [1, 2, 3, 4, 5, 6, 7]
julia> y = [1, 3, 5, 7, 9, 11, 13]
julia> Pingouin.compute_effsize(x, y, paired=true, eftype="hedges")
-0.8222477210374874

Glass delta from two independent samples. The group with the lowest

variance will automatically be selected as the control.

julia> Pingouin.compute_effsize(x, y, paired=false, eftype="glass")
-1.3887301496588271

Common Language Effect Size.

julia> Pingouin.compute_effsize(x, y, eftype="cles")
0.2857142857142857

In other words, there are ~29% of pairs where $x$ is higher than $y$, which means that there are ~71% of pairs where $x$ is lower than $y$. This can be easily verified by changing the order of $x$ and $y$:

julia> Pingouin.compute_effsize(y, x, eftype="cles")
0.7142857142857143

source

Pingouin.compute_effsize — Method

compute_effsize(x, y[, paired, eftype])

Calculate effect size between two set of observations.

Arguments

x::Array{<:Number}: First set of observations.
y::Array{<:Number}: Second set of observations.
paired::Bool: If True, uses Cohen d-avg formula to correct for repeated measurements (see Notes).
eftype::String: Desired output effect size. Available methods are:
- "none": no effect size
- "cohen": Unbiased Cohen d
- "hedges": Hedges g
- "glass": Glass delta
- "r": correlation coefficient
- "eta-square": Eta-square
- "odds-ratio": Odds ratio
- "auc": Area Under the Curve
- "cles": Common Language Effect Size

Returns

ef::Float64: Effect size

See Also

convert_effsize: Conversion between effect sizes.
compute_effsize_from_t: Convert a T-statistic to an effect size.

Notes

Missing values are automatically removed from the data. If $x$ and $y$ are paired, the entire row is removed.

If $x$ and $y$ are independent, the Cohen $d$ is:

\[d = \frac{\overline{X} - \overline{Y}} {\sqrt{\frac{(n_{1} - 1)\sigma_{1}^{2} + (n_{2} - 1) \sigma_{2}^{2}}{n1 + n2 - 2}}}\]

If $x$ and $y$ are paired, the Cohen $d_{avg}$ is computed:

\[d_{avg} = \frac{\overline{X} - \overline{Y}} {\sqrt{\frac{(\sigma_1^2 + \sigma_2^2)}{2}}}\]

The Cohen’s d is a biased estimate of the population effect size, especially for small samples (n < 20). It is often preferable to use the corrected Hedges $g$ instead:

\[g = d \times (1 - \frac{3}{4(n_1 + n_2) - 9})\]

The Glass $\delta$ is calculated using the group with the lowest variance as the control group:

\[\delta = \frac{\overline{X} - \overline{Y}}{\sigma^2_{\text{control}}}\]

\[\text{CL} = P(X > Y) + .5 \times P(X = Y)\]

For other effect sizes, Pingouin will first calculate a Cohen $d$ and then use the convert_effsize to convert to the desired effect size.

References

Lakens, D., 2013. Calculating and reporting effect sizes to

facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4, 863. https://doi.org/10.3389/fpsyg.2013.00863

Cumming, Geoff. Understanding the new statistics: Effect sizes,

confidence intervals, and meta-analysis. Routledge, 2013.

https://osf.io/vbdah/

Examples

Cohen d from two independent samples.

julia> x = [1, 2, 3, 4]
julia> y = [3, 4, 5, 6, 7]
julia> Pingouin.compute_effsize(x, y, paired=false, eftype="cohen")
-1.707825127659933

The sign of the Cohen d will be opposite if we reverse the order of $x$ and $y$:

julia> Pingouin.compute_effsize(y, x, paired=false, eftype="cohen")
1.707825127659933

Hedges g from two paired samples.

julia> x = [1, 2, 3, 4, 5, 6, 7]
julia> y = [1, 3, 5, 7, 9, 11, 13]
julia> Pingouin.compute_effsize(x, y, paired=true, eftype="hedges")
-0.8222477210374874

Glass delta from two independent samples. The group with the lowest

variance will automatically be selected as the control.

julia> Pingouin.compute_effsize(x, y, paired=false, eftype="glass")
-1.3887301496588271

Common Language Effect Size.

julia> Pingouin.compute_effsize(x, y, eftype="cles")
0.2857142857142857

julia> Pingouin.compute_effsize(y, x, eftype="cles")
0.7142857142857143

source

Pingouin.compute_effsize_from_t — Method

compute_effsize_from_t(tval[, nx, ny, N, eftype])

Compute effect size from a T-value.

Parameters

tval::Float64: T-value.
nx, ny::Int64: Optional. Group sample sizes.
N::Int64: Optional. Total sample size (will not be used if nx and ny are specified).
eftype::String: Optional. Desired output effect size.

Returns

ef::Float64: Effect size

See Also

compute_effsize: Calculate effect size between two set of observations.
convert_effsize: Conversion between effect sizes.

Notes

If both nx and ny are specified, the formula to convert from t to d is:

\[d = t * \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}\]

If only N (total sample size) is specified, the formula is:

\[d = \frac{2t}{\sqrt{N}}\]

Examples

Compute effect size from a T-value when both sample sizes are known.

julia> tval, nx, ny = 2.90, 35, 25
julia> d = Pingouin.compute_effsize_from_t(tval, nx=nx, ny=ny, eftype="cohen")
0.7593982580212534

Compute effect size when only total sample size is known (nx+ny)

julia> tval, N = 2.90, 60
julia> d = Pingouin.compute_effsize_from_t(tval, N=N, eftype="cohen")
0.7487767802667672

source

Pingouin.compute_esci — Method

compute_esci(stat, nx, ny[, paired, eftype, confidence, decimals])

Parametric confidence intervals around a Cohen d or a correlation coefficient.

Arguments

stat::Float64: Original effect size. Must be either a correlation coefficient or a Cohen-type effect size (Cohen d or Hedges g).
nx, ny::Int64: Length of vector x and y.
paired::Bool: Indicates if the effect size was estimated from a paired sample. This is only relevant for cohen or hedges effect size.
eftype::String: Effect size type. Must be "r" (correlation) or "cohen" (Cohen d or Hedges g).
confidence::Float64: Confidence level (0.95 = 95%)
decimals::Int64: Number of rounded decimals.

Returns

ci::Array Desired converted effect size

Notes

To compute the parametric confidence interval around a Pearson r correlation coefficient, one must first apply a Fisher's r-to-z transformation:

\[z = 0.5 \cdot \ln \frac{1 + r}{1 - r} = \text{arctanh}(r)\]

and compute the standard deviation:

\[\sigma = \frac{1}{\sqrt{n - 3}}\]

where $n$ is the sample size.

The lower and upper confidence intervals - in z-space - are then given by:

\[\text{ci}_z = z \pm \text{crit} \cdot \sigma\]

where $\text{crit}$ is the critical value of the normal distribution corresponding to the desired confidence level (e.g. 1.96 in case of a 95% confidence interval).

These confidence intervals can then be easily converted back to r-space:

\[\text{ci}_r = \frac{\exp(2 \cdot \text{ci}_z) - 1} {\exp(2 \cdot \text{ci}_z) + 1} = \text{tanh}(\text{ci}_z)\]

A formula for calculating the confidence interval for a Cohen d effect size is given by Hedges and Olkin (1985, p86). If the effect size estimate from the sample is $d$, then it follows a T distribution with standard deviation:

\[\sigma = \sqrt{\frac{n_x + n_y}{n_x \cdot n_y} + \frac{d^2}{2 (n_x + n_y)}}\]

where $n_x$ and $n_y$ are the sample sizes of the two groups.

In one-sample test or paired test, this becomes:

\[\sigma = \sqrt{\frac{1}{n_x} + \frac{d^2}{2 n_x}}\]

The lower and upper confidence intervals are then given by:

\[\text{ci}_d = d \pm \text{crit} \cdot \sigma\]

where $\text{crit}$ is the critical value of the T distribution corresponding to the desired confidence level.

References

https://en.wikipedia.org/wiki/Fisher_transformation
Hedges, L., and Ingram Olkin. "Statistical models for meta-analysis." (1985).
http://www.leeds.ac.uk/educol/documents/00002182.htm
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5133225/

Examples

Confidence interval of a Pearson correlation coefficient

julia> x = [3, 4, 6, 7, 5, 6, 7, 3, 5, 4, 2]
julia> y = [4, 6, 6, 7, 6, 5, 5, 2, 3, 4, 1]
julia> nx, ny = length(x), length(y)
julia> stat = Pingouin.compute_effsize(x, y, eftype="r")
0.7468280049029223
julia> ci = Pingouin.compute_esci(stat=stat, nx=nx, ny=ny, eftype="r")
2-element Array{Float64,1}:
 0.27
 0.93

Confidence interval of a Cohen d

julia> stat = Pingouin.compute_effsize(x, y, eftype="cohen")
0.1537753990658328
julia> ci = Pingouin.compute_esci(stat=stat, nx=nx, ny=ny, eftype="cohen", decimals=3)
2-element Array{Float64,1}:
 -0.737
  1.045

source

Pingouin.convert_effsize — Method

convert_effsize(ef, input_type, output_type[, nx, ny])

Conversion between effect sizes.

Parameters

ef::Float64: Original effect size.
input_type::String: Effect size type of ef. Must be "r" or "d".
output_type::String: Desired effect size type.
nx, ny::Int64: Optional. Length of vector x and y. Required to convert to Hedges g. Available methods are:
- "cohen": Unbiased Cohen d
- "hedges": Hedges g
- "eta-square": Eta-square
- "odds-ratio": Odds ratio
- "AUC": Area Under the Curve
- "none": pass-through (return ef)

Returns

ef::Float64: Desired converted effect size

See Also

compute_effsize: Calculate effect size between two set of observations.
compute_effsize_from_t: Convert a T-statistic to an effect size.

Notes

The formula to convert r to d is given in [1]:

\[d = \frac{2r}{\sqrt{1 - r^2}}\]

The formula to convert d to r is given in [2]:

\[r = \frac{d}{\sqrt{d^2 + \frac{(n_x + n_y)^2 - 2(n_x + n_y)} {n_xn_y}}}\]

The formula to convert d to $\eta^2$ is given in [3]:

\[\eta^2 = \frac{(0.5 d)^2}{1 + (0.5 d)^2}\]

The formula to convert d to an odds-ratio is given in [4]:

\[\text{OR} = \exp (\frac{d \pi}{\sqrt{3}})\]

The formula to convert d to area under the curve is given in [5]:

\[\text{AUC} = \mathcal{N}_{cdf}(\frac{d}{\sqrt{2}})\]

References

[1] Rosenthal, Robert. "Parametric measures of effect size." The handbook of research synthesis 621 (1994): 231-244.

[2] McGrath, Robert E., and Gregory J. Meyer. "When effect sizes disagree: the case of r and d." Psychological methods 11.4 (2006): 386.

[3] Cohen, Jacob. "Statistical power analysis for the behavioral sciences. 2nd." (1988).

[4] Borenstein, Michael, et al. "Effect sizes for continuous data." The handbook of research synthesis and meta-analysis 2 (2009): 221-235.

[5] Ruscio, John. "A probability-based measure of effect size: Robustness to base rates and other factors." Psychological methods 1 3.1 (2008): 19.

Examples

Convert from Cohen d to eta-square

julia> d = .45
julia> eta = Pingouin.convert_effsize(d, "cohen", "eta-square")
0.048185603807257595

Convert from Cohen d to Hegdes g (requires the sample sizes of each

group)

julia> Pingouin.convert_effsize(.45, "cohen", "hedges", nx=10, ny=10)
0.4309859154929578

Convert Pearson r to Cohen d

julia> r = 0.40
julia> d = Pingouin.convert_effsize(r, "r", "cohen")
0.8728715609439696

Reverse operation: convert Cohen d to Pearson r

julia> Pingouin.convert_effsize(d, "cohen", "r")
0.4000000000000001

source