unravel.cluster_stats.effect_sizes.effect_sizes module#

Use effect_sizes from UNRAVEL to calculate the effect size for a comparison between two groups for each cluster [or a valid cluster list].

Prereqs:
  • cstats_validation to generate the input CSVs with densities.

Inputs:
  • CSV with densities (Columns: Samples, Sex, Conditions, Cluster_1, Cluster_2, …)

Outputs:
  • CSV w/ absolute effect sizes and upper and lower limits of the confidence interval (CI) for each cluster

  • <input>_Hedges_g_<condition_1>_<condition_2>.csv

Note

  • -c1 and -c2 should match the condition name in the Conditions column of the input CSV or be a prefix of the condition name.

  • The effect size is calculated as the unbiased Hedge’s g effect sizes (corrected for sample size).

  • Hedges’ g = ((c2-c1)/spooled*corr_factor)

  • CI = Hedges’ g +/- t * SE

  • 0.2-0.5 = small effect; 0.5-0.8 = medium; 0.8+ = large

  • The CI is based on a two-tailed t-test with alpha = 0.05.

  • More more info, see: https://pubmed.ncbi.nlm.nih.gov/37248402/

Usage#

effect_sizes -i densities.csv -c1 saline -c2 psilocybin [-c 1 2 3 4 5] [-v]

unravel.cluster_stats.effect_sizes.effect_sizes.parse_args()[source]#
unravel.cluster_stats.effect_sizes.effect_sizes.condition_selector(df, condition, unique_conditions, condition_column='Conditions')[source]#

Create a condition selector to handle pooling of data in a DataFrame based on specified conditions. This function checks if the ‘condition’ is exactly present in the ‘Conditions’ column or is a prefix of any condition in this column. If the exact condition is found, it selects those rows. If the condition is a prefix (e.g., ‘saline’ matches ‘saline-1’, ‘saline-2’), it selects all rows where the ‘Conditions’ column starts with this prefix. An error is raised if the condition is neither found as an exact match nor as a prefix.

Parameters:
  • df (pd.DataFrame) – DataFrame whose ‘Conditions’ column contains the conditions of interest.

  • condition (str) – The condition or prefix of interest.

  • unique_conditions (list) – List of unique conditions in the ‘Conditions’ column to validate against.

Returns:

A boolean Series to select rows based on the condition.

Return type:

pd.Series

unravel.cluster_stats.effect_sizes.effect_sizes.hedges_g(df, condition_1, condition_2)[source]#
unravel.cluster_stats.effect_sizes.effect_sizes.filter_dataframe(df, cluster_list)[source]#
unravel.cluster_stats.effect_sizes.effect_sizes.main()[source]#