unravel.cluster_stats.effect_sizes.effect_sizes_by_sex__absolute module#
Use effect_sizes_sex_abs
(esa
) from UNRAVEL to calculate the effect size for a comparison between two groups for each cluster [or a valid cluster list].
- Prereqs:
cstats_validation
to generate the input CSVs with densities.
- Inputs:
CSV with densities (Columns: Samples, Sex, Conditions, Cluster_1, Cluster_2, …)
Enter M or F in the Sex column.
- Outputs:
CSV w/ absolute effect sizes and upper and lower limits of the confidence interval (CI) for each cluster
<input>_Hedges_g_<condition_1>_<condition_2>_<M/F>.csv for males and females
Note
-c1 and -c2 should match the condition name in the Conditions column of the input CSV or be a prefix of the condition name.
The effect size is calculated as the unbiased Hedge’s g effect sizes (corrected for sample size).
Hedges’ g = ((c2-c1)/spooled*corr_factor)
CI = Hedges’ g +/- t * SE
0.2-0.5 = small effect; 0.5-0.8 = medium; 0.8+ = large
The CI is based on a two-tailed t-test with alpha = 0.05.
More more info, see: https://pubmed.ncbi.nlm.nih.gov/37248402/
- Usage:
effect_sizes_sex_abs -i densities.csv -c1 saline -c2 psilocybin [-c 1 2 3 4 5] [-v]
- unravel.cluster_stats.effect_sizes.effect_sizes_by_sex__absolute.condition_selector(df, condition, unique_conditions, condition_column='Conditions')[source]#
Create a condition selector to handle pooling of data in a DataFrame based on specified conditions. This function checks if the ‘condition’ is exactly present in the ‘Conditions’ column or is a prefix of any condition in this column. If the exact condition is found, it selects those rows. If the condition is a prefix (e.g., ‘saline’ matches ‘saline-1’, ‘saline-2’), it selects all rows where the ‘Conditions’ column starts with this prefix. An error is raised if the condition is neither found as an exact match nor as a prefix.
- Parameters:
- Returns:
A boolean Series to select rows based on the condition.
- Return type:
pd.Series
- unravel.cluster_stats.effect_sizes.effect_sizes_by_sex__absolute.filter_dataframe(df, cluster_list)[source]#