unravel.cluster_stats.summary module#
Use cstats_summary
(css
) from UNRAVEL to aggregate and analyze cluster validation data from cstats_validation
.
- Prereqs:
cstats_validation
The name of the rev_cluster_index file should relate to the name of the cluster validation directory.
cluster_index_dir = Path(args.moving_img from cstats_validation).name w/o “_rev_cluster_index” and “.nii.gz”
_cluster_info.txt should be named like: cluster_index_dir + “_cluster_info.txt”
vstats_path / ‘stats’ / cluster_correction_dir should contain:
<cluster_index_dir>_rev_cluster_index[_LH | _RH].nii.gz, <cluster_index_dir>_cluster_info.txt, and p_value_threshold.txt
- Inputs:
Cell/label density CSVs from from
cstats_validation
The current directory should not have other folders when running this script for the first time.
Directories from
cstats_summary
orcstats_org_data
are ok though.- The sample_key.csv file should have the following format:
dir_name,condition sample01,control sample02,treatment
- Outputs:
- For each cluster map, the following output directories are created:
3D_brains: Files 3D brain models of valid clusters
valid_clusters_tables_and_legend: Excel files with tables summarizing top regions and defining region abbreviations (for SI tables)
_valid_clusters: valid cluster maps, CSVs for sunburst plots (plot with Flourish), etc.
_valid_clusters_stats: test results for adding asterisks to the xlsx files, etc.
_valid_clusters_prism: CSVs for making bar graphs in GraphPad Prism (refer to the xlsx files for annotations)
cstats_summary
runs these commands:cstats_org_data
,cstats_group_data
,utils_prepend
,cstats
,cstats_index
,cstats_brain_model
,cstats_table
,cstats_prism
,cstats_legend
Note
Only process one comparison at a time. If you have multiple comparisons, run this script separately for each comparison in separate directories.
Then aggregate the results as needed (e.g. to make a legend with all relevant abbeviations, copy the .xlsx files to a central location and run
cstats_legend
).See
cstats
for more information on -cp and -hg.
If you need to rerun this script, delete the following directories and files in the current working directory: find . -name _valid_clusters -exec rm -rf {} ; -o -name cluster_validation_summary_t-test.csv -exec rm -f {} ; -o -name cluster_validation_summary_tukey.csv -exec rm -f {} ; -o -name 3D_brains -exec rm -rf {} ; -o -name valid_clusters_tables_and_legend -exec rm -rf {} ; -o -name _valid_clusters_stats -exec rm -rf {} ;
If you want to aggregate CSVs for sunburst plots of valid clusters, run this in a root directory: find . -name “valid_clusters_sunburst.csv” -exec sh -c ‘cp {} ./$(basename $(dirname $(dirname {})))_$(basename {})’ ;
Likewise, you can aggregate raw data (raw_data_for_t-test_pooled.csv), stats (t-test_results.csv), and prism files (cell_density_summary_for_valid_clusters.csv).
Usage if running directly after cstats_validation
:#
cstats_summary -c <path/config.ini> -cvd ‘psilocybin_v_saline_tstat1_q<asterisk>’ -vd <path/vstats_dir> -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]
Usage from a cluster correction dir after cstats_validation
:#
cstats_summary -c cluster_summary.ini -cvd ‘psilocybin_v_saline_tstat1_q<asterisk>’ -vd ../.. -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]
Usage if running after cstats_validation
and cstats_org_data
:#
cstats_summary -c <path/config.ini> -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]