unravel.cluster_stats.summary module#
Use cstats_summary (css) from UNRAVEL to aggregate and analyze cluster validation data from cstats_validation.
- Prereqs:
- cstats_validation
- The name of the rev_cluster_index file should relate to the name of the cluster validation directory. 
- cluster_index_dir = Path(args.moving_img from cstats_validation).name w/o “_rev_cluster_index” and “.nii.gz” 
- _cluster_info.txt should be named like: cluster_index_dir + “_cluster_info.txt” 
- vstats_path / ‘stats’ / cluster_correction_dir should contain: 
- <cluster_index_dir>_rev_cluster_index[_LH | _RH].nii.gz, <cluster_index_dir>_cluster_info.txt, and p_value_threshold.txt 
 
- Inputs:
- Cell/label density CSVs from from - cstats_validation
- The current directory should not have other folders when running this script for the first time. 
- Directories from - cstats_summaryor- cstats_org_dataare ok though.
- The sample_key.csv file should have the following format:
- dir_name,condition sample01,control sample02,treatment 
 
 
- Outputs:
- For each cluster map, the following output directories are created:
- 3D_brains: Files 3D brain models of valid clusters 
- valid_clusters_tables_and_legend: Excel files with tables summarizing top regions and defining region abbreviations (for SI tables) 
- _valid_clusters: valid cluster maps, CSVs for sunburst plots (plot with Flourish), etc. 
- _valid_clusters_stats: test results for adding asterisks to the xlsx files, etc. 
- _valid_clusters_prism: CSVs for making bar graphs in GraphPad Prism (refer to the xlsx files for annotations) 
 
 
 
- cstats_summaryruns these commands:
- cstats_org_data,- cstats_group_data,- utils_prepend,- cstats,- cstats_index,- cstats_brain_model,- cstats_table,- cstats_prism,- cstats_legend
 
Note
- Only process one comparison at a time. If you have multiple comparisons, run this script separately for each comparison in separate directories. 
- Then aggregate the results as needed (e.g. to make a legend with all relevant abbeviations, copy the .xlsx files to a central location and run - cstats_legend).
- See - cstatsfor more information on -cp and -hg.
If you need to rerun this script, delete the following directories and files in the current working directory: find . -name _valid_clusters -exec rm -rf {} ; -o -name cluster_validation_summary_t-test.csv -exec rm -f {} ; -o -name cluster_validation_summary_tukey.csv -exec rm -f {} ; -o -name 3D_brains -exec rm -rf {} ; -o -name valid_clusters_tables_and_legend -exec rm -rf {} ; -o -name _valid_clusters_stats -exec rm -rf {} ;
If you want to aggregate CSVs for sunburst plots of valid clusters, run this in a root directory: find . -name “valid_clusters_sunburst.csv” -exec sh -c ‘cp {} ./$(basename $(dirname $(dirname {})))_$(basename {})’ ;
Likewise, you can aggregate raw data (raw_data_for_t-test_pooled.csv), stats (t-test_results.csv), and prism files (cell_density_summary_for_valid_clusters.csv).
- Next steps:
- cstats_summary_config: Copy the cluster_summary.ini file to a new location.
- cstats_summary: Aggregate and analyze cluster validation data from cstats_validation.
 
Usage if running directly after cstats_validation:#
cstats_summary -c <path/config.ini> -cvd ‘psilocybin_v_saline_tstat1_q<asterisk>’ -vd <path/vstats_dir> -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]
Usage from a cluster correction dir after cstats_validation:#
cstats_summary -c cluster_summary.ini -cvd ‘psilocybin_v_saline_tstat1_q<asterisk>’ -vd ../.. -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]
Usage if running after cstats_validation and cstats_org_data:#
cstats_summary -c <path/config.ini> -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]