unravel.cluster_stats.summary module#
Use cstats_summary (css) from UNRAVEL to aggregate and analyze cluster validation data from cstats_validation.
- Prereqs:
cstats_validationThe name of the rev_cluster_index file should relate to the name of the cluster validation directory.
cluster_index_dir = Path(args.moving_img from cstats_validation).name w/o “_rev_cluster_index” and “.nii.gz”
_cluster_info.txt should be named like: cluster_index_dir + “_cluster_info.txt”
vstats_path / ‘stats’ / cluster_correction_dir should contain:
<cluster_index_dir>_rev_cluster_index[_LH | _RH].nii.gz, <cluster_index_dir>_cluster_info.txt, and p_value_threshold.txt
- Inputs:
Cell/label density CSVs from from
cstats_validationThe current directory should not have other folders when running this script for the first time.
Directories from
cstats_summaryorcstats_org_dataare ok though.- The sample_key.csv file should have the following format:
dir_name,condition sample01,control sample02,treatment
- Outputs:
- For each cluster map, the following output directories are created:
3D_brains: Files 3D brain models of valid clusters
valid_clusters_tables_and_legend: Excel files with tables summarizing top regions and defining region abbreviations (for SI tables)
_valid_clusters: valid cluster maps, CSVs for sunburst plots (plot with Flourish), etc.
_valid_clusters_stats: test results for adding asterisks to the xlsx files, etc.
_valid_clusters_prism: CSVs for making bar graphs in GraphPad Prism (refer to the xlsx files for annotations)
cstats_summaryruns these commands:cstats_org_data,cstats_group_data,utils_prepend,cstats,cstats_index,cstats_brain_model,cstats_table,cstats_prism,cstats_legend
Note
Only process one comparison at a time. If you have multiple comparisons, run this script separately for each comparison in separate directories.
Then aggregate the results as needed (e.g. to make a legend with all relevant abbeviations, copy the .xlsx files to a central location and run
cstats_legend).See
cstatsfor more information on -cp and -hg.
If you need to rerun this script, delete the following directories and files in the current working directory: find . -name _valid_clusters -exec rm -rf {} ; -o -name cluster_validation_summary_t-test.csv -exec rm -f {} ; -o -name cluster_validation_summary_tukey.csv -exec rm -f {} ; -o -name 3D_brains -exec rm -rf {} ; -o -name valid_clusters_tables_and_legend -exec rm -rf {} ; -o -name _valid_clusters_stats -exec rm -rf {} ;
If you want to aggregate CSVs for sunburst plots of valid clusters, run this in a root directory: find . -name “valid_clusters_sunburst.csv” -exec sh -c ‘cp {} ./$(basename $(dirname $(dirname {})))_$(basename {})’ ;
Likewise, you can aggregate raw data (raw_data_for_t-test_pooled.csv), stats (t-test_results.csv), and prism files (cell_density_summary_for_valid_clusters.csv).
- Next steps:
cstats_summary_config: Copy the cluster_summary.ini file to a new location.cstats_summary: Aggregate and analyze cluster validation data from cstats_validation.
Usage if running directly after cstats_validation:#
cstats_summary -c <path/config.ini> -cvd ‘psilocybin_v_saline_tstat1_q<asterisk>’ -vd <path/vstats_dir> -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]
Usage from a cluster correction dir after cstats_validation:#
cstats_summary -c cluster_summary.ini -cvd ‘psilocybin_v_saline_tstat1_q<asterisk>’ -vd ../.. -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]
Usage if running after cstats_validation and cstats_org_data:#
cstats_summary -c <path/config.ini> -sk <path/sample_key.csv> –groups <group1> <group2> -hg <higher_group> [-d <list of paths>] [-v]