unravel.cluster_stats.table module#

Use cstats_table from UNRAVEL to summarize volumes of the top x regions and collapsing them into parent regions until a criterion is met.

Prereqs:
  • This command is usually run via cstats_summary.

Inputs:
  • CSVs with sunburst data for each cluster (e.g., cluster_`*`_sunburst.csv).

  • *`cluster_info.txt in the parent dir (made by ``cstats_fdr` and copied by cstats_org_data).

Outputs:
  • A color-coded xlsx table summarizing the top regions and their volumes for each cluster.

  • A hierarchically sorted CSV with regional volumes for each cluster.

Sorting by hierarchy and volume:#

Group by Depth: Starting from the earliest depth column, for each depth level:
  • Sum the volumes of all rows sharing the same region (or combination of regions up to that depth).

  • Sort these groups by their aggregate volume in descending order, ensuring larger groups are prioritized.

Sort Within Groups: Within each group created in step 1:
  • Sort the rows by their individual volume in descending order.

Maintain Grouping Order:
  • As we move to deeper depth levels, maintain the grouping and ordering established in previous steps (only adjusting the order within groups based on the new depth’s aggregate volumes).

Note

  • CCFv3-2020_info.csv is in UNRAVEL/unravel/core/csvs/

  • It has columns: structure_id_path,very_general_region,collapsed_region_name,abbreviation,collapsed_region,other_abbreviation,other_abbreviation_defined,layer,sunburst

  • Alternatively, use CCFv3-2017_info.csv or provide a custom CSV with the same columns.

Usage:#

cstats_table [-vcd <val_clusters_dir>] [-t <number of top regions>] [-pv <perecent volume criterion>] [-csv CCFv3-2020_info.csv] [-rgb sunburst_RGBs.csv] [-v]

unravel.cluster_stats.table.parse_args()[source]#
unravel.cluster_stats.table.fill_na_with_last_known(df)[source]#
unravel.cluster_stats.table.sort_sunburst_hierarchy(df)[source]#

Sort the DataFrame by hierarchy and volume.

unravel.cluster_stats.table.undo_fill_with_original(df_sorted, df_original)[source]#
unravel.cluster_stats.table.can_collapse(df, depth_col)[source]#

Determine if regions in the specified depth column can be collapsed. Returns a DataFrame with regions that can be collapsed based on volume and count criteria.

unravel.cluster_stats.table.collapse_hierarchy(df, verbose=False)[source]#
unravel.cluster_stats.table.calculate_top_regions(df, top_n, percent_vol_threshold, verbose=False)[source]#

Identify the top regions based on the dynamically collapsed hierarchy, ensuring they meet the specified percentage volume criterion.

Parameters:
  • df_collapsed – DataFrame with the hierarchy collapsed where meaningful.

  • top_n – The number of top regions to identify.

  • percent_vol_threshold – The minimum percentage of total volume these regions should represent.

Returns:

DataFrame with the top regions and their volumes if the criterion is met; otherwise, None.

unravel.cluster_stats.table.get_top_regions_and_percent_vols(sunburst_csv_path, top_regions, percent_vol, verbose=False)[source]#
unravel.cluster_stats.table.get_fill_color(value, max_value)[source]#
unravel.cluster_stats.table.main()[source]#