Visualization of iModulons
This section covers the visualization functions for exploring iModulon gene weights and activities across samples.
Overview
MultiModulon provides eight main visualization functions:
view_iModulon_weights - Visualize gene weights within a component for a single species
view_core_iModulon_weights - Visualize a core iModulon component across all species
view_iModulon_activities - Visualize component activities across samples
compare_core_iModulon_activity - Compare core iModulon activities across multiple species for specific conditions
plot_iM_conservation_bubble_matrix - Summarize iModulon conservation across species
show_iModulon_activity_change - Visualize activity changes between two conditions
core_iModulon_stability - Quantify core iModulon stability across species using pairwise correlations
show_gene_iModulon_correlation - Show correlation between gene expression and iModulon activity across species
All functions support customization of appearance, highlighting, and export options.
Visualizing Gene Weights
- MultiModulon.view_iModulon_weights(species, component, save_path=None, fig_size=(6, 4), font_path=None, show_COG=False, show_gene_names=None, show_all_gene_names=False)
Create a bar plot showing gene weights for a specific iModulon component.
- Parameters:
species (str) – Species/strain name
component (str) – Component name (e.g., ‘Core_1’, ‘Unique_1’)
save_path (str) – Path to save the plot (optional)
fig_size (tuple) – Figure size as (width, height) (default: (6, 4))
font_path (str) – Path to custom font file (optional)
show_COG (bool) – Color genes by COG category (default: False)
show_gene_names (bool) – Show gene names on plot. If None, auto-set based on component size (default: None). Maximum 60 gene labels will be shown (top genes by weight magnitude)
show_all_gene_names (bool) – Label all genes above threshold without the 60-label cap (default: False)
Basic Usage
# Simple gene weight plot
multiModulon.view_iModulon_weights(
species='Species1',
component='Core_1',
save_path='core1_weights.svg'
)
# With COG coloring
multiModulon.view_iModulon_weights(
species='Species1',
component='Core_1',
show_COG=True,
save_path='core1_weights_COG.svg'
)
# With gene names labeled
multiModulon.view_iModulon_weights(
species='Species1',
component='Core_1',
show_gene_names=True,
save_path='core1_weights_labeled.svg'
)
# Auto-labeling for small components (default behavior)
# If component has <10 genes above threshold, labels are shown automatically
multiModulon.view_iModulon_weights(
species='Species1',
component='Small_Component_1', # Has only 7 genes
save_path='small_component_auto_labeled.svg'
)
Understanding the Plot
X-axis: Gene positions along genome (Mb)
Y-axis: Gene weights (coefficients from M matrix)
Dotted lines: Threshold (if optimized)
Colors: COG categories (if show_COG=True) or light blue/grey based on threshold
Labels: Gene names displayed on plot when show_gene_names=True (max 60 genes) - Automatically shown for small components (<10 genes above threshold) - Uses Preferred_name if available, otherwise uses standard gene names - Text has white background boxes for better readability - Positioned with initial offset (2% of y-range) from dots using golden angle distribution - Uses adjustText library for optimized positioning with strong repulsion parameters - Simple lines connect labels to their corresponding points - Fallback smart positioning with alternating pattern if adjustText fails
COG Categories
When show_COG=True, genes are colored by functional category:
# COG categories and their colors:
# - Translation (J): black
# - Transcription (K): sandybrown
# - Replication (L): fuchsia
# - Cell division (D): olive
# - Defense (V): orchid
# - Signal transduction (T): teal
# - Cell membrane (M): purple
# - Energy production (C): red
# - Carbohydrate metabolism (G): gold
# - Amino acid metabolism (E): darkgreen
# - Nucleotide metabolism (F): pink
# - Coenzyme metabolism (H): brown
# - Lipid metabolism (I): lightsalmon
# - Inorganic ion metabolism (P): darkblue
# - Secondary metabolism (Q): sienna
# - Unknown function (S): lightgray
# - Not in COG: gray
Customizing Appearance
# Larger figure with custom font
multiModulon.view_iModulon_weights(
species='Species1',
component='Core_1',
fig_size=(8, 6),
font_path='/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf',
save_path='custom_weights.svg'
)
Visualizing Core iModulons Across Species
- MultiModulon.view_core_iModulon_weights(component, save_path=None, fig_size=(6, 4), font_path=None, show_COG=False, reference_order=None, show_gene_names=False)
Visualize a core iModulon component across all species. Creates individual plots for each species showing the same core component, or a combined plot with subplots when COG coloring is enabled.
- Parameters:
component (str) – Core component name (e.g., ‘Core_1’, ‘Core_2’)
save_path (str) – Directory path to save plots (optional)
fig_size (tuple) – Figure size for individual plots (default: (6, 4))
font_path (str) – Path to custom font file (optional)
show_COG (bool) – Color genes by COG category (default: False)
reference_order (list) – Custom species order for subplot arrangement (optional)
show_gene_names (bool) – Show gene names on plots for all genes above threshold and uses adjust_text to reduce overlap (default: False).
Basic Usage
# Visualize core component across all species
multiModulon.view_core_iModulon_weights(
component='Core_1',
save_path='core_plots/'
)
# With COG coloring - creates combined plot
multiModulon.view_core_iModulon_weights(
component='Core_1',
show_COG=True,
save_path='core1_all_species_COG.svg'
)
# With gene labeling - shows all genes above threshold
multiModulon.view_core_iModulon_weights(
component='Core_1',
show_gene_names=True,
save_path='core1_labeled.svg'
)
# This will:
# - Label all genes above threshold in each species plot
# - Print list of shared genes to console (when available)
Custom Species Order
When using COG coloring, arrange species in a specific order:
# Define custom order (first 3 in top row, rest in bottom row)
multiModulon.view_core_iModulon_weights(
component='Core_1',
show_COG=True,
reference_order=['MG1655', 'BL21', 'C', 'Crooks', 'W', 'W3110'],
save_path='core1_ordered.svg'
)
Understanding the Output
- Without COG coloring: Creates individual plots for each species
Each plot saved as ‘{species}_{component}_iModulon.svg’
Shows gene weights on genomic coordinates
Includes threshold lines if available
Gene labels shown if show_gene_names=True
- With COG coloring: Creates a single combined plot
All species shown as subplots
Shared COG category legend at bottom
Genes colored by functional category
Grey dots indicate genes below threshold
Gene labels shown if show_gene_names=True (all genes above threshold)
Shared genes across all species are printed to console instead
Initial positioning uses golden angle spiral (same as view_iModulon_weights)
Stronger force parameters for crowded subplots (force_points: 1.0-1.2, force_text: 2.0-2.5)
More expansion around points and text (3.5-4.0 for points, 3.0-3.5 for text)
2500 iterations for better convergence in small subplot spaces
Fallback to initial positions if adjust_text fails
Batch Processing Core Components
# Plot all core components
M = multiModulon[multiModulon.species[0]].M
core_components = [c for c in M.columns if c.startswith('Core_')]
for comp in core_components:
# Individual species plots
multiModulon.view_core_iModulon_weights(
component=comp,
save_path=f'core_plots/{comp}/'
)
# Combined COG plot
multiModulon.view_core_iModulon_weights(
component=comp,
show_COG=True,
save_path=f'core_plots/{comp}_COG.svg'
)
Visualizing iModulon Activities
- MultiModulon.view_iModulon_activities(species, component, save_path=None, fig_size=(12, 3), font_path=None, highlight_project=None, highlight_study=None, highlight_condition=None, show_highlight_only=False, show_highlight_only_color=None)
Create a bar plot showing component activities across samples.
- Parameters:
species (str) – Species/strain name
component (str) – Component name
save_path (str) – Path to save the plot
fig_size (tuple) – Figure size (default: (12, 3))
font_path (str) – Path to custom font
highlight_project – Project(s) to highlight (str or list)
highlight_study (str) – Study to highlight
highlight_condition – Condition(s) to highlight (str or list)
show_highlight_only (bool) – Only show highlighted conditions
show_highlight_only_color (list) – Colors for highlighted conditions
Basic Usage
# Simple activity plot
multiModulon.view_iModulon_activities(
species='Species1',
component='Core_1',
save_path='core1_activities.svg'
)
# Highlight specific project
multiModulon.view_iModulon_activities(
species='Species1',
component='Core_1',
highlight_project='ProjectA',
save_path='core1_highlighted.svg'
)
Condition-based Visualization
When a condition column exists in the sample sheet:
# Activities are averaged by condition
# Individual sample values shown as black dots
multiModulon.view_iModulon_activities(
species='Species1',
component='Core_1',
save_path='condition_averaged.svg'
)
# Highlight specific conditions
multiModulon.view_iModulon_activities(
species='Species1',
component='Core_1',
highlight_condition=['Treatment1', 'Treatment2'],
save_path='conditions_highlighted.svg'
)
Show Only Highlighted Conditions
Focus on specific conditions:
# Show only specific conditions with custom colors
multiModulon.view_iModulon_activities(
species='Species1',
component='Core_1',
highlight_condition=['Control', 'Stress', 'Recovery'],
show_highlight_only=True,
show_highlight_only_color=['blue', 'red', 'green'],
save_path='focused_conditions.svg'
)
Multiple Highlighting Options
# Highlight multiple projects
multiModulon.view_iModulon_activities(
species='Species1',
component='Core_1',
highlight_project=['ProjectA', 'ProjectB'],
save_path='multi_project.svg'
)
# Highlight by study
multiModulon.view_iModulon_activities(
species='Species1',
component='Core_1',
highlight_study='GSE12345',
save_path='study_highlighted.svg'
)
Advanced Visualization
Batch Visualization
Create plots for multiple components:
# Plot all core components
for species in multiModulon.species:
M = multiModulon[species].M
core_comps = [c for c in M.columns if c.startswith('Core_')]
for comp in core_comps:
# Gene weights
multiModulon.view_iModulon_weights(
species=species,
component=comp,
show_COG=True,
save_path=f'weights/{species}_{comp}_weights.svg'
)
# Activities
multiModulon.view_iModulon_activities(
species=species,
component=comp,
save_path=f'activities/{species}_{comp}_activities.svg'
)
Export Options
File Formats
Save plots in different formats:
# Vector format (scalable)
multiModulon.view_iModulon_weights(
species='Species1',
component='Core_1',
save_path='weights.svg' # SVG format
)
# High-resolution raster
multiModulon.view_iModulon_weights(
species='Species1',
component='Core_1',
save_path='weights.png' # png at 300 DPI
)
# PDF for publications
multiModulon.view_iModulon_weights(
species='Species1',
component='Core_1',
save_path='weights.pdf'
)
Directory Organization
Organize outputs systematically:
import os
# Create directory structure
base_dir = 'imodulon_plots'
for subdir in ['weights', 'activities', 'weights_COG']:
os.makedirs(f'{base_dir}/{subdir}', exist_ok=True)
# Save with organized naming
for species in multiModulon.species:
for comp in ['Core_1', 'Core_2', 'Unique_1']:
# Weights without COG
multiModulon.view_iModulon_weights(
species=species,
component=comp,
save_path=f'{base_dir}/weights/{species}_{comp}.svg'
)
# Weights with COG
multiModulon.view_iModulon_weights(
species=species,
component=comp,
show_COG=True,
save_path=f'{base_dir}/weights_COG/{species}_{comp}.svg'
)
# Activities
multiModulon.view_iModulon_activities(
species=species,
component=comp,
save_path=f'{base_dir}/activities/{species}_{comp}.svg'
)
Comparing Core iModulon Activities Across Species
- MultiModulon.compare_core_iModulon_activity(component, species_in_comparison, condition_list, save_path=None, fig_size=(12, 3), font_path=None, legend_title=None, title=None)
Compare core iModulon activities across multiple species for specific conditions. Creates a grouped bar plot with conditions on x-axis and species shown as different colored bars.
- Parameters:
component (str) – Core component name (e.g., ‘Core_1’, ‘Core_2’)
species_in_comparison (list) – List of species names to compare
condition_list (list) – List of conditions in format “condition:project”
save_path (str) – Path to save the plot (optional)
fig_size (tuple) – Figure size (default: (12, 3))
font_path (str) – Path to custom font file (optional)
legend_title (str) – Custom title for the legend (default: ‘Species’)
title (str) – Custom title for the plot (default: ‘Core iModulon {component} Activity Comparison’)
Basic Usage
# Compare Core_1 activities across species for specific conditions
multiModulon.compare_core_iModulon_activity(
component='Core_1',
species_in_comparison=['E_coli', 'S_enterica', 'K_pneumoniae'],
condition_list=['glucose:project1', 'lactose:project1', 'arabinose:project2']
)
Condition Format
Conditions must be specified as “condition:project” pairs:
# Comparing growth conditions from different projects
multiModulon.compare_core_iModulon_activity(
component='Core_1',
species_in_comparison=['Species1', 'Species2', 'Species3'],
condition_list=[
'exponential:growth_study', # Exponential phase from growth_study
'stationary:growth_study', # Stationary phase from growth_study
'heat_shock:stress_project', # Heat shock from stress_project
'cold_shock:stress_project' # Cold shock from stress_project
],
save_path='core1_condition_comparison.svg'
)
Understanding the Plot
X-axis: Conditions (grouped by the order in condition_list)
Y-axis: iModulon activity values
Bars: Different colors for each species
Dots: Individual sample values (black dots on bars)
Legend: Species names with corresponding colors
Error Handling
The function validates that all conditions exist in all species:
# This will raise an error if any species lacks a condition
try:
multiModulon.compare_core_iModulon_activity(
component='Core_1',
species_in_comparison=['Species1', 'Species2'],
condition_list=['rare_condition:project1']
)
except ValueError as e:
print(f"Error: {e}")
Customizing Appearance
# Larger figure with custom font
multiModulon.compare_core_iModulon_activity(
component='Core_1',
species_in_comparison=['Species1', 'Species2', 'Species3'],
condition_list=['control:exp1', 'treatment:exp1'],
fig_size=(15, 5), # Wider figure
font_path='/path/to/font.ttf',
save_path='comparison_custom.svg'
)
# Custom title and legend
multiModulon.compare_core_iModulon_activity(
component='Core_1',
species_in_comparison=['E_coli_K12', 'E_coli_B', 'E_coli_C'],
condition_list=['glucose:carbon_study', 'lactose:carbon_study'],
title='Carbon Source Response in E. coli Strains',
legend_title='E. coli Strain',
save_path='ecoli_carbon_response.svg'
)
Use Cases
Stress Response Comparison: Compare how different species respond to the same stresses
Metabolic Adaptation: Analyze metabolic shifts across species under different carbon sources
Evolutionary Analysis: Study conservation of regulatory responses
# Example: Comparing stress responses
stress_conditions = [
'control:stress_study',
'heat_42C:stress_study',
'oxidative_H2O2:stress_study',
'acid_pH5:stress_study'
]
multiModulon.compare_core_iModulon_activity(
component='Core_1', # Assuming Core_1 is stress-related
species_in_comparison=['E_coli', 'S_enterica', 'K_pneumoniae'],
condition_list=stress_conditions,
save_path='stress_response_comparison.svg'
)
Conservation Bubble Matrix
- MultiModulon.plot_iM_conservation_bubble_matrix(n_components, reference_order=None, iM_colors=None, fig_size=(10, 4), bubble_scale=800.0, y_label='Species/Strains', save_path=None, font_path=None)
Plot a bubble matrix summarizing iModulon conservation across species.
- Parameters:
n_components (int) – Number of leading components (per species) on the x-axis
reference_order (list) – Optional species order for the y-axis
iM_colors (list) – Optional list of colors for iModulon columns
fig_size (tuple) – Figure size as (width, height) (default: (10, 4))
bubble_scale (float) – Scaling factor for bubble sizes (default: 800.0)
y_label (str) – Label for the y-axis (default: “Species/Strains”)
save_path (str) – Path to save the plot (optional)
font_path (str) – Path to custom font file (optional)
Basic Usage
# Summarize conservation for the top 8 components per species
multiModulon.plot_iM_conservation_bubble_matrix(
n_components=8,
reference_order=['Species1', 'Species2', 'Species3'],
save_path='conservation_bubble_matrix.svg'
)
Visualizing Activity Changes Between Conditions
- MultiModulon.show_iModulon_activity_change(species, condition_1, condition_2, save_path=None, fig_size=(5, 5), font_path=None, threshold=1.5)
Visualize iModulon activity changes between two conditions as a scatter plot.
Creates a scatter plot with condition_1 activities on x-axis and condition_2 on y-axis. Components with significant changes are highlighted in light blue and labeled. Activities are calculated by averaging all biological replicates for each condition.
- Parameters:
species (str) – Species/strain name
condition_1 (str) – First condition in format “condition_name:project_name” (x-axis)
condition_2 (str) – Second condition in format “condition_name:project_name” (y-axis)
save_path (str) – Path to save the plot (optional)
fig_size (tuple) – Figure size (default: (5, 5))
font_path (str) – Path to custom font file (optional)
threshold (float) – Threshold for significant change (default: 1.5). Scaled based on activity range
Basic Usage
# Compare activities between two conditions
multiModulon.show_iModulon_activity_change(
species='E_coli',
condition_1='glucose:carbon_source_study',
condition_2='lactose:carbon_source_study',
save_path='glucose_vs_lactose_changes.svg'
)
# Compare conditions from different projects
multiModulon.show_iModulon_activity_change(
species='E_coli',
condition_1='control:experiment_1',
condition_2='stress:experiment_2',
save_path='cross_project_comparison.svg'
)
Understanding the Plot
Grey dots: Components with minimal change between conditions
Light blue dots: Components with significant change (absolute difference > scaled threshold)
Labels: Component names shown for significant changes - Smart initial positioning that checks distance to ALL points before placing labels - Minimum safe distance of 10% of axis range from any point - White background boxes with light gray borders for readability - Simple gray lines connect labels to their corresponding points - No automatic repositioning to prevent labels from moving onto dots - 10% axis margins added to ensure labels are fully visible - Saved with 0.05 inch padding to prevent label cutoff
Dotted lines: Three reference lines at y=x (diagonal), x=0 (vertical), and y=0 (horizontal)
Note: The threshold is automatically scaled based on the range of activities to handle negative ICA values appropriately.
Customizing the Threshold
# Use stricter threshold for significance
multiModulon.show_iModulon_activity_change(
species='E_coli',
condition_1='control:stress_study',
condition_2='heat_shock:stress_study',
threshold=2.0, # Require 2-fold change
save_path='stress_response_strict.svg'
)
# Use more lenient threshold
multiModulon.show_iModulon_activity_change(
species='E_coli',
condition_1='early_log:growth_curve',
condition_2='late_log:growth_curve',
threshold=1.3, # 1.3-fold change
save_path='growth_phase_changes.svg'
)
Use Cases
Metabolic Shifts: Identify iModulons responding to carbon source changes
Stress Response: Find iModulons activated under stress conditions
Growth Phase: Compare exponential vs stationary phase activities
Treatment Effects: Analyze drug or environmental perturbations
# Example: Analyzing antibiotic response
multiModulon.show_iModulon_activity_change(
species='E_coli',
condition_1='untreated:antibiotic_study',
condition_2='ampicillin:antibiotic_study',
threshold=1.5,
save_path='ampicillin_response.svg'
)
# Example: Growth phase comparison
multiModulon.show_iModulon_activity_change(
species='S_enterica',
condition_1='exponential:growth_phases',
condition_2='stationary:growth_phases',
font_path='/path/to/Arial.ttf',
save_path='growth_phase_comparison.pdf'
)
Core iModulon Stability Analysis
- MultiModulon.core_iModulon_stability(component, save_path=None, fig_size=(6, 4), font_path=None, show_stats=True)
Quantify core iModulon stability across species using pairwise correlations.
This function calculates how similar a core iModulon is across different species by computing the mean pairwise correlation of M matrix weights. It uses adaptive gap detection to identify distinct groups and outliers in small datasets (3-6 species).
- Parameters:
component (str) – Component name (e.g., ‘Core_1’, ‘Iron’)
save_path (str) – Path to save the plot (optional)
fig_size (tuple) – Figure size as (width, height) (default: (6, 5))
font_path (str) – Path to custom font file (optional)
show_stats (bool) – Whether to show statistics on the plot (default: True)
- Returns:
Tuple of (stable_species, stable_min, stable_max, stability_scores) - stable_species (list): List of species names classified as stable (non-outliers) - stable_min (float): Lower boundary for stable range (outlier detection threshold) - stable_max (float): Upper boundary for stable range (always 1.0) - stability_scores (dict): Dictionary mapping species names to stability scores
Basic Usage
# Simple usage - robust outlier detection optimized for 3-6 species
stable, min_bound, max_bound, scores = multiModulon.core_iModulon_stability('Core_1')
print(f"Stable species: {stable}")
print(f"Outlier threshold: {min_bound:.3f}")
# With custom font for publications
stable, min_bound, max_bound, scores = multiModulon.core_iModulon_stability(
'Iron',
font_path='/usr/share/fonts/truetype/msttcorefonts/Arial.ttf',
save_path='iron_stability.svg'
)
# Analyze individual species stability
for species, score in scores.items():
status = "stable" if species in stable else "outlier (problematic)"
print(f"{species}: {score:.3f} ({status})")
# Check if any species are problematic
if len(stable) < len(scores):
outliers = [s for s in scores.keys() if s not in stable]
print(f"⚠️ Potential issues with: {', '.join(outliers)}")
else:
print("✅ All species show consistent regulatory patterns")
Understanding the Stability Metric
The stability score for each species is calculated as the mean pairwise Pearson correlation of its M matrix weights with all other species for the specified core component:
Score = 1.0: Perfect correlation with all other species (highly stable)
Score > 0.7: Good stability, similar regulatory pattern across species
Score < 0.5: Low stability, divergent regulatory pattern
Adaptive Gap Detection
The function uses adaptive gap detection - specifically designed to handle both group separation and outlier detection in small datasets (3-6 species):
- Multi-Level Detection Strategy
Similar scores (range < 0.05): All species marked stable
Significant gaps: Detects natural breaks between groups of species
Outlier detection: Uses IQR method for scattered individual outliers
Edge cases: Special handling for 3-species datasets
- Gap Detection Logic
Large gap threshold: Gap must be ≥15% of total range (or 60% for small ranges)
Split groups: Places threshold at midpoint of largest significant gap
Fallback to IQR: Uses interquartile range outlier detection if no clear gaps
- Why This Approach Works Better?
Detects group patterns: Identifies when species form distinct clusters
Handles uniform distributions: Correctly identifies when all species are similar
Sensitive to structure: Finds meaningful biological separations
Robust to sample size: Works from 3-6 species with appropriate thresholds
- Real Examples from Your Data:
Core_5 scores: [0.63, 0.65, 0.67, 0.67, 0.68] → Range=0.05 → All stable ✅
Core_6 scores: [0.38, 0.38, 0.40, 0.52, 0.52, 0.53] → Gap=0.12 → Two groups detected ✅
This adaptive method correctly identifies both scenarios: truly stable components and those with distinct species groups.
Understanding the Plot
X-axis: Species names
Y-axis: Stability scores (mean pairwise correlation)
Bar colors: - Blue (#C1C6E8): Stable species - Peach (#F0DDD2): Unstable species
Gray dashed lines: Adaptive detection boundaries (gap detection or outlier threshold)
Light blue shading: Stable correlation range (adapts to data structure)
Clean legend: Simple “Stable” and “Unstable” labels
Use Cases
Quality Control: Identify species with poorly defined or inconsistent iModulons
Data Validation: Detect potential ICA decomposition issues or data quality problems
Species Selection: Choose the most reliable species for downstream analysis
Comparative Analysis: Understand which species deviate from conserved regulatory patterns
Method Validation: Assess whether core components are truly “core” across species
# Example: Analyzing all core components
M = multiModulon[multiModulon.species[0]].M
core_components = [c for c in M.columns if c.startswith('Core_')]
stability_results = {}
for comp in core_components:
stable, min_bound, max_bound, scores = multiModulon.core_iModulon_stability(
comp,
save_path=f'stability/{comp}_stability.svg'
)
stability_results[comp] = {
'stable_species': stable,
'range': (min_bound, max_bound),
'all_scores': scores,
'range_width': max_bound - min_bound
}
# Identify components with outlier species
problematic_components = [comp for comp, res in stability_results.items()
if len(res['stable_species']) < len(scores)]
print(f"Components with outlier species: {problematic_components}")
# Find the most problematic species across all components
all_outliers = []
for res in stability_results.values():
outliers = [s for s in scores.keys() if s not in res['stable_species']]
all_outliers.extend(outliers)
from collections import Counter
outlier_counts = Counter(all_outliers)
if outlier_counts:
print("Species flagged as outliers (count across components):")
for species, count in outlier_counts.most_common():
print(f" {species}: {count} components")
Advanced Options
# Disable statistics display for cleaner plots
stable, min_bound, max_bound, scores = multiModulon.core_iModulon_stability(
'Core_1',
show_stats=False,
save_path='clean_stability.svg'
)
# Custom figure size and font
stable, min_bound, max_bound, scores = multiModulon.core_iModulon_stability(
'Core_1',
fig_size=(7, 5),
font_path='/usr/share/fonts/truetype/msttcorefonts/Arial.ttf',
save_path='custom_stability.pdf'
)
# Batch analysis of multiple components
M = multiModulon[multiModulon.species[0]].M
components = [c for c in M.columns if c.startswith('Core_')]
stability_summary = {}
for comp in components:
stable, min_bound, max_bound, scores = multiModulon.core_iModulon_stability(
comp,
save_path=f'stability/{comp}_stability.svg'
)
stability_summary[comp] = {
'n_stable': len(stable),
'n_total': len(scores),
'outlier_threshold': min_bound,
'median_score': np.median(list(scores.values())),
'outlier_species': [s for s in scores.keys() if s not in stable]
}
print("Component stability summary:")
for comp, stats in stability_summary.items():
outlier_info = f", outliers: {stats['outlier_species']}" if stats['outlier_species'] else ""
print(f"{comp}: {stats['n_stable']}/{stats['n_total']} stable "
f"(median={stats['median_score']:.3f}){outlier_info}")
# Overall dataset quality assessment
total_stable = sum(stats['n_stable'] for stats in stability_summary.values())
total_possible = sum(stats['n_total'] for stats in stability_summary.values())
print(f"\nOverall stability: {total_stable}/{total_possible} "
f"({100*total_stable/total_possible:.1f}%) species-component pairs are stable")
Gene-iModulon Correlation Analysis
- MultiModulon.show_gene_iModulon_correlation(gene, component, save_path=None, fig_size=(5, 4), font_path=None)
Show correlation between gene expression and iModulon activity across species.
Creates scatter plots showing the correlation between gene expression (from log_tpm) and component activity (from A matrix) for each species where the gene is present.
- Parameters:
gene (str) – Gene name (any value from combined_gene_db)
component (str) – Component name (e.g., ‘Core_1’, ‘Unique_1’)
save_path (str) – Path to save the figure (optional). Can be: - Full file path with extension (e.g., ‘output/correlation.svg’) - Directory path (will save as ‘{gene}_{component}_correlation.svg’)
fig_size (tuple) – Figure size for each subplot (default: (5, 4))
font_path (str) – Path to custom font file (optional)
Basic Usage
# Show correlation for a specific gene and core iModulon
multiModulon.show_gene_iModulon_correlation(
gene='argA',
component='Core_1',
save_path='argA_Core1_correlation.svg'
)
# With custom appearance
multiModulon.show_gene_iModulon_correlation(
gene='trpE',
component='Core_3',
fig_size=(6, 5),
font_path='/path/to/Arial.ttf',
save_path='output_dir/'
)
Features
Multi-species visualization: Shows correlation for all species containing the gene
Correlation coefficient: Displays Pearson’s r in the top left of each subplot
Fitted line: Shows linear relationship between expression and activity
Automatic layout: Maximum 3 columns per row for multiple species
Species-specific gene names: Uses appropriate gene identifiers for each species
Use Cases
Validate iModulon members: Confirm genes are truly regulated by the iModulon
Cross-species comparison: See if gene-iModulon relationships are conserved
Identify outliers: Find conditions where typical correlations break down
Regulatory strength: Assess how tightly a gene follows iModulon activity
# Example: Analyzing amino acid biosynthesis regulation
multiModulon.show_gene_iModulon_correlation(
gene='hisG', # Histidine biosynthesis
component='Core_5', # Amino acid biosynthesis iModulon
save_path='histidine_regulation.pdf'
)
# Example: Stress response gene analysis
multiModulon.show_gene_iModulon_correlation(
gene='dnaK', # Heat shock protein
component='Core_8', # Stress response iModulon
fig_size=(5, 4),
save_path='stress_response_correlation.svg'
)
Best Practices
Use descriptive filenames - Include species and component names
Consistent figure sizes - Use same dimensions for comparable plots
Save vector formats - Use SVG for publication figures
Document parameters - Note thresholds and highlighting used
Next Steps
examples/visualization_gallery - More visualization examples
Biological interpretation - Analyze visualized patterns
Export for further analysis - Use data in other tools