Usage Guide =========== This guide covers the main plot types available in bioviz-kit with working examples drawn from the actual codebase. Installation ------------ Install from PyPI: .. code-block:: bash pip install bioviz-kit # Or using `uv`: uv install bioviz-kit Or install in development mode: .. code-block:: bash git clone https://github.com/yourusername/bioviz-kit.git cd bioviz-kit pip install -e . # Or using `uv` in editable/development mode: uv install -e . Core Concepts ------------- bioviz-kit follows a consistent pattern for all plot types: 1. **Config classes** - Pydantic models that define all plot parameters 2. **Plotter classes** - Take data + config, produce matplotlib figures This design provides: - Type safety and validation via Pydantic - IDE autocompletion for all parameters - Sensible defaults for publication-ready output - Easy serialization/deserialization of configurations Kaplan-Meier Survival Plots --------------------------- KM plots are commonly used in clinical trials to visualize time-to-event data. .. code-block:: python import pandas as pd from bioviz.configs import KMPlotConfig from bioviz.plots import KMPlotter # Prepare your survival data df = pd.DataFrame({ "time": [5, 10, 15, 8, 12, 20], "event": [1, 0, 1, 1, 0, 1], "arm": ["Treatment", "Treatment", "Treatment", "Control", "Control", "Control"], }) # Configure the plot config = KMPlotConfig( time_col="time", event_col="event", group_col="arm", title="Overall Survival", show_risktable=True, show_pvalue=True, ) # Generate plotter = KMPlotter(df, config) fig, ax, pval = plotter.plot() Key configuration options: - ``show_risktable`` - Display number at risk below the plot - ``show_pvalue`` - Show log-rank test p-value - ``show_ci`` - Display confidence intervals - ``legend_loc`` - Legend position: "bottom", "right", or "inside" - ``color_dict`` - Map group names to colors Volcano Plots ------------- Volcano plots display statistical significance vs effect size, commonly used for differential expression analysis. .. code-block:: python import pandas as pd from bioviz.configs import VolcanoConfig from bioviz.plots import VolcanoPlotter df = pd.DataFrame({ "label": ["A", "B", "C", "D", "E"], "log2_or": [3.2, -2.5, 0.5, -4.2, 1.1], "p_adj": [0.01, 0.04, 0.5, 0.001, 0.2], }) config = VolcanoConfig( x_col="log2_or", y_col="p_adj", label_col="label", log_transform_ycol=True, # -log10 transform p-values label_mode="sig_and_thresh", # label points meeting both thresholds color_mode="sig_and_thresh", # color points meeting both thresholds y_col_thresh=0.05, # significance threshold abs_x_thresh=2.0, # effect size threshold (|x| >= 2) ) plotter = VolcanoPlotter(df, config) fig, ax = plotter.plot() plotter.save("volcano.png") Key configuration options: - ``label_mode`` - Controls which points get labeled: "auto", "sig", "sig_and_thresh", "thresh", "sig_or_thresh", "all" - ``color_mode`` - Controls which points get colored (same options) - ``force_label_side_by_point_sign`` - Push labels outward based on point sign - ``use_adjust_text`` - Enable adjustText library for label placement Oncoplots --------- Oncoplots (mutation landscapes) show the mutation status of genes across samples. The plotter requires column mappings (``x_col``, ``y_col``, ``value_col``, ``row_group_col``) plus optional row_groups DataFrame and top annotations. .. code-block:: python import pandas as pd from bioviz.configs import ( OncoplotConfig, HeatmapAnnotationConfig, TopAnnotationConfig, ) from bioviz.plots import OncoPlotter # Main mutation data df = pd.DataFrame({ "Patient_ID": ["p1", "p1", "p2", "p2", "p1", "p2", "p2"], "Gene": ["TP53", "KRAS", "KRAS", "TP53", "PIK3CA", "PIK3CA", "PIK3CA"], "Variant_type": ["SNV", "SNV", "CNV", "Fusion", "SNV", "SNV", "CNV"], "Cohort": ["A", "A", "B", "B", "A", "B", "B"], "Dose": ["100 mg", "100 mg", "200 mg", "200 mg", "100 mg", "200 mg", "200 mg"], }) # Row groups (pathway annotations) - index matches y_col values row_groups = pd.DataFrame({ "Pathway": { "TP53": "Tumor Suppressor", "KRAS": "RAS Signaling", "PIK3CA": "PI3K/AKT", } }).rename_axis("Gene") # Pathway bar colors row_groups_color_dict = { "Tumor Suppressor": "#000000", "RAS Signaling": "#000000", "PI3K/AKT": "#000000", } # Top annotation series (must be indexed by x_col values) cohort_series = df.drop_duplicates("Patient_ID").set_index("Patient_ID")["Cohort"] dose_series = df.drop_duplicates("Patient_ID").set_index("Patient_ID")["Dose"] # Mutation type colors colors = {"SNV": "#1f77b4", "CNV": "#ff7f0e", "Fusion": "#2ca02c"} # Configure top annotations top_ann = TopAnnotationConfig( values=cohort_series, colors={"A": "#003975", "B": "#9d0ca2"}, legend_title="Cohort", legend_value_order=["A", "B"], merge_labels=False, show_category_labels=False, ) dose_ann = TopAnnotationConfig( values=dose_series, colors={"100 mg": "#007352", "200 mg": "#860F0F"}, legend_title="Dose", legend_value_order=["100 mg", "200 mg"], merge_labels=False, show_category_labels=False, ) # Configure heatmap cell rendering heat_ann = HeatmapAnnotationConfig( values="Variant_type", # column name or pd.Series colors=colors, bottom_left_triangle_values=["SNV"], # render as bottom-left triangles upper_right_triangle_values=["CNV"], # render as top-right triangles legend_title="Mutation Type", legend_value_order=["SNV", "CNV", "Fusion"], ) # Main config with column mappings onc_cfg = OncoplotConfig( x_col="Patient_ID", y_col="Gene", value_col="Variant_type", row_group_col="Pathway", heatmap_annotation=heat_ann, top_annotations={"Cohort": top_ann, "Dose": dose_ann}, top_annotation_order=["Cohort", "Dose"], legend_category_order=["Dose", "Cohort", "Mutation Type"], ) # Create plotter with data, config, and row groups plotter = OncoPlotter( df, onc_cfg, row_groups=row_groups, row_groups_color_dict=row_groups_color_dict, ) fig = plotter.plot() fig.savefig("oncoplot.pdf", bbox_inches="tight", pad_inches=0.1) Key configuration options: - ``x_col``, ``y_col``, ``value_col``, ``row_group_col`` - Required column mappings - ``heatmap_annotation`` - Controls cell rendering (colors, triangles) - ``top_annotations`` - Dict of annotation name → TopAnnotationConfig - ``col_split_by`` / ``col_split_order`` - Split columns by a categorical variable - ``row_group_order`` - Custom ordering for pathway/row-group bars Forest Plots ------------ Forest plots visualize hazard ratios with confidence intervals from survival analysis. .. code-block:: python import pandas as pd from bioviz.configs import ForestPlotConfig from bioviz.plots import ForestPlotter df = pd.DataFrame({ "comparator": ["Age ≥65", "Age <65", "Male", "Female"], "hr": [1.2, 0.85, 1.1, 0.9], "ci_lower": [0.9, 0.6, 0.8, 0.65], "ci_upper": [1.6, 1.2, 1.5, 1.25], "p_value": [0.21, 0.35, 0.42, 0.48], "reference": ["<65", "<65", "Female", "Male"], }) config = ForestPlotConfig( hr_col="hr", ci_lower_col="ci_lower", ci_upper_col="ci_upper", label_col="comparator", pvalue_col="p_value", reference_col="reference", show_reference_line=True, show_stats_table=True, xlabel="Hazard Ratio (95% CI)", log_scale=True, # Standard for HR visualization ) plotter = ForestPlotter(df, config) fig, ax = plotter.plot() Key configuration options: - ``log_scale`` - Use log scale for x-axis (standard for HR plots) - ``show_stats_table`` - Show HR/CI/p-value table on right side - ``show_reference_line`` - Vertical line at HR=1 - ``color_significant`` / ``color_nonsignificant`` - Colors by p-value - ``variable_col`` - Group rows by variable for multi-section plots Grouped Bar Charts ------------------ Grouped bar charts with automatic confidence interval calculation. .. code-block:: python import pandas as pd from bioviz.configs import GroupedBarConfig from bioviz.plots import GroupedBarPlotter # Pre-computed data with CIs df = pd.DataFrame({ "Category": ["Gene A", "Gene A", "Gene B", "Gene B"], "Group": ["Baseline", "Progression", "Baseline", "Progression"], "value": [0.15, 0.35, 0.20, 0.45], "ci_low": [0.08, 0.25, 0.12, 0.35], "ci_high": [0.25, 0.48, 0.32, 0.58], }) config = GroupedBarConfig( category_col="Category", group_col="Group", value_col="value", ci_low_col="ci_low", ci_high_col="ci_high", orientation="horizontal", # or "vertical" ) plotter = GroupedBarPlotter(df, config) fig, ax = plotter.plot() For proportion data with automatic CI computation: .. code-block:: python config = GroupedBarConfig( category_col="Category", group_col="Group", value_col="value", k_col="count", # numerator column n_col="total", # denominator column ci_method="clopper-pearson", # or "bootstrap" ) Key configuration options: - ``orientation`` - "horizontal" (barh) or "vertical" (bar) - ``ci_method`` - "clopper-pearson", "bootstrap", or "none" - ``k_col`` / ``n_col`` - Columns for proportion CI computation - ``alpha`` - Significance level for CI (0.05 = 95% CI) Styling and Themes ------------------ Font sizes in bioviz-kit default to ``None``, which means they inherit from matplotlib's rcParams. This makes it easy to apply global themes: .. code-block:: python import matplotlib.pyplot as plt # Set global style plt.rcParams.update({ "font.size": 12, "axes.labelsize": 14, "axes.titlesize": 16, "legend.fontsize": 11, }) # All bioviz plots will now use these sizes Saving Figures -------------- Figures can be saved directly via the plotter or manually: .. code-block:: python # Some plotters have a save() method plotter.save("figure.pdf") # Or save manually with custom settings fig.savefig("figure.pdf", dpi=300, bbox_inches="tight") Examples -------- See the ``examples/`` directory for complete, runnable examples: - ``km_survival_example.py`` - Kaplan-Meier survival analysis with multiple variants - ``volcano_smoke.py`` - Volcano plot variations (forced labels, adjust_text) - ``oncoplot_example.py`` - Detailed oncoplot with pathway bars and top annotations - ``minimal_bioviz_smoke.py`` - Line plots, tables, and oncoplots in one file - ``distribution_examples.py`` - Histogram + boxplot combinations