Usage Guide

This guide covers the main plot types available in bioviz-kit with working examples drawn from the actual codebase.

Installation

Install from PyPI:

pip install bioviz-kit

 # Or using `uv`:

 uv install bioviz-kit

Or install in development mode:

git clone https://github.com/yourusername/bioviz-kit.git
cd bioviz-kit
pip install -e .

 # Or using `uv` in editable/development mode:

 uv install -e .

Core Concepts

bioviz-kit follows a consistent pattern for all plot types:

  1. Config classes - Pydantic models that define all plot parameters

  2. Plotter classes - Take data + config, produce matplotlib figures

This design provides:

  • Type safety and validation via Pydantic

  • IDE autocompletion for all parameters

  • Sensible defaults for publication-ready output

  • Easy serialization/deserialization of configurations

Kaplan-Meier Survival Plots

KM plots are commonly used in clinical trials to visualize time-to-event data.

import pandas as pd
from bioviz.configs import KMPlotConfig
from bioviz.plots import KMPlotter

# Prepare your survival data
df = pd.DataFrame({
    "time": [5, 10, 15, 8, 12, 20],
    "event": [1, 0, 1, 1, 0, 1],
    "arm": ["Treatment", "Treatment", "Treatment",
            "Control", "Control", "Control"],
})

# Configure the plot
config = KMPlotConfig(
    time_col="time",
    event_col="event",
    group_col="arm",
    title="Overall Survival",
    show_risktable=True,
    show_pvalue=True,
)

# Generate
plotter = KMPlotter(df, config)
fig, ax, pval = plotter.plot()

Key configuration options:

  • show_risktable - Display number at risk below the plot

  • show_pvalue - Show log-rank test p-value

  • show_ci - Display confidence intervals

  • legend_loc - Legend position: “bottom”, “right”, or “inside”

  • color_dict - Map group names to colors

Volcano Plots

Volcano plots display statistical significance vs effect size, commonly used for differential expression analysis.

import pandas as pd
from bioviz.configs import VolcanoConfig
from bioviz.plots import VolcanoPlotter

df = pd.DataFrame({
    "label": ["A", "B", "C", "D", "E"],
    "log2_or": [3.2, -2.5, 0.5, -4.2, 1.1],
    "p_adj": [0.01, 0.04, 0.5, 0.001, 0.2],
})

config = VolcanoConfig(
    x_col="log2_or",
    y_col="p_adj",
    label_col="label",
    log_transform_ycol=True,        # -log10 transform p-values
    label_mode="sig_and_thresh",    # label points meeting both thresholds
    color_mode="sig_and_thresh",    # color points meeting both thresholds
    y_col_thresh=0.05,              # significance threshold
    abs_x_thresh=2.0,               # effect size threshold (|x| >= 2)
)

plotter = VolcanoPlotter(df, config)
fig, ax = plotter.plot()
plotter.save("volcano.png")

Key configuration options:

  • label_mode - Controls which points get labeled: “auto”, “sig”, “sig_and_thresh”, “thresh”, “sig_or_thresh”, “all”

  • color_mode - Controls which points get colored (same options)

  • force_label_side_by_point_sign - Push labels outward based on point sign

  • use_adjust_text - Enable adjustText library for label placement

Oncoplots

Oncoplots (mutation landscapes) show the mutation status of genes across samples. The plotter requires column mappings (x_col, y_col, value_col, row_group_col) plus optional row_groups DataFrame and top annotations.

import pandas as pd
from bioviz.configs import (
    OncoplotConfig,
    HeatmapAnnotationConfig,
    TopAnnotationConfig,
)
from bioviz.plots import OncoPlotter

# Main mutation data
df = pd.DataFrame({
    "Patient_ID": ["p1", "p1", "p2", "p2", "p1", "p2", "p2"],
    "Gene": ["TP53", "KRAS", "KRAS", "TP53", "PIK3CA", "PIK3CA", "PIK3CA"],
    "Variant_type": ["SNV", "SNV", "CNV", "Fusion", "SNV", "SNV", "CNV"],
    "Cohort": ["A", "A", "B", "B", "A", "B", "B"],
    "Dose": ["100 mg", "100 mg", "200 mg", "200 mg", "100 mg", "200 mg", "200 mg"],
})

# Row groups (pathway annotations) - index matches y_col values
row_groups = pd.DataFrame({
    "Pathway": {
        "TP53": "Tumor Suppressor",
        "KRAS": "RAS Signaling",
        "PIK3CA": "PI3K/AKT",
    }
}).rename_axis("Gene")

# Pathway bar colors
row_groups_color_dict = {
    "Tumor Suppressor": "#000000",
    "RAS Signaling": "#000000",
    "PI3K/AKT": "#000000",
}

# Top annotation series (must be indexed by x_col values)
cohort_series = df.drop_duplicates("Patient_ID").set_index("Patient_ID")["Cohort"]
dose_series = df.drop_duplicates("Patient_ID").set_index("Patient_ID")["Dose"]

# Mutation type colors
colors = {"SNV": "#1f77b4", "CNV": "#ff7f0e", "Fusion": "#2ca02c"}

# Configure top annotations
top_ann = TopAnnotationConfig(
    values=cohort_series,
    colors={"A": "#003975", "B": "#9d0ca2"},
    legend_title="Cohort",
    legend_value_order=["A", "B"],
    merge_labels=False,
    show_category_labels=False,
)

dose_ann = TopAnnotationConfig(
    values=dose_series,
    colors={"100 mg": "#007352", "200 mg": "#860F0F"},
    legend_title="Dose",
    legend_value_order=["100 mg", "200 mg"],
    merge_labels=False,
    show_category_labels=False,
)

# Configure heatmap cell rendering
heat_ann = HeatmapAnnotationConfig(
    values="Variant_type",  # column name or pd.Series
    colors=colors,
    bottom_left_triangle_values=["SNV"],   # render as bottom-left triangles
    upper_right_triangle_values=["CNV"],   # render as top-right triangles
    legend_title="Mutation Type",
    legend_value_order=["SNV", "CNV", "Fusion"],
)

# Main config with column mappings
onc_cfg = OncoplotConfig(
    x_col="Patient_ID",
    y_col="Gene",
    value_col="Variant_type",
    row_group_col="Pathway",
    heatmap_annotation=heat_ann,
    top_annotations={"Cohort": top_ann, "Dose": dose_ann},
    top_annotation_order=["Cohort", "Dose"],
    legend_category_order=["Dose", "Cohort", "Mutation Type"],
)

# Create plotter with data, config, and row groups
plotter = OncoPlotter(
    df,
    onc_cfg,
    row_groups=row_groups,
    row_groups_color_dict=row_groups_color_dict,
)
fig = plotter.plot()
fig.savefig("oncoplot.pdf", bbox_inches="tight", pad_inches=0.1)

Key configuration options:

  • x_col, y_col, value_col, row_group_col - Required column mappings

  • heatmap_annotation - Controls cell rendering (colors, triangles)

  • top_annotations - Dict of annotation name → TopAnnotationConfig

  • col_split_by / col_split_order - Split columns by a categorical variable

  • row_group_order - Custom ordering for pathway/row-group bars

Forest Plots

Forest plots visualize hazard ratios with confidence intervals from survival analysis.

import pandas as pd
from bioviz.configs import ForestPlotConfig
from bioviz.plots import ForestPlotter

df = pd.DataFrame({
    "comparator": ["Age ≥65", "Age <65", "Male", "Female"],
    "hr": [1.2, 0.85, 1.1, 0.9],
    "ci_lower": [0.9, 0.6, 0.8, 0.65],
    "ci_upper": [1.6, 1.2, 1.5, 1.25],
    "p_value": [0.21, 0.35, 0.42, 0.48],
    "reference": ["<65", "<65", "Female", "Male"],
})

config = ForestPlotConfig(
    hr_col="hr",
    ci_lower_col="ci_lower",
    ci_upper_col="ci_upper",
    label_col="comparator",
    pvalue_col="p_value",
    reference_col="reference",
    show_reference_line=True,
    show_stats_table=True,
    xlabel="Hazard Ratio (95% CI)",
    log_scale=True,  # Standard for HR visualization
)

plotter = ForestPlotter(df, config)
fig, ax = plotter.plot()

Key configuration options:

  • log_scale - Use log scale for x-axis (standard for HR plots)

  • show_stats_table - Show HR/CI/p-value table on right side

  • show_reference_line - Vertical line at HR=1

  • color_significant / color_nonsignificant - Colors by p-value

  • variable_col - Group rows by variable for multi-section plots

Grouped Bar Charts

Grouped bar charts with automatic confidence interval calculation.

import pandas as pd
from bioviz.configs import GroupedBarConfig
from bioviz.plots import GroupedBarPlotter

# Pre-computed data with CIs
df = pd.DataFrame({
    "Category": ["Gene A", "Gene A", "Gene B", "Gene B"],
    "Group": ["Baseline", "Progression", "Baseline", "Progression"],
    "value": [0.15, 0.35, 0.20, 0.45],
    "ci_low": [0.08, 0.25, 0.12, 0.35],
    "ci_high": [0.25, 0.48, 0.32, 0.58],
})

config = GroupedBarConfig(
    category_col="Category",
    group_col="Group",
    value_col="value",
    ci_low_col="ci_low",
    ci_high_col="ci_high",
    orientation="horizontal",  # or "vertical"
)

plotter = GroupedBarPlotter(df, config)
fig, ax = plotter.plot()

For proportion data with automatic CI computation:

config = GroupedBarConfig(
    category_col="Category",
    group_col="Group",
    value_col="value",
    k_col="count",      # numerator column
    n_col="total",      # denominator column
    ci_method="clopper-pearson",  # or "bootstrap"
)

Key configuration options:

  • orientation - “horizontal” (barh) or “vertical” (bar)

  • ci_method - “clopper-pearson”, “bootstrap”, or “none”

  • k_col / n_col - Columns for proportion CI computation

  • alpha - Significance level for CI (0.05 = 95% CI)

Styling and Themes

Font sizes in bioviz-kit default to None, which means they inherit from matplotlib’s rcParams. This makes it easy to apply global themes:

import matplotlib.pyplot as plt

# Set global style
plt.rcParams.update({
    "font.size": 12,
    "axes.labelsize": 14,
    "axes.titlesize": 16,
    "legend.fontsize": 11,
})

# All bioviz plots will now use these sizes

Saving Figures

Figures can be saved directly via the plotter or manually:

# Some plotters have a save() method
plotter.save("figure.pdf")

# Or save manually with custom settings
fig.savefig("figure.pdf", dpi=300, bbox_inches="tight")

Examples

See the examples/ directory for complete, runnable examples:

  • km_survival_example.py - Kaplan-Meier survival analysis with multiple variants

  • volcano_smoke.py - Volcano plot variations (forced labels, adjust_text)

  • oncoplot_example.py - Detailed oncoplot with pathway bars and top annotations

  • minimal_bioviz_smoke.py - Line plots, tables, and oncoplots in one file

  • distribution_examples.py - Histogram + boxplot combinations