Usage Guide

This guide covers the main plot types available in bioviz-kit with working examples drawn from the actual codebase.

Installation

Install from PyPI:

pip install bioviz-kit

 # Or using `uv`:

 uv install bioviz-kit

Or install in development mode:

git clone https://github.com/yourusername/bioviz-kit.git
cd bioviz-kit
pip install -e .

 # Or using `uv` in editable/development mode:

 uv install -e .

Core Concepts

bioviz-kit follows a consistent pattern for all plot types:

Config classes - Pydantic models that define all plot parameters
Plotter classes - Take data + config, produce matplotlib figures

This design provides:

Type safety and validation via Pydantic
IDE autocompletion for all parameters
Sensible defaults for publication-ready output
Easy serialization/deserialization of configurations

Kaplan-Meier Survival Plots

KM plots are commonly used in clinical trials to visualize time-to-event data.

import pandas as pd
from bioviz.configs import KMPlotConfig
from bioviz.plots import KMPlotter

# Prepare your survival data
df = pd.DataFrame({
    "time": [5, 10, 15, 8, 12, 20],
    "event": [1, 0, 1, 1, 0, 1],
    "arm": ["Treatment", "Treatment", "Treatment",
            "Control", "Control", "Control"],
})

# Configure the plot
config = KMPlotConfig(
    time_col="time",
    event_col="event",
    group_col="arm",
    title="Overall Survival",
    show_risktable=True,
    show_pvalue=True,
)

# Generate
plotter = KMPlotter(df, config)
fig, ax, pval = plotter.plot()

Key configuration options:

show_risktable - Display number at risk below the plot
show_pvalue - Show log-rank test p-value
show_ci - Display confidence intervals
legend_loc - Legend position: “bottom”, “right”, or “inside”
color_dict - Map group names to colors

Volcano Plots

Volcano plots display statistical significance vs effect size, commonly used for differential expression analysis.

import pandas as pd
from bioviz.configs import VolcanoConfig
from bioviz.plots import VolcanoPlotter

df = pd.DataFrame({
    "label": ["A", "B", "C", "D", "E"],
    "log2_or": [3.2, -2.5, 0.5, -4.2, 1.1],
    "p_adj": [0.01, 0.04, 0.5, 0.001, 0.2],
})

config = VolcanoConfig(
    x_col="log2_or",
    y_col="p_adj",
    label_col="label",
    log_transform_ycol=True,        # -log10 transform p-values
    label_mode="sig_and_thresh",    # label points meeting both thresholds
    color_mode="sig_and_thresh",    # color points meeting both thresholds
    y_col_thresh=0.05,              # significance threshold
    abs_x_thresh=2.0,               # effect size threshold (|x| >= 2)
)

plotter = VolcanoPlotter(df, config)
fig, ax = plotter.plot()
plotter.save("volcano.png")

Key configuration options:

label_mode - Controls which points get labeled: “auto”, “sig”, “sig_and_thresh”, “thresh”, “sig_or_thresh”, “all”
color_mode - Controls which points get colored (same options)
force_label_side_by_point_sign - Push labels outward based on point sign
use_adjust_text - Enable adjustText library for label placement

Oncoplots

Oncoplots (mutation landscapes) show the mutation status of genes across samples. The plotter requires column mappings (x_col, y_col, value_col, row_group_col) plus optional row_groups DataFrame and top annotations.

import pandas as pd
from bioviz.configs import (
    OncoplotConfig,
    HeatmapAnnotationConfig,
    TopAnnotationConfig,
)
from bioviz.plots import OncoPlotter

# Main mutation data
df = pd.DataFrame({
    "Patient_ID": ["p1", "p1", "p2", "p2", "p1", "p2", "p2"],
    "Gene": ["TP53", "KRAS", "KRAS", "TP53", "PIK3CA", "PIK3CA", "PIK3CA"],
    "Variant_type": ["SNV", "SNV", "CNV", "Fusion", "SNV", "SNV", "CNV"],
    "Cohort": ["A", "A", "B", "B", "A", "B", "B"],
    "Dose": ["100 mg", "100 mg", "200 mg", "200 mg", "100 mg", "200 mg", "200 mg"],
})

# Row groups (pathway annotations) - index matches y_col values
row_groups = pd.DataFrame({
    "Pathway": {
        "TP53": "Tumor Suppressor",
        "KRAS": "RAS Signaling",
        "PIK3CA": "PI3K/AKT",
    }
}).rename_axis("Gene")

# Pathway bar colors
row_groups_color_dict = {
    "Tumor Suppressor": "#000000",
    "RAS Signaling": "#000000",
    "PI3K/AKT": "#000000",
}

# Top annotation series (must be indexed by x_col values)
cohort_series = df.drop_duplicates("Patient_ID").set_index("Patient_ID")["Cohort"]
dose_series = df.drop_duplicates("Patient_ID").set_index("Patient_ID")["Dose"]

# Mutation type colors
colors = {"SNV": "#1f77b4", "CNV": "#ff7f0e", "Fusion": "#2ca02c"}

# Configure top annotations
top_ann = TopAnnotationConfig(
    values=cohort_series,
    colors={"A": "#003975", "B": "#9d0ca2"},
    legend_title="Cohort",
    legend_value_order=["A", "B"],
    merge_labels=False,
    show_category_labels=False,
)

dose_ann = TopAnnotationConfig(
    values=dose_series,
    colors={"100 mg": "#007352", "200 mg": "#860F0F"},
    legend_title="Dose",
    legend_value_order=["100 mg", "200 mg"],
    merge_labels=False,
    show_category_labels=False,
)

# Configure heatmap cell rendering
heat_ann = HeatmapAnnotationConfig(
    values="Variant_type",  # column name or pd.Series
    colors=colors,
    bottom_left_triangle_values=["SNV"],   # render as bottom-left triangles
    upper_right_triangle_values=["CNV"],   # render as top-right triangles
    legend_title="Mutation Type",
    legend_value_order=["SNV", "CNV", "Fusion"],
)

# Main config with column mappings
onc_cfg = OncoplotConfig(
    x_col="Patient_ID",
    y_col="Gene",
    value_col="Variant_type",
    row_group_col="Pathway",
    heatmap_annotation=heat_ann,
    top_annotations={"Cohort": top_ann, "Dose": dose_ann},
    top_annotation_order=["Cohort", "Dose"],
    legend_category_order=["Dose", "Cohort", "Mutation Type"],
)

# Create plotter with data, config, and row groups
plotter = OncoPlotter(
    df,
    onc_cfg,
    row_groups=row_groups,
    row_groups_color_dict=row_groups_color_dict,
)
fig = plotter.plot()
fig.savefig("oncoplot.pdf", bbox_inches="tight", pad_inches=0.1)

Key configuration options:

x_col, y_col, value_col, row_group_col - Required column mappings
heatmap_annotation - Controls cell rendering (colors, triangles)
top_annotations - Dict of annotation name → TopAnnotationConfig
col_split_by / col_split_order - Split columns by a categorical variable
row_group_order - Custom ordering for pathway/row-group bars

Forest Plots

Forest plots visualize hazard ratios with confidence intervals from survival analysis.

import pandas as pd
from bioviz.configs import ForestPlotConfig
from bioviz.plots import ForestPlotter

df = pd.DataFrame({
    "comparator": ["Age ≥65", "Age <65", "Male", "Female"],
    "hr": [1.2, 0.85, 1.1, 0.9],
    "ci_lower": [0.9, 0.6, 0.8, 0.65],
    "ci_upper": [1.6, 1.2, 1.5, 1.25],
    "p_value": [0.21, 0.35, 0.42, 0.48],
    "reference": ["<65", "<65", "Female", "Male"],
})

config = ForestPlotConfig(
    hr_col="hr",
    ci_lower_col="ci_lower",
    ci_upper_col="ci_upper",
    label_col="comparator",
    pvalue_col="p_value",
    reference_col="reference",
    show_reference_line=True,
    show_stats_table=True,
    xlabel="Hazard Ratio (95% CI)",
    log_scale=True,  # Standard for HR visualization
)

plotter = ForestPlotter(df, config)
fig, ax = plotter.plot()

Key configuration options:

log_scale - Use log scale for x-axis (standard for HR plots)
show_stats_table - Show HR/CI/p-value table on right side
show_reference_line - Vertical line at HR=1
color_significant / color_nonsignificant - Colors by p-value
variable_col - Group rows by variable for multi-section plots

Grouped Bar Charts

Grouped bar charts with automatic confidence interval calculation.

import pandas as pd
from bioviz.configs import GroupedBarConfig
from bioviz.plots import GroupedBarPlotter

# Pre-computed data with CIs
df = pd.DataFrame({
    "Category": ["Gene A", "Gene A", "Gene B", "Gene B"],
    "Group": ["Baseline", "Progression", "Baseline", "Progression"],
    "value": [0.15, 0.35, 0.20, 0.45],
    "ci_low": [0.08, 0.25, 0.12, 0.35],
    "ci_high": [0.25, 0.48, 0.32, 0.58],
})

config = GroupedBarConfig(
    category_col="Category",
    group_col="Group",
    value_col="value",
    ci_low_col="ci_low",
    ci_high_col="ci_high",
    orientation="horizontal",  # or "vertical"
)

plotter = GroupedBarPlotter(df, config)
fig, ax = plotter.plot()

For proportion data with automatic CI computation:

config = GroupedBarConfig(
    category_col="Category",
    group_col="Group",
    value_col="value",
    k_col="count",      # numerator column
    n_col="total",      # denominator column
    ci_method="clopper-pearson",  # or "bootstrap"
)

Key configuration options:

orientation - “horizontal” (barh) or “vertical” (bar)
ci_method - “clopper-pearson”, “bootstrap”, or “none”
k_col / n_col - Columns for proportion CI computation
alpha - Significance level for CI (0.05 = 95% CI)

Styling and Themes

Font sizes in bioviz-kit default to None, which means they inherit from matplotlib’s rcParams. This makes it easy to apply global themes:

import matplotlib.pyplot as plt

# Set global style
plt.rcParams.update({
    "font.size": 12,
    "axes.labelsize": 14,
    "axes.titlesize": 16,
    "legend.fontsize": 11,
})

# All bioviz plots will now use these sizes

Saving Figures

Figures can be saved directly via the plotter or manually:

# Some plotters have a save() method
plotter.save("figure.pdf")

# Or save manually with custom settings
fig.savefig("figure.pdf", dpi=300, bbox_inches="tight")

Examples

See the examples/ directory for complete, runnable examples:

km_survival_example.py - Kaplan-Meier survival analysis with multiple variants
volcano_smoke.py - Volcano plot variations (forced labels, adjust_text)
oncoplot_example.py - Detailed oncoplot with pathway bars and top annotations
minimal_bioviz_smoke.py - Line plots, tables, and oncoplots in one file
distribution_examples.py - Histogram + boxplot combinations