Annotation Boost (Optional)

Overview

Annotation Boost is an advanced validation tool that enhances annotation confidence through multiple iterations of analysis. It's particularly useful for:

  • Validating low-confidence annotations
  • Getting detailed insights into specific cell clusters
  • Resolving ambiguous cell type assignments
  • Generating comprehensive validation reports

Quick Start

CASSIA.runCASSIA_annotationboost(
    full_result_path = "batch_results_summary.csv",
    marker = marker_data,
    cluster_name = "CD4+ T cell",
    major_cluster_info = "Human PBMC",
    output_name = "Cluster1_report",
    model = "anthropic/claude-sonnet-4.5",
    provider = "openrouter",
)
Python

Input

  • Full results CSV from CASSIA batch analysis (_summary.csv)
  • Original marker gene file (Ideally use the raw marker file - do not filter!)
  • Cluster context information
  • Specific cluster identifier
  • (Optional) Conversations JSON file from batch annotation (_conversations.json)

Parameters

Required Parameters

ParameterDescription
full_result_pathPath to the CASSIA results CSV file (_summary.csv)
markerMarker gene data (data frame or path). Use the same marker data as the initial analysis (do not filter)
cluster_nameExact name of the target cluster to validate
major_cluster_infoContext about the dataset (e.g., "Human PBMC", "Mouse Brain")
output_nameBase name for the output validation report

Optional Parameters

ParameterDefaultDescription
num_iterations5Number of validation rounds
model-LLM model to use. Recommended: anthropic/claude-sonnet-4.5 or better
provider-API provider for the model
conversations_json_path"auto"Path to the conversations JSON file, or "auto" to auto-detect from full_result_path (e.g., batch_summary.csvbatch_conversations.json)
conversation_history_mode"full"How to use prior conversation history: "full", "final", or "none"
search_strategy"breadth"Strategy for exploring hypotheses: "breadth" or "depth"
report_style"per_iteration"Format of the final report: "per_iteration" or "total_summary"
reasoning-Reasoning effort level: "low", "medium", "high". Only supported by OpenAI GPT-5 series models

Output

The analysis generates the following output files:

  • {output_name}_summary.html: HTML report with detailed analysis results and visualizations.
  • {output_name}_raw_conversation.txt: Raw conversation text containing the full analysis dialogue.