Introduction

This document describes the output produced by the pipeline. The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.

Pipeline overview

The pipeline is built using Nextflow and processes data using the following steps:

Illumination Correction

Output files
  • cellprofiler/illumination_correction/
    • <batch>_<plate>_<channel>/: One subdirectory per channel and plate combination
      • <plate>_Illum<channel>.npy: NumPy array containing the illumination correction function

CellProfiler computes per-channel, per-plate illumination correction functions using the CorrectIlluminationCalculate module. These .npy files are used as inputs to both the assay development and analysis steps to normalize uneven illumination across the field of view.

Assay Development

Output files
  • cellprofiler/assay_development/
    • <batch>_<plate>_<well>/: One subdirectory per well
      • *.png: Segmentation overlay images for visual QC
      • Image.csv: Image-level measurements

CellProfiler segments a single site per well (controlled by --cellprofiler_assaydevelopment_site, default: 1) and produces overlay images for visual inspection. This step serves as a QC gate — review the segmentation overlays before committing to full analysis. Assay development always runs in both assay_development and analysis modes.

Analysis

Output files
  • cellprofiler/analysis/
    • <batch>_<plate>_<well>_<site>/: One subdirectory per site
      • Image.csv: Image-level measurements and metadata
      • Nuclei.csv: Nuclei object morphological measurements
      • Cells.csv: Cell object morphological measurements
      • Cytoplasm.csv: Cytoplasm object morphological measurements
      • *.png: Segmentation overlay images

CellProfiler performs full morphological feature extraction on every site. Images are grouped by batch, plate, well, and site, with illumination correction functions applied from the illumination correction step. This step only runs in analysis mode.

CytoTable

Output files
  • cytotable/
    • <batch>_<plate>_<well>_<site>.parquet: Collated CellProfiler measurements in Parquet format

CytoTable converts CellProfiler CSV outputs into a single Parquet file per site using the cellprofiler_csv preset. Parquet files are columnar, compressed, and ready for downstream analysis with tools like Pycytominer. This step only runs in analysis mode.

Pipeline information

Output files
  • pipeline_info/
    • execution_timeline_*.html: Nextflow execution timeline
    • execution_report_*.html: Nextflow execution report with resource usage
    • execution_trace_*.txt: Nextflow execution trace with per-task metrics
    • pipeline_dag_*.html: Pipeline DAG visualization
    • nf_core_cellpainting_software_mqc_versions.yml: Software versions used in the run

Nextflow provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.