What is this?
Pipestat standardizes reporting of pipeline results. It provides 1) a standard specification for how pipeline outputs should be stored; and 2) an implementation to easily write results to that format from within Python or from the command line.
How does it work?
A pipeline author defines all the outputs produced by a pipeline by writing a JSON-schema. The pipeline then uses pipestat to report pipeline outputs as the pipeline runs, either via the Python API or command line interface. The user configures results to be stored either in a YAML-formatted file or a PostgreSQL database. The results are recorded according to the pipestat specification, in a standard, pipeline-agnostic way. This way, downstream software can use this specification to create universal tools for analyzing, monitoring, and visualizing pipeline results that will work with any pipeline or workflow.
Quick start
Install pipestat
pip install pipestat
Set environment variables (optional)
export PIPESTAT_RESULTS_SCHEMA=output_schema.yaml
export PIPESTAT_RECORD_ID=my_record
export PIPESTAT_RESULTS_FILE=results_file.yaml
export PIPESTAT_NAMESPACE=my_namespace
Pipeline results reporting and retrieval
Report a result
From command line:
pipestat report -i result_name -v 1.1
From Python:
import pipestat
psm = pipestat.PipestatManager()
psm.report(values={"result_name": 1.1})
Retrieve a result
From command line:
pipestat retrieve -i result_name
From Python:
import pipestat
psm = pipestat.PipestatManager()
psm.retrieve(result_identifier="result_name")
Pipeline status management
Set status
From command line:
pipestat status set running
From Python:
import pipestat
psm = pipestat.PipestatManager()
psm.set_status(status_identifier="running")
Get status
From command line:
pipestat status get
From Python:
import pipestat
psm = pipestat.PipestatManager()
psm.get_status()