Observability in Flowker

Flowker automatically collects telemetry data across all workflow executions. This guide explains what you can monitor, how to interpret what you see, and when to involve your engineering team.

What Flowker monitors automatically

No manual instrumentation is needed. As soon as Flowker is running, it tracks:

Workflow executions — every run, from trigger to completion
Step-by-step progress — which nodes were processed and in what order
Execution outcomes — completed or failed
Service health — whether Flowker and its database are available and accepting traffic
Request volume and response times — how many API calls are being made and how fast they complete

This data flows automatically to your observability stack (Grafana), where it can be queried, visualized, and alerted on.

How to check if Flowker is healthy

Flowker exposes Kubernetes-compatible liveness and readiness probes that the platform uses to track service availability. You normally do not need to query these directly — degraded service health surfaces in Grafana dashboards and alerts. If Flowker is running but unable to process requests, it is usually a database connectivity issue; contact your engineering team.

What you’ll see in Grafana

Lerian’s pre-configured dashboards give you a business-level view of Flowker’s behavior in real time.

Request throughput

How many API calls Flowker is receiving per second, broken down by route (e.g., workflow execution, workflow list, health). Useful for spotting traffic spikes or unexpected drops in activity.

Response time (P95 latency)

The time it takes Flowker to respond to 95% of requests. A rising P95 can indicate that executions are taking longer than expected — useful as an early warning before a full degradation.

Error rate

The proportion of requests that returned a server error (HTTP 5xx). A non-zero error rate means something is failing inside Flowker. Spikes here warrant immediate investigation.

Active executions

How many workflows are currently being executed. Useful for understanding load patterns and whether executions are completing as expected.

How to interpret execution status

Each workflow execution in Flowker has a status that tells you where it stands.

Status	Meaning	What to do
`pending`	Execution is queued and waiting to start	Normal — will transition to running shortly
`running`	Execution is in progress	Normal — monitor for completion
`completed`	All steps finished successfully	No action needed
`failed`	At least one step failed	Check the execution details for the error message

If you see a significant number of failed executions in a short period, check the error rate dashboard and flag it to engineering. A single failure is often expected; a pattern is a signal.

When to involve engineering

You can self-serve most status checks through Grafana. Escalate to engineering when:

Flowker is marked unavailable in the platform health view (typically a database connectivity issue)
Error rate dashboard shows a sustained spike (not a one-off)
P95 latency is consistently above the baseline for your workflows
A large number of executions are failed with no clear trigger
Flowker is not processing new executions despite being marked healthy

In these cases, share the Grafana dashboard link or a screenshot with the engineering team along with the timeframe — it speeds up diagnosis significantly.

Overview

Guides

Console

Operations

Observability in Flowker

What Flowker monitors automatically

How to check if Flowker is healthy

What you’ll see in Grafana

Request throughput

Response time (P95 latency)

Error rate

Active executions

How to interpret execution status

When to involve engineering

​What Flowker monitors automatically

​How to check if Flowker is healthy

​What you’ll see in Grafana

​Request throughput

​Response time (P95 latency)

​Error rate

​Active executions

​How to interpret execution status

​When to involve engineering

What Flowker monitors automatically

How to check if Flowker is healthy

What you’ll see in Grafana

Request throughput

Response time (P95 latency)

Error rate

Active executions

How to interpret execution status

When to involve engineering