Fly.io Integration
Monitor machine metrics, volume health, edge deployment latency, and global network performance across your Fly.io applications. Full multi-region visibility from a single dashboard.
How It Works
Configure fly.toml Metrics
Add the [metrics] stanza to your fly.toml to expose an application metrics endpoint. TigerOps scrapes it from within the Fly private network using Prometheus remote write, requiring no public exposure.
Connect the Fly API Token
Create a read-only Fly.io API token with the metrics:read scope. TigerOps uses the Fly Machines API and Prometheus-compatible metrics endpoint to collect machine, volume, and network data.
Deploy the TigerOps Machine
Optionally deploy a TigerOps collector Machine to your Fly organization that scrapes all apps in your organization from within the Fly private network. Zero public endpoints required.
Set Region-Aware Alerts
TigerOps labels all metrics with the Fly.io region code. Set per-region latency thresholds and machine health alerts to catch regional issues before they impact global users.
What You Get Out of the Box
Machine CPU & Memory
Per-machine CPU utilization, memory usage, and OOM kill events across all your Fly Machines. Group by app, region, or process group for targeted capacity analysis.
Edge Deployment Latency
Request latency, error rates, and throughput per Fly region. TigerOps builds a global latency heatmap showing which regions are experiencing degradation in real time.
Volume Health
Fly volume read/write IOPS, throughput, and fullness percentage. TigerOps alerts before volumes reach capacity and tracks I/O latency degradation for stateful workloads.
Machine Restart & Lifecycle
Machine start, stop, restart, and crash events with the reason code. TigerOps correlates machine restarts with memory OOM events, health check failures, and application errors.
Network & Anycast Metrics
Bytes sent and received per machine, anycast IP traffic distribution, and WireGuard tunnel health for private networking. Detect network saturation before it impacts performance.
Autoscaling & Scale Events
Track Fly autoscaling decisions, machine provisioning latency, and scale-to-zero transitions. TigerOps correlates scaling events with traffic patterns to validate your autoscale policy.
fly.toml Metrics Config
Add the metrics stanza to your fly.toml to expose your application metrics endpoint for TigerOps to scrape.
# fly.toml — expose application metrics for TigerOps
app = "my-app"
primary_region = "iad"
[build]
image = "my-org/my-app:latest"
[env]
TIGEROPS_API_KEY = "your_api_key" # Use fly secrets set in production
TIGEROPS_ENDPOINT = "https://ingest.atatus.net/api/v1/write"
TIGEROPS_SERVICE = "my-app"
# Expose a Prometheus-compatible /metrics endpoint on port 9091
# TigerOps scrapes this from within the Fly private network
[metrics]
port = 9091
path = "/metrics"
[[services]]
protocol = "tcp"
internal_port = 8080
[[services.ports]]
port = 443
handlers = ["tls", "http"]
[services.concurrency]
type = "requests"
hard_limit = 200
soft_limit = 150
# TigerOps collector — deploy to the same org to scrape via 6PN
# fly launch --image atatus/fly-collector:latest --name tigerops-collector
# fly secrets set TIGEROPS_API_KEY=your_key --app tigerops-collector
# fly secrets set SCRAPE_APPS="my-app,my-api,my-worker" --app tigerops-collectorCommon Questions
Does TigerOps support Fly.io private networking (6PN)?
Yes. The TigerOps collector Machine runs inside your Fly organization and uses the Fly 6PN private network to scrape application metrics endpoints without any public exposure. All metric collection happens over the encrypted private network.
Can TigerOps monitor Fly Machines that scale to zero?
Yes. TigerOps tracks machine lifecycle events including scale-to-zero transitions and cold-start latency. When a machine scales to zero, its last known metrics are preserved and the scale-to-zero event is recorded as a deployment marker.
Does TigerOps support Fly.io Postgres (managed)?
Yes. TigerOps monitors Fly Postgres clusters including replication lag, connection pool utilization, and query performance. Fly Postgres metrics are collected through the Fly Machines API and the PostgreSQL metrics endpoint exposed on the private network.
Can I monitor multiple Fly.io organizations in one TigerOps account?
Yes. You can connect multiple Fly API tokens from different organizations to a single TigerOps workspace. Resources from each organization are labeled with the organization slug for easy filtering and separation.
How does TigerOps handle Fly.io multi-region deployments?
TigerOps labels every metric with the fly_region tag and builds per-region dashboards automatically. You can compare P99 latency across regions, set per-region alert thresholds, and receive alerts that specify which region is impacted.
Full Visibility Into Your Fly.io Applications
No credit card required. Connect in minutes. Machine metrics, edge latency, and volume health immediately.