System
The System page provides a real-time operational dashboard for the Chompy platform. It displays health status for all core services, server resource utilization, database backup management, and Docker container status. This is the primary page for verifying that the platform is running correctly and diagnosing infrastructure issues.
Overall Status
The page header shows the last refresh timestamp and the overall system status as a color-coded badge:
| Status | Badge Color | Meaning |
|---|---|---|
| HEALTHY | Green | All services are responding and operating normally. |
| DEGRADED | Yellow | One or more services are unhealthy, but the platform is partially operational. |
| UNKNOWN | Gray | Unable to determine overall status. |
The overall status is computed from the individual service statuses — if all services are healthy, the overall is healthy. If any service is unhealthy, the overall is degraded. Click the Refresh button to re-poll all services.
Service Cards
The System page displays health cards for each core platform component. Every card shows the service name, an icon, and a status badge (HEALTHY in green or UNHEALTHY in red).
ClickHouse
The time-series database storing all flow records, SNMP metrics, and syslog data.
| Metric | Description |
|---|---|
| Version | The running ClickHouse server version (e.g., 26.1.2.11). |
| Uptime | How long the ClickHouse server has been running since last restart. |
| Database Size | Total disk usage of all ClickHouse databases. |
| Total Rows | Sum of rows across all tables (flows_all, snmp_metrics, syslog, etc.). |
| Latency | Round-trip time for a simple health check query (SELECT 1). |
Health is determined by the ability to connect and execute a query. If ClickHouse is unreachable or the query times out, the status shows UNHEALTHY.
PostgreSQL
The relational database storing device inventory, user accounts, alert rules, site configuration, flow tags, and all platform settings.
| Metric | Description |
|---|---|
| Version | The running PostgreSQL server version (e.g., PostgreSQL 18.1). |
| Uptime | Server uptime since last restart. |
| Connections | Current active connections out of the maximum allowed (e.g., 23 / 100). |
| Database Size | Total disk usage of the network_monitoring database. |
| Latency | Round-trip time for a health check query. |
Kafka
The message broker handling the flow data pipeline between goflow2 (flow collector), the Go flow enricher, and ClickHouse.
| Metric | Description |
|---|---|
| Status | Whether the Kafka broker is reachable and responding. |
| Brokers | Number of Kafka brokers in the cluster (typically 1 for single-node deployments). |
| Topics | Number of Kafka topics. A count of 0 when flows should be active may indicate a configuration issue. |
Kafka health is checked by executing a topic list command against the broker container. If the container is not running or the broker is not reachable, the status shows UNHEALTHY — as seen in the screenshot where Kafka shows "not found" status.
Synthetics Service
The Go-based synthetic monitoring service that executes ping, HTTP, and traceroute tests against configured targets.
| Metric | Description |
|---|---|
| Active Tests | Number of enabled synthetic test definitions. |
Health is determined by a response from the service's /health endpoint on port 8080.
Backend API
The Node.js backend serving the REST API for the Chompy frontend and all integrations.
| Metric | Description |
|---|---|
| Node.js | The Node.js runtime version (e.g., v18.20.8). |
| Uptime | How long the backend process has been running. |
| Memory | Heap memory used vs. total heap allocated (e.g., 52.7 MB / 64.0 MB). |
| Memory % | Percentage of allocated heap memory in use. |
The backend is always healthy if you can see this page, since the page itself is served by the backend API.
Disk Usage
Host server disk utilization for the volume where Chompy data is stored.
| Metric | Description |
|---|---|
| Total | Total disk capacity. |
| Used | Current disk usage. |
| Available | Remaining free space. |
| Used % | Percentage used, with a visual progress bar. The bar turns yellow above 70% and red above 90%. |
Server Resources
Host server CPU and memory utilization.
| Metric | Description |
|---|---|
| CPU Usage | Current CPU utilization percentage. |
| CPU Cores | Number of available CPU cores. |
| Load Avg | 1-minute, 5-minute, and 15-minute load averages. |
| Memory | Physical RAM used vs. total (e.g., 3.74 GB / 62.79 GB). |
| Memory % | RAM utilization percentage. |
| Uptime | Host server uptime. |
| CPU / MEM bars | Visual progress bars with color thresholds — green below 70%, yellow 70–90%, red above 90%. |
Database Backups
The Database Backups card provides manual and automated backup controls for both ClickHouse and PostgreSQL.
Manual Backups
Click the green Backup button next to either database to trigger an immediate backup:
- ClickHouse — Uses
clickhouse-backupto create a full backup including schema, data, configs, and RBAC. Backups are stored in the mounted backups volume. - PostgreSQL — Uses
pg_dumpto export thenetwork_monitoringdatabase. Supports full backups and schema-only backups. Output is compressed with gzip.
Auto Backup Schedule
The auto backup dropdown configures the cron schedule for automated backups. Options include schedules like Weekly Sun 2AM and other preset intervals. Automated backups run both ClickHouse and PostgreSQL backups on the configured schedule.
Recent Backup History
The card displays recent backup operations with status indicators:
- 🔄 (orange) — Backup in progress.
- ✅ (green) — Backup completed successfully.
- ❌ (red) — Backup failed.
Each entry shows the database type (e.g., postgres) and the timestamp. Click on a backup entry to view details or download the backup file.
Backup Storage
Backup files are stored in the host-mounted backups directory mapped into the Docker containers:
| Path | Contents |
|---|---|
/etc/chompy/backups/postgres-* | PostgreSQL .sql.gz dump files. |
/etc/chompy/backups/clickhouse-* | ClickHouse backup archives. |
Docker Containers
The Docker Containers section lists all running containers in the Chompy stack with their current status and image information.
| Column | Description |
|---|---|
| Container | The container name as defined in docker-compose.yml (e.g., snmp-discovery, chompy-synthetics, snmp-poller, postgres-db). |
| Status | Current container state with uptime — e.g., Up 26 hours (green check). Containers with health checks show (healthy) when passing. |
| Image | The Docker image and tag the container is running (e.g., ghcr.io/chompy-user/chompy-snmp-poller:latest, postgres:latest). |
Container status is retrieved by querying the Docker daemon. Containers that are stopped, restarting, or in an error state will display accordingly. This section provides a quick visual check that all platform components are running.
Troubleshooting
Service showing UNHEALTHY
If a service card shows UNHEALTHY, check the following:
- ClickHouse — Verify the container is running with
docker ps | grep clickhouse. Check ClickHouse logs withdocker logs chompy-clickhouse --tail 100. Common issues include disk full, OOM kills, or corrupted tables. - PostgreSQL — Check container status and logs. Verify connection count hasn't hit the maximum. Common issues include connection exhaustion and disk space.
- Kafka — Verify both the Kafka container is running. Check for "not found" status which indicates the container isn't accessible. Review Kafka logs for broker errors.
- Synthetics — The Go synthetics service may not be running if no synthetic tests are configured. Check the container logs.
DEGRADED overall status
The overall status shows DEGRADED when any individual service is unhealthy. Identify which service card shows UNHEALTHY and address that specific service. The platform can operate in a degraded state — for example, if Kafka is down, existing data in ClickHouse is still queryable but new flow ingestion will be paused.
High memory or CPU usage
If the Server Resources card shows elevated CPU or memory, check which containers are consuming the most resources with docker stats. The most resource-intensive components are typically ClickHouse (during queries or merges), the Go flow enricher (at high flow rates), and the SNMP poller (with many devices).