Skip to main content

System

The System page provides a real-time operational dashboard for the Chompy platform. It displays health status for all core services, server resource utilization, database backup management, and Docker container status. This is the primary page for verifying that the platform is running correctly and diagnosing infrastructure issues.

Overall Status

The page header shows the last refresh timestamp and the overall system status as a color-coded badge:

StatusBadge ColorMeaning
HEALTHYGreenAll services are responding and operating normally.
DEGRADEDYellowOne or more services are unhealthy, but the platform is partially operational.
UNKNOWNGrayUnable to determine overall status.

The overall status is computed from the individual service statuses — if all services are healthy, the overall is healthy. If any service is unhealthy, the overall is degraded. Click the Refresh button to re-poll all services.

Service Cards

The System page displays health cards for each core platform component. Every card shows the service name, an icon, and a status badge (HEALTHY in green or UNHEALTHY in red).

ClickHouse

The time-series database storing all flow records, SNMP metrics, and syslog data.

MetricDescription
VersionThe running ClickHouse server version (e.g., 26.1.2.11).
UptimeHow long the ClickHouse server has been running since last restart.
Database SizeTotal disk usage of all ClickHouse databases.
Total RowsSum of rows across all tables (flows_all, snmp_metrics, syslog, etc.).
LatencyRound-trip time for a simple health check query (SELECT 1).

Health is determined by the ability to connect and execute a query. If ClickHouse is unreachable or the query times out, the status shows UNHEALTHY.

PostgreSQL

The relational database storing device inventory, user accounts, alert rules, site configuration, flow tags, and all platform settings.

MetricDescription
VersionThe running PostgreSQL server version (e.g., PostgreSQL 18.1).
UptimeServer uptime since last restart.
ConnectionsCurrent active connections out of the maximum allowed (e.g., 23 / 100).
Database SizeTotal disk usage of the network_monitoring database.
LatencyRound-trip time for a health check query.

Kafka

The message broker handling the flow data pipeline between goflow2 (flow collector), the Go flow enricher, and ClickHouse.

MetricDescription
StatusWhether the Kafka broker is reachable and responding.
BrokersNumber of Kafka brokers in the cluster (typically 1 for single-node deployments).
TopicsNumber of Kafka topics. A count of 0 when flows should be active may indicate a configuration issue.

Kafka health is checked by executing a topic list command against the broker container. If the container is not running or the broker is not reachable, the status shows UNHEALTHY — as seen in the screenshot where Kafka shows "not found" status.

Synthetics Service

The Go-based synthetic monitoring service that executes ping, HTTP, and traceroute tests against configured targets.

MetricDescription
Active TestsNumber of enabled synthetic test definitions.

Health is determined by a response from the service's /health endpoint on port 8080.

Backend API

The Node.js backend serving the REST API for the Chompy frontend and all integrations.

MetricDescription
Node.jsThe Node.js runtime version (e.g., v18.20.8).
UptimeHow long the backend process has been running.
MemoryHeap memory used vs. total heap allocated (e.g., 52.7 MB / 64.0 MB).
Memory %Percentage of allocated heap memory in use.

The backend is always healthy if you can see this page, since the page itself is served by the backend API.

Disk Usage

Host server disk utilization for the volume where Chompy data is stored.

MetricDescription
TotalTotal disk capacity.
UsedCurrent disk usage.
AvailableRemaining free space.
Used %Percentage used, with a visual progress bar. The bar turns yellow above 70% and red above 90%.

Server Resources

Host server CPU and memory utilization.

MetricDescription
CPU UsageCurrent CPU utilization percentage.
CPU CoresNumber of available CPU cores.
Load Avg1-minute, 5-minute, and 15-minute load averages.
MemoryPhysical RAM used vs. total (e.g., 3.74 GB / 62.79 GB).
Memory %RAM utilization percentage.
UptimeHost server uptime.
CPU / MEM barsVisual progress bars with color thresholds — green below 70%, yellow 70–90%, red above 90%.

Database Backups

The Database Backups card provides manual and automated backup controls for both ClickHouse and PostgreSQL.

Manual Backups

Click the green Backup button next to either database to trigger an immediate backup:

  • ClickHouse — Uses clickhouse-backup to create a full backup including schema, data, configs, and RBAC. Backups are stored in the mounted backups volume.
  • PostgreSQL — Uses pg_dump to export the network_monitoring database. Supports full backups and schema-only backups. Output is compressed with gzip.

Auto Backup Schedule

The auto backup dropdown configures the cron schedule for automated backups. Options include schedules like Weekly Sun 2AM and other preset intervals. Automated backups run both ClickHouse and PostgreSQL backups on the configured schedule.

Recent Backup History

The card displays recent backup operations with status indicators:

  • 🔄 (orange) — Backup in progress.
  • ✅ (green) — Backup completed successfully.
  • ❌ (red) — Backup failed.

Each entry shows the database type (e.g., postgres) and the timestamp. Click on a backup entry to view details or download the backup file.

Backup Storage

Backup files are stored in the host-mounted backups directory mapped into the Docker containers:

PathContents
/etc/chompy/backups/postgres-*PostgreSQL .sql.gz dump files.
/etc/chompy/backups/clickhouse-*ClickHouse backup archives.

Docker Containers

The Docker Containers section lists all running containers in the Chompy stack with their current status and image information.

ColumnDescription
ContainerThe container name as defined in docker-compose.yml (e.g., snmp-discovery, chompy-synthetics, snmp-poller, postgres-db).
StatusCurrent container state with uptime — e.g., Up 26 hours (green check). Containers with health checks show (healthy) when passing.
ImageThe Docker image and tag the container is running (e.g., ghcr.io/chompy-user/chompy-snmp-poller:latest, postgres:latest).

Container status is retrieved by querying the Docker daemon. Containers that are stopped, restarting, or in an error state will display accordingly. This section provides a quick visual check that all platform components are running.

Troubleshooting

Service showing UNHEALTHY

If a service card shows UNHEALTHY, check the following:

  • ClickHouse — Verify the container is running with docker ps | grep clickhouse. Check ClickHouse logs with docker logs chompy-clickhouse --tail 100. Common issues include disk full, OOM kills, or corrupted tables.
  • PostgreSQL — Check container status and logs. Verify connection count hasn't hit the maximum. Common issues include connection exhaustion and disk space.
  • Kafka — Verify both the Kafka container is running. Check for "not found" status which indicates the container isn't accessible. Review Kafka logs for broker errors.
  • Synthetics — The Go synthetics service may not be running if no synthetic tests are configured. Check the container logs.

DEGRADED overall status

The overall status shows DEGRADED when any individual service is unhealthy. Identify which service card shows UNHEALTHY and address that specific service. The platform can operate in a degraded state — for example, if Kafka is down, existing data in ClickHouse is still queryable but new flow ingestion will be paused.

High memory or CPU usage

If the Server Resources card shows elevated CPU or memory, check which containers are consuming the most resources with docker stats. The most resource-intensive components are typically ClickHouse (during queries or merges), the Go flow enricher (at high flow rates), and the SNMP poller (with many devices).