DNS Resolution System
Overview
The DNS Resolution System provides automatic hostname resolution for IP addresses in network flow data. It consists of a background resolver service that populates a Redis cache, which is then read by the Go flow-enricher to add hostnames to flows before they're stored in ClickHouse.
Key Features
- Asynchronous Resolution — DNS lookups happen in the background, never blocking flow processing
- Redis Caching — Sub-millisecond lookups with configurable TTLs
- Multi-Server Failover — Configure multiple DNS servers with automatic health tracking
- Public/Private IP Filtering — Independently enable/disable resolution for public vs private IPs
- UI Configuration — All settings manageable via the web interface
- High Performance — 90%+ cache hit rates, minimal impact on flow processing
Architecture
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ ClickHouse │ │ PostgreSQL │ │ Redis │
│ (Flow Data) │ │ (DNS Config) │ │ (DNS Cache) │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
│ Query unique IPs │ Load config │ Cache lookups
│ │ │
▼ ▼ │
┌─────────────────────────────────────────────┐ │
│ DNS Resolver Service │ │
│ (Node.js - Backend) │◄────────┘
│ │
│ • Queries ClickHouse for unique IPs │
│ • Filters by public/private settings │
│ • Performs DNS reverse lookups │
│ • Caches results in Redis │
│ • Tracks DNS server health │
└─────────────────────────────────────────────┘
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Kafka │────▶│ Flow Enricher │────▶│ ClickHouse │
│ (Raw Flows) │ │ (Go Service) │ │ (Enriched Flows)│
└─────────────────┘ └────────┬────────┘ └─────────────────┘
│
│ Bulk lookup
▼
┌─────────────────┐
│ Redis │
│ (DNS Cache) │
└─────────────────┘
Data Flow
DNS Resolver Service (runs every 30 seconds)
- Queries ClickHouse for unique IPs from recent flows
- Filters IPs based on config (public/private settings)
- Checks Redis for already-cached entries
- Performs DNS reverse lookups for uncached IPs
- Stores results in Redis with configurable TTL
Go Flow Enricher (processes each flow)
- Reads flows from Kafka
- Extracts
src_addranddst_addr - Performs bulk Redis lookup (pipeline)
- Adds
src_hostnameanddst_hostnameto flow - Writes enriched flow to ClickHouse
Components
1. Redis Cache
Purpose: Fast in-memory storage for DNS lookups
Docker Service (docker-compose.yml):
redis:
image: redis:7-alpine
container_name: chompy-redis
restart: unless-stopped
networks:
- netflow_network
ports:
- "6379:6379"
volumes:
- redis_data:/data
command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3
Key Format: rdns:<ip_address>
Value Format:
- Resolved hostname (e.g.,
vmware.home.local) NXDOMAINfor IPs that don't resolve
TTLs:
- Success: 86400 seconds (24 hours) - configurable
- NXDOMAIN: 3600 seconds (1 hour) - configurable
2. DNS Resolver Service (Node.js)
Purpose: Background service that populates Redis cache
Configuration (loaded from PostgreSQL):
| Setting | Default | Description |
|---|---|---|
enabled | true | Master enable/disable |
resolve_public_ips | true | Resolve public IP addresses |
resolve_private_ips | true | Resolve RFC1918 addresses |
min_flow_count | 3 | Minimum flows before resolving an IP |
batch_size | 500 | Max IPs to process per cycle |
lookup_interval_ms | 30000 | Time between processing cycles |
success_ttl | 86400 | Cache TTL for successful lookups (seconds) |
nxdomain_ttl | 3600 | Cache TTL for NXDOMAIN responses (seconds) |
max_concurrent_lookups | 50 | Parallel DNS queries |
dns_timeout_ms | 5000 | DNS query timeout |
DNS Server Failover:
- Servers ordered by priority (lower = first)
- Health tracking: 3 consecutive failures marks server unhealthy
- Unhealthy servers skipped for 5 minutes, then retried
- NXDOMAIN is a valid response, not a failure
3. Go Flow Enricher
Environment Variables:
| Variable | Example | Description |
|---|---|---|
REDIS_URL | redis://redis:6379 | Redis connection URL |
ENABLE_DNS | true | Enable DNS enrichment |
Stats Logging (every 30 seconds):
[Stats] Flows: 17285 processed | DNS: 655 hits, 73 misses (90.0% hit rate)
UI Configuration
Access DNS settings via Settings → DNS Resolution:
- DNS Cache Statistics — Cache size, successful/failed lookups
- DNS Resolver Settings — Enable/disable, public/private toggles
- DNS Server Management — Add/edit/delete/test/reorder servers
- Refresh button — Update stats
- Clear cache button — Purge all cached entries
Default DNS Servers
| Priority | Name | Address | Use For |
|---|---|---|---|
| 0 | System Default | system | All |
| 100 | Google DNS | 8.8.8.8 | All |
| 101 | Google DNS Secondary | 8.8.4.4 | All |
| 200 | Cloudflare DNS | 1.1.1.1 | All |
| 201 | Cloudflare DNS Secondary | 1.0.0.1 | All |
Environment Variables
Backend service in docker-compose.yml:
environment:
- REDIS_URL=redis://redis:6379
Flow-enricher service:
environment:
- REDIS_URL=redis://redis:6379
- ENABLE_DNS=true
Verification Commands
Check Redis Cache
# Count cached entries
docker exec chompy-redis redis-cli DBSIZE
# List some keys
docker exec chompy-redis redis-cli KEYS 'rdns:*' | head -20
# Lookup specific IP
docker exec chompy-redis redis-cli GET 'rdns:192.168.100.132'
Check Flow Enricher Stats
docker logs flow-enricher 2>&1 | grep Stats | tail -5
Expected output:
[Stats] Flows: 17285 processed | DNS: 655 hits, 73 misses (90.0% hit rate)
Test API Endpoints
# Get stats
curl -s http://localhost:3001/api/dns/stats | jq
# Test single lookup
curl -s "http://localhost:3001/api/dns/lookup?ip=8.8.8.8" | jq
# List servers
curl -s http://localhost:3001/api/dns/servers | jq
Troubleshooting
No Hostnames in Flows
Checklist:
- Redis running?
docker ps | grep redis - DNS resolver started?
docker logs chompy-backend | grep DNS - Flow enricher connected?
docker logs flow-enricher | grep Redis - Cache populated?
docker exec chompy-redis redis-cli DBSIZE
Low Cache Hit Rate
Possible causes:
- Cache recently cleared
- High volume of new unique IPs
- TTL too short
Check: Flow enricher logs show hit/miss rate
Performance Considerations
Memory Usage
- Each cached entry: ~100 bytes
- 512MB Redis limit: ~5 million entries
- LRU eviction automatically removes old entries
Lookup Performance
- Redis pipeline lookup: ~0.1-0.5ms per batch
- No blocking on cache miss (returns empty string)
- Flow enricher throughput unaffected
DNS Query Rate
- Batch processing limits query rate
- Concurrent lookups capped at 50 by default
- Unhealthy servers automatically skipped
Typical Configurations
Internal Network Only
DNS Resolver: Enabled
Public IPs: Disabled (reduces cache size)
Private IPs: Enabled
DNS Servers:
Priority 0: Internal DNS (10.0.0.53) - Private only
Full Resolution
DNS Resolver: Enabled
Public IPs: Enabled
Private IPs: Enabled
DNS Servers:
Priority 0: System Default - All
Priority 10: Internal DNS (10.0.0.53) - Private only
Priority 100: Google DNS (8.8.8.8) - Public only
Priority 101: Cloudflare (1.1.1.1) - Public only (backup)