Skip to main content

DNS Resolution System

Overview

The DNS Resolution System provides automatic hostname resolution for IP addresses in network flow data. It consists of a background resolver service that populates a Redis cache, which is then read by the Go flow-enricher to add hostnames to flows before they're stored in ClickHouse.

Key Features

  • Asynchronous Resolution — DNS lookups happen in the background, never blocking flow processing
  • Redis Caching — Sub-millisecond lookups with configurable TTLs
  • Multi-Server Failover — Configure multiple DNS servers with automatic health tracking
  • Public/Private IP Filtering — Independently enable/disable resolution for public vs private IPs
  • UI Configuration — All settings manageable via the web interface
  • High Performance — 90%+ cache hit rates, minimal impact on flow processing

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│ ClickHouse │ │ PostgreSQL │ │ Redis │
│ (Flow Data) │ │ (DNS Config) │ │ (DNS Cache) │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
│ Query unique IPs │ Load config │ Cache lookups
│ │ │
▼ ▼ │
┌─────────────────────────────────────────────┐ │
│ DNS Resolver Service │ │
│ (Node.js - Backend) │◄────────┘
│ │
│ • Queries ClickHouse for unique IPs │
│ • Filters by public/private settings │
│ • Performs DNS reverse lookups │
│ • Caches results in Redis │
│ • Tracks DNS server health │
└─────────────────────────────────────────────┘

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Kafka │────▶│ Flow Enricher │────▶│ ClickHouse │
│ (Raw Flows) │ │ (Go Service) │ │ (Enriched Flows)│
└─────────────────┘ └────────┬────────┘ └─────────────────┘

│ Bulk lookup

┌─────────────────┐
│ Redis │
│ (DNS Cache) │
└─────────────────┘

Data Flow

DNS Resolver Service (runs every 30 seconds)

  1. Queries ClickHouse for unique IPs from recent flows
  2. Filters IPs based on config (public/private settings)
  3. Checks Redis for already-cached entries
  4. Performs DNS reverse lookups for uncached IPs
  5. Stores results in Redis with configurable TTL

Go Flow Enricher (processes each flow)

  1. Reads flows from Kafka
  2. Extracts src_addr and dst_addr
  3. Performs bulk Redis lookup (pipeline)
  4. Adds src_hostname and dst_hostname to flow
  5. Writes enriched flow to ClickHouse

Components

1. Redis Cache

Purpose: Fast in-memory storage for DNS lookups

Docker Service (docker-compose.yml):

redis:
image: redis:7-alpine
container_name: chompy-redis
restart: unless-stopped
networks:
- netflow_network
ports:
- "6379:6379"
volumes:
- redis_data:/data
command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3

Key Format: rdns:<ip_address>

Value Format:

  • Resolved hostname (e.g., vmware.home.local)
  • NXDOMAIN for IPs that don't resolve

TTLs:

  • Success: 86400 seconds (24 hours) - configurable
  • NXDOMAIN: 3600 seconds (1 hour) - configurable

2. DNS Resolver Service (Node.js)

Purpose: Background service that populates Redis cache

Configuration (loaded from PostgreSQL):

SettingDefaultDescription
enabledtrueMaster enable/disable
resolve_public_ipstrueResolve public IP addresses
resolve_private_ipstrueResolve RFC1918 addresses
min_flow_count3Minimum flows before resolving an IP
batch_size500Max IPs to process per cycle
lookup_interval_ms30000Time between processing cycles
success_ttl86400Cache TTL for successful lookups (seconds)
nxdomain_ttl3600Cache TTL for NXDOMAIN responses (seconds)
max_concurrent_lookups50Parallel DNS queries
dns_timeout_ms5000DNS query timeout

DNS Server Failover:

  • Servers ordered by priority (lower = first)
  • Health tracking: 3 consecutive failures marks server unhealthy
  • Unhealthy servers skipped for 5 minutes, then retried
  • NXDOMAIN is a valid response, not a failure

3. Go Flow Enricher

Environment Variables:

VariableExampleDescription
REDIS_URLredis://redis:6379Redis connection URL
ENABLE_DNStrueEnable DNS enrichment

Stats Logging (every 30 seconds):

[Stats] Flows: 17285 processed | DNS: 655 hits, 73 misses (90.0% hit rate)

UI Configuration

Access DNS settings via SettingsDNS Resolution:

  • DNS Cache Statistics — Cache size, successful/failed lookups
  • DNS Resolver Settings — Enable/disable, public/private toggles
  • DNS Server Management — Add/edit/delete/test/reorder servers
  • Refresh button — Update stats
  • Clear cache button — Purge all cached entries

Default DNS Servers

PriorityNameAddressUse For
0System DefaultsystemAll
100Google DNS8.8.8.8All
101Google DNS Secondary8.8.4.4All
200Cloudflare DNS1.1.1.1All
201Cloudflare DNS Secondary1.0.0.1All

Environment Variables

Backend service in docker-compose.yml:

environment:
- REDIS_URL=redis://redis:6379

Flow-enricher service:

environment:
- REDIS_URL=redis://redis:6379
- ENABLE_DNS=true

Verification Commands

Check Redis Cache

# Count cached entries
docker exec chompy-redis redis-cli DBSIZE

# List some keys
docker exec chompy-redis redis-cli KEYS 'rdns:*' | head -20

# Lookup specific IP
docker exec chompy-redis redis-cli GET 'rdns:192.168.100.132'

Check Flow Enricher Stats

docker logs flow-enricher 2>&1 | grep Stats | tail -5

Expected output:

[Stats] Flows: 17285 processed | DNS: 655 hits, 73 misses (90.0% hit rate)

Test API Endpoints

# Get stats
curl -s http://localhost:3001/api/dns/stats | jq

# Test single lookup
curl -s "http://localhost:3001/api/dns/lookup?ip=8.8.8.8" | jq

# List servers
curl -s http://localhost:3001/api/dns/servers | jq

Troubleshooting

No Hostnames in Flows

Checklist:

  1. Redis running? docker ps | grep redis
  2. DNS resolver started? docker logs chompy-backend | grep DNS
  3. Flow enricher connected? docker logs flow-enricher | grep Redis
  4. Cache populated? docker exec chompy-redis redis-cli DBSIZE

Low Cache Hit Rate

Possible causes:

  • Cache recently cleared
  • High volume of new unique IPs
  • TTL too short

Check: Flow enricher logs show hit/miss rate


Performance Considerations

Memory Usage

  • Each cached entry: ~100 bytes
  • 512MB Redis limit: ~5 million entries
  • LRU eviction automatically removes old entries

Lookup Performance

  • Redis pipeline lookup: ~0.1-0.5ms per batch
  • No blocking on cache miss (returns empty string)
  • Flow enricher throughput unaffected

DNS Query Rate

  • Batch processing limits query rate
  • Concurrent lookups capped at 50 by default
  • Unhealthy servers automatically skipped

Typical Configurations

Internal Network Only

DNS Resolver: Enabled
Public IPs: Disabled (reduces cache size)
Private IPs: Enabled

DNS Servers:
Priority 0: Internal DNS (10.0.0.53) - Private only

Full Resolution

DNS Resolver: Enabled
Public IPs: Enabled
Private IPs: Enabled

DNS Servers:
Priority 0: System Default - All
Priority 10: Internal DNS (10.0.0.53) - Private only
Priority 100: Google DNS (8.8.8.8) - Public only
Priority 101: Cloudflare (1.1.1.1) - Public only (backup)