Skip to main content

AI Data Masking

WhiteOwl Networks includes a data masking service that protects sensitive network information before it is sent to external AI providers. When enabled, IP addresses, hostnames, and device names are replaced with anonymized placeholders before leaving your environment. AI responses are then unmasked before being displayed, so the user experience remains seamless.

Overview

The AI assistant, alert processor, and report generator all send network context to an external AI provider (Anthropic Claude) for analysis. This context includes IP addresses, device names, hostnames, and other potentially sensitive infrastructure details. Data masking ensures none of this information leaves your network in its original form.

How It Works

                    ┌──────────────┐
User Question ──▶ │ Gather │
│ Network │
│ Context │
└──────┬───────┘

┌──────▼───────┐
│ DataMasker │
│ .maskContext │──▶ 192.168.1.1 → 192.0.2.1
│ │ core-rtr-01 → device-001
└──────┬───────┘

┌──────▼───────┐
│ AI Provider │ Only sees anonymized data
│ (Claude) │
└──────┬───────┘

┌──────▼───────┐
│ DataMasker │
│ .unmask │──▶ 192.0.2.1 → 192.168.1.1
│ │ device-001 → core-rtr-01
└──────┬───────┘

Response ──▶ User sees real IPs and names

The masking is bidirectional and deterministic within a single request: the same real IP always maps to the same masked IP, ensuring the AI's analysis remains internally consistent even though it never sees real addresses.

Enabling Data Masking

Data masking is controlled via the AI Settings page under Settings → AI Configuration.

Toggle the Mask Sensitive Data option to enable masking across all AI features. This sets the mask_data column in the ai_settings table:

ALTER TABLE ai_settings ADD COLUMN mask_data BOOLEAN DEFAULT false;

When disabled, all data passes through to the AI provider unchanged. When enabled, masking is applied to the AI chatbot, report generator, and alert processor.

What Gets Masked

IP Addresses

All IPv4 addresses are replaced with addresses from RFC 5737 documentation ranges, which are reserved for documentation and examples and will never appear in real network traffic:

Real AddressMasked AsRFC 5737 Range
192.168.1.1192.0.2.1192.0.2.0/24 (first 254)
10.0.0.50198.51.100.1198.51.100.0/24 (next 254)
172.16.200.5203.0.113.1203.0.113.0/24 (next 254)

The masker maintains a sequential counter across the three documentation ranges, supporting up to 762 unique IP addresses per request. Each unique real IP receives a unique masked IP, and the mapping is consistent within the request — if 192.168.1.1 maps to 192.0.2.1, every occurrence in the context and user message is replaced.

Hostnames and Device Names

Device names and hostnames are replaced with generic sequential identifiers:

Real NameMasked As
core-router-01device-001
edge-switch-nyc-02device-002
fw-dmz-primarydevice-003

Hostname masking is applied to fields like device_name, mgmt_ip, host, src_hostname, dst_hostname, and any other string fields that contain known device names or hostnames.

Integration Points

Data masking is integrated into three AI-powered features:

AI Chat Assistant

In services/aiContext.js, masking wraps the context gathering and prompt construction:

const masker = new DataMasker({ enabled: aiSettings.mask_data ?? false });
const maskedContext = masker.maskContext(networkContext);
const maskedMessage = masker.maskContext(message);

// Build prompt with maskedContext, send maskedMessage to AI
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
system: systemPrompt, // contains maskedContext
messages: [
...conversationHistory,
{ role: 'user', content: maskedMessage }
]
});

// Unmask before returning to user
const assistantMessage = masker.unmaskResponse(response.content[0].text);

The user's question is also masked, so if they ask "What's happening on core-router-01?", the AI sees "What's happening on device-001?" and responds accordingly. The response is then unmasked so the user sees their real device names.

Report Generator

In services/reportGenerator.js, the same pattern applies to scheduled and on-demand reports:

let maskEnabled = false;
try {
const settings = await this.pgClient.query('SELECT mask_data FROM ai_settings WHERE id = 1');
maskEnabled = settings.rows[0]?.mask_data ?? false;
} catch (e) { /* masking disabled on error */ }

const masker = new DataMasker({ enabled: maskEnabled });
const maskedData = masker.maskContext(reportData);

// Build report context using maskedData instead of reportData
// ...

// Unmask the AI-generated report
const result = JSON.parse(jsonMatch[0]);
return masker.unmaskObject(result);

Alert Processor

AI-powered alert analysis also masks context before sending alert details to the AI provider for root cause analysis.

DataMasker API

The DataMasker class (services/dataMaskerService.js) provides three main methods:

maskContext(data)

Recursively walks any data structure (objects, arrays, strings) and replaces all detected IP addresses and hostnames with their masked equivalents. Returns a deep copy — the original data is never modified.

const masker = new DataMasker({ enabled: true });
const masked = masker.maskContext({
devices: [
{ name: 'core-rtr-01', ip: '192.168.1.1' },
{ name: 'edge-sw-02', ip: '10.0.0.50' }
]
});
// Result:
// {
// devices: [
// { name: 'device-001', ip: '192.0.2.1' },
// { name: 'device-002', ip: '198.51.100.1' }
// ]
// }

unmaskResponse(text)

Replaces all masked values in a string with their original values. Used on AI responses before returning them to the user.

const response = "Device device-001 at 192.0.2.1 has high CPU usage.";
const unmasked = masker.unmaskResponse(response);
// "Device core-rtr-01 at 192.168.1.1 has high CPU usage."

unmaskObject(obj)

Like unmaskResponse, but recursively walks an object or array. Used when the AI returns structured JSON (such as report data).

getMappings()

Returns the current IP and hostname mapping tables. Useful for debugging:

console.log(masker.getMappings());
// {
// ips: { '192.168.1.1': '192.0.2.1', '10.0.0.50': '198.51.100.1' },
// hostnames: { 'core-rtr-01': 'device-001', 'edge-sw-02': 'device-002' }
// }

Security Considerations

Masking is per-request: Each API call creates a fresh DataMasker instance with new mappings. Mappings are not persisted or shared between requests.

Masking is not encryption: The anonymization uses simple substitution. It prevents accidental exposure of infrastructure details to external services but is not designed to withstand adversarial analysis.

Conversation history: When conversation history is sent for multi-turn chat, previous messages are not re-masked. If masking was disabled for earlier messages in a conversation, those messages retain their original content.

Graceful degradation: If the masker encounters an error, it defaults to disabled (pass-through) rather than blocking the AI feature entirely.

Some IPs are not masked

The masker detects Private IPv4 addresses using regex pattern matching. IPs embedded in unusual formats (hex, integer representation) may not be caught. Standard dotted-decimal notation is fully supported.