Network Diagnostic Platform

Deterministic.
Offline-first.
Evidence-driven.

NetDoctor ingests switch, router and firewall configurations, detects rogue devices via MAC intelligence, runs deterministic rules and step-by-step playbooks - producing findings with cited evidence. Fully offline, read-only, zero trust in AI.

120+
Checks
42
Parsers
1781
Tests
100%
Offline

No arbitrary CLI

Devices are queried only through a fixed catalog of safe, read-only intents. No user-typed command strings ever reach the wire.

Evidence-first

Every finding carries provenance: which artifact, which line, which parsed field, which baseline value. No finding without evidence.

Offline core

The full diagnostic engine runs from uploaded files - no internet, no AI required. AI is an explanation layer, never truth.

Deterministic

Same inputs, same outputs. Rules operate on normalized snapshots and derived facts, not on raw text grep.

What it does - today

A complete diagnostic loop, end to end.

Every feature listed below is implemented, tested and shipping.

Ingest

Upload bundles

Upload configs, show outputs and zipped bundles via drag-and-drop or paste. Filename detection, content sniffing and gzip storage keep artifacts traceable.

Ingest

Read-only SSH

Collect live device output through Scrapli async SSH with parallel collection: baseline, topology, full and troubleshoot profiles. Per-device locks, per-site concurrency caps, per-command retries.

Engine

Cisco parsers

42 structured parsers normalize Cisco IOS / IOS-XE state: config, inventory, VLANs, trunks, interfaces, CDP, LLDP, STP, routes, MAC, ARP, PoE, environment and SNMP.

Engine

Snapshot model

Parsed data is merged into a canonical JSON snapshot, separating configured and observed state with consistency flags when they disagree.

Engine

Derived facts

Role, uplink, stack, gateway and redundancy facts are computed once so every rule reads the same normalized model.

Rules

Baseline checks

Built-in and organization rules cover STP/VTP hygiene, VLANs, trunks, AAA, SNMP, NTP, DHCP snooping and DAI without dynamic eval.

Rules

Policy layers

Baselines merge built-in, global, environment, site, role and device layers, with per-rule overrides and a clear winning source.

Evidence

Provenance

Every finding carries artifact ID, parsed field, baseline source and predicate inputs, so reports can show exactly why it fired.

UI

Device detail

Per-device dashboard shows summary, findings, interfaces, VLANs, neighbors and raw artifacts with searchable operational context.

UI

Topology graph

Site topology renders role-based hierarchy, port-channels, clusters and focus paths, with neighbor links kept tied to evidence.

UI

Risk views

Cross-device findings are deduplicated by rule, device, interface and VLAN, then grouped into severity and critical-path views.

UI

Baseline UI

Browse and edit baseline policy in the UI, previewing which layer wins before a check is applied to a device.

AI

Optional AI

A local LLM offers plain-language commentary on findings - no data ever leaves the machine. Disabled by default, never a source of truth.

Ops

Job audit

Collection and analysis jobs track who ran what, profile, target, status and errors in a deterministic state machine.

Ops

Exports

CSV and JSON exports include findings and snapshots with Cisco-aware secret redaction before external sharing.

Engine

MAC Intelligence

Offline IEEE OUI database, vendor classification, MAC observation tracking, flap detection and rogue device analysis. Baseline-driven - vendor name alone never triggers high-severity findings.

Engine

Playbook engine

Step-by-step diagnostic playbooks mapped to findings. 10 playbooks with 118 individual checks covering port issues, VLAN, STP, EtherChannel, PoE, AP, TACACS and compliance.

Ops

Credential vault

SSH credentials encrypted at rest with AES-256-GCM, PBKDF2-SHA256 key derivation (100k iterations). Passwords never returned in API responses. Per-profile isolation.

Ops

Scheduled collection

Cron-based automated SSH collection with hierarchical targeting: global, country, site, sub-site or specific devices. 7 presets plus custom cron, with per-device locking and concurrency caps.

UI

Smart site mapping

Sites auto-positioned on a world map from hostname conventions and an offline city coordinate database. Golden-angle spiral offsets prevent overlap. Admins can drag markers to override.

Ops

Admin panel

7-tab admin dashboard: system health, RBAC with 4 roles and per-user permission overrides, credential vault, backup scheduler, security audit with forensic fingerprinting, and brute-force lockout.

UI

Live terminal

WebSocket-streamed SSH output during collection. Watch every command execute in real time, per device, with status indicators and per-command progress tracking.

Ops

Automated backups

Scheduled PostgreSQL backups via pg_dump with gzip compression. Configurable retention policy (default 30 days), backup history with size and age, and one-click manual backup.

Quality

Test suite

Golden fixtures and regression tests pin parser output, derived facts and rule behavior across 1 781 automated checks.

How it works

Five stages, deterministic, evidenced.

01

Ingest

Upload artifacts or run a read-only SSH collection profile against live devices. Files are deduplicated, gzipped, and detected by filename + content.

02

Parse

Each artifact runs through its dedicated parser. Outputs are dataclasses with explicit fields - never raw strings. Parser status: ok / partial / failed / empty / raw_only.

03

Normalise

Parsers feed the snapshot builder. Configured vs observed values merge into a canonical JSON model with consistency flags. Derived facts are computed once.

04

Evaluate

Deterministic rule predicates run against the snapshot, derived facts and 6-layer baseline. Each finding is built with its evidence payload.

05

Present

Dashboard, snapshot detail, topology graph, exports. AI explanations on demand for context - never replacing the deterministic verdict.

Topology

Your network, drawn from the wires up.

NetDoctor builds a live topology graph from CDP, LLDP and port-channel data without a single SNMP poll. Click any device, get evidence-anchored findings inline.

Click any device to inspect
WAN / edge core stack access layer Po1 x2 10Gb RTR RO-EDGE-RTR ISR Gi0/0 FW FW-01 FortiGate wan1 CORE RO-CORE-01 C9500 stack 48/52 up ACC SW-ACC-01 C9300 Po1 ACC SW-ACC-02 C9200 Gi1/0/48 6x PC IOT x4 3x PHONE ESXI-02 AP-17
router/fw core access endpoints Po aggregation CDP / LLDP / clusters
01

Hierarchy you can trust

Devices placed by inferred role (core / distribution / access / endpoint) using the same engine that powers the rules.

02

Port-channels, one logical link

Aggregated members deduplicated and rendered as parallel lines with the operational port-channel badge.

03

Endpoints classified offline

Phones, APs, cameras, HMIs, printers, servers - classified from CDP capabilities and the offline IEEE OUI database.

04

Findings overlaid on the graph

Severity counts inline. Click a device for full evidence, recommendation and impact for every finding.

05

Export as PNG or SVG

One-click high-resolution PNG export (3x scale) or clean SVG for documentation, change requests and management reports. Every export captures the current layout, collapsed state and finding badges exactly as displayed.

06

Interactive canvas

Pan, zoom (0.05x to 8x), drag individual nodes with optional grid snap. Multi-select with rubber-band selection. Collapse subtrees with the ± button. Save and load named layout profiles. Positions persist across sessions.

07

World map with site markers

Sites auto-positioned on a Leaflet/OSM map from hostname conventions and an offline city coordinate database. Pulsing markers, cluster grouping at low zoom, click to drill into the site graph. Admins can drag markers to override positions.

08

Ghost neighbors and clusters

CDP/LLDP neighbors without a local snapshot appear as ghost nodes (dashed outline). Endpoints with the same role are grouped into expandable clusters with a +N badge. Click a cluster to fan out individual members.

MAC Intelligence

Every MAC address tells a story. We read all of them.

39,201 IEEE OUI entries. 590 curated vendor overrides. 130+ classification patterns. Completely offline — no internet required, ever.

🔍

Vendor identification

Every MAC address is resolved against the full IEEE OUI registry — 39,201 entries compressed to 318 KB, loaded in ~80 ms. Two-tier architecture: 590 hand-curated vendor type overrides checked first, then auto-classification with confidence scoring from 130+ regex patterns. Cisco, Juniper, Arista, Fortinet, Hikvision, Yealink, Xerox and hundreds more — each with a device type and confidence score.

🏷️

Device type classification

12 canonical device types: network (switches, routers, firewalls, APs), phone (VoIP), printer, camera (IP/NVR), endpoint (laptops, desktops, mobiles), server, virtualization (VMware, Hyper-V, KVM), firewall, wireless_ap, iot (industrial, sensors, UPS) and more. Ambiguous vendors (Cisco = switch or phone? HP = printer or server?) return possible_types for downstream rules to refine using interface role, baseline and history.

🛡️

Rogue device detection

15 deterministic rules (ROGUE-001 through ROGUE-015) analyse every access port. Unauthorized mini-switches on user ports, network vendor MACs where only endpoints belong, unknown OUI on secured ports, MAC flapping from syslog, port-security violations and 802.1X/MAB failures. Vendor name alone never triggers a high-severity finding — every rule requires corroborating evidence from interface role, baseline and observed state.

🧠

Smart inference & downgrade

Not every multi-MAC access port is a rogue switch. The engine recognizes: phone + PC pairs (downgrades to info, recommends voice VLAN), wireless AP in bridge mode (AP + wireless clients on one port, ≤20 MACs), camera clusters (PoE camera switch / NVR uplink, ≥50% camera OUI), and multi-NIC servers (sequential same-OUI MACs within 8 addresses). Each suppression is explained and logged.

📍

Port movement & history

Every MAC is tracked per device: first seen, last seen, observation count, current interface, previous interfaces. When a MAC moves between ports on the same switch, the engine generates an alert with from/to interfaces and timestamp. Full history is persisted in PostgreSQL with configurable retention (90 days default).

Flap detection

Dual detection: MAC table analysis finds the same MAC learned on multiple interfaces simultaneously (confidence 0.90), and syslog parsing catches Cisco SW_MATM / MACFLAP messages with interface pairs (confidence 0.85). Virtual MACs are excluded: HSRP, VRRP, GLBP, STP, LLDP, LACP, multicast and broadcast.

🌐

Cross-site dual presence

Detects the same MAC address active at two or more sites simultaneously. Severity scales with the time gap: under 1 minute = critical (impossible travel / MAC spoofing), 1 hour = high, 6 hours = medium. Virtual MAC protocols and multicast are excluded. Site extraction uses hostname parsing — no hardcoded site names.

📊

Baseline-driven analysis

Every interface has role-based expectations: access ports allow 1 MAC, voice ports allow 2, trunks 256, uplinks unlimited. Organization baselines can override expected vendor types, maximum MAC counts and known MAC allowlists per interface. Violations are measured against the baseline, not against arbitrary thresholds. Uplinks, etherchannel members, AP trunks and server trunks are automatically excluded from rogue checks.

39,201IEEE OUI entries
15Rogue rules
130+Vendor patterns
0Internet required
Network Telemetry · In Development

Beyond SSH — continuous visibility without logging in.

SSH collects a point-in-time snapshot. Telemetry adds the dimension of time: real-time counters, async events, and change detection that triggers re-analysis automatically.

In Development

SNMPv3 Polling Collector

Authenticated, encrypted polling (SHA + AES-128) with configurable intervals. Interface counters (64-bit HC), CPU, memory, temperature, fan status, MAC table, ARP table, STP topology, VLANs, CDP/LLDP neighbors, port-security violations — all from standard and Cisco enterprise MIBs.

What it collects

IF-MIB counters IP-MIB / ARP BRIDGE-MIB / MAC table STP topology CISCO-CDP-MIB LLDP-MIB CISCO-PROCESS-MIB (CPU) CISCO-MEMORY-POOL-MIB CISCO-ENVMON-MIB ENTITY-MIB (inventory) CISCO-PORT-SECURITY-MIB CISCO-VTP-MIB
In Development

SNMP Trap & Inform Receiver

Async event receiver on UDP 162 / 1162. The device pushes events the moment they happen — no polling delay. Link up/down, cold start, config changes, STP topology changes, port-security violations, err-disable events. SNMPv3 informs with acknowledgement guarantee.

Events captured

linkDown / linkUp coldStart / warmStart authenticationFailure Config change STP topology change Port-security violation Err-disable VLAN membership change
In Development

Syslog Collector

UDP 514, TCP 514 and TLS 6514 receiver. Every switch and firewall already speaks syslog — no agent, no license. 8 severity levels from emergency to debug. Config changes, link events, STP reconvergence, port-security violations, DHCP snooping, ACL hits — all captured, parsed and correlated with device snapshots.

Correlation triggers

%SYS-5-CONFIG_I → re-collect config %LINK-3-UPDOWN → re-collect interfaces %PORT_SECURITY-2 → immediate finding %SPANTREE-5 → re-evaluate STP rules %SW_MATM → MAC flap detection
Planned

Event-Driven Re-Collection

The missing piece between polling and continuous assurance. When a trap or syslog event signals a meaningful change (config saved, link flap, STP reconvergence), NetDoctor automatically triggers a targeted SSH re-collection of the affected artifacts — and re-runs the rule engine. The finding appears in the dashboard within seconds of the event, not at the next scheduled poll.

Future

gRPC Model-Driven Telemetry

For IOS-XE 16.10+, NX-OS and IOS-XR: sub-second push-model streaming over HTTP/2 with Protocol Buffers. Interface counters every 1 second, CPU every 5 seconds, routing table changes in real time. No polling overhead, no SNMP limitations. The highest-fidelity data source for modern Cisco platforms.

Future

NETCONF / RESTCONF

Structured XML/JSON data over SSH (port 830) and HTTPS (port 443) using YANG models. Atomic reads with candidate configs, operational data stores, and event notifications. The programmatic alternative to CLI scraping — available on IOS-XE, FortiGate, Junos, Arista EOS and most modern platforms.

Design principle: Every telemetry source feeds the same normalized snapshot that powers the rule engine. SSH, SNMP, syslog and future gRPC data all converge into a single deterministic pipeline. No separate dashboards, no data silos — one engine, one truth.

Security model

Designed against the things that actually break networks.

A single typo in configuration mode can take an enterprise offline. That's why the tool has no configuration mode.

Things this tool will never do

  • Execute arbitrary CLI typed by a user
  • Enter configure terminal
  • Run write, reload, clear, erase, delete, format
  • Execute uncontrolled debug commands
  • Treat AI output as source of truth
  • Send unsanitised secrets to any external API
  • Generate findings without evidence
  • Treat missing data as healthy

What it actually does

  • Read-only intents from a fixed, audited catalog
  • Per-intent timeouts and risk classification
  • Cisco-aware redactor: enable secret, SNMP communities, TACACS keys, PSKs
  • Full audit trail (who/when/what/where) in PostgreSQL
  • JWT auth with scoped roles (superadmin, admin, operator, viewer)
  • Per-user granular permission overrides (7 permissions)
  • Credential vault with AES-256-GCM encryption at rest
  • Brute-force lockout (5 attempts / 5 min → 15-min lock)
  • Security audit with forensic fingerprinting
  • Findings carry provenance back to the originating line
  • Explicit "missing data" state - never silently healthy

Redaction surface

Before any artifact, finding or snapshot can leave the local perimeter (export, AI prompt, share link), the redactor strips:

enable secret / password username … secret snmp-server community / user tacacs-server key radius-server key key-string pre-shared-key crypto isakmp key vty / console password
Roadmap

From offline switch audit to full multi-vendor topology intelligence.

Built in order: engine → MAC intelligence → routing → cross-device path → FortiGate → AI explanations. Each phase ships with tests before the next begins.

01
Shipped

Cisco Switch Core

  • Offline upload & parse pipeline
  • Normalised Device Snapshot v2
  • 23 Cisco show parsers
  • 55 built-in switch rules
  • Derived facts engine
  • Evidence engine
  • 6-layer baseline merge
  • Snapshot Detail dashboard
02
Shipped

Active Collection

  • Netmiko SSH collection
  • 4 collection profiles
  • Per-intent timeouts & risk
  • Live WebSocket terminal
  • Credential vault
  • Job state machine + audit
03
Shipped

Org & UX

  • Organisation baselines
  • Custom baseline-driven rules
  • Topology graph (SVG)
  • Cross-device CDP / LLDP rules
  • Findings dedup & search
  • Stack detection (4-tier)
  • Re-analysis & incremental upload
04
Shipped

Playbooks (118 checks)

  • Diagnostic playbook engine
  • Finding-to-playbook mapping
  • Step-by-step remediation guides
  • AP port verification playbook
  • 10 playbooks across 8 categories
  • Playbook audit trail
05
Shipped

MAC Intelligence

  • OUI database (offline IEEE registry)
  • Vendor lookup & classifier
  • MAC observation history
  • MAC flap detection
  • Rogue device analyzer (5-phase pipeline)
  • Baseline-driven, not vendor-name-only
06
Shipped

Cisco Routing

  • Route table parser (RIB)
  • Route candidate model + preference
  • BGP analyzer + RIB-failure rule
  • ARP analyzer
  • Next-hop / CEF validation
  • Causal chain builder
07
Shipped

Cross-Device Path

  • Path graph builder
  • End-to-end path tracer
  • Multi-hop forwarding validation
  • Cross-device evidence chains
08
Shipped

FortiGate

  • FortiGate knowledge model
  • Configuration parser
  • Policy & route table parsers
  • Built-in FortiGate rules
  • Cross-vendor topology
09
In Development

Network Telemetry

  • SNMPv3 polling collector (GET / WALK)
  • SNMP trap & inform receiver
  • Syslog collector (UDP / TCP / TLS)
  • Event-driven re-collection triggers
  • Real-time interface counters & health
  • Config change detection via traps
10
Next

AI Explanations

  • 100 % local LLM - no data leaves the machine
  • Per-finding plain-language summaries
  • Cited evidence in every reply
  • AI as commentary, not verdict
  • No internet required
11
Future

Palo Alto Networks

  • PAN-OS XML config parser
  • Security & NAT policy analyzer
  • Zone & virtual-router model
  • Panorama device-group awareness
  • Built-in PAN-OS rules
12
Future

Juniper Junos

  • Set / hierarchical config parser
  • Operational-mode show parsers
  • OSPF & BGP analyzers
  • SRX policy & security-zone rules
  • Virtual-chassis stack detection
13
Future

Aruba & HPE

  • ArubaOS-CX & AOS-S parsers
  • HPE Comware (5900/5940) support
  • VSF / IRF stack detection
  • Aruba Central / mobility rules
  • Cross-vendor LLDP normalisation
14
Future

MikroTik & Ubiquiti

  • RouterOS export parser
  • UniFi controller integration
  • EdgeOS / EdgeRouter support
  • Wireless / mesh-aware rules
  • SMB & ISP-friendly profiles
15
Future

Vendor SDK & Extensibility

  • Public parser plugin API (Python)
  • Declarative rule DSL (YAML)
  • Custom snapshot adapters
  • Community vendor packs
  • Extreme, Dell, Brocade scaffolds
  • Per-org vendor enable/disable

← Drag or scroll to explore all phases →

Tech stack

Boring on purpose. Easy to operate.

Backend

Python 3.11+ · FastAPI · SQLAlchemy async · Alembic · Scrapli (async SSH) · pyATS-friendly parsers

Frontend

React 19 · Vite · TanStack Query · Tailwind CSS · TypeScript

Storage

PostgreSQL 16 · Redis 7 · Filesystem (gzip artifacts)

Deployment

Docker Compose · Single-binary friendly · Air-gapped friendly

Quality

1 781 unit / integration tests · Golden fixture tests · pytest

AI (optional)

Local LLM only · zero data egress · explanations only · never source of truth

FAQ

The questions network engineers actually ask.

Will it ever push a config change to a device?

No. There is no configuration mode and there are no write commands in the catalog. The platform is read-only by architecture, not by policy.

Does it require internet access?

No. The entire platform - including the optional AI explanation layer - runs locally. No data ever leaves the machine, no external API calls, no telemetry.

Can it run in an air-gapped environment?

Yes. Docker Compose deployment + offline OUI database + offline rule packs. No phone-home telemetry.

How do you handle false positives?

Rules read normalised facts, not raw text. Derived facts (interface role, management SVI, stack topology) cap most false-positive sources. Baselines override severities and thresholds at any of 6 layers.

What happens when an artifact is missing?

It is explicit: rules that need it are listed under blocked by missing data, with the exact command to collect it. Missing data is never treated as healthy.

Why not just use Gemini / GPT / an LLM for the whole thing?

Security risk. Sending device configs to a cloud LLM leaks topology, credentials and policy to a third party. NetDoctor uses a local LLM only - nothing leaves the machine. And AI is restricted to plain-language commentary; verdicts always come from deterministic rules with cited evidence.

What vendors are supported?

Today: Cisco IOS / IOS-XE switches (L2 and L3) with full MAC intelligence and rogue device detection. Next: Cisco routers (RIB / BGP / CEF). Then: FortiGate firewalls, Palo Alto, Juniper Junos.

Where do organisation values come from?

Baseline files: built_in → global → environment → site → role → device. Never hardcoded in source. Every value emitted in evidence cites its baseline layer.