SENSEI System Design Notes

DISCLAIMER

This post is a work in progress as I go through describing several years of development.

Introduction

Welcome back!

In this post I want to document the design of SENSEI, a system for network state estimation, sharing, supervision, and operator control in tactical and distributed environments.

SENSEI started from a practical need: we needed a way to observe network conditions across multiple nodes, share that state between instances, and turn those observations into actionable information for both applications and operators. Over time, the project evolved from a monitoring-oriented system into a broader platform that supports configuration management, runtime supervision, active link control, and visualization.

This post describes what SENSEI is, how it is structured, how it was built, and some of the design decisions behind the current architecture.

What is SENSEI

At a high level, SENSEI is a framework for collecting, exchanging, and consuming network state.

It is designed for environments where connectivity is dynamic, distributed, and sometimes constrained. Instead of treating networking as a black box, SENSEI exposes a structured view of the network through a combination of:

local monitoring
state sharing between instances
supervisory reasoning
operator-facing visualization and control

In practice, SENSEI combines several ideas:

Monitoring network behavior locally
Sharing summarized network state across nodes
Supervising and interpreting that state
Exposing configuration and runtime state through a REST API and web UI

The result is a system that can support both humans in the loop and software components that need to understand or react to network conditions.

SENSEI Architecture

The figure below shows the current SENSEI architecture.

SENSEI Architecture

SENSEI is best understood as a set of interacting layers:

Operator / Management Plane: Configure and monitors state of SENSEI instances.
SENSEI Core: Harvests, merge, analyze, and share network state information.
REST-managed configuration and runtime model: Provide access to SENSEI information “locally” through a REST interface.
Distributed SENSEI node instances: SENSEI instances are designed to communicate through disrupted links of heterogeneous characteristics.

System Requirements

The architecture was driven by a few practical needs.

The main objectives were:

Observe network conditions locally: operators needs to understand the state of the network, applications need network information to adapt their behavior.
Share network state between SENSEI instances: Use tomography to share only what is necessary to allow remote adaptation and monitoring. Provide a drill-down interface for when more info is required.
Provide a programmatic API for configuration and runtime state
Model monitored domains, edges, and link profiles: Must support flexible logical monitoring
Allow runtime control of active links: Operators must be able to specify preferences on how links are used.
Support extensibility for additional monitoring sources and transports

As the system evolved, another major requirement emerged: SENSEI needed a cleaner management plane. It was no longer enough to just collect and exchange telemetry; we also wanted operators to be able to inspect and modify the system from a UI.

That led to the addition of a FastAPI backend, a React frontend, and a YAML-backed configuration workflow.

High-level Design Discussion

Conceptually, SENSEI has three major concerns:

Monitoring and local state estimation
Sharing state between distributed instances
Using that state for supervision, adaptation, and operator control

To reflect that, the core architecture is organized into three logical layers:

1. Monitoring Layer

This is where raw or near-raw information is collected.

The main components are:

NetSensor
Responsible for passively harvesting traffic information, detecting topology-related information, and collecting local measurements.
NodeMonitor
Aggregates and fuses local state into a more meaningful node-level picture. This can include network statistics, node health, and topology summaries.
Other Producers
Additional sources such as Mockets, NetProxy, OS sensors, and location-related inputs can contribute data into the monitoring layer.

The monitoring layer is intentionally modular. I wanted SENSEI to be able to ingest observations from more than one source, instead of hard-coding a single monitoring path.

This is the layer that handles SENSEI-to-SENSEI state exchange.

Its responsibilities include:

group management / peer discovery
network state exchange
transport abstraction
filtering and aggregating what to share, compression, access control, etc.

A key design point here is that the Sharing Layer is the architectural boundary between SENSEI instances. Internally, the current implementation uses:

NATS for instance transport
Mockets for intra-instance transport

3. Consumer / Adaptation Layer

Once state has been collected and shared, something has to consume it.

The main consumers in the current design are:

NetSupervisor
Performs higher-level network analysis such as link classification and inference over throughput, latency, loss, and other conditions.
ACMS
Adaptive communication management logic, intended to support policy-based tuning and adaptation.
NetViewer / UI consumers
Visualization and operator awareness components.

This separation lets SENSEI support both human users and autonomous or semi-autonomous adaptation logic. See this for an interesting DEMO.

The REST-managed SENSEI Model

One of the biggest architectural changes in the newer version of SENSEI was introducing a clear configuration and runtime data model behind a REST API to support multiple communication paths between endpoints.

The backend models the following main entities:

Monitoring Domains
Represent groups of addresses and subnets that belong together logically.
Monitoring Edges
Represent relationships between domains.
Link Profiles
Describe expected link characteristics, both globally and on a per-edge basis.
Connectors / Acceptors
Represent outgoing and incoming connectivity settings for SENSEI instances.
Runtime Telemetry
Stores live link and node information such as timestamps, throughput, bandwidth, latency, saturation, and availability.
Active Links
Represents runtime link state and operator controls such as:
- preferred: prefer this link
- disableProbing: do not probe this link
- disableMonitoring: do not move monitoring data through this link
- deactivationRequested: stop using this link

This model is managed by a Configuration Manager that loads, applies, validates, and persists configuration.

How the New System Was Built

The current implementation is split into a few major software pieces.

React Frontend

The UI is written in React and provides a web-based interface for:

browsing configured domains, edges, connectors, and acceptors
viewing global and edge-specific profiles
inspecting runtime telemetry
controlling active links
displaying monitoring graphs over time

This was a major step forward compared to older workflows, because it gave us a clean way to make SENSEI inspectable and editable by operators.

The figure below shows the SENSEI UI.

SENSEI UI

SENSEI UI_GRAPHS

FastAPI Backend

The backend is implemented in Python using FastAPI.

Its responsibilities include:

exposing /api/v2/... endpoints
validating configuration through Pydantic
serving configuration objects such as domains, edges, profiles, connectors, and acceptors
exposing runtime objects such as links, nodes, and active links
persisting valid configuration back to YAML

The REST layer also acts as the bridge between the operator UI and the in-memory SENSEI model.

YAML-backed Configuration

Configuration is persisted in a YAML file. That made it easy to:

bootstrap the system from a human-editable source
keep configuration outside the code
inspect and version configuration changes

Over time, the backend evolved from simply loading configuration to also validating and writing it back after operator changes.

Runtime Telemetry and History

For monitoring and visualization, the frontend polls runtime link information and keeps local history in order to render graphs over time.

The current graphs focus on:

latency
throughput
bandwidth
saturation
availability

This helped turn SENSEI from a static configuration tool into a live operational dashboard.

Why the Architecture Looks This Way

A few design decisions shaped the system.

Clear Separation Between Configuration and Runtime State

One of the first lessons was that configured state and live state are not the same thing.

Configured state includes things like:

domains
edges
profiles
connectors
acceptors

Runtime state includes:

current links
current node telemetry
active-link flags
time-series measurements

Treating these separately made the system easier to reason about and easier to present in the UI.

Layering the System

The monitoring, sharing, and consumer/adaptation layers were intentionally kept separate.

This makes it possible to:

plug in new monitoring sources
change or improve state-sharing mechanisms
add new consumers of network state
expose additional operator tools without rewriting the lower layers

Internal Transports

Another important principle was to use two separate transports: NATS for communication between SENSEI microservices and Mockets (a custom transport we built for tactical environments) between SENSEI instances.

Implementation Notes

A few concrete implementation details worth mentioning:

The backend uses Pydantic models to validate config and runtime payloads.
The web frontend uses React with Axios to communicate with the backend.
The UI includes monitoring graphs and active-link controls.
CORS had to be configured explicitly so the frontend and backend could work together during development.
Configuration persistence required some care because YAML rewriting can drop comments or unmodeled fields if you are not careful.
The active-link workflow introduced an operational distinction between:
- configured connectors/acceptors
- discovered or active runtime links

What I Like About the Current Design

There are a few things I particularly like about where SENSEI ended up.

1. The core concepts are explicit

Domains, edges, profiles, active links, and telemetry are all first-class entities. That makes the system easier to explain and easier to evolve.

2. The architecture bridges research and engineering

The earlier conceptual SENSEI architecture naturally supported monitoring and state exchange, while the newer REST/UI work made the system much more usable in practice.

3. It supports both operators and automation

SENSEI can act as:

a monitoring dashboard
a control interface
a data source for other systems
a substrate for adaptive communication logic

That combination is one of the system’s biggest strengths.

Future Work

There are still several directions I would like to keep improving.

Some of the most important ones are:

richer runtime analytics and classification
improved persistence mechanisms that better preserve YAML structure/comments
tighter integration between supervisory logic and active-link policy
more advanced topology and map-based visualization
stronger support for historical analysis and replay
cleaner CRUD APIs for all configuration resources
more flexible transport backends for the sharing layer

Conclusion

SENSEI grew from the need to understand and exchange network state in distributed, tactical-style environments, but it has evolved into something broader: a system for monitoring, sharing, supervision, and operator control.

At the architectural level, the most important ideas are:

local monitoring
distributed state sharing
supervisory consumption of that state
a modern management plane built around REST, YAML, and a web UI

For me, one of the most interesting parts of the project was balancing the original system concepts with the practical realities of building a maintainable tool: configuration management, validation, UI workflows, and runtime observability.

SENSEI is still evolving, but the current design already provides a strong foundation for both experimentation and operational use.

DISCLAIMER

Introduction

What is SENSEI

SENSEI Architecture

System Requirements

High-level Design Discussion

1. Monitoring Layer

2. Sharing Layer

3. Consumer / Adaptation Layer

The REST-managed SENSEI Model

How the New System Was Built

React Frontend

FastAPI Backend

YAML-backed Configuration

Runtime Telemetry and History

Why the Architecture Looks This Way

Clear Separation Between Configuration and Runtime State

Layering the System

Internal Transports

Implementation Notes

What I Like About the Current Design

1. The core concepts are explicit

2. The architecture bridges research and engineering

3. It supports both operators and automation

Future Work

Conclusion