DISCLAIMER

This post is a work in progress as I go through describing several years of development.

Introduction

Welcome back!

In this post I want to document the design of SENSEI, a system for network state estimation, sharing, supervision, and operator control in tactical and distributed environments.

SENSEI started from a practical need: we needed a way to observe network conditions across multiple nodes, share that state between instances, and turn those observations into actionable information for both applications and operators. Over time, the project evolved from a monitoring-oriented system into a broader platform that supports configuration management, runtime supervision, active link control, and visualization.

This post describes what SENSEI is, how it is structured, how it was built, and some of the design decisions behind the current architecture.


What is SENSEI

At a high level, SENSEI is a framework for collecting, exchanging, and consuming network state.

It is designed for environments where connectivity is dynamic, distributed, and sometimes constrained. Instead of treating networking as a black box, SENSEI exposes a structured view of the network through a combination of:

  • local monitoring
  • state sharing between instances
  • supervisory reasoning
  • operator-facing visualization and control

In practice, SENSEI combines several ideas:

  1. Monitoring network behavior locally
  2. Sharing summarized network state across nodes
  3. Supervising and interpreting that state
  4. Exposing configuration and runtime state through a REST API and web UI

The result is a system that can support both humans in the loop and software components that need to understand or react to network conditions.


SENSEI Architecture

The figure below shows the current SENSEI architecture.

SENSEI Architecture

SENSEI is best understood as a set of interacting layers:

  1. Operator / Management Plane: Configure and monitors state of SENSEI instances.
  2. SENSEI Core: Harvests, merge, analyze, and share network state information.
  3. REST-managed configuration and runtime model: Provide access to SENSEI information “locally” through a REST interface.
  4. Distributed SENSEI node instances: SENSEI instances are designed to communicate through disrupted links of heterogeneous characteristics.

System Requirements

The architecture was driven by a few practical needs.

The main objectives were:

  • Observe network conditions locally: operators needs to understand the state of the network, applications need network information to adapt their behavior.
  • Share network state between SENSEI instances: Use tomography to share only what is necessary to allow remote adaptation and monitoring. Provide a drill-down interface for when more info is required.
  • Provide a programmatic API for configuration and runtime state
  • Model monitored domains, edges, and link profiles: Must support flexible logical monitoring
  • Allow runtime control of active links: Operators must be able to specify preferences on how links are used.
  • Support extensibility for additional monitoring sources and transports

As the system evolved, another major requirement emerged: SENSEI needed a cleaner management plane. It was no longer enough to just collect and exchange telemetry; we also wanted operators to be able to inspect and modify the system from a UI.

That led to the addition of a FastAPI backend, a React frontend, and a YAML-backed configuration workflow.


High-level Design Discussion

Conceptually, SENSEI has three major concerns:

  1. Monitoring and local state estimation
  2. Sharing state between distributed instances
  3. Using that state for supervision, adaptation, and operator control

To reflect that, the core architecture is organized into three logical layers:

1. Monitoring Layer

This is where raw or near-raw information is collected.

The main components are:

  • NetSensor
    Responsible for passively harvesting traffic information, detecting topology-related information, and collecting local measurements.

  • NodeMonitor
    Aggregates and fuses local state into a more meaningful node-level picture. This can include network statistics, node health, and topology summaries.

  • Other Producers
    Additional sources such as Mockets, NetProxy, OS sensors, and location-related inputs can contribute data into the monitoring layer.

The monitoring layer is intentionally modular. I wanted SENSEI to be able to ingest observations from more than one source, instead of hard-coding a single monitoring path.

2. Sharing Layer

This is the layer that handles SENSEI-to-SENSEI state exchange.

Its responsibilities include:

  • group management / peer discovery
  • network state exchange
  • transport abstraction
  • filtering and aggregating what to share, compression, access control, etc.

A key design point here is that the Sharing Layer is the architectural boundary between SENSEI instances. Internally, the current implementation uses:

  • NATS for instance transport
  • Mockets for intra-instance transport

3. Consumer / Adaptation Layer

Once state has been collected and shared, something has to consume it.

The main consumers in the current design are:

  • NetSupervisor
    Performs higher-level network analysis such as link classification and inference over throughput, latency, loss, and other conditions.

  • ACMS
    Adaptive communication management logic, intended to support policy-based tuning and adaptation.

  • NetViewer / UI consumers
    Visualization and operator awareness components.

This separation lets SENSEI support both human users and autonomous or semi-autonomous adaptation logic. See this for an interesting DEMO.


The REST-managed SENSEI Model

One of the biggest architectural changes in the newer version of SENSEI was introducing a clear configuration and runtime data model behind a REST API to support multiple communication paths between endpoints.

The backend models the following main entities:

  • Monitoring Domains
    Represent groups of addresses and subnets that belong together logically.

  • Monitoring Edges
    Represent relationships between domains.

  • Link Profiles
    Describe expected link characteristics, both globally and on a per-edge basis.

  • Connectors / Acceptors
    Represent outgoing and incoming connectivity settings for SENSEI instances.

  • Runtime Telemetry
    Stores live link and node information such as timestamps, throughput, bandwidth, latency, saturation, and availability.

  • Active Links
    Represents runtime link state and operator controls such as:

    • preferred: prefer this link
    • disableProbing: do not probe this link
    • disableMonitoring: do not move monitoring data through this link
    • deactivationRequested: stop using this link

This model is managed by a Configuration Manager that loads, applies, validates, and persists configuration.


How the New System Was Built

The current implementation is split into a few major software pieces.

React Frontend

The UI is written in React and provides a web-based interface for:

  • browsing configured domains, edges, connectors, and acceptors
  • viewing global and edge-specific profiles
  • inspecting runtime telemetry
  • controlling active links
  • displaying monitoring graphs over time

This was a major step forward compared to older workflows, because it gave us a clean way to make SENSEI inspectable and editable by operators.

The figure below shows the SENSEI UI.

SENSEI UI

SENSEI UI_GRAPHS

FastAPI Backend

The backend is implemented in Python using FastAPI.

Its responsibilities include:

  • exposing /api/v2/... endpoints
  • validating configuration through Pydantic
  • serving configuration objects such as domains, edges, profiles, connectors, and acceptors
  • exposing runtime objects such as links, nodes, and active links
  • persisting valid configuration back to YAML

The REST layer also acts as the bridge between the operator UI and the in-memory SENSEI model.

YAML-backed Configuration

Configuration is persisted in a YAML file. That made it easy to:

  • bootstrap the system from a human-editable source
  • keep configuration outside the code
  • inspect and version configuration changes

Over time, the backend evolved from simply loading configuration to also validating and writing it back after operator changes.

Runtime Telemetry and History

For monitoring and visualization, the frontend polls runtime link information and keeps local history in order to render graphs over time.

The current graphs focus on:

  • latency
  • throughput
  • bandwidth
  • saturation
  • availability

This helped turn SENSEI from a static configuration tool into a live operational dashboard.


Why the Architecture Looks This Way

A few design decisions shaped the system.

Clear Separation Between Configuration and Runtime State

One of the first lessons was that configured state and live state are not the same thing.

Configured state includes things like:

  • domains
  • edges
  • profiles
  • connectors
  • acceptors

Runtime state includes:

  • current links
  • current node telemetry
  • active-link flags
  • time-series measurements

Treating these separately made the system easier to reason about and easier to present in the UI.

Layering the System

The monitoring, sharing, and consumer/adaptation layers were intentionally kept separate.

This makes it possible to:

  • plug in new monitoring sources
  • change or improve state-sharing mechanisms
  • add new consumers of network state
  • expose additional operator tools without rewriting the lower layers

Internal Transports

Another important principle was to use two separate transports: NATS for communication between SENSEI microservices and Mockets (a custom transport we built for tactical environments) between SENSEI instances.


Implementation Notes

A few concrete implementation details worth mentioning:

  • The backend uses Pydantic models to validate config and runtime payloads.
  • The web frontend uses React with Axios to communicate with the backend.
  • The UI includes monitoring graphs and active-link controls.
  • CORS had to be configured explicitly so the frontend and backend could work together during development.
  • Configuration persistence required some care because YAML rewriting can drop comments or unmodeled fields if you are not careful.
  • The active-link workflow introduced an operational distinction between:
    • configured connectors/acceptors
    • discovered or active runtime links

What I Like About the Current Design

There are a few things I particularly like about where SENSEI ended up.

1. The core concepts are explicit

Domains, edges, profiles, active links, and telemetry are all first-class entities. That makes the system easier to explain and easier to evolve.

2. The architecture bridges research and engineering

The earlier conceptual SENSEI architecture naturally supported monitoring and state exchange, while the newer REST/UI work made the system much more usable in practice.

3. It supports both operators and automation

SENSEI can act as:

  • a monitoring dashboard
  • a control interface
  • a data source for other systems
  • a substrate for adaptive communication logic

That combination is one of the system’s biggest strengths.


Future Work

There are still several directions I would like to keep improving.

Some of the most important ones are:

  • richer runtime analytics and classification
  • improved persistence mechanisms that better preserve YAML structure/comments
  • tighter integration between supervisory logic and active-link policy
  • more advanced topology and map-based visualization
  • stronger support for historical analysis and replay
  • cleaner CRUD APIs for all configuration resources
  • more flexible transport backends for the sharing layer

Conclusion

SENSEI grew from the need to understand and exchange network state in distributed, tactical-style environments, but it has evolved into something broader: a system for monitoring, sharing, supervision, and operator control.

At the architectural level, the most important ideas are:

  • local monitoring
  • distributed state sharing
  • supervisory consumption of that state
  • a modern management plane built around REST, YAML, and a web UI

For me, one of the most interesting parts of the project was balancing the original system concepts with the practical realities of building a maintainable tool: configuration management, validation, UI workflows, and runtime observability.

SENSEI is still evolving, but the current design already provides a strong foundation for both experimentation and operational use.