==============================================================================
 MVPS -- Data-Plane Profile (Proposal)
 A companion proposal for embedding the three-layer coherence framework
 in programmable forwarding silicon (P4 / Tofino-class targets).

 Leonardo Melegassi <melegassi@catellix.com>
 Catellix Research

 Version: 0.1 (proposal -- not yet implemented or validated on hardware)
 Date:    2026-05-21
 Status:  companion to MVPS_THREE_LAYER_MATHEMATICAL_EVIDENCE.txt v1.1
==============================================================================


 Abstract.

 This document proposes a *data-plane profile* of the MVPS three-layer
 coherence framework defined in MVPS_THREE_LAYER_MATHEMATICAL_EVIDENCE.txt
 (the math companion, v1.1). Where the math companion targets
 observability from outside the network -- composing MVPS bundles from
 external probes (RIPE Atlas, Catchpoint, customer endpoints, looking
 glasses) -- this profile targets *in-band computation* of the same
 coherence axes inside programmable forwarding silicon (P4 targets such
 as Intel Tofino-2, and software equivalents such as VPP/DPDK).

 The proposal preserves the axiom system of the math companion verbatim.
 The definitions of C_1, C_2, C_3, the operational Hamiltonian H, the
 Mahalanobis-based phase-distance Phi_D, and the operational phase
 label Phi_K are unchanged. What changes is the *type signature of a
 vantage*: from "an external probe vantage producing an MVPS bundle
 over the public Internet" to "a next-hop, queue, or port inside the
 same forwarding plane producing a per-tick state snapshot". Section 8
 formalises this mapping.

 Three honest framings used throughout this profile.

   (a) Nothing in this document has been implemented on real hardware.
       It is a design proposal, grounded in standard P4 idioms
       (Count-Min sketches, Bloom filters, fixed-point lookup tables)
       and in published Tofino-2 resource budgets. A reference
       implementation and silicon validation are listed as open work
       in Section 9.

   (b) The worked example in Section 5 (PIX gray-failure incident on a
       Tier-1 peering edge) is *synthetic*. It is constructed from
       publicly-known characteristics of Tier-1 BGP behaviour, ECMP
       failure modes, IX peering at IX.br SP, and AWS sa-east-1
       reachability, but it does not describe a specific real
       incident. Its purpose is to illustrate the operational gap that
       data-plane MVPS is meant to close.

   (c) This profile is a *proposal for a future companion I-D*; it is
       not itself an Internet-Draft submission. It exists so that
       reviewers of the math companion -- in particular Benoit Donnet
       (Universite de Liege) -- can assess whether the framework's
       axiomatic abstraction generalises to in-network computation
       without algebraic changes. That generalisation is the central
       claim of the framework's long-term value proposition.

 Companion artefacts:
   - MVPS_THREE_LAYER_MATHEMATICAL_EVIDENCE.txt v1.1
       (mathematical reference; this profile cites it normatively)
   - draft-melegassi-ippm-mvps-bundle
       (the data-structure I-D)
   - https://catellix.com/v11-evidence.html
       (visual evidence package, synthetic scenarios + conjecture
        tests)


==============================================================================
 1. Problem statement: from observatory to embedded
==============================================================================

 The MVPS framework as currently defined operates as an *observatory*:
 a controller external to the production data path collects MVPS
 bundles from N >= 2 vantages, computes the three coherence axes
 (C_1, C_2, C_3), and emits a phase label Phi_K describing whether the
 system is in BAU, WATCH, ALARM, or CRITICAL. This pipeline is
 well-suited to incident reconstruction, longitudinal SLA audit, and
 research over public measurement platforms. It is *not* well-suited
 to two operational regimes that increasingly dominate carrier-grade
 networks:

   1.1 Sub-second SLA breaches under gray failure.
       Modern fintech (e.g. real-time payment rails such as PIX in
       Brazil, FedNow in the US, UPI in India), low-latency trading,
       and 5G ultra-reliable low-latency (URLLC) workloads have SLAs
       measured in tens to hundreds of milliseconds. A gray failure
       (BGP session UP, interface counters clean, but a peer silently
       degrading via a snake-path or partial blackhole) can degrade
       these workloads for minutes before any external observatory
       has enough samples to detect it. Detection in the observatory
       regime is post-hoc by construction: the bundle has to be
       collected, transported off-box, decoded, computed, and then
       acted upon in the control plane. End-to-end this is rarely
       below 10-30 seconds, and frequently above.

   1.2 In-network autonomous action.
       Operations practice has shifted from "observe and alert" to
       "observe and react". Autonomous load-balancers (e.g. Google
       Maglev-class), in-band telemetry (IOAM, INT), and programmable
       forwarding (P4) have made it routine for production silicon
       to make local routing decisions on telemetry signals at line
       rate. MVPS as an observatory cannot participate in this loop:
       its computation lives outside the data path. To be useful as
       a primary signal for autonomous action, the same coherence
       axes must be computable on-box and at line rate.

 The thesis of this profile is that *the same axiomatic framework
 covers both regimes*. The math companion's bundle B(t) is an
 abstraction over a finite set of vantages V_1, ..., V_N; the axioms
 do not constrain whether those vantages are external probers or
 internal forwarding objects. What changes between the observatory
 and the embedded profile is concrete: the source of data per
 vantage, the time-bucket granularity, and the implementation
 substrate (Python on a controller versus P4 register arrays on an
 ASIC). The mathematics is identical and is reused without
 modification.

 The remainder of this document specifies what those concrete
 changes are.


==============================================================================
 2. Vantage transformation: from probe to next-hop
==============================================================================

 In the math companion (Sec. 1), an MVPS bundle B(t) is defined as a
 JSON object containing a list of vantages V = {V_1, ..., V_N} with
 N >= 2. Each V_i carries an ordered hop list H_i, an RTT vector, a
 geographic anchor sequence, and optional metadata. The vantage is
 implicitly external: it is a host that has issued traceroute-class
 probes and serialised the result.

 In the data-plane profile, a vantage is an internal forwarding
 object. Three concrete vantage types are defined:

   2.1 Next-hop vantage (the primary case in this profile).
       For an Equal-Cost Multi-Path (ECMP) group of width W, a
       next-hop vantage V_i is the i-th next-hop of the group,
       observed over a tick window of width Delta_t (default
       Delta_t = 10 ms). The bundle B(t) at tick t is the unordered
       collection of W next-hop vantages, each summarising what
       traffic that next-hop saw, returned, or failed to return
       during [t, t + Delta_t).

   2.2 Queue vantage.
       For a multi-queue port (e.g. a Strict Priority + Weighted
       Round Robin scheduler with K queues), V_i is the i-th queue.
       This is the natural vantage choice for diagnosing
       intra-port head-of-line blocking, microburst-induced jitter,
       and SLA differentiation across DSCP classes.

   2.3 Port vantage.
       For a chassis with multiple physical ports landing on the
       same logical attachment circuit (e.g. LAG members,
       link-aggregation), V_i is the i-th port. This is the
       natural vantage choice for diagnosing LAG hash polarisation
       and per-fibre optical degradation.

 The choice of vantage type is per-deployment. A peering edge router
 will most commonly use next-hop vantages over its ECMP groups; a
 service edge router with strict QoS will additionally use queue
 vantages; a core-fabric router with many LAGs will additionally
 use port vantages. Multiple vantage types may coexist on the same
 chassis with disjoint resource pools.

 In all three cases the cardinality N corresponds to the width of
 the local resource (ECMP width, queue count, LAG width). Typical
 values: N in {2, 4, 8, 16}. The math companion's lower bound
 N >= 2 is preserved.

 Per-vantage state.

 Each vantage V_i maintains, in P4 register arrays, four observable
 streams during the tick:

   (a) Flow distribution sketch p_i.
       A Count-Min Sketch (CMS) over a configurable flow key (5-tuple
       hash by default) is updated on every packet processed by V_i.
       Recommended dimensions: d = 4 hash functions, w = 1024 buckets
       per hash, 16-bit counters. SRAM cost per vantage: 4 * 1024 *
       2 bytes = 8 KiB.

   (b) RTT estimator rtt_i.
       For TCP traffic, an inline TCP-RACK-like estimator updates an
       exponentially-weighted RTT register from observed
       SEQ -> ACK round-trips. For UDP/QUIC traffic, an injected
       IOAM probe at a fixed cadence (e.g. 100 ms per next-hop) is
       used to refresh rtt_i. SRAM cost: 32-bit register + 16-bit
       sample-count register per vantage = 6 bytes.

   (c) Return-path source set S_i.
       A Bloom filter accumulates the source IPs of ICMP
       Time-Exceeded and ICMP Destination-Unreachable messages
       arriving on the return path bound to V_i. Recommended
       dimensions: m = 8192 bits, k = 5 hash functions. SRAM cost:
       1 KiB per vantage.

   (d) Counters.
       Packet count, byte count, drop count, retransmit count.
       SRAM cost: 16 bytes per vantage.

 Total per-vantage SRAM budget: ~9.0 KiB.

 At the close of each tick (t -> t + Delta_t), the bundle B(t) is
 the ordered tuple

       B(t) = ( (p_1, rtt_1, S_1, ctr_1),
                (p_2, rtt_2, S_2, ctr_2),
                ...,
                (p_N, rtt_N, S_N, ctr_N) ).

 This is the on-chip analogue of the JSON bundle defined in the math
 companion's Sec. 1. No serialisation to JSON is required for the
 in-band computation that follows; serialisation is needed only
 when a bundle needs to be exfiltrated for offline analysis (which
 is the IOAM TLV path defined in Sec. 7).


==============================================================================
 3. Axes restated for the data plane
==============================================================================

 The three coherence axes from the math companion (Sec. 2.1-2.3) are
 reproduced verbatim below, followed by a P4-friendly implementation
 strategy. The definitions are unchanged; only the numerical strategy
 is adapted to fixed-point ALUs and bounded register arrays.

 ---------------------------------------------------------------------------
 3.1 C_1 -- causal coherence (Einstein bound + temporal stability)
 ---------------------------------------------------------------------------

 Definition (math companion, Sec. 2.1, unchanged).

   C_1 = min(C_1^Einstein, C_1^tau)

   C_1^Einstein
       = 1 - (1/M) * sum_{(a,b) : a < b} 1[ rtt_a + rtt_b
                                             < 2 * d_ab / c_f ]

   C_1^tau = exp( -H_v ),   H_v = - sum_p p_log p

 Data-plane implementation.

 Einstein term. The per-vantage RTT register rtt_i is an unsigned
 32-bit fixed-point quantity in microseconds. The per-pair distance
 2 * d_ab / c_f is a *deployment-time constant* compiled into a
 lookup table indexed by (a, b). For an ECMP group of width 4 there
 are C(4,2) = 6 pairs; the table is 6 * 8 bytes = 48 bytes per group.
 The comparison is one subtraction plus one signed-bit test, fitting
 in a single P4 stage. The Einstein term itself is a popcount of the
 pair-violation bits divided by M; both operations fit in a second
 stage.

 Temporal-stability term. C_1^tau requires Shannon entropy H_v over
 a fingerprint distribution. Computing Shannon entropy in P4 is
 expensive (no native log). The recommended profile is:

   (i) Maintain a *fingerprint occupancy histogram* H_occ over the
       last K ticks of fingerprints observed on V_i. K is a
       deployment parameter, typically 16-64.

   (ii) Define a coarsened entropy proxy

           H_proxy = lookup_entropy_table[ encode(H_occ) ]

       where encode(.) is a fixed-precision projection of H_occ onto
       a 1-byte index, and lookup_entropy_table is a 256-entry
       precomputed table mapping that index to a Q4.4 fixed-point
       approximation of the true Shannon entropy of the histogram.

   (iii) Define C_1^tau = lookup_exp_neg[ H_proxy ], a 64-entry
       precomputed table approximating exp(-x) for x in [0, log K].

 The combined error of these two table-based approximations against
 the true C_1^tau is bounded by 6% in the worst case for K = 32, by
 spot-check against a software reference. The error is acceptable
 because C_1^tau enters Phi_D through Mahalanobis distance with a
 covariance matrix whose diagonal absorbs constant proportional
 errors in C_1; the *change* in C_1^tau under a regime shift is
 preserved with much higher fidelity than the absolute value.

 Output. C_1 is a Q1.10 fixed-point scalar in [0, 1].

 ---------------------------------------------------------------------------
 3.2 C_2 -- informational coherence (JSD on flow distributions)
 ---------------------------------------------------------------------------

 Definition (math companion, Sec. 2.2, unchanged).

   C_2 = 1 - JSD_norm( {p_v} ),
   JSD_norm = JSD( {p_v} ) / log_2( min(N, |A|) ),
   JSD( {p_v} ) = (1/N) sum_v KL( p_v || M ),
   M = (1/N) sum_v p_v.

 Data-plane implementation.

 Computing KL-divergence on Count-Min sketches in P4 is, again,
 prohibitive (no native log, no division). The recommended profile
 replaces the JSD computation with an L1-distance-on-sketches proxy
 that is monotonic in JSD over the ranges that matter for phase
 detection:

   (i) Compute, pairwise, the L1 distance between sketches:

           L1(p_a, p_b) = sum_{i, j} | CMS_a[i, j] - CMS_b[i, j] |

       Each pairwise L1 is a parallel reduce of (d * w) =
       (4 * 1024) = 4096 lanes, which fits in 2 P4 stages with the
       standard Tofino register-array reduce idiom.

   (ii) Aggregate to a scalar L1_total = (1/M) sum_pairs L1(p_a, p_b).

   (iii) Map L1_total to JSD_norm via a 1024-entry lookup table
       calibrated offline on representative traffic mixtures. The
       lookup is a single P4 stage.

   (iv) C_2 = 1 - JSD_norm.

 Calibration of the L1 -> JSD_norm table is the most operationally
 sensitive step of this profile and is one of the open research
 questions enumerated in Sec. 9. The recommended initial
 calibration is to fit the table on a corpus of one full week of
 production traffic per deployment site; the table is then loaded
 via P4Runtime and refreshed quarterly.

 Output. C_2 is a Q1.10 fixed-point scalar in [0, 1].

 ---------------------------------------------------------------------------
 3.3 C_3 -- topological coherence (Jaccard on return-path sets)
 ---------------------------------------------------------------------------

 Definition (math companion, Sec. 2.3, unchanged in form, restated
 for return-path sets).

   C_3 = (1 / C(N,2)) * sum_{i < j} | S_i intersect S_j |
                                      / | S_i union S_j |

 In the observatory profile, S_i is the directed edge set of vantage
 i's traceroute. In the data-plane profile, S_i is the Bloom filter
 of return-path sources observed on next-hop V_i during the tick.

 Data-plane implementation.

 Bloom-filter intersection and union are bitwise AND and bitwise OR
 over the m-bit filters; both fit in standard P4 register-array
 idioms. Population count (popcount) of a 8192-bit filter is an
 8-stage tree-reduce on Tofino-2 (since the chip's native popcount
 width is limited), or a single accumulator update if popcount is
 maintained incrementally on insert. The latter is recommended:
 on every insert into S_i, the popcount counter is updated in the
 same stage; when computing C_3, no additional popcount sweep is
 needed.

   - count_intersect_ij = popcount( S_i AND S_j ).
   - count_union_ij = popcount(S_i) + popcount(S_j)
                       - count_intersect_ij.
   - jaccard_ij = count_intersect_ij / count_union_ij.

 Division in P4 is implemented via a Newton-Raphson approximator
 with two iterations or, more commonly, via a 1024-entry
 reciprocal-lookup table. Either fits in 2 P4 stages.

 Output. C_3 is a Q1.10 fixed-point scalar in [0, 1].

 ---------------------------------------------------------------------------
 3.4 H, Phi_D, Phi_K in the data plane
 ---------------------------------------------------------------------------

 Definition (math companion, Sec. 2.4 and Sec. 4, unchanged).

   H(t) = -log( C_1(t) * C_2(t) * C_3(t) )
   D^2(t) = (x(t) - mu)^T Sigma^{-1} (x(t) - mu),  x = (C_1, C_2, C_3)
   Phi_D(t) = exp( -D^2(t) / k ),  k = 6.25
   Phi_K(t) in {BAU, WATCH, ALARM, CRITICAL}
       indexed by D^2(t) thresholds (4.33, 7.81, 11.34).

 Data-plane implementation.

 H. The product C_1 * C_2 * C_3 is two fixed-point multiplications
 (Q1.10 * Q1.10 -> Q2.20, truncated back to Q1.10). The negative
 logarithm is a 1024-entry lookup table mapping Q1.10 -> Q4.6.
 Total cost: 3 P4 stages.

 Phi_D. The covariance inverse Sigma^{-1} is a 3 x 3 symmetric
 matrix with 6 unique entries. Sigma^{-1} is *not* computed on the
 data plane; it is computed in the control plane on a sliding
 window of past ticks (typically 30 seconds of BAU samples) and
 written to the data plane via P4Runtime. The data-plane
 computation of D^2 is then 9 fixed-point multiplications and 6
 fixed-point additions: 3 stages including the final Q4.6 result.

 Phi_D = exp(-D^2 / k) is a 1024-entry lookup table mapping
 Q4.6 -> Q1.10. One stage.

 Phi_K is a TCAM ternary match on D^2 against the three thresholds.
 One stage.

 Total stage budget for Section 3.

   C_1: ~3 stages
   C_2: ~3 stages
   C_3: ~3 stages
   H, Phi_D, Phi_K: ~5 stages
   ----------------------------
   Subtotal:        ~14 stages

 Tofino-2 has 20 stages per pipeline. The remaining 6 stages are
 retained for parsing, forwarding, ACL, and IOAM trace insertion.
 The MVPS data-plane computation is therefore feasible *as a
 secondary pipeline pass on egress* without displacing forwarding
 logic. On software targets (VPP/DPDK) the stage count is not a
 binding constraint; per-packet cost dominates instead, and is
 acceptable at line rates up to 100 Gbps on commodity x86 with
 AVX-512.


==============================================================================
 4. Phase detection and autonomous action
==============================================================================

 4.1 Tick boundary and bundle close-out.

 At every tick boundary t -> t + Delta_t, the data plane:

   (i)  reads the per-vantage state ( p_i, rtt_i, S_i, ctr_i );
   (ii) computes ( C_1(t), C_2(t), C_3(t) ) by Sec. 3 above;
   (iii) computes ( H(t), D^2(t), Phi_D(t), Phi_K(t) );
   (iv) atomically swaps to a fresh per-vantage state for tick t+1
        (double-buffered registers; no copy required, only a
        pointer flip).

 The total computation completes within 1-2 milliseconds of the
 tick boundary at line rate, well within the 10 ms tick window.

 4.2 Action policy.

 Phi_K is the actionable signal. The recommended action policy
 follows the math companion's Sec. 4 thresholds:

   BAU      : no action.
   WATCH    : flag the vantage(s) responsible (the pair (a, b) with
              the largest contribution to D^2) in the IOAM TLV
              (Sec. 7); export an event to the control plane via
              Packet-In; do not change forwarding behaviour.
   ALARM    : in addition, *de-prefer* the responsible vantage(s)
              from ECMP/queue/LAG selection by setting the
              vantage's selection weight to a low non-zero value
              (e.g. 1 of 256). This biases new flows away while
              not stranding existing flows.
   CRITICAL : in addition, set the responsible vantage's selection
              weight to 0. This drains the vantage entirely.
              Existing flows are re-hashed onto remaining
              vantages on their next packet.

 The action is implemented as a P4Runtime table update to the
 next-hop / queue / port selection table. The update is initiated
 by the control plane in response to the Packet-In event; the
 control plane is in the loop for all weight changes, but is *not*
 in the loop for detection. End-to-end detect-and-react latency
 (gray failure onset -> drained vantage) is dominated by the
 Packet-In round-trip: 100-500 ms on typical deployments.

 4.3 Hysteresis and false-alarm suppression.

 Phi_K transitions are gated by a hysteresis band:

   - WATCH -> ALARM requires Phi_K = WATCH for at least 3
     consecutive ticks AND D^2 trending upward.
   - ALARM -> CRITICAL requires Phi_K = ALARM for at least 5
     consecutive ticks AND D^2 above the CRITICAL threshold for
     at least 2 consecutive ticks.
   - All states require D^2 below the lower threshold for at least
     10 consecutive ticks before stepping down.

 The hysteresis is implemented via a small per-vantage state
 machine with a transition counter; SRAM cost is 4 bytes per
 vantage. The hysteresis parameters are configurable via
 P4Runtime.

 The combination of (a) Mahalanobis-based detection, (b)
 multi-tick consecutive confirmation, and (c) downstream control-
 plane review of weight changes is intended to keep the
 false-alarm rate low enough to operate without operator-in-the-
 loop confirmation, but the practical false-alarm rate must be
 measured per deployment on synthetic load and on tracebacks
 from production. This measurement is one of the open work items
 in Sec. 9.


==============================================================================
 5. Worked example: gray failure on a Tier-1 peering edge
==============================================================================

 Scenario summary (synthetic).

   Operator   : Tier-1 ISP, AS28xxx, peering at IX.br SP.
   Edge router: Tofino-2 with custom P4, MVPS data-plane profile
                deployed on the AWS sa-east-1 ECMP group.
   ECMP group : width N = 4, peers
                  V_1 = NTT       (AS2914)
                  V_2 = Cogent    (AS174)
                  V_3 = Telxius   (AS12956)
                  V_4 = Lumen     (AS3356)
                Each peer announces 16.182.0.0/16 (AWS sa-east-1).
   Customer   : a Brazilian fintech with PIX (real-time payments)
                workload; SLA target p99 < 120 ms one-way.
   Tick       : Delta_t = 10 ms.

 Failure onset.

 At t = 14:32:00.000 UTC, AS174 (Cogent) silently re-converges its
 MPLS LSP to AWS sa-east-1 via a snake path Miami -> Ashburn ->
 Sao Paulo. The re-convergence is caused by an internal IGP flap
 inside AS174; from the IX.br SP edge, the BGP session is
 unaffected (HOLD timers do not fire), the next-hop is unchanged,
 and the interface counters are clean. RTT to the AWS PoP via
 Cogent rises from 4 ms to 124 ms; via the other three peers it
 stays at 4-6 ms.

 What MVPS embedded sees, tick by tick.

   t = 14:32:00.010   (1st tick after onset)

     RTT vector       (V_1, V_2, V_3, V_4) = (4.1, 124.3, 5.0, 4.6) ms
     Pair (V_1, V_2)  rtt_1 + rtt_2 = 128.4 ms
     2 d_12 / c_f     = 0.31 ms (NTT and Cogent share the same IX
                        in Sao Paulo; great-circle distance ~ 0)
     Verdict          severe Einstein violation -- C_1^Einstein
                        drops from 1.000 to 0.500 within one tick
                        on this pair alone.
     Effect on C_1    C_1 falls to ~0.50 immediately.

   t = 14:32:00.020 .. 14:32:00.150   (~14 ticks)

     TCP retransmits begin on flows whose hash maps to V_2. The
     fintech client opens new connections; these are re-hashed by
     ECMP and a fraction lands again on V_2. The CMS sketches of
     V_1, V_3, V_4 stay close to one another (their flow
     populations are statistically equivalent); the CMS sketch of
     V_2 begins to diverge as flows give up retrying on it.
     L1_total rises monotonically; the L1 -> JSD_norm lookup
     produces JSD_norm rising from ~0.05 (BAU) to ~0.55.
     Effect on C_2    C_2 falls from ~0.95 to ~0.45.

     The Cogent snake path traverses transit nodes in Miami and
     Ashburn that none of the other three peers ever touch. ICMP
     Time-Exceeded sources observed on the V_2 return path begin
     to populate Bloom-filter cells that V_1, V_3, V_4 never
     populate. Pairwise Jaccard between V_2 and the others falls
     from ~0.85 (BAU) to ~0.20.
     Effect on C_3    C_3 falls from ~0.85 to ~0.50.

   t = 14:32:00.150   (15th tick)

     ( C_1, C_2, C_3 ) = (0.50, 0.45, 0.50)
     H = -log(0.50 * 0.45 * 0.50) ~= 2.18
     D^2 ~= 8.4 against a Sigma^{-1} calibrated on the prior 30 s
     of BAU samples.
     Phi_K transitions BAU -> WATCH; the Packet-In carries the
     identity of V_2 as the dominant contributor to D^2.

   t = 14:32:00.400   (40th tick, 400 ms after onset)

     D^2 has stayed above 11.34 for 5 consecutive ticks. Phi_K
     transitions WATCH -> ALARM -> CRITICAL through hysteresis.
     The control plane, having received a continuous stream of
     Packet-In events naming V_2, issues a P4Runtime update
     setting weight(V_2) = 0 in the ECMP selection table.

   t = 14:32:00.500   (50th tick, 500 ms after onset)

     New flows are no longer hashed onto V_2. Existing flows that
     re-hash on retransmit migrate to V_1, V_3, V_4. RTT
     distribution returns to BAU. Phi_K returns to WATCH within
     ~1 second and to BAU within ~10 seconds.

 Outcome comparison.

   Without MVPS embedded (today)
     - Operator alerted at ~22 minutes by customer ticket.
     - Manual diagnosis (mtr loop + traceroute correlation) at
       ~30 minutes.
     - Manual ECMP drain at ~35 minutes.
     - SLA breach: ~12 million PIX transactions degraded.
     - Customer trust: damaged.

   With MVPS embedded (this profile)
     - Detection at ~150 ms.
     - Drain at ~500 ms.
     - SLA breach: <100,000 transactions briefly retransmitted
       (TCP-level retries succeed within ~200 ms via remaining
       three peers); fintech p99 latency held under SLA.
     - Operator informed by IOAM telemetry stream; the incident
       appears in the post-mortem dashboard but does not page on
       call.
     - Customer trust: not affected.

 Caveat. This worked example is synthetic. The numerics for D^2,
 Phi_K transition timing, and SLA outcome are constructed by hand
 from a software simulation of the data-plane profile. They are
 not measurements from a deployed Tofino. The example illustrates
 the *operational gap* that data-plane MVPS is designed to close;
 quantitative validation against real hardware on real production
 traffic is open work (Sec. 9, Item D9.1).


==============================================================================
 6. Hardware resource budget
==============================================================================

 6.1 Per-vantage SRAM.

   Count-Min sketch p_i  : 4 hashes x 1024 buckets x 2 B  =  8.0 KiB
   RTT estimator rtt_i   : 32-bit value + 16-bit count   =  6   B
   Bloom filter S_i      : 8192 bits                     =  1.0 KiB
   Counters ctr_i        : 4 x 4-byte counters           = 16   B
   Hysteresis state      : transition counter            =  4   B
                                                           ----------
                                                           ~9.0 KiB

 6.2 Per-group SRAM (ECMP group of width N = 4).

   Per-vantage state x 4              ~ 36 KiB
   Pairwise distance LUT (2 d / c)    ~ 48 B
   L1 -> JSD_norm LUT (1024 entries)  ~  2 KiB
   exp(-x/k) LUT (1024 entries)       ~  2 KiB (shared across groups)
   -log(x) LUT (1024 entries)         ~  2 KiB (shared across groups)
   Sigma^{-1} (3x3 fixed-point)       ~ 36 B (per group; per Sec. 4.2)
                                       --------
                                       ~38 KiB per group
                                       (+ ~6 KiB shared LUTs)

 6.3 Total SRAM for 1024 ECMP groups.

   1024 groups * 38 KiB                 ~ 38 MiB
   + shared LUTs                          ~  6 KiB
   --------------------------------------------
   Total                                  ~ 38 MiB

 Tofino-2 ships with ~30 MiB of SRAM total in 20 stages. 38 MiB is
 ~25% over budget for full coverage of 1024 groups. Practical
 deployment options:

   - Top-N coverage. By Pareto, 80% of the carrier-grade traffic
     volume in a Tier-1 edge typically transits the top 100-200
     ECMP groups. Covering only those drops the SRAM cost to
     ~4-8 MiB and is the recommended starting point.
   - Reduced sketch dimensions. CMS at 4 x 512 x 16-bit (4 KiB)
     and Bloom at 4096 bits (0.5 KiB) cuts the per-vantage cost to
     ~5 KiB and the 1024-group cost to ~21 MiB, fitting Tofino-2.
   - Hybrid software / hardware. The 1024-group fully-dimensioned
     case fits comfortably on a software target (VPP/DPDK on
     commodity x86 with 64+ GiB RAM) and is the recommended
     reference profile for early validation.

 6.4 Per-tick stage budget.

   Bundle close-out + C_1, C_2, C_3 + H, Phi_D, Phi_K
                                       ~ 14 stages
   Forwarding + ACL + IOAM TLV insert  ~  6 stages
                                       --------
   Total                                 20 stages (Tofino-2 max).

 6.5 Control-plane bandwidth.

   Sigma^{-1} updates: 6 unique entries x 4 bytes per group, every
   30 seconds = 0.8 bits per second per group. Negligible.

   Phi_K event Packet-Ins: peak rate during a transition <100 events
   per group per second. For 1024 covered groups in worst case,
   <100k events/s -- well within standard Tofino-2 control-plane
   gRPC capacity (~1M events/s).

 6.6 Datapath latency overhead.

   The MVPS computation runs on egress, in parallel with packet
   forwarding, on a sampled fraction of traffic (default: 1 in
   every 16 packets per vantage feeds the per-vantage state). The
   per-packet forwarding latency is *unchanged*. The per-tick
   computation latency (10 ms tick, ~1-2 ms compute) is hidden
   inside the tick window.


==============================================================================
 7. In-band telemetry: the IOAM TLV
==============================================================================

 7.1 Motivation.

 Operations teams need to be able to see what the data plane has
 decided, in real time, without polling the data plane or relying
 on Packet-In throttling. The recommended mechanism is to emit
 the per-tick coherence vector and phase label as an IOAM Trace
 Option TLV (RFC 9197) inserted on a sampled fraction of egress
 packets. Carrier-grade collectors that already consume IOAM
 (Cisco DNA, Juniper Mist, Arista CloudVision, open-source
 InfluxDB-based stacks) can ingest the MVPS TLV without protocol
 changes.

 7.2 TLV layout (proposed; subject to IANA registration).

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  | TLV-Type (TBD)|   Length=12   |     Vantage-Group-Id (16-bit) |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |   C_1 (Q1.10) | C_2 (Q1.10)   | C_3 (Q1.10)   | reserved      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |          Phi_D (Q1.10)        |  Phi_K (8-bit)| reserved      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                          tick_id (32-bit)                     |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 Total: 12 bytes.

 Field semantics.

   TLV-Type           : to be assigned by IANA from the IOAM
                        Trace-Type registry.
   Length             : fixed at 12 bytes for this revision.
   Vantage-Group-Id   : opaque 16-bit identifier for the ECMP /
                        queue / port group in question. Mapped by
                        the operator to a meaningful name (e.g.
                        "AWS sa-east-1 IX.br SP") via control-
                        plane configuration.
   C_1, C_2, C_3      : Q1.10 fixed-point coherence values in
                        [0, 1].
   Phi_D              : Q1.10 fixed-point phase distance, exp-
                        weighted Mahalanobis distance.
   Phi_K              : 8-bit enum:
                          0 = BAU
                          1 = WATCH
                          2 = ALARM
                          3 = CRITICAL
                          (4-255 reserved)
   tick_id            : monotonic 32-bit tick counter; wraps
                        every ~497 days at 10 ms tick.

 7.3 Sampling.

 The TLV is inserted on 1 packet in every M egress packets per
 group, with M configurable per group (default M = 64). At
 100 Gbps line rate per group, this yields ~1.5M annotated
 packets per second per group, sufficient for sub-second
 dashboarding without significant header-overhead amplification
 (12 bytes / 1500-byte packet ~= 0.8% overhead on annotated
 packets, ~0.012% averaged across the group).

 7.4 Privacy and operational considerations.

 The MVPS TLV exposes operationally-sensitive information about
 the internal state of the forwarding plane to any party that can
 observe the egress packet. Standard IOAM hygiene applies: the
 TLV MUST be stripped at trust-domain boundaries (typically the
 administrative AS edge) and SHOULD be encrypted or
 authenticated when crossing trust domains internally. RFC 9322
 (IOAM Deployment Considerations) gives the canonical guidance.


==============================================================================
 8. Formal mapping back to v1.1: Poincare's "art"
==============================================================================

 The central claim of this profile is that the algebraic structure
 of the math companion (v1.1) is preserved under the substitution
 of vantage type. This section makes that claim formal.

 8.1 The bundle as an algebraic object.

 In the math companion (Sec. 1), an MVPS bundle is an element of

       B := V^N

 where V is the type of vantage (a record over hop list, RTT
 vector, geographic anchor, ASN, optional metadata) and N >= 2.
 The coherence axes are defined as functions

       C_1, C_2, C_3 : B -> [0, 1].

 The Hamiltonian is a function

       H : [0, 1]^3 -> [0, infinity)

 with H(c_1, c_2, c_3) = -log(c_1 * c_2 * c_3). The phase label is
 a function

       Phi_K : [0, infinity) -> {BAU, WATCH, ALARM, CRITICAL}.

 8.2 The substitution.

 The data-plane profile defines a new vantage type V' (Sec. 2)
 whose record is

       V' := ( CountMin x RttEstimator x BloomFilter x Counters ).

 The coherence axes are reimplemented as functions

       C_1', C_2', C_3' : (V')^N -> [0, 1]

 by the constructions in Sec. 3.1 - 3.3. These constructions
 differ from the math companion's constructions only in
 implementation substrate (fixed-point lookup tables, Count-Min
 sketches, Bloom filters); they share the same input-output
 specification:

   - C_1'(b') agrees with C_1(b) up to fixed-point quantisation
     and table approximation (bounded error ~6%) when b' encodes
     the same vantage observations as b.
   - C_2'(b') agrees with C_2(b) up to the L1 -> JSD_norm table
     approximation (bounded error ~5% in the JSD ranges that
     matter for phase detection, after deployment-time
     calibration).
   - C_3'(b') agrees with C_3(b) up to Bloom-filter
     false-positive rate (bounded by deployment-time choice of
     m and k, default <2%).

 H, Phi_D, Phi_K are unchanged: they are defined on (C_1, C_2, C_3)
 in [0, 1]^3 and do not care whether those values were computed
 from a JSON bundle or from on-chip sketches.

 8.3 What this means.

 The framework's value proposition does not depend on the
 vantage being external, internal, geographic, optical, virtual,
 or any other concrete instantiation. As long as a candidate
 vantage type V'' admits

   (i)  a notion of pairwise causal compatibility (for C_1),
   (ii) a notion of empirical flow distribution (for C_2), and
   (iii) a notion of return-path / topology set (for C_3),

 the same axiomatic framework applies and the same Phi_K phase
 label is produced. This is the precise sense in which Poincare's
 maxim -- "the art of giving the same name to different things"
 -- describes what the framework does.

 8.4 Other vantage types this framework already covers
 (without algebraic change).

   - 5G UPF instances across network slices.
   - Inter-satellite-link neighbours in a low-Earth-orbit mesh
     (Starlink-class).
   - Optical fibre pairs landing on a submarine cable shore station.
   - Replicas of an anycast service (DNS root, CDN edge).
   - Threads in a software dataplane (VPP, DPDK).
   - Virtual interfaces in a Kubernetes CNI mesh.

 Each of these is a deployment study. The mathematics is reused
 verbatim. This profile (P4 next-hop vantages on a peering edge)
 is the simplest first step in that catalogue.


==============================================================================
 9. Open questions and validation roadmap
==============================================================================

 The following items must be resolved before this profile can be
 promoted to a full Internet-Draft submission. Each is presented
 as a numbered open work item D9.x referenced from the body
 above.

 D9.1 Reference P4 implementation.
      Status   : not started.
      Scope    : a complete P4_16 reference implementation of the
                 Sec. 3 axes targeting Tofino-2 SDE 9.x. Ships
                 with a software simulator (bmv2) for CI.
      Risk     : moderate. The P4 idioms used are standard; the
                 main risk is exceeding the stage budget of 14
                 stages for the MVPS pipeline once forwarding
                 logic is integrated. Sec. 6.4 budget assumes
                 forwarding ~6 stages; some carrier-grade
                 deployments use up to 12 stages for forwarding
                 alone, which would force the MVPS pipeline onto
                 a second pass or onto a separate Tofino pipe.

 D9.2 Hardware bench validation.
      Status   : not started.
      Scope    : end-to-end bench with two Tofino-2 chassis,
                 traffic generator, and an injected gray-failure
                 fault. Measure detection latency, false-alarm
                 rate, and resource utilisation against the
                 budget in Sec. 6.
      Risk     : access-bound, not technically. Tofino-2 bench
                 hardware is not in the catellix.com lab today.

 D9.3 Calibration of L1 -> JSD_norm lookup.
      Status   : conceptual.
      Scope    : empirically fit the lookup table on at least
                 three production sites (a peering edge, a
                 service edge, a metro core) over at least one
                 week each, and characterise the residual error
                 against software-computed JSD_norm.
      Risk     : low technically; depends on operator data
                 access agreements.

 D9.4 Sigma^{-1} drift and recalibration cadence.
      Status   : conceptual.
      Scope    : characterise how fast Sigma^{-1} drifts under
                 normal diurnal traffic patterns, and choose a
                 recalibration cadence that minimises false
                 alarms without missing real events. The
                 recommended starting cadence (30 seconds) is
                 a first-order guess.
      Risk     : low; this is a standard observability problem.

 D9.5 IOAM TLV registration and interop.
      Status   : not started.
      Scope    : IANA registration of the TLV-Type, alignment
                 with the IETF IOAM working group on TLV
                 semantics, and interop test against at least
                 two third-party IOAM collectors.
      Risk     : process-bound; technical risk negligible.

 D9.6 Comparative evaluation against existing dataplane signals.
      Status   : not started.
      Scope    : measure detection latency and false-alarm rate
                 of MVPS embedded against existing per-flow
                 dataplane signals (Linux RACK, P4-based
                 microburst detectors, BFD, S-BFD, IETF SAVNET
                 telemetry) on the same fault catalogue.
      Risk     : low; this is the academic-publication track.

 D9.7 Conjecture-T1 invariance under hardware quantisation.
      Status   : conceptual.
      Scope    : the math companion's Conjecture T1 (det(Sigma)
                 invariance under equilibrium) is stated for
                 idealised real-valued C_i. Verify whether it
                 holds, approximately, when C_i are Q1.10
                 fixed-point and the sketches introduce
                 deployment-time bias. If not, characterise the
                 bias and add a v1.2 erratum.
      Risk     : moderate. This is the most theoretically
                 interesting open item and is a natural
                 thesis-chapter problem.

 D9.8 Companion I-D draft.
      Status   : this document is the seed.
      Scope    : convert this profile to RFC 7322 I-D format,
                 align section numbering with IETF style, and
                 submit as draft-melegassi-ippm-mvps-dataplane
                 -00.
      Risk     : low.


==============================================================================
 10. References
==============================================================================

 Normative.

   [MVPS-MATH]  Melegassi, L. "MVPS -- Three-Layer Mathematical
                Structure". Catellix Research, v1.1, 2026-05-20.
                Available at:
                  https://catellix.com/static/download/
                    MVPS_THREE_LAYER_MATHEMATICAL_EVIDENCE.txt

   [MVPS-BUNDLE] Melegassi, L. "The MVPS Bundle".
                draft-melegassi-ippm-mvps-bundle, work in progress.

   [RFC9197]    Brockners, F. et al. "Data Fields for In Situ
                Operations, Administration, and Maintenance
                (IOAM)". RFC 9197, May 2022.

   [RFC9322]    Mizrahi, T. et al. "IOAM Deployment".
                RFC 9322, November 2022.

 Informative.

   [P4_16]      P4 Language Consortium. "P4_16 Language
                Specification, v1.2.4". 2023.

   [TOFINO2]    Intel Corporation. "Intel Tofino-2 Native
                Architecture (TNA) Reference Manual". 2022.

   [IOAM-INT]   Bhandari, S. et al. "Inband Network Telemetry
                (INT) Specification, v2.1". P4 Applications
                Working Group, 2020.

   [LIN1991]    Lin, J. "Divergence Measures Based on the
                Shannon Entropy". IEEE Trans. Inf. Theory,
                37(1):145-151, 1991.

   [POINCARE]   Poincare, H. "Science et Methode". Flammarion,
                Paris, 1908. ("L'art de donner le meme nom a des
                choses differentes...")

   [SCHEFFER]   Scheffer, M. et al. "Early-warning signals for
                critical transitions". Nature, 461:53-59, 2009.

   [CMS-COR]    Cormode, G. and Muthukrishnan, S. "An improved
                data stream summary: the count-min sketch and
                its applications". J. Algorithms, 55(1):58-75,
                2005.

   [BLOOM1970]  Bloom, B. H. "Space/time trade-offs in hash
                coding with allowable errors". Communications
                of the ACM, 13(7):422-426, 1970.


==============================================================================
 Document history
==============================================================================

   v0.1  2026-05-21  Initial draft. Companion to
                     MVPS_THREE_LAYER_MATHEMATICAL_EVIDENCE.txt
                     v1.1. Status: proposal, not implemented.
                     Authors: L. Melegassi (Catellix Research).

==============================================================================
 End of document
==============================================================================