============================================================================== MVPS -- Doctoral continuations: three trajectories (v2) Leonardo Melegassi, Catellix Companion to draft-melegassi-ippm-mvps-bundle and to thesis-kit/MASTER-THESIS-ENTRY-POINTS.md 2026-05-19, Andradina, SP, Brazil ============================================================================== "La mathematique est l'art de donner le meme nom a des choses differentes." "Mathematics is the art of giving the same name to different things." -- Henri Poincare, Science et Methode (1908), Livre I, Ch. II. ============================================================================== 0. Status and intent ============================================================================== This document is NOT an offer, a pitch, or a research programme seeking a supervisor. It is a forward-looking note kept inside the thesis-kit so that, if and only if the Master Thesis collaboration matures into a longer scientific relationship, the doctoral trajectory has already been sketched -- not invented under deadline pressure. It exists because, while writing the Master's entry points (thesis-kit/MASTER-THESIS-ENTRY-POINTS.md), each of M1 / M2 / M3 kept producing a follow-up question larger than itself. Those follow-ups did not fit a 750-hour scope. They are listed here as three independent doctoral trajectories. None of them is required for the Master's to succeed. Each of the three trajectories below is anchored on at least one finding the toolkit has already empirically demonstrated (F001..F019). The doctorate is the generalisation; the empirical counterexample already exists. ============================================================================== 1. The thread, made explicit ============================================================================== The unifying observation, made explicit by the Poincare epigraph: In protocol engineering, the deepest open questions are about deciding when two superficially different artefacts deserve the same name, and when two superficially identical artefacts deserve different names. That is what canonical forms decide (D1). That is what signatures and provenance attest (D2). That is what an empirical discipline of specification ambiguity measures (D3). The single most consequential sentence the toolkit produced -- already in findings/F009-F016-mathematical-layer.md -- is the syntactic / semantic asymmetry of path_fingerprint: path_fingerprint(B1) != path_fingerprint(B2) ==> B1 and B2 are syntactically distinct. path_fingerprint(B1) == path_fingerprint(B2) ==> B1 and B2 may or may not represent the same observed path. This is the theoretical core of D1, the operational core of D2, and the case-study core of D3. All three trajectories are different ways of making that asymmetry sharper. ============================================================================== 2. The triangulation ============================================================================== Three actors must be in the room for this kind of work to land. Universite de Liege Twenty years of empirical Internet-measurement methodology -- the Paris traceroute / MDA tradition (Augustin, Friedman, Teixeira; Donnet et al., ca. 2005-2015), reverse traceroute, anycast and IPv6 topology, RIPE Atlas analysis, anonymous-node inference. The empirical discipline. Catellix The instrumentation that made F001..F019 reproducible -- mvps-audit (single-file binary), the Profile v1 reference, the differential-testing harness, a 2,006-vector conformance corpus, the public mvps_bundle Python package. The engineering substrate. IPPM / IETF The venue where canonical forms become normative, where reference implementations become interoperability evidence, and where draft-melegassi-ippm-mvps-bundle is already on the Datatracker. The deployment surface. Each trajectory is intentionally designed to require all three. Removing any one corner makes the trajectory either purely theoretical, purely engineering, or purely political. None of those is sufficient. ============================================================================== 3. D1 -- Canonicalisation as a normative substrate for reproducible, interoperable Internet path telemetry ============================================================================== Essence ------- Raise canonical-byte equivalence from a per-field convenience (RFC 5952 for IPv6, JCS for JSON, CDE for CBOR) to a first-class normative substrate for an entire class of telemetry bundles, and prove -- empirically and structurally -- that the substrate suffices. Empirical anchor ---------------- F001, F002, F005, F007, F008 (serialisation layer) F011, F016 (algebraic-layer corollaries) F017 (cross-field consistency) F012 (refinement, IPv6 zone identifiers) Each already empirically demonstrated on the 2,006-vector corpus shipped with the toolkit. Master's on-ramp ---------------- M1 (canonicalisation audit + Profile v1.1). The Master's produces Profile v1.1 as a catalogue extension. The PhD generalises Profile v1.1 into a theory of canonicalisation profiles and proves -- on independent implementations -- that the theory closes the catalogue. Structural claim the thesis must prove -------------------------------------- Profile v1 in section 2 of profile-v1/SPEC.md states the invariant the PhD must turn into a theorem: semantic equivalence ==> canonical-byte equivalence ==> fingerprint equivalence In algebraic terms: bundles modulo semantic equivalence form a quotient set. The canonicaliser is a canonical section of that quotient -- a function that picks one representative per equivalence class, in such a way that hashing the representative produces a well-defined identity on classes (not just on bytes). The right vocabulary is the universal algebra of quotient structures (Burris & Sankappanavar, A Course in Universal Algebra, ch. II), not the rewrite-system vocabulary (Church-Rosser, Newman): the canonicaliser is deterministic by construction, so confluence is automatic, and what needs proving is totality, idempotence, and well-definedness on classes. F001 is the proof that today the substrate does NOT satisfy this property: 926 / 926 IPv6-bearing bundles in the corpus produce distinct fingerprints between two good-faith implementations. F017 and F019 show the gap is structural (it crosses field boundaries and the validator / canonicaliser interface), not just per-field. Theoretical anchoring --------------------- - Universal algebra of quotient structures and canonical representatives (Burris & Sankappanavar, A Course in Universal Algebra, Springer GTM 1981, available open access). - The deterministic-encoding tradition: RFC 5952 (IPv6 textual), RFC 8785 (JCS), CBOR Deterministic Encoding (RFC 8949 sec. 4.2), COSE canonical structures (RFC 9052), HTTP message signatures (RFC 9421). - Differential testing as a scientific instrument: Yang, Chen, Eide, Regehr (Csmith), PLDI 2011; Le, Afshari, Su (EMI), PLDI 2014. - Empirical Internet-measurement substrate: the Paris traceroute / MDA line and its follow-ups; RIPE Atlas methodology; the CAIDA reverse-traceroute and topology research tradition. Four-year roadmap ----------------- Year 1 Formalisation. Define ~_semantic rigorously. Prove the canonicaliser is total, idempotent, well-defined on classes. Characterise exactly the cases where the diagram does NOT commute (this IS the catalogue of normative-gap findings). Extend the corpus to ~10,000 vectors via property-based + schema-driven fuzzing. Primary deliverable: one peer-reviewed paper (PETS / CoNEXT / FORTE-class venue); updated I-D with formalisation appendix. Year 2 Multi-implementation interoperability. Re-derive Profile v1.x in four languages (Python reference + Go + Rust + one of C / TypeScript / OCaml). Each passes the same conformance corpus on x86-64 and AArch64. Primary deliverable: one systems paper (CoNEXT / IMC / NSDI); a reference suite shipped with the I-D. Year 3 Cryptographic-layer corollary. Demonstrate (not formally verify -- that belongs in D2) that Profile v1 + HMAC-SHA-256 closes F010 in a way two independent implementations agree on. Document the boundary where D1 stops and D2 begins. Primary deliverable: one short paper or extended technical report; companion I-D. Year 4 Internet-scale validation and standardisation. Deploy to a RIPE Atlas anchor subset (~50 anchors, 90 days, ~1.5 M bundles). Measure canonicalisation divergence in the wild. Pursue WG adoption of Profile v1.x. Defend. Primary deliverable: one IMC / PAM paper; WG-adoption discussion (best-case: adopted; realistic: well-positioned). Why Liege specifically (three reasons, none ceremonial) ------------------------------------------------------- 1. The MDA tradition needs canonical fingerprints. Multipath detection assumes that two probes traversing the same path can be recognised as such. In production that recognition is heuristic. Canonical fingerprints turn the heuristic into an identity. The thesis pays back a methodology that has been improvising the answer for over a decade. 2. The empirical-measurement-first culture is the right soil. A formal canonicalisation thesis written in a pure-theory department risks becoming an exercise in algebraic aesthetics. Written where every claim is expected to be measured against RIPE Atlas or a comparable dataset, it stays honest. 3. IPPM proximity. Standardisation in Year 4 is not an afterthought; it is where the thesis lands. The Liege group has the institutional connections to make that landing real. Realistic outputs over four years --------------------------------- - 3-5 peer-reviewed papers (PETS, CoNEXT, IMC, PAM, possibly NSDI). - 1-2 I-Ds; one pursued through WG-adoption. - 1 reference implementation suite in 4 languages, CI-gated. - 1 public conformance corpus, >= 10,000 vectors. - 1 RIPE Atlas longitudinal dataset (~90 days). - 1 dissertation. Principal risks and mitigations ------------------------------- Pure-theory drift. Mitigated by the hard requirement that every formalisation claim be paired with a counterexample in the corpus or an absence-of-counterexample over the full corpus. Competing standards Mitigated by engaging the relevant WGs (CBOR, (CDE, JCS, ...). JSON, COSE, IPPM) early; the thesis is positioned as APPLYING deterministic encoding to a new domain, not competing with it. Scope creep into Mitigated by holding MVPS as the case study "all of network for the first three years, and only telemetry". generalising the framework -- not the case studies -- in Year 4. ============================================================================== 4. D2 -- Verifiable provenance for distributed Internet measurement ============================================================================== Essence ------- Answer the operational question "who can I trust about this measurement, and at what cost?" by turning per-snapshot authenticity (HMAC) into a federated, verifiable, cross-organisation measurement provenance infrastructure -- and by defining the algebra of bundle combination that federation requires. Empirical anchor ---------------- F010 path_fingerprint length-extendable: the necessary cryptographic PRECONDITION this thesis closes. F015 bundle combination algebra undefined: the operational GAP this thesis fills. F019 path_fingerprint more permissive than validate_bundle: the cross-layer ATTACK SURFACE this thesis must defend against. All three already empirically demonstrated and reproducible in < 1 second with mvps-audit. Master's on-ramp ---------------- M3 (per-snapshot HMAC + key rotation). The Master's produces a per-snapshot scheme inside a single trust domain. The PhD produces the inter-organisation federation and the formal verification of its security claims. What changes from M3 to D2 -------------------------- M3 stops at: vantage point produces a bundle, signs it with a vantage-scoped key, downstream verifies. That suffices inside one operator. It does not suffice when: - A bundle traverses three organisations (producer, aggregator, archive) and each must attest something different. - Keys rotate at different cadences in different organisations. - A vantage point is compromised retrospectively and historical bundles must be invalidated without breaking the rest of the dataset. - An archive must prove, years later, that the bundle it stored is the bundle the vantage point produced -- without retaining the signing key. - Two operators contribute snapshots to a shared bundle and the combination operator on bundles must be defined normatively (F015: today it is silent). These are the questions D2 answers. Theoretical anchoring --------------------- - F015 (bundle combination algebra) as the structural problem. The thesis must define MVPS-bundle combination as an explicit algebraic structure -- the existing F015 recommendation is a commutative semigroup with namespaced vantage IDs and idempotence on byte-equal snapshots; the PhD justifies, refines, or replaces this choice. - F010 (path_fingerprint length-extendable) as the cryptographic precondition (already proved; D2 builds on it). - Per-snapshot and bundle-level authenticity primitives: HMAC-SHA-256 (RFC 2104) for the symmetric / single-operator case; Ed25519 (RFC 8032) and COSE_Sign1 / COSE_Sign / Countersign (RFC 9052, RFC 9338) for the cross-operator case. - Transparency-log inspiration: Certificate Transparency (RFC 6962, RFC 9162), Sigstore / Rekor. - Formal protocol verification: Tamarin (Meier, Schmidt, Cremers, Basin) or ProVerif (Blanchet) for machine-checked security arguments against an adversarial-vantage-point threat model. - Optional / exploratory (only if a specific threat model warrants): threshold signatures (Shoup; BLS aggregation) for multi-vantage co-attestation; Verifiable Random Functions (RFC 9381) for anti-collusion in vantage selection; Merkle aggregation for compact proofs over very large bundle sets. Four-year roadmap ----------------- Year 1 Threat models and the federation algebra. Three threat models: single-operator (M3 baseline); multi-operator with one honest majority; multi-operator with mutually distrusting parties. For each, the minimum cryptographic primitive that suffices. In parallel: normative algebra of bundle combination (F015), with worked examples on a 3-operator simulation. Primary deliverable: one survey / problem-statement paper. Year 2 Scheme design and formal verification. Implement in Tamarin or ProVerif; produce a machine-checked proof for each of the three threat models. Primary deliverable: one security paper (NDSS / IEEE S&P / USENIX Security). Year 3 Deployment and interoperability. Three-organisation testbed: one ULiege node, one RIPE-anchor-class node, one Brazil-side / Catellix node. Run the protocol for ~12 weeks. Measure signature failure rate, key-rotation outage windows, archive verification cost, retrospective revocation latency. Primary deliverable: one systems paper (NSDI / CoNEXT / IMC); companion I-D in COSE or a measurement-provenance WG. Year 4 Generalisation and defence. Extend beyond MVPS to one adjacent measurement format (RIPE Atlas result; BMP; BGP-LS). Pursue I-D adoption. Defend. Primary deliverable: I-D submitted; one synthesis paper; dissertation. Why Liege specifically ---------------------- The Liege group's work on anonymous network nodes and on anycast detection hinges on attribution -- WHO is responsible for a hop, WHERE a service is replicated. Verifiable provenance is the same question re-asked one layer up: who is responsible for the measurement, and what would it take to prove it cryptographically? The group has spent two decades answering the lower-level version of that question; D2 answers the next-layer version using the empirical discipline already in the house. Realistic outputs over four years --------------------------------- - 3-5 peer-reviewed papers (NDSS, IEEE S&P, NSDI, IMC, CoNEXT). - 1-2 I-Ds (one in COSE; one in IPPM or a measurement-provenance WG). - 1 reference implementation of the federation protocol with Tamarin / ProVerif models. - 1 three-organisation testbed deployment, with 12 weeks of public measurements. - 1 dissertation. Principal risks and mitigations ------------------------------- Operator pushback Mitigated by supporting both HMAC against asymmetric (operator-symmetric) and Ed25519 signatures. (cross-operator) modes with a clear deployment guide for each. Formal-verification Mitigated by requiring every Tamarin / rabbit hole. ProVerif model to correspond to an explicit attack scenario from the F-catalogue or from a published incident -- no abstract proofs without a concrete adversary. Federation never Mitigated by designing the protocol so that materialising at single-operator deployment already provides deployment time. clear benefit (closes M3's loop); federation is an opt-in extension, not a prerequisite. ============================================================================== 5. D3 -- Differential Spec Engineering: an empirical discipline for IETF documents ============================================================================== Essence ------- Treat textual normative specifications as empirical objects whose ambiguity (sigma_spec) can be defined, measured, and systematically reduced -- generalising the method that produced F001..F019 from a single case (MVPS) into a transferable empirical discipline applicable across IETF and adjacent standards bodies. Empirical anchor ---------------- The METHOD itself, not any one finding. The existence of F001..F019, all produced within 24 hours of evening work by a single author on a single spec, is the proof of concept. The thesis is the discipline. Master's on-ramp ---------------- None directly -- D3 is the meta-level question raised by the method used to find F001..F019, independent of any specific finding. M2 (independent Go producer) is, however, the single best dry-run of the methodology: it forces an explicit articulation of what counts as a "divergence" and what counts as a "bug". Core hypothesis --------------- Every prose normative specification has a non-zero specification uncertainty, sigma_spec. Today, that uncertainty is discovered post-hoc -- by deployment incidents, errata, interop-event surprise. D3 hypothesises that sigma_spec can be: 1. Defined formally (candidate: Shannon entropy of the distribution of behaviours observed across independent implementations conditional on a fixed input bundle). 2. Measured empirically (candidate: differential testing across a curated implementation set + property-based input generation from the spec's schema). 3. Reduced systematically (candidate: targeted normative refinement guided by the divergence ranking the measurement produces). If true, this turns specification authorship from a craft into a measurable engineering discipline, with the same kind of feedback loop that compiler testing brought to compiler engineering between 2010 and 2020 (Csmith, EMI, YARPGen, Frama-C). Theoretical anchoring --------------------- - Compiler differential testing (Csmith -- Yang, Chen, Eide, Regehr, PLDI 2011; EMI -- Le, Afshari, Su, PLDI 2014; YARPGen). - Formal semantics of natural-language specifications (Cuoq with Frama-C / TrustInSoft; Krebbers, The C Standard Formalized in Coq, PhD Radboud 2015; CakeML -- Norrish, Tan, Owens et al.). - Empirical software engineering methodology (Wohlin, Runeson, Host, Ohlsson, Regnell, Wesslen, Experimentation in Software Engineering). - Information-theoretic notions of ambiguity (Shannon entropy; conditional entropy) as vocabulary for sigma_spec. - IETF / IRTF process literature (RFC 2026, RFC 8126, BCP 9) for the deployment surface of the methodology. Four-year roadmap ----------------- Year 1 Formalisation and tooling baseline. Define sigma_spec precisely (Shannon-entropy vs worst-case-divergence formulations -- decide, or admit both). Build the first reference instrument: a generic differential-testing harness accepting any spec with a YANG / JSON-schema / CDDL definition and a set of executable implementations. Validate against MVPS (ground truth: F001..F019 already exists). Primary deliverable: one methodology paper; the harness as a public artefact. Year 2 Case studies, part 1. Three IETF case studies of increasing complexity: (i) MVPS (baseline); (ii) a stable, mature spec, e.g. a JOSE algorithm (RFC 7518), to test that sigma_spec is correctly LOW; (iii) a known-troublesome spec, e.g. BGP-LS attributes (RFC 7752) or BMP peer-up notifications (RFC 7854), to test that sigma_spec is correctly HIGH. Primary deliverable: one empirical paper (IMC / CoNEXT / FSE / ICSE). Year 3 Case studies, part 2; tooling productisation. Two more case studies (one in IRTF; one in W3C if reachable). Open the tooling for community use; engage at least one WG to apply DSE measurement once during a draft revision cycle. Primary deliverable: one systems paper; community uptake evidence (at least one external user). Year 4 Synthesis and standardisation. Methodology written up as an informational I-D ("Differential Spec Engineering: an empirical methodology for normative documents"). Defend. Primary deliverable: I-D submitted (informational; adoption is bonus, not commitment). Dissertation. Stretch goal (not required for the dissertation): get DSE measurement formally integrated into one IETF WG's revision workflow. IETF process changes slowly; the thesis must be defendable whether or not the process change lands. Why Liege specifically ---------------------- The methodological match is unusually clean. The Liege group has spent twenty years applying empirical instruments to objects that the networking community had been treating qualitatively (reverse- engineered topologies, anonymous nodes, multipath inference). D3 applies the same epistemological move to a different object: specifications themselves. The intellectual reflex -- "what would it look like to actually measure this?" -- is the generator of the thesis. ULiege is, on this read, one of perhaps a half-dozen places in the world where this thesis would be at home. Realistic outputs over four years --------------------------------- - 3-5 peer-reviewed papers (IMC, CoNEXT, FSE, ICSE; one possible at a methodology venue such as ESEM). - 1 I-D (informational) on the methodology. - 1 open-source DSE harness, reusable across specs. - 5 case-study reports, each citable as evidence. - 1 dissertation. Principal risks and mitigations ------------------------------- Highest ambition, Mitigated by a hard cap of five case studies highest scope risk. and an explicit policy that the thesis is about the METHOD, not about exhaustively cataloguing any one spec's bugs. Tool-engineering Mitigated by the Year 1 formalisation drift. requirement: no case study runs until sigma_spec is defined and the harness has been validated against MVPS. Community Mitigated by case-study selection: the indifference. high-sigma_spec cases are picked from specs whose ambiguity has caused documented operational incidents (BGP-LS attribute encoding; BMP peer-up; etc.), so the audience for the result is pre-existing. ============================================================================== 6. How to choose between D1, D2, and D3 ============================================================================== These are not three flavours of the same dissertation. They have different centres of mass and recruit different temperaments. D1 rewards a candidate who is comfortable moving between universal algebra (quotients, sections) and standards prose (RFCs, drafts), and who wants the lasting artefact to be a STANDARD. D2 rewards a candidate who is comfortable with cryptographic protocol design and formal protocol verification, and who wants the lasting artefact to be a DEPLOYED FEDERATED SYSTEM. D3 rewards a candidate who is comfortable with empirical methodology and tool-building, and who wants the lasting artefact to be a DISCIPLINE. All three are doable in four years at ULiege with the toolkit and catalogue this thesis-kit already provides as a starting condition. None of the three is doable in four years without that starting condition; the toolkit reasonably shaves 9-12 months off each, because the conformance corpus, the differential harness, the Profile v1 reference, the catalogue F001..F019, and the IPPM relationship around draft-melegassi-ippm-mvps-bundle already exist. ============================================================================== 7. Relationship to the Master's ============================================================================== Master's entry point Empirical anchor (Master's) Continuation -------------------- --------------------------- ------------ M1 Canonicalisation F001..F008, F011, F017 D1 audit + Profile v1.1 M2 Independent Go F019 (across-layer D1 (interop) producer + Atlas disagreement) or D3 (case) M3 Per-snapshot HMAC F009, F010, F015 D2 + key rotation A successful Master's outcome is a necessary but not sufficient condition for any of D1 / D2 / D3. The Master's establishes that the candidate can produce concrete spec-engineering artefacts at IETF- publishable quality; the PhD establishes that they can build a four-year research programme around the structural question those artefacts raised. ============================================================================== 8. Funding and structure (briefly) ============================================================================== These trajectories are written as ULiege PhDs, which in the Belgian framework typically run on FNRS (FRIA / Aspirant), ARC, or industrial co-funding. A COTUTELLE with a Brazilian institution (UFSCar, Unicamp, UFRJ, or USP) is also viable and would let Catellix co-finance a portion. No commitment is implied either way; this is noted only because the trajectories are sized to standard ULiege PhD funding envelopes (3-4 years, 1 FTE supervisor + 1 promoter). ============================================================================== 9. On the epigraph, returned to ============================================================================== Poincare's sentence, in context, is about why the abstraction of GROUP -- which gives "the same name" to permutations, rotations, integers under addition, and the automorphisms of a geometric figure -- is a deep act of mathematics rather than a notational convenience. The power, he says, lies in the act of NAMING: once two phenomena share a name, every theorem about that name applies to both. The MVPS bundle, the RFC 5952 IPv6 textual form, the RFC 8785 JSON canonicalisation, deterministic CBOR, the COSE signature payload, the HTTP message-signature input, and the future telemetry formats nobody has written yet are, in the relevant sense, THE SAME THING: each is a structured artefact whose semantic equivalence classes deserve canonical names, whose canonical names deserve fingerprints, and whose fingerprints deserve cryptographic binding. D1, D2, and D3 are three ways of giving them that name -- and of proving that the name is well-defined. If the Master's collaboration goes well, any one of them is a four-year conversation worth having. ============================================================================== 10. Expansion (v3, 2026-05-20) -- the three mathematical layers ============================================================================== Between 2026-05-19 and 2026-05-20 the toolkit gained three rigorously defined mathematical layers, implemented in app/mvps_layer{1,2,3}.py, 70 unit tests green, four canonical scenarios per layer, CLI and REST entry points. Each layer is the answer to a single physical question: Layer 1 (spatial) Where is the network now? C(t) in [0,1]^3 + Hamiltonian H + six hidden observables (rho, Delta, R, Sigma, det Sigma, sigma_t * sigma_f). Layer 2 (temporal/ How does the network evolve? structural) 9-vector (D1,D2,D3 / K1,K2,K3 / T1,T2,T3) -- regularity / conservation / topology. Layer 3 (regime) Which qualitative phase is occupied, and how close is the next transition? Phi(t) discrete + (tau_CSD, tau_dphi, tau_OU) + (omega_H, omega_M, omega_C). Every previous Master's (M1, M2, M3) and doctoral (D1, D2, D3) entry point stands as written. None is obsolete; canonicalisation, provenance and Differential Spec Engineering remain independent contributions. What follows below is the *expansion surface* opened by the three layers, organised at three scopes: Master (M), Doctoral (D), Post-doctoral (P), plus four cross-cutting categories of academic pendency (theoretical / empirical / engineering / governance). The numbering continues from M3 / D3; the legacy entry points are preserved verbatim above. ============================================================================== 11. Master's-level expansion (M4 .. M33) ============================================================================== Each Master is sized at the same 750-hour scope as M1/M2/M3. Each has a concrete first deliverable on day 1 (the toolkit reproduces the relevant scenarios) and a clear "what done looks like". ------------------------------------------------------------------------------ Layer 1 -- spatial coherence (10 Master entry points) ------------------------------------------------------------------------------ M4 -- Empirical calibration of C1 Einstein bound across fibre stretch Anchor: app/mvps_layer1.py::c1_causal + FIBRE_STRETCH_BY_REGION. Goal: measure c_fibre across terrestrial, submarine and intra-DC links using RIPE Atlas ground-truth probes; produce a calibrated table of refractive-index ratios by region. Done: a publishable per-region c_fibre table; an updated Profile v1 normative reference for the Einstein causal axis. M5 -- JSD vs match_rate for C2: a decidability criterion Anchor: app/mvps_layer1.py::c2_informational. Goal: characterise empirically when continuous-distribution JSD outperforms point-wise match_rate, and vice-versa. Build a decision tree based on (#vantages, vantage diversity, hop-distribution width). Done: a 30-page methodology note; runtime selector in the code; co-authored short paper with the supervisor. M6 -- Window selection for the entropy term in C1^tau Anchor: H_v in the fingerprint distribution. Today the moving window is a hard-coded constant. Goal: derive the optimal window from the autocorrelation time of the fingerprint stream (Ornstein-Uhlenbeck-style). Validate that the C1 produced is invariant under window choice within a justified range. Done: a calibration scheme; a robustness study. M7 -- Gauge fidelity rho = C2 / C3: empirical phenomenology Anchor: app/mvps_layer1.py::gauge_fidelity. F001 is already a *known* gauge_gap regime (rho = 0). Goal: scan rho across 50+ RIPE Atlas anchors over 90 days; identify the operational regimes that drive rho away from bijective. Done: a first map of "where in the Internet rho collapses", with at least three new operational regimes catalogued. M8 -- Reciprocity violation R on OWAMP / TWAMP datasets Anchor: app/mvps_layer1.py::reciprocity_violation. Goal: first large-scale measurement of BGP-induced asymmetry via R on a public OWAMP dataset (e.g. perfSONAR archives). Compare with classical TomographyR results. Done: empirical paper; identification of AS pairs with chronic R > 0.5. M9 -- Triangle defect Delta as informational curvature Anchor: app/mvps_layer1.py::triangle_defect. Goal: connect Delta with hyperbolic embedding of the Internet graph (Boguna-Krioukov line). Test the hypothesis that mean(Delta) reflects global graph hyperbolicity. Done: a Delta-vs-Ollivier-Ricci cross-validation paper. M10 -- Online robust estimation of Sigma (residual covariance) Anchor: app/mvps_layer1.py::LayerPredictor (Welford). Goal: replace Welford with Minimum Covariance Determinant (MCD) and Tyler M-estimators; characterise breakdown points in adversarial inputs. Done: a robustness study; library replacement; updated eigendecomposition confidence intervals. M11 -- Heisenberg-Gabor tightness in MVPS observations Anchor: app/mvps_layer1.py::uncertainty_bound. Goal: characterise under which sampling protocols the product sigma_t * sigma_f saturates the 1/(4*pi) floor. Done: an operational guide on how to choose dt and window to reach the physical limit of joint observability. M12 -- Information geometry of (C1, C2, C3): Fisher metric Anchor: the [0,1]^3 cube interpreted as a statistical manifold. Goal: derive the Fisher information metric on Layer-1; draw geodesics between BASELINE and the six other reference centroids. Compare metric distances with the legacy Mahalanobis. Done: a short methodology paper; an alternative metric to feed Phi_D in Layer 3. M13 -- Adversarial robustness of Layer-1 axes Anchor: F001..F019 already document adversarial regimes accidentally produced by bad canonicalisation. Goal: design controlled adversaries (one per axis) that depress C_i without depressing the others; quantify cost of attack vs detection. Done: an attack catalogue + a defence ranking. ------------------------------------------------------------------------------ Layer 2 -- temporal / structural coherence (9 Master entry points) ------------------------------------------------------------------------------ M14 -- D1 Lipschitz saturation: choosing operational sat constant Anchor: app/mvps_layer2.py::d1_lipschitz_regularity. Goal: derive saturation from autocorrelation time, not from a hard-coded constant. Validate on real time-series. Done: a calibrated D1 with a justified scale; a small appendix to Profile v2. M15 -- D3 ergodicity tests at scale (Birkhoff, Kac) Anchor: app/mvps_layer2.py::d3_ergodicity. Goal: first large-scale Birkhoff-style test on Internet path data: does time-average equal ensemble-average for Layer-1 axes? Empirically identify non-ergodic regimes (e.g. anycast, CDN re-routing). Done: empirical paper documenting non-ergodicity as a regime, not a measurement defect. M16 -- K1 Kirchhoff conservation on sFlow / NetFlow data Anchor: app/mvps_layer2.py::k1_flow_conservation. Goal: first deployment of K1 on a real telco's sFlow feed; catalogue the operational reasons for K1 < 1 (forwarding loss, ECMP rehash, sampling bias). Done: a real-network operator co-authored paper. M17 -- K2 Clausius entropy and BGP route-flap dynamics Anchor: app/mvps_layer2.py::k2_information_conservation. Goal: relate K2 transients with BGP UPDATE counts from RIPE RIS feed; characterise the entropy signature of a full route flap event. Done: a quantitative bridge between control-plane events and Layer-2 informational conservation. M18 -- K3 Hamilton stability as a SLA metric Anchor: app/mvps_layer2.py::k3_action_stability. Goal: propose K3 as an SLA contract metric (alternative to five-9s availability). Test against six months of production data from a regional operator. Done: a draft I-D proposing K3 as a measurable SLA target. M19 -- T2 Fiedler algebraic connectivity longitudinal Anchor: app/mvps_layer2.py::t2_algebraic_connectivity. Goal: track lambda_2(L) of the Internet's RIPE Atlas topology month-by-month over 2-3 years; identify connectivity "weather". Done: a longitudinal dataset + first paper documenting algebraic-connectivity dynamics of the Internet. M20 -- T3 Menger path diversity across CDN providers Anchor: app/mvps_layer2.py::t3_path_diversity. Goal: compare T3 between major CDN providers (Cloudflare, Fastly, Akamai, CloudFront) from the perspective of a common set of vantages. Done: a public T3 benchmark for the CDN industry. M21 -- Migration of legacy C6/C7/C9 out of MVCI v1.x Anchor: the four "categorically wrong" axes identified in CAMADA2_DINAMICA_CONSERVACAO_ESTRUTURA.md sec. 0. Goal: implement the migration: C6 -> Layer 0 (syntax), C7 -> coverage metadata, C9 -> Layer 3 model. Reconcile the public glossary with the code. Done: MVCI v1.5 release notes + reconciled glossary + regression suite. M22 -- Comparison of Layer-2 signatures with classical baselines Anchor: docs/research/BASELINES_MATRIX.md exists already for Layer 1. Goal: extend the matrix to Layer 2 -- demonstrate that (D, K, T) reaches signal a single-vantage RTT histogram cannot. Done: a head-to-head empirical paper. ------------------------------------------------------------------------------ Layer 3 -- regime and transition (7 Master entry points) ------------------------------------------------------------------------------ M23 -- Recalibration of _FAILURE_CENTROIDS on Layer-1 v1 Anchor: thesis-kit/CAMADA3_FASE_E_TRANSICAO.md sec. 6. Goal: replace the engineering-prior centroids (calibrated on legacy C1/C2/C3) with empirical centroids on the new Einstein/JSD/Jaccard definitions, fit on a labelled dataset of regime instances. Done: a recalibrated _FAILURE_CENTROIDS table with confidence intervals; a paper documenting the procedure. M24 -- Choice of K (number of phases) by BIC / AIC / silhouette Anchor: K = 7 is a legacy engineering choice, not a result. Goal: justify K empirically on >= 6 months of RIPE Atlas data. Test K in {3,...,15} with cross-validation. Done: a defensible K with information-criterion evidence. M25 -- Critical Slowing Down on RIPE Atlas data Anchor: app/mvps_layer3.py::tau_csd. Goal: first systematic Scheffer 2009 application to Internet-scale time series. How early does tau_CSD warn, on average, across documented incidents? Done: an early-warning lead-time distribution; a paper. M26 -- Ornstein-Uhlenbeck T_OU as recovery-time SLO Anchor: app/mvps_layer3.py::tau_ou. Goal: propose T_OU as a publishable Service Level Objective ("time for the network to forget a perturbation"). Validate over months of production data. Done: a measurable, comparable, normatively-defined SLO. M27 -- Voronoi-boundary distance tau_dphi as NOC alert metric Anchor: app/mvps_layer3.py::tau_boundary. Goal: integrate tau_dphi into a production NOC; measure false-positive / false-negative rates against ground-truth incidents over 90 days. Done: an operational deployment evaluation. M28 -- omega_M margin as antecedent indicator Anchor: app/mvps_layer3.py::omega_margin. Goal: when omega_M < 0.05 fires, how many minutes before the label Phi_K actually flips? A quantitative characterisation across regimes. Done: an operator's playbook for omega_M thresholds. M29 -- Calibration of K_* labels under Layer-1 v1 Anchor: regime_k._K_FROM_CLASS. Goal: revisit the mapping FCE_class -> K_label after Layer 1 axes were redefined. Survey 20+ NOC operators on the semantics they expect from each K_*. Done: a sociology-of-NOC-vocabulary paper + a settled canonical mapping. ------------------------------------------------------------------------------ Cross-layer and infrastructure (4 Master entry points) ------------------------------------------------------------------------------ M30 -- Educational GUI with the three layers Anchor: tools/mvps_audit_web/. Goal: extend the Discovery Lab into a three-layer interactive teaching app; A/B test pedagogy with two cohorts (one with classical traceroute slides, one with Layer-1/2/3 dashboards). Done: a teaching-effectiveness paper; the curriculum kit becomes the public default. M31 -- BMP and BGP-LS adapters to MVPS Anchor: the existing RIPE-Atlas adapter (M2). Goal: produce MVPS bundles from BMP (BGP monitoring) and BGP-LS feeds, exercising Layer 2's K1/K2 axes with control-plane evidence. Done: two new adapters; the corpus grows beyond traceroute-derived data. M32 -- YANG/CDDL/CBOR alternative serialisations Anchor: tools/mvps_audit_web/draft-melegassi. Goal: validate that Profile v1 + Layer 1/2/3 are serialisation-agnostic; produce a YANG, a CDDL, and a CBOR canonical encoding alongside the existing JCS. Done: three alternative wire formats, all passing the 2,006-vector conformance. M33 -- Cross-language i18n A/B in education Anchor: thesis-kit i18n already in PT/EN/FR/ES. Goal: scientifically significant A/B test on students -- does mother-tongue Layer-1/2/3 instruction lower the time-to-fluency? (~ 60 student cohort, 4 languages.) Done: an empirical CS-education paper. ============================================================================== 12. Doctoral-level expansion (D4 .. D20) ============================================================================== Each is a four-year programme, anchored on at least one finding already empirically demonstrated in the toolkit. The expansion is deliberate: in D1/D2/D3 the lever was canonical names, signatures, and method. In D4..D20 the lever is the mathematical structure of the three layers themselves. ------------------------------------------------------------------------------ D4 -- Theory of MVPS Layers: an axiomatic three-layer framework ------------------------------------------------------------------------------ Essence: formalise the three-layer architecture as a complete algebraic / topological framework. Prove that Layer 1, 2, 3 exhaust the space of observables of a multi-vantage probing system, up to a formal "completeness" theorem analogous to thermodynamic completeness (state variables + conservation laws + phase variables). Empirical anchor: the existing 9 + 9 + 9 + 6 = 33 scalars (3 + 6 Layer 1 + 9 Layer 2 + 9 Layer 3 + 6 oculta). Roadmap (4 years): Y1: formal definitions of equivalence on observable algebras; prove minimal sufficiency of Layer 1 for instantaneous state. Y2: prove minimal sufficiency of Layer 2 for temporal state. Y3: prove minimal sufficiency of Layer 3 for regime. Y4: completeness conjecture + empirical falsifiability tests. ------------------------------------------------------------------------------ D5 -- Empirical phenomenology of Layer-1 coherences at Internet scale ------------------------------------------------------------------------------ Essence: a longitudinal observational programme: collect (C1, C2, C3, H, rho, Delta, R, Sigma) at all 100+ RIPE anchors for 24+ months. Build the first "atlas of Internet coherence", analogous to the CAIDA AS topology atlas but in the coherence-space. Roadmap: Y1: deployment + initial dataset. Y2: pattern catalogue. Y3: longitudinal stability + seasonal effects. Y4: synthesis + public dataset release + WG impact. ------------------------------------------------------------------------------ D6 -- Statistical mechanics of the Internet ------------------------------------------------------------------------------ Essence: develop the explicit partition function Z(beta) over Layer-1 states; demonstrate that the seven phases identified are *physical phases* in the Landau sense (each is a minimum of a free-energy functional with non-trivial dependence on a control parameter). Predict critical exponents and test against measurement. Empirical anchor: H = -ln(C1*C2*C3) (the Hamiltonian); det(Sigma) (the phase volume); transitions documented in 18 months of mega-day. Roadmap: Y1: define beta operationally; build Z(beta). Y2: Landau expansion; predict critical exponents. Y3: empirical measurement of exponents in production data. Y4: synthesis + bridge with classical Internet-topology stat-mech (e.g. Cohen-Erez-ben-Avraham-Havlin scale-free). ------------------------------------------------------------------------------ D7 -- Tomographic inversion of the Internet routing function ------------------------------------------------------------------------------ Essence: invert MVPS Layers 1/2/3 to reconstruct the *routing function* itself, by analogy with medical or seismic tomography. Layer 1 is the projection; the routing function is the volume. Solve the inverse problem. Empirical anchor: MVPS bundles are projections of a higher-dimensional routing object; the inversion is the missing piece. Roadmap: Y1: inverse-problem formulation in the language of integral geometry. Y2: implementation of the inverse operator on synthetic data. Y3: validation against ground-truth routing tables on a collaborating operator network. Y4: limitations + scale-up + publication. ------------------------------------------------------------------------------ D8 -- Internet cosmology: invariants over years ------------------------------------------------------------------------------ Essence: identify quantities that are conserved over geological time-scales of Internet (months, years). Candidates: det(Sigma), the lambda_2(L) growth law, the K_3 base value, the entropy rate of the path distribution. Roadmap: Y1: candidate invariants + theoretical justification. Y2: passive observation over 12 months. Y3: prediction of invariants under intervention (an operator acquisition, a major submarine cable failure, a regulatory shift). Y4: synthesis as "constants of nature" of the Internet, with a falsifiable framework. ------------------------------------------------------------------------------ D9 -- Phase diagram of the Internet ------------------------------------------------------------------------------ Essence: map empirically the seven (or K, after M24) phases of Layer 3; characterise the boundaries (Voronoi cells), the transition probabilities (Markov-on-Phi), the residence times (power-law? exponential?). Roadmap: Y1: dataset construction (24 months of Phi labels). Y2: boundary characterisation; transition matrix estimation. Y3: power-law residence-time analysis; ergodicity tests. Y4: predictive model + intervention experiments. ------------------------------------------------------------------------------ D10 -- Hyperbolic geometry of (rho, Delta) coherences ------------------------------------------------------------------------------ Essence: combine gauge fidelity (rho) and triangle defect (Delta) into a hyperbolic-embedding of the Internet that improves on the Boguna-Krioukov H^2 embedding by using observable coherences instead of degree-correlation proxies. Roadmap: Y1: definition + theoretical motivation. Y2: implementation; comparison with classical hyperbolic embedding on CAIDA topologies. Y3: predictive power for routing decisions (greedy hyperbolic forwarding). Y4: deployment trial; thesis. ------------------------------------------------------------------------------ D11 -- Early-warning theory of network transitions ------------------------------------------------------------------------------ Essence: rigorous lead-time theory for tau_CSD: under what distributional assumptions does Scheffer-style CSD provably warn T_lead time-steps before a transition? Roadmap: Y1: theoretical lead-time bounds. Y2: empirical validation in synthetic AR(p) / SDE simulations. Y3: empirical validation in Internet incidents. Y4: deployment + operational tuning. ------------------------------------------------------------------------------ D12 -- Information geometry of the three-layer trajectory ------------------------------------------------------------------------------ Essence: equip the 33-dimensional Layer space with a Fisher information metric, study geodesics, develop a "thermodynamic" formulation a la Crooks fluctuation theorem. Roadmap: Y1: Fisher metric on Layer 1; extend to Layer 2. Y2: geodesic equation; numerical solver. Y3: empirical geodesic tracking during incidents. Y4: extension to Layer 3 (discrete-continuous hybrid). ------------------------------------------------------------------------------ D13 -- Quantum-like uncertainty bounds in MVPS observables ------------------------------------------------------------------------------ Essence: extend Heisenberg-Gabor to other conjugate pairs of observables in MVPS. Are there non-commuting "measurement operators" in Internet observation? (Speculative but well-defined in the right formalism.) Roadmap: Y1: identify candidate conjugate pairs. Y2: derive bounds; test empirically. Y3: implications for sampling protocol design. Y4: synthesis. ------------------------------------------------------------------------------ D14 -- Renormalisation group flow on Internet coherences ------------------------------------------------------------------------------ Essence: define a coarse-graining procedure on the Internet graph; study how Layer-1 coherences flow under coarse-graining; identify fixed points (universality classes). Roadmap: Y1: define block-spin-like coarse-graining on routers / ASes / regions. Y2: numerical RG on synthetic topologies. Y3: empirical RG on real CAIDA topology + Layer 1. Y4: classify fixed points; publish a "Internet universality class" catalogue. ------------------------------------------------------------------------------ D15 -- Causal inference on multi-vantage observations ------------------------------------------------------------------------------ Essence: bring Pearl's do-calculus and counterfactual reasoning to MVPS. Given Layer 1/2/3 trajectories, infer causal graphs over hops, ASes, regions. This is much stronger than correlational detection (which is what Layer 1 already does). Roadmap: Y1: causal-discovery algorithms over MVPS series. Y2: validation on simulated networks with known ground truth. Y3: validation on a controlled testbed with known interventions. Y4: deployment + thesis. ------------------------------------------------------------------------------ D16 -- Adversarial robustness theory ------------------------------------------------------------------------------ Essence: formalise the adversary models for each of the 33 scalars. Compute lower bounds on attack cost to fool any of them. Build a provably-robust subset (similar in spirit to certified defences in adversarial ML). Roadmap: Y1: adversary models + cost lower bounds. Y2: defences per layer. Y3: empirical red-teaming. Y4: certified-robust profile v2. ------------------------------------------------------------------------------ D17 -- Game theory of measurement attack vs defence ------------------------------------------------------------------------------ Essence: in which phase Phi_K does an attacker prefer to live to remain invisible to detection? Equilibrium analysis of attacker- detector games on the seven-phase diagram. Roadmap: Y1: game-theoretic formulation. Y2: equilibrium analysis. Y3: empirical validation on RIPE Atlas with synthetic adversaries. Y4: deployment guidance + thesis. ------------------------------------------------------------------------------ D18 -- Information-theoretic minimum bundle (MVPS compression) ------------------------------------------------------------------------------ Essence: what is the smallest bundle (in bits) that still reconstructs Layer 1, 2, 3 to within epsilon? Establish a rate-distortion theory for MVPS observability. Roadmap: Y1: rate-distortion formulation. Y2: optimal codebook construction. Y3: empirical evaluation across diverse production datasets. Y4: standard-track compressed-MVPS I-D. ------------------------------------------------------------------------------ D19 -- Federated MVPS production without mutual trust ------------------------------------------------------------------------------ Essence: how do N operators produce a joint MVPS bundle whose Layer 1/2/3 invariants are preserved, without any operator trusting the others? Connects D2 (provenance) with the three layers (the invariants). Roadmap: Y1: trust model + invariant preservation requirements. Y2: protocol design + formal verification. Y3: testbed deployment. Y4: standardisation. ------------------------------------------------------------------------------ D20 -- Differential-privacy MVPS bundles ------------------------------------------------------------------------------ Essence: a privacy-preserving MVPS bundle where Layer 1/2/3 invariants remain measurable while no single hop can be identified. Calibrate the trade-off epsilon (privacy) vs Layer-1 fidelity. Roadmap: Y1: differential privacy formulation for path data. Y2: noise design; epsilon-fidelity Pareto front. Y3: deployment trial. Y4: standardisation + thesis. ============================================================================== 13. Post-doctoral-level scope (P1 .. P7) ============================================================================== Each P is a 2-to-4 year focused programme, post-PhD. Each is sized for a single FTE with at most one PhD student. P-level themes are the consolidations -- the moves from "interesting result" to "transferable scientific instrument". P1 -- Internet Phase Diagram, longitudinal (24 months) Build the public, peer-reviewed, ground-truth phase diagram Phi(t) for the global Internet from 100+ RIPE anchors. Deliverable: the Internet Phase Atlas (data + tooling), comparable in role to CAIDA's AS-topology atlas. P2 -- Public coherence dataset and DOI'd release pipeline Operate a continuously-updated, DOI'd, peer-reviewed Layer 1/2/3 dataset, with reproducible-research badges. Deliverable: a Zenodo / CAIDA-class public dataset, used by other groups as benchmark. P3 -- Predict-and-prevent NOC framework in production Integrate tau_CSD / tau_dphi / tau_OU into a real NOC at a research-and-education network (e.g. Belnet, RNP, GEANT, Internet2). Measure operational outcome. Deliverable: a peer-reviewed operations paper; an open reference deployment. P4 -- IETF standardisation of MVPS Profile v2 (with Layers) Carry the I-D from -01 to RFC, with WG adoption, interim hackathons, and at least three independent reference implementations passing the conformance corpus. Deliverable: an RFC. P5 -- Cross-framework comparison: MVPS vs commercial alternatives Head-to-head empirical comparison with Catchpoint, ThousandEyes, Kentik, NetMon, in terms of incident detection accuracy and lead time, under blinded controlled protocol. Deliverable: a public benchmark; a paper at a leading network-measurement venue. P6 -- Multi-scale Phi (second, minute, hour, day, week, month) Generalise Layer 3 to operate at six time-scales simultaneously. Build the inter-scale coupling theory. Deliverable: the first "renormalised regime detector", with formal proof of consistency. P7 -- Layer 4: the meta-regime Investigate whether a fourth layer is required to capture the *socio-technical* dynamics above Phi (operator acquisitions, regulation, geopolitics of routing). Speculative; falsifiable. Deliverable: either a Layer 4 proposal or a formal impossibility argument. ============================================================================== 14. Theoretical pendencies (open mathematical questions) ============================================================================== These are not theses; they are individual questions any of the above theses may address. T1 det(Sigma) is conjectured to be a near-invariant under network equilibrium. Provide a proof or a refuting empirical counterexample at scale. T2 The Heisenberg-Gabor bound 1/(4*pi) on (sigma_t, sigma_f) -- under which sampling protocols is it tight in MVPS? T3 Voronoi-cell boundaries in (C1, C2, C3) space: are they algebraic varieties, smooth manifolds, or fractals? T4 Triangle defect Delta vs Ollivier-Ricci curvature: are they reducible to one another in the Internet graph? T5 Layer-1 completeness: which observable properties cannot be expressed as functions of (C1, C2, C3) + hidden observables? T6 How much history (h ticks) is necessary and sufficient for each Layer-2 axis to be estimable to relative error epsilon? T7 Universality of Scheffer CSD beyond eco-systems: rigorous proof of universal precursor in BGP-routed graphs. T8 The strict mathematical relation between K3 (Hamilton stability) and tau_CSD (Scheffer): are they equivalent observables under a change of basis? T9 Fluctuation-dissipation theorem for C(t): if C drifts under external "force" F, is dC/dF = ? T10 Is there a variational principle (Lagrangian density) whose Euler-Lagrange equations produce H = -ln(C1 * C2 * C3) as the canonical Hamiltonian? T11 Categorical formulation: are MVPS bundles a sheaf on the site of routing intentions? Does Layer 1 emerge from a limit construction? T12 Algebraic K-theory of canonical-form profiles: is the lattice of Profile v1 -> v1.x -> v2 a K_0? T13 Define Layer-1 entropy production and prove a Second Law (one-direction inequality) on its evolution. T14 Existence theorem: prove that for any K, there is a vantage- point cardinality N(K) sufficient to make all Phi_K distinguishable with confidence >= 95%. T15 Decoupling theorem: under what conditions are Layer 1 and Layer 2 statistically independent? ============================================================================== 15. Empirical pendencies (open measurement studies at scale) ============================================================================== E1 Validate F001..F019 against 100+ independent real implementations of MVPS-bundle producers. E2 Calibrate _FAILURE_CENTROIDS on 12+ months of RIPE Atlas data with the new Layer-1 v1 definitions. E3 Build a public, peer-reviewed dataset of G (equilibrium sampling cadence) across diverse networks. E4 Mixed IPv4/IPv6 traceroute Layer 1 robustness study. E5 Direct precision/recall comparison: MVPS Layer 1/2/3 vs classical traceroute, on a curated incident catalogue. E6 Cross-AS measurement of reciprocity violation R: where in the Internet R is chronically > 0.5? E7 Geographic mapping of gauge-gap regime (rho = 0). E8 Operational study: how much warning does tau_CSD provide for real, ticketed BGP / capacity incidents? E9 First Birkhoff ergodicity test on Internet path data, at scale. E10 Cross-validation: MVPS Layer 1 vs reverse-traceroute output on the same vantage-destination pairs. E11 Diurnal / weekly / annual cycles of the 33 Layer scalars (where present?). E12 Catastrophic-event signature catalogue: cable cuts, BGP leaks, DDoS, route hijacks -- each labelled in (Phi, tau, omega). E13 Cross-CDN diversity: T3 measured from a common vantage set to a common destination set. E14 IPv6 zone-identifier behaviour on real Atlas measurements (F012 deepening). E15 Industrial network case study: enterprise WAN deployed with Layer 1/2/3 telemetry for 90 days. ============================================================================== 16. Engineering pendencies (open software / infrastructure work) ============================================================================== S1 Formal verification of Profile v1 in Coq / Lean / Isabelle. S2 Reference implementations of Profile v1 in Go, Rust, TypeScript, OCaml, C, Haskell (six independent languages). S3 Free public conformance gate (GitHub Actions, similar to web-platform-tests for browsers). S4 eBPF in-kernel probe that emits MVPS bundles natively. S5 Hardware probe (Raspberry Pi or similar) that produces bundles autonomously, USB-connected. S6 RIPE Atlas adapter (production-grade) -- M2 extended. S7 CAIDA Ark adapter. S8 BMP adapter (S31 in M). S9 BGP-LS adapter. S10 YANG, CDDL, CBOR alternative wire formats (M32). S11 Wireshark dissector for MVPS bundle (assist incident response). S12 Grafana dashboard plug-in: live Layer 1/2/3 dashboards consuming MVPS REST API. S13 Public REST gateway: api.catellix.com/mvps/layers/1|2|3, rate-limited, OpenAPI-described. S14 K8s operator: "MVPS-as-a-Service" deployment in any CNCF-aware cluster. S15 Differential privacy noise injector (companion to D20). ============================================================================== 17. Governance / process pendencies ============================================================================== G1 IETF adoption of draft-melegassi-ippm-mvps-bundle by IPPM. G2 Submission and pursuit of a Profile v2 normative document that includes Layers 1/2/3 as normative annexes. G3 Bring up a working group dedicated to MVPS (or join an existing close-relative WG); current "ippm" is the right home for the core; "tmrg", "nmrg", "core" are adjacent options. G4 Erratum coordination with adjacent RFCs (5952, 8785, etc.) where F-findings interact. G5 Sigcomm / Sigops / IMC tutorial track: workshop the three layers; build community. G6 Educational: undergraduate curriculum module ("Reproducible Internet measurement"), with the Discovery Lab as the lab component. G7 Compliance: align Profile v1 with relevant data-protection regulations (GDPR, LGPD) for cross-border telemetry. G8 Industry consortium: invite Belnet, RNP, GEANT, Internet2 to co-author a deployment guide for research-and-education networks. G9 Open-data licence: clarify Creative-Commons-Zero release for the conformance corpus and the test vectors. G10 Patent disclosure / hygiene: file a defensive publication on the three-layer architecture to keep it permanently in the public domain. ============================================================================== 18. Cross-reference matrix: anchor -> entry points ============================================================================== A Master's or PhD student arriving at the toolkit should be able to find an entry point starting from any concrete object. Object Master Doctoral -------------------------------- ---------- ---------------- F001 (IPv6 canonical) M1 D1 F009 (HMAC scheme) M3 D2 F010 (path_fingerprint MAC bug) M3 D2 F015 (bundle algebra) M3 D2 F019 (canonicaliser vs validator) M2, M3 D1, D2 C1 Einstein causal axis M4, M11 D5, D6, D8 C2 JSD informational axis M5 D5, D8, D10 C3 Jaccard topological axis M10, M19 D5, D10, D14 rho = C2/C3 (gauge fidelity) M7 D10 Delta (triangle defect) M9 D10, D14 R (reciprocity violation) M8 D7, D15 Sigma (residual covariance) M10 D6, D8, D12 Hamiltonian H M18 D6, D8, D12 D1 Lipschitz M14 D11 D3 ergodicity M15 D7, D14 K1 Kirchhoff (flow conservation) M16 D5, D6, D7 K2 Clausius (entropy) M17 D6, D8, D14 K3 Hamilton (action stability) M18 D6, D8, D11, D12 T1 anti-SPOF M20 D5, D17, D19 T2 Fiedler M19 D5, D9, D14, D17 T3 Menger M20 D5, D7, D19 Phi (phase label) M23, M29 D9, D13, D17 tau_CSD M25 D11, D6 tau_dphi M27 D9, D11 tau_OU M26 D11, D12 omega_H / omega_M / omega_C M28 D9, D11, D17 Profile v1 / v1.x M1, M21 D1 HMAC / signature M3 D2, D19 Differential testing harness M2 D3, D16 ============================================================================== 19. Pedagogical sequence: from undergraduate to doctorate ============================================================================== For a department considering building a curriculum around MVPS: Undergraduate (1 semester, 60h): "Reproducible Internet measurement" -- Discovery Lab as teaching kit. Outcome: students reproduce F001..F019, write a 5-page report on one finding. Master's foundation (1 semester, 120h): Layer 1 (C1, C2, C3) + Hamiltonian. Outcome: students implement a minimal Layer 1 in a language of their choice and validate against the canonical scenarios. Master's specialisation (one of M1..M33, 750h, ~1 academic year): As detailed above. Doctoral programme (one of D1..D20, ~4 years): As detailed above. Post-doctoral programme (one of P1..P7, 2-4 years): As detailed above. The same conformance corpus, the same code, the same documentation, and the same draft serve as substrate at every level. The educational ramp is the same as the research ramp -- which is unusual, and intentional. ============================================================================== 20. How to choose, expanded ============================================================================== The original D1/D2/D3 split was "by lever" (canonical names vs signatures vs method). The expansion adds a second dimension: "by mathematical layer". A student now picks both: which layer (1, 2, 3, or cross) and which lever. A representative grid (not exhaustive): Canonicalisation Provenance Method (D1 lineage) (D2 lineage) (D3 lineage) Layer 1 D5, D10, D14 D16, D19, D20 D11, D12, D13 Layer 2 D5, D6, D7 D17, D19 D14, D16 Layer 3 D9 D17, D19 D11 Cross-layer D4, D5 D19, D20 D4, D18 Any one of these cells is a four-year programme. Cells without an entry are intentional negative space -- not every (layer, lever) combination is a viable doctoral programme. ============================================================================== End of expansion. Document status: v3, 2026-05-20, Andradina. Companion to MASTER-THESIS-ENTRY-POINTS.md and to the three mathematical layer documents in this thesis-kit: CAMADA1_UNIFICADA_PASSO_A_PASSO.md CAMADA1_DESCOBERTAS_OCULTAS.md CAMADA2_DINAMICA_CONSERVACAO_ESTRUTURA.md CAMADA3_FASE_E_TRANSICAO.md No commitment implied. Open only if doctoral scope is what you are looking for. ==============================================================================