Dominant-Harm Severity Scoring Engine for AI Incidents

Defensive Publication — Companion record to ODA3 IP Filing Doc #9

Document type	Defensive Publication (Prior Art Disclosure)
Author / Owner	ODA3 Pvt Ltd (ODA³ Institute), Bihar, India
Related framework component	UAIF v1.0 — Severity Scoring Engine, §4.2–4.9, Appendix H (ODA3-2026-06-TCR-STD-002)
Canonical URL	https://oda3.org/disclosures/severity-scoring-engine https://www.tdcommons.org/dpubs_series/10504/
Contact	contact@oda3.org · https://oda3.org

Independent record notice This page is published on oda3.org as an independent, separately dated record of the disclosure below. The same disclosure text is also filed on Technical Disclosure Commons (tdcommons.org) as a companion defensive publication. Maintaining both records provides two independent, publicly verifiable timestamps for the same prior art.

Abstract

A severity scoring system and method for classifying artificial intelligence (AI) security incidents. The system computes a severity score using a dominant-harm anchor: the highest-weighted harm dimension present in an incident is taken as the base score, with all other present harm dimensions contributing only half their summed weight, capped at a fixed maximum, preventing minor cumulative harms from arithmetically equalling a single severe harm. Situational context factors (attack-vector-specific) are added on top of the harm-anchored base score using a diminishing-contribution model in which each successive context, ordered by descending base value, contributes half the weight of the previous context, subject to both a per-factor cap and a hard aggregate cap. The resulting base score is scaled by an impact factor derived from the maximum (not product) of a population-affected multiplier and a blast-radius multiplier, then adjusted by a temporal/confidence factor that can only reduce — never inflate — the score, based on attribution confidence, investigation status, and evidence quality. The final presentation score maps to a five-level severity matrix and feeds a regulatory-trigger function that replaces binary pass/fail thresholds with a confidence-banded determination (HIGH/MEDIUM/LOW), tied to calibration status, that mandates human review when confidence is low. This disclosure is published to establish prior art and prevent patent claims on these methods by any party.

Technical Field

This disclosure relates to systems and methods for classifying cybersecurity and safety incidents in artificial intelligence systems, and more specifically to severity scoring engines that anchor on a single dominant harm dimension, apply diminishing-contribution context modelling, and generate confidence-banded (rather than binary) regulatory-notification triggers.

Background

Prior art systems for incident severity scoring (e.g., generic vulnerability scoring frameworks) typically sum or otherwise combine all applicable harm/impact dimensions without limit, which allows a large number of low-severity factors to arithmetically accumulate to the same score as a single catastrophic factor. Such systems also typically apply binary score thresholds for regulatory-notification triggers, which creates false precision when underlying weights are provisional or uncalibrated — for example, a score of 6.5 triggering a regulatory obligation while 6.4 does not, despite both being subject to the same calibration uncertainty. There exists a need for a severity scoring engine that (a) anchors on dominant harm rather than unconstrained summation, (b) bounds the influence of secondary harms and situational context factors through diminishing contribution, and (c) replaces binary regulatory thresholds with a confidence-banded determination that accounts for the calibration status of the underlying weights.

Detailed Description

1. Harm Weight Architecture

The system defines eight harm dimensions, each with a default weight derived from a structured, documented, multi-stage elicitation process (literature review of a documented incident corpus, followed by a structured Delphi expert-elicitation workshop with multiple anonymous rating rounds targeting a minimum inter-rater reliability threshold). Default weights are explicitly marked as provisional pending sector-specific empirical validation, and every severity computation using default weights carries a machine-readable provisional-weight notice. Each harm dimension’s weight reflects its relative severity ordering as established by the elicitation process (highest-consensus harms receive materially higher weights than lower-consensus or typically-secondary harms).

2. Dominant-Harm Anchor with Secondary Cap

Given a set of harms present in an incident, the system identifies the harm dimension with the single highest default weight among those present (“dominant harm”) and uses its weight as the anchor value. The weights of all other present harm dimensions are summed (“others_sum”) and added to the anchor at half value, subject to a fixed ceiling of 2.0 on the secondary contribution (i.e., adjusted_base = dominant_weight + min(2.0, 0.5 × others_sum), with the result rounded to one decimal place). This prevents an unbounded number of minor secondary harms from accumulating to equal or exceed a single severe harm, while still preserving differentiation for multi-harm incidents.

3. Diminishing-Contribution Context Model

Situational context factors (e.g., attack-vector-specific conditions present in an AI security incident) each carry a base value. To compute their aggregate contribution to severity: the present context factors are sorted by descending base value; each context’s value is first capped at a fixed per-factor ceiling; the first (highest-value) context contributes its full capped value; each subsequent context contributes its capped value multiplied by a contribution factor that itself halves with each step (i.e., 1.0, 0.5, 0.25, …); contributions accumulate until a hard aggregate ceiling is reached, at which point accumulation stops. All context factors considered — including those whose contribution was diminished — are recorded in a full audit trail showing base value, contribution factor applied, and effective contribution, regardless of their effect on the final score. This preserves complete analytical traceability for multi-vector incidents while bounding the score impact of an arbitrary number of co-occurring contextual factors.

4. Impact Scaling via Maximum (Not Product) of Independent Scale Dimensions

The system derives an impact-scale multiplier from two independent scale dimensions — population affected and blast radius (organizational/infrastructural scope) — each mapped to a multiplier via a defined lookup table. The impact-scale multiplier is computed as the maximum of the two dimension-specific multipliers (not their product), subject to a fixed overall ceiling, on the basis that population size and organizational blast radius are correlated proxies for the same underlying scale concept and should not be allowed to compound multiplicatively.

5. Confidence-Based Temporal Adjustment (Inverted, Reduction-Only)

The system computes a confidence factor as the average of three independently-scored inputs (attribution confidence, investigation status, evidence quality), each scored on a defined low/medium/high scale. This confidence factor is then used to compute a temporal reduction (proportional to 1 minus the confidence factor, subject to a maximum reduction ceiling) that is applied to reduce the harm-and-context base score before impact scaling. Critically, the adjustment is one-directional: low confidence can only reduce the presented severity score, never inflate it, preventing unverified or low-evidence incidents from being artificially escalated while still permitting full scoring once confidence is established.

6. Five-Level Severity Matrix

The final presentation score (harm-and-context base, temporally adjusted, then impact-scaled) maps deterministically to a five-level ordinal severity matrix, with each level associated with an indicative harm threshold and an indicative regulatory-reporting proxy. A separate, non-presentation analytical score is retained for differentiation among incidents that cluster at the maximum of the presentation scale.

7. Calibration-Aware Regulatory Trigger with Confidence Bands

Rather than a binary score-threshold determination, the system computes a regulatory-trigger confidence band as a function of: the distance between the presentation score and a configurable jurisdiction-specific threshold; the calibration status of the weights used (e.g., provisional vs. sector-validated vs. custom); and a set of jurisdiction/sector enrichment flags (e.g., sensitive data, critical sector, vulnerable population involved). The function returns a tri-level confidence classification (HIGH/MEDIUM/LOW) and a boolean trigger that is only set under defined combinations of base-threshold-crossing and confidence level (with enrichment flags able to elevate a MEDIUM-confidence case to triggering). Whenever confidence is LOW, the system mandates human review rather than relying on the automated determination. This confidence-banding approach is designed specifically to avoid the false precision of a single binary score cutoff when the underlying harm weights are provisional.

Prior Art Coverage

This disclosure covers: (1) dominant-harm-anchored severity scoring in which a single highest-weight present harm dimension forms the base score and secondary harms contribute only a capped fraction of their summed weight; (2) diminishing-contribution aggregation of situational/contextual severity factors in which each successive factor (ordered by descending base value) contributes a geometrically-diminishing fraction of its value, subject to per-factor and aggregate caps, with full audit-trail retention of all considered factors; (3) impact scaling derived from the maximum (rather than product) of independent scale-dimension multipliers; (4) one-directional (reduction-only) confidence/temporal adjustment of a severity score based on attribution confidence, investigation status, and evidence quality; (5) calibration-status-aware regulatory-trigger determination using a confidence-band classification (rather than a binary threshold) with mandatory human-review escalation on low confidence; (6) any AI incident classification system implementing any combination of the above.

Publication date: June 17, 2026

Publisher: ODA3 Pvt Ltd (ODA³ Institute)

Jurisdiction of publication: India

Companion filing: Technical Disclosure Commons (tdcommons.org), Defensive Publications Series

Document referenceODA3 IP Filing Doc #9

Dominant-Harm-Anchored Severity Scoring Engine for AI Incident Classification with Diminishing-Contribution Context Modeling and Calibration-Aware Regulatory Trigger Confidence Bands