Evaluation Overview — Axiom Cortex™ (No Public API)

What it is: Axiom Cortex™ is a proprietary, internal evaluation service inside the Nearshore IT Co-Pilot™ platform. There is no public API. All evaluations run inside our platform to align and predict talent performance using 44 psychometric + NLP signals on top of LLMs with semantic chunking in RAG and staged/multi-step prompting. We apply language-fairness calibration (judge ideas and reasoning, not accent/phrasing) and perform expert review of flags before roll-up. Results map to BARS (Behaviorally Anchored Rating Scales = ratings tied to observable behaviors).


Inputs (plain English)


Method core


The 44 Signals (executive view)

A. Alignment (5)
1) Problem–schema match • 2) Stack coverage • 3) Complexity bandwidth • 4) Domain transfer • 5) Requirements fidelity

B. Analysis (7)
6) Chain integrity • 7) Error-mode awareness • 8) Counterfactuals • 9) Causal linkage • 10) Evidence binding • 11) Numeracy checks • 12) Ambiguity handling

C. Synthesis (6)
13) Composability • 14) Architectural tradeoffs • 15) Edge-case coverage • 16) Observability planning • 17) Dependency hygiene • 18) Migration/rollback planning

D. Code Quality (6)
19) Semantic diff quality • 20) Static risks • 21) Runtime realism • 22) Test intent/coverage • 23) Debug vectoring • 24) Complexity control

E. Communication & Collaboration (6)
25) Instruction uptake • 26) Audience targeting • 27) Decision log clarity • 28) Review acumen • 29) Ticket hygiene • 30) Documentation atomics

F. Systems & Ops (5)
31) Reliability thinking (SLOs) • 32) Cost–performance • 33) Release safety • 34) Infra-as-code • 35) Telemetry interpretation

G. Risk & Security (5)
36) Sensitive-data handling • 37) License hygiene • 38) Supply-chain care • 39) Threat-modeling reflex • 40) Secrets/access discipline

H. Robustness & Agreement (4)
41) Adversarial resistance • 42) Cross-model agreement • 43) Self-consistency • 44) Hallucination index


Processing & formulas (transparent math, plain English)

We keep formulas readable here and reserve proofs/derivations for Scientific Foundations.

1) Normalization (z-scores)
Each raw signal (s_i) is normalized within role/level cohorts:
z_i = (s_i − μ_role,level) / σ_role,level (robust μ/σ with outlier guards).
This keeps scores comparable across roles and seniorities.

2) L2-aware weighting
We decompose communication into semantic vs. form carriers with weights (\alpha) and (\beta):
Score_comm = α·SemContent + β·Form, with β → 0 as L2 uncertainty rises (estimated from stable markers).
Meaning: grammar/fluency noise is down-weighted; ambiguous content is still penalized.

3) Cross-lingual semantic fidelity (FSD)
We compare answer embeddings to an Ideal Answer Blueprint distribution via a Fréchet-style distance.
Lower FSD ⇒ closer to the target concept even with Spanish-influenced English.

4) Optimal transport (W2) with code-switch mask
Token alignment uses W2 distance with neutral cost for common bilingual markers (e.g., “pues”, “o sea”).
This prevents code-switch tokens from inflating “distance” when the substantive idea matches.

5) DIF checks & correction
For each rubric item, we test Differential Item Functioning across language cohorts at matched ability.
Items with significant DIF are adjusted or removed; if unresolved, the model fails closed for that item.

6) Aggregation (monotone link)
Signals roll up through constrained aggregators:

7) Uncertainty & calibration
Every composite score includes uncertainty (bootstrap CIs / calibrated posteriors).
We monitor Expected Calibration Error (ECE) and produce reliability diagrams by cohort.

8) BARS mapping (human-readable)
Final scores map to BARS (ratings tied to observable behaviors) so feedback is specific, fair, and actionable.

9) Decision layer (constrained utility)
Recommendations maximize expected utility subject to fairness/reliability gates:
maximize E[U | evidence] s.t. DIF ≤ δ, G ≥ G_min, P[Collab < τ_c] ≤ ε_c
Lagrangian relaxations provide gate-aware choices with justifications.


Outputs you see (inside the platform)


Publications & research

More: Publications page


Platform context (no external endpoints)

Axiom Cortex™ powers the Nearshore IT Co-Pilot™—one platform to hire, equip, secure, and pay LATAM engineers under one SLA. There is no external API.

Authority anchors: