Hyperbridge Digital · Model Foundry

KYNETRA FOUNDRY™

Fifty owned primitives, frameworks, and architectures for training, creating, and fine-tuning frontier models — Hyperbridge's in-house counterpart to the open LoRA / QLoRA / RLHF ecosystem. Every name here is coined and owned; every mechanism underneath is real engineering.

Owned & operated by Hyperbridge Digital · The forge for sovereign models — train, create, fine-tune.

Components

Pillars

Architectures

Frameworks

100%

Hyperbridge-owned

The Forge Pipeline — ten pillars, one model lifecycle · click a stage to jump

01 Curate Dedup, synthesize, tokenize the corpus Corpus & Data Engineering · 5 02 Pretrain Lay the foundation architecture Pretraining & Foundation Architecture · 5 03 Adapt Specialize a frozen giant cheaply Adaptation & PEFT · 5 04 Align Teach taste with preference & critique Alignment & Preference · 5 05 Distill Compress and fuse capability Distillation & Merging · 5 06 Compress Drive weights & cache to 4-bit Quantization & Compression · 5 07 Augment Bolt on memory & long context Retrieval, Memory & Long Context · 5 08 Serve Maximize tokens per dollar Inference & Serving Architecture · 5 09 Evaluate Measure, judge, watch for drift Evaluation, Observability & Governance · 5 10 Govern Guard, shard, keep it sovereign Safety, Sovereignty & Orchestration · 5

Terminology Framework Architecture

No components match — try another term.

Adaptation & PEFT

Bend a frozen giant to your domain without retraining it — low-rank grafts, quantized adapters, and surgical weight edits that ship in megabytes, not gigabytes.

5 components

Framework 01

↳ 3

RankWeave

Inject a sovereign low-rank delta into frozen weights — train 0.2% of params, keep all the capability.

Grounded inLow

LoRA is like editing a blueprint by annotating in the margins — the original is untouched, the delta is thin, and you can swap annotations per job.

Rank: 4-64Trainable: ~0.1-1%Latency: +0ms (merged)

Open deep dive

Framework 01

↳ 3

NibbleGraft

Graft trainable adapters onto a 4-bit frozen base — fine-tune giant models on a single GPU.

Grounded inQLoRA

QLoRA is like renting a tiny apartment by compressing your furniture to 4-bit flatpacks — you assemble only the piece you need right now.

Bits: 4 (NF-style)VRAM: -65% vs bf16Base: frozen + quantized

Open deep dive

Architecture 01

↳ 3

MagnitudeForge

Decompose each weight into magnitude and direction — tune both, fine-tune like full training.

Grounded inDoRA (Weight

DoRA adjusts a compass needle by changing both its length and its direction independently, rather than nudging the whole needle as one object.

Decomp: magnitude + directionRank: 4-32Accuracy: ~full-FT parity

Open deep dive

Architecture 01

↳ 3

PrefixLattice

Steer a frozen model with learned virtual tokens prepended to every attention layer — no weights touched.

Grounded inPrefix

Prefix-tuning staples a learned memo to the top of every blueprint page — the plans are untouched but the worker reads context first at every step.

Weights: 0 editedParams: ~0.1-3%Scope: all layers

Open deep dive

Terminology 01

↳ 3

GraftFold

The math of folding trained adapters back into base weights — and weighted-merging many into one.

Grounded inAdapter merging

Merging adapters is like mixing paint — weighted blending is cheap, but once colors conflict, averaging smears both rather than keeping either sharp.

Overhead: +0ms post-foldMerge: N adapters → 1Conflict: sign-aware

Open deep dive

Quantization & Compression

Drive weights and KV-cache to 4 bits and below without losing the plot. Post-training quant, pruning, and quantization-aware training.

5 components

Framework 02

↳ 3

Nanocrush NF4

4-bit information-theoretic weight casting that holds accuracy where naive INT4 collapses.

Grounded in4

NF4 is a ruler whose tick marks cluster near zero and spread at the extremes, matching where most weight values actually land.

Bits: 4 (NF-style)Block: 64Levels: 16

Open deep dive

Terminology 02

↳ 3

Densecore Recompress

Quantize the quantization metadata itself to claw back the last fraction of a bit.

Grounded inDouble quantization (DQ) from QLoRA

Double quantization is quantizing your receipts after you've already quantized your groceries — real savings, but only if the receipts were a big share of the wallet.

Overhead: 0.5→0.127 b/paramMeta-block: 256Scale bits: 8

Open deep dive

Framework 02

↳ 3

Errorforge Calibrate

Post-training low-bit quant that compensates each weight against its neighbors' error.

Grounded inGPTQ

GPTQ is a row of dominoes — knock each weight into its nearest quantized slot, then nudge downstream weights to absorb the topple before moving on.

Bits: 3-4Method: PTQCalib: ~128-512 samples

Open deep dive

Architecture 02

↳ 3

Latticeprune Sparsefold

Hardware-aligned 2:4 structured sparsity that physically halves the weight matrix.

Grounded inSemi

2:4 sparsity enforces that no more than 2 of every 4 weights survive, so the hardware can skip the absent ones without even looking.

Pattern: 2:4Speedup: ~2xIndex: 2 b/group

Open deep dive

Architecture 02

↳ 3

Cachecrush Streamline

Per-channel low-bit KV-cache quant that lets context length scale past the memory wall.

Grounded inKV

KV-cache quantization stores attention history in shorthand — you lose nuance but the binder fits in the bag and you reconstruct on demand.

KV bits: 4-8Key: per-channelValue: per-token

Open deep dive

Corpus & Data Engineering

The model is the dataset. Curation, dedup, synthetic generation, tokenization, and curriculum — the upstream that decides everything downstream.

5 components

Framework 03

↳ 3

Glyphsieve

Near-duplicate culling at corpus scale via fingerprint-band collision.

Grounded inDocument deduplication using MinHash + Locality

A probabilistic fingerprint that turns document similarity into a coin-flip collision: shared content lands in the same bucket.

Jaccard: ~0.8Sig: 128-256 permsRedundancy: -40%

Open deep dive

Framework 03

↳ 3

Corpusmith

Self-instruct synthetic data forged from a seed of human exemplars.

Grounded inSynthetic instruction

A snowball rolling downhill: seed a teacher with a few human examples and let it write its own training data, expanding with each pass.

Seed: ~175 tasksYield: 50k-500k pairsDedup: ROUGE-L < 0.7

Open deep dive

Architecture 03

↳ 3

Latticescript

A byte-level merge lattice that learns the model's vocabulary from the corpus.

Grounded inSubword tokenizer construction via Byte

A greedy zipper that repeatedly snaps the two most common adjacent symbols together until the vocabulary is full — frequency becomes merge order.

Vocab: 32k-128kBase: 256 bytesOOV: 0%

Open deep dive

Terminology 03

↳ 3

Strataweave

The per-source sampling weights that govern what the model actually sees.

Grounded inData mixture / domain reweighting laws for pretraining (e.g. DoReMi

Domain weights are the dial setting how many effective epochs each source sees, independent of its raw token count.

Domains: 5-20Temp: 0.3-1.0Upsample: <=4x

Open deep dive

Framework 03

↳ 3

Curriculord

Difficulty-ordered data scheduling from easy foundations to hard frontier.

Grounded inCurriculum learning

Teach the model like a student: start with solved examples, introduce hard problems only once the fundamentals are solid.

Buckets: 4-10Pacing: linear/rootReplay: >=20%

Open deep dive

Alignment & Preference

Teach taste, not just tokens. Reward models, direct preference optimization, and constitutional self-critique.

5 components

Framework 04

↳ 3

CreedForge

Self-critique loops temper a model against your written charter, no human raters needed.

Grounded inRLAIF / Constitutional AI

The model becomes its own editor: it reads a principle, critiques its own draft, then rewrites to comply before a judge scores the pair.

Human labels: ~0Phases: 2 (SL + RL)Judge: model-as-rater

Open deep dive

Architecture 04

↳ 3

Concordance Lattice

A frozen scalar critic over the policy that scores human-ranked pairs into a reward field.

Grounded inReward modeling for RLHF

A learned scorecard that reads two responses and assigns numbers so the human-preferred one always comes out higher — trained purely on comparisons.

Head: 1 scalarLoss: Bradley-TerryInit: SFT backbone

Open deep dive

Framework 04

↳ 3

PreferLoom

Skip the reward model — optimize the policy straight from preference pairs in closed form.

Grounded inDirect Preference Optimization (DPO

Rewrite the RLHF objective algebraically until the reward model cancels out, leaving a classification loss that nudges the policy straight from preference pairs.

Reward model: noneLoss: implicit-rewardBeta: 0.1-0.5

Open deep dive

Terminology 04

↳ 3

Verdance Margin

The signed preference gap a model holds between a winning and losing response.

Grounded inThe implicit

The margin is the scoreboard gap — not just whether pairs rank correctly, but how confidently and widening a separation training is actually achieving.

Sign: >0 = correctTarget acc: >0.7Unit: log-odds

Open deep dive

Framework 04

↳ 3

Assayloom Cull

Sample N, keep the best by reward, fine-tune on the survivors — alignment without RL.

Grounded inRejection sampling fine

Generate a shortlist per question, let a judge pick the best, train only on winners — iterate until top answers come from scratch.

N: 4-64Keep: top-1/top-kLoss: cross-entropy

Open deep dive

Distillation & Merging

Pour a large model into a small one; fuse many checkpoints into one. Logit distillation and weight-space merging.

5 components

Framework 05

↳ 3

Distilflux

Pour a teacher's full belief into a smaller student through soft-label flux.

Grounded inResponse

The teacher whispers not just the answer but its full confidence distribution, giving the student far richer signal than a one-hot label.

Temp: 2-8Params: -40 to -90%Loss: KL + CE

Open deep dive

Terminology 05

↳ 3

Tracegraft

Match the teacher's hidden geometry, not just its words.

Grounded inFeature

Copy not just the teacher's essay answers but its scratch-pad, forcing the student through the same intermediate steps.

Loss: MSE/cosineTargets: hidden+attnProj: learned linear

Open deep dive

Architecture 05

↳ 3

Mergespire

Fuse many fine-tunes into one checkpoint by reconciling their task vectors.

Grounded inTraining

Arithmetic on skill deltas: add the skills you want, subtract what you don't, then reconcile sign conflicts before applying the result.

Retrain: noneDensity: ~10-30%Inputs: 2-N tunes

Open deep dive

Framework 05

↳ 3

Fluxbroth

Average many checkpoints into one stronger model — no inference-time cost.

Grounded inModel souping

Independently fine-tuned runs of the same init land in one loss basin, so their centroid is flatter and more general than any single run.

Cost: 1x inferenceInputs: 3-N runsInit: shared

Open deep dive

Architecture 05

↳ 3

Spireforge

Upcycle a dense checkpoint into a sparse Mixture-of-Experts.

Grounded inMoE upcycling

Clone the dense FFN into N expert slots and train a router to divide labor — parameters multiply, but each token pays for only two experts.

Active: top-k of NFLOPs/token: ~flatInit: cloned FFN

Open deep dive

Retrieval, Memory & Long Context

Give the model the world it wasn't trained on. Vector retrieval, external memory, and attention that scales to book-length context.

5 components

Architecture 06

↳ 3

Memvault Lattice

A two-stage retrieval lattice that grounds generation in your own corpus, not the model's guesses.

Grounded inRetrieval

A library lookup bolted onto a frozen LLM: the retriever finds the relevant pages and the model reads them, instead of recalling from weights.

Top-k: 3-20Recall@10: >0.9Refresh: index-only, no retrain

Open deep dive

Terminology 06

↳ 3

Echograph Embeddings

Dense semantic vectors that let meaning, not keywords, drive retrieval.

Grounded inDense vector embeddings from a contrastively

A bi-encoder maps every text to a point where semantic closeness is geometric closeness — retrieval becomes nearest-neighbor search on that map.

Dim: 384-1536Metric: cosineLoss: InfoNCE

Open deep dive

Framework 06

↳ 3

Riftspan Rotary

Stretch a trained context window far past its native length without retraining from scratch.

Grounded inLong

A frequency-band zoom-out: compress the slow rotations that overflow the original window while leaving the fast local-position signal intact.

Context: up to 8-32xMethod: RoPE rescaleRetune: <1% steps

Open deep dive

Architecture 06

↳ 3

Vaultecho Cache

Persist and stream attention state so long sessions never recompute history.

Grounded inKV

Keep a few initial anchor tokens plus a sliding window of recent ones — anchors stabilize the softmax while older history is dropped, not approximated.

Memory: O(window)Sinks: 4 tokensPrefill: cached, reused

Open deep dive

Framework 06

↳ 3

Graftsieve Rerank

A cross-encoder second pass that re-scores candidates so only the truly relevant reach the model.

Grounded inTwo

The bi-encoder casts a wide net for recall; the cross-encoder hand-inspects each pair together, catching subtle mismatches independent encodings miss.

Recall pool: 50-100Keep: top 3-5Stage: 2 (cross-enc)

Open deep dive

Inference & Serving Architecture

Tokens per dollar. Speculative decoding, paged attention, continuous batching, and routing cascades.

5 components

Framework 07

↳ 3

Speccast Relay

A small drafter sprints ahead; the sovereign model verifies in one pass.

Grounded inSpeculative decoding (draft

A fast sketch artist drafts a sentence while the expert only nods yes/no at each word — parallelizing what was serial.

Speedup: 2-3xBlock K: 4-8Output: lossless

Open deep dive

Architecture 07

↳ 3

Pagewright KV

Virtual-memory paging for the KV cache: no fragmentation, near-zero waste.

Grounded inPagedAttention

OS virtual memory for the attention cache: sequences own logical page tables, not contiguous RAM, so fragmentation drops to near zero.

Block: 16 tokWaste: <4%Prefix: COW-shared

Open deep dive

Framework 07

↳ 3

Flowbatch Loom

Sequences join and leave the batch every step; the GPU never idles.

Grounded inContinuous (in

Rather than waiting for the whole table to finish before seating new guests, seat one diner the instant any seat opens.

Throughput: up to 20x vs staticSched: per-iterationLatency: tail-optimized

Open deep dive

Architecture 07

↳ 3

Shardloom Mesh

Split each layer across GPUs, stage the layers in a pipeline — weave both.

Grounded inCombined tensor parallelism (intra

Tensor parallelism splits one wide highway across lanes within a node; pipeline parallelism is a relay-race baton pass between nodes.

Mesh: TP x PPTP link: NVLinkBubble: micro-batched

Open deep dive

Terminology 07

↳ 3

Castfuse Kernel

IO-aware attention that never writes the full score matrix to HBM.

Grounded inFlashAttention

Compute attention scores tile by tile in fast scratchpad memory, accumulating on the fly, so the full score matrix never touches slow global memory.

Mem: O(N)Exact: yesBottleneck: HBM IO

Open deep dive

Pretraining & Foundation Architecture

The bones. Sparse experts, attention variants, positional schemes, and the scaling laws that govern them.

5 components

Architecture 08

↳ 3

Basalt Core

The sovereign decoder backbone every Kynetra model is forged on.

Grounded inModern pre

A plain left-to-right stack of read-then-think blocks — attention gathers context, the MLP processes it, residuals keep gradients alive across hundreds of layers.

Norm: RMSNorm (pre)FFN: SwiGLULayers: 12-120+

Open deep dive

Architecture 08

↳ 3

Spire Lattice

Sparse expert routing — vast capacity, lean per-token compute.

Grounded inSparse Mixture

Instead of every token passing through one big FFN, a traffic cop routes each to two of many specialists — total knowledge grows, per-token work stays fixed.

Experts: 8-128Active: top-1/top-2Capacity: +10-50x params

Open deep dive

Framework 08

↳ 3

Attentryx Grip

Share the keys, free the cache — attention that scales at inference.

Grounded inGrouped

Many query heads share one set of keys and values per group, like many readers sharing a single annotated reference copy.

KV heads: 1-8Cache: -50-87%Quality: ~MHA

Open deep dive

Terminology 08

↳ 3

Helix Anchor

Rotary phase encoding that lets context stretch far past training length.

Grounded inRotary Position Embedding (RoPE) plus long

Each attention score comes from the angle between two rotating clock hands — relative position emerges from phase, not from added position vectors.

Params: 0 addedBase: 10k-1M+Context: 4x-32x extend

Open deep dive

Framework 08

↳ 3

Forgecurve Doctrine

Compute-optimal sizing and staged curriculum — train the right model, the right way.

Grounded inNeural scaling laws (Chinchilla compute

Chinchilla's insight: at fixed compute, halving params while doubling tokens beats scaling params alone — size and data must grow together.

Optimal: D≈20N tokensCompute: C≈6NDStages: 2-4 curriculum

Open deep dive

Evaluation, Observability & Governance

If you can't measure it, you can't ship it. Eval harnesses, judges, drift detection, and full lineage.

5 components

Architecture 09

↳ 3

Proofgrid

A sovereign, versioned eval lattice that scores every checkpoint on the same sealed bench.

Grounded inReproducible evaluation harnesses and benchmark orchestration (e.g. lm

A sealed, version-controlled test suite that scores every checkpoint under identical frozen conditions, so results are actually comparable.

Tasks: 50-400Seed: fixed/deterministicCI: bootstrap 95%

Open deep dive

Framework 09

↳ 3

Arbiter Lattice

Ensemble LLM-as-judge with calibrated rubrics, position-swapping, and human anchor points.

Grounded inLLM

Using a strong model as a stand-in human rater — valid only once you've measured how closely its preferences track real humans on your task.

Judges: 1-5 ensembleBias-control: order-swapAgreement: kappa-tracked

Open deep dive

Terminology 09

↳ 3

Veracity Quotient

A single faithfulness score: how much of an answer is grounded in cited evidence.

Grounded inHallucination / factuality and groundedness metrics for RAG and open

Decompose an answer into individual factual claims, then check each against sources — fact-checking made computable.

Range: 0.0-1.0Checker: NLI entailmentGranularity: atomic-claim

Open deep dive

Architecture 09

↳ 3

Lineagraph

Tamper-evident provenance DAG linking every weight to the data, code, and config that forged it.

Grounded inData and model lineage / provenance

A tamper-evident paper trail for your model — every artifact is hashed and chained so you can prove what went in and trace any flaw to its source.

Addressing: SHA-256Output: ML-BOMTamper: signature-chained

Open deep dive

Framework 09

↳ 3

Driftwatch Sentinel

Continuous telemetry that catches input drift and quality regressions before users do.

Grounded inProduction drift and regression detection / observability

A statistical smoke detector for production — it alarms when today's inputs or outputs look meaningfully different from the baseline you trust.

Drift: PSI/KL/MMDMode: streamingGate: shadow-canary

Open deep dive

Safety, Sovereignty & Orchestration

Train on your terms, behind your walls. Guardrails, sovereign / air-gapped pipelines, and sharded orchestration at scale.

5 components

Framework 10

↳ 3

Sentinel Weave

A layered guardrail mesh that filters inputs and outputs before they ever reach the user.

Grounded inGuardrails and content filtering via classifier

A fast, separately-updatable guard in front of and behind the model — it enforces policy without touching base weights.

Latency: <40ms/turnCategories: 12+Action: block/redact/rewrite

Open deep dive

Framework 10

↳ 3

Wardgate

Adversarial-prompt defense that hardens models against jailbreaks and injection.

Grounded inJailbreak and prompt

Adversarial fine-tuning is a vaccination — it builds refusal immunity by repeatedly exposing the model to attack patterns before deployment.

ASR: -90% vs baseHierarchy: 4-tierDetect: perplexity+embed

Open deep dive

Architecture 10

↳ 3

Sovryn Vault

Air-gapped, on-prem training topology where data and weights never leave the perimeter.

Grounded inSovereign / air

Air-gap the whole training environment the way classified facilities air-gap networks — nothing leaves the perimeter, so the perimeter is the boundary.

Egress: zeroKeys: customer HSMResidency: pinned

Open deep dive

Architecture 10

↳ 3

Shardbastion

Fully-sharded parallelism that splits weights, gradients and optimizer state across the cluster.

Grounded inFSDP / ZeRO sharding

Instead of every GPU holding a full model copy, each holds one slice — shards assemble just-in-time per layer, then discard.

Mem/GPU: ~1/N replicasStage: 3 (params+grads+opt)Offload: CPU/NVMe

Open deep dive

Terminology 10

↳ 3

Anchorpoint

A durable, resumable training checkpoint that restores the full run state byte-for-byte.

Grounded inCheckpoint / resume orchestration combined with activation (gradient) checkpointing

A checkpoint is a save-game: it captures everything needed to resume exactly — weights, optimizer, RNG state, and data position.

Resume: exactly-onceRecompute: ~+30% FLOPsWrite: async/sharded

Open deep dive

The Atelier · Creative Leadership

Conceived & forged by

KYNETRA FOUNDRY was imagined, named, and architected under the creative leadership of KR and CS — the founding minds who shaped every pillar, coined every primitive, and set the doctrine this catalogue is built on.

Creative Direction & Architecture

Vision, naming systems, and the forge doctrine.

Founding Architect & Systems Design

Engineering grounding, pillar structure, and rigor.

Creative leadership · KR & CS — for Hyperbridge Digital