#ProjectDarkMatter

Judicial Reasoning Engine

A purpose-built legal cognition system engineered to read, reason, and write with the discipline of a seasoned judge, spanning the South African case law corpus with a million-token memory.

Built to preserve the craft of judgment: every inference audited, every outcome explainable, every decision grounded in the record.

1M-token context High Court advocate validated POPIA-first governance Foundation-agnostic Traceability guaranteed

Performance at a Glance

90.11%
Peak Accuracy

20.1% higher than baseline general models.

1M Tokens
Context Window

Processes an entire case in one pass.

43,124×
Impact Multiplier

Relative task coverage vs. manual throughput.

58,000
Unique Training Instances

Total examples processed across the legal corpus.

Project Overview

Issue Spotting

Identifies the core legal questions and disputes before the court.

Principle Extraction

Distils the ratio decidendi and binding rules from judgments.

Precedent Analysis

Understands whether authority is followed, distinguished, or overruled.

Judgment Structuring

Produces structured outlines (facts, issues, analysis, order).

Outcome Reasoning

Captures the probable order and the underlying reasoning steps.

Legal Summarisation

Generates comprehensive, court-aware summaries and synthesis.

South African & Global Context

While the propiatary dataset provides raw material, no public record shows any other project in South Africa converting the corpus into a multi-skill legal training set for a long-context model. Project Dark Matter pioneers advanced generative AI tailored to the specifics of South African law.

Value Without Currency

96
Reliability Index
Accuracy & consistency
100
Traceability Coverage
Evidence & reasoning audit trail
94
Governance Readiness
POPIA-ready policies
92
Sovereign Assurance
Deployment & residency control
98
Expert Validation
Reviewed by High Court advocates

Technical Architecture & Moat

Premium Foundation

  • Latest-generation flagship LLM, foundation-agnostic approach.
  • Hosted in top-tier US East facility with sub-70ms latency.
  • Tier III+/IV design, SOC 2 Type II and ISO 27001/27018 alignment.
  • Ultra-long 1M token context window minimises chunking risk.
  • Adapter size 8 for multi-task capacity with efficient training.

Proprietary Data Pipeline

  • 15,000+ multi-skill examples curated from authoritative cases.
  • Layered QC gates, split hygiene, leakage prevention.
  • Total of 58,000 unique training instances processed.
  • Multi-part JSONL format keeps the project foundation-agnostic.

Governance & Validation

  • Formal review by two High Court advocates.
  • Hallucination mitigation via specialised evaluation pipelines.
  • Full traceability for court-grade auditability.

"For centuries we trained minds to think like judges. Today, we trained a machine to reason like one—not to replace judgment, but to preserve its discipline."

— Advocate Nandi Basson

Contact

Currently in alpha phase.

Contact us at info@vcb-ai.online

We share detailed metrics and demo access under NDA.