Unsterwerx

Unsterwerx is a document-domain implementation of the Trusted Client-Centric Application Architecture (US Patent US9069626B2). It ingests common document formats into a local Shared Sandbox, normalizes them into a Universal Data Set, finds duplicates and near-duplicates, computes structural diffs, and supports temporal reconstruction under Business Intelligence and User Intelligence policy control.

Features

Quick Start

bash
curl -fsSL https://unsterwerx.run/install.sh | sh
unsterwerx ingest /path/to/documents
unsterwerx similarity
unsterwerx search "data architecture"
unsterwerx status --detailed

Commands

CommandDescription
ingestIngest files from a source directory
statusShow system and document status
reindexRebuild full-text search index (FTS5)
similarityRun similarity analysis on ingested documents
diffCompute diffs between similar document pairs
searchSearch canonical document content
reconstructReconstruct a document from canonical store
classifyClassify documents using rules
archiveArchive documents per retention policies
auditView and verify audit log
rulesManage classification rules
knowledgeBayesian scoring, vector graphs, BI dedup
importImport data from external sources
jobsManage background ingest and import jobs
configManage configuration
benchmarkBenchmark the TCA pipeline
upgradeCheck for and install the latest release