Unsterwerx

jobs

Manages background ingest and import jobs within the Shared Sandbox. When ingest or import run is launched with --background, a job is created that runs asynchronously via a detached worker process. The jobs command provides operators with visibility into job lifecycle, progress, and per-file diagnostics.

Job records are stored durably in the local database. Workers emit periodic heartbeats so that stale jobs (whose worker process died) can be detected and recovered.

Usage

bash
unsterwerx jobs <SUBCOMMAND>

Subcommands

SubcommandDescription
listList recent jobs
statusShow job details and progress
stopRequest a running or paused job to stop cleanly
pauseRequest a running job to pause at the next safe checkpoint
resumeResume a paused job
errorsShow error diagnostics for a job
logsShow all diagnostics (warnings and errors) for a job

jobs list

Lists recent jobs with their type, status, progress, and queue time. Stale runs are automatically refreshed before listing.

bash
unsterwerx jobs list [OPTIONS]

Options

OptionTypeDefaultDescription
--statusstringFilter by status: queued, running, stopping, paused, completed, failed, stopped, stale
--limitinteger20Maximum number of jobs to show
--jsonflagOutput as JSON via emit_json envelope

Example

bash
unsterwerx jobs list
  ID        Type           Status      Path                            Progress  Queued At
  ------------------------------------------------------------------------------------------
  a1b2c3d4  ingest         completed   .../path/to/documents           2873/2873  2026-03-31 14:25:52
  e5f6a7b8  import         running     .../Downloads/chatgpt-export     142/580   2026-03-31 15:01:03
bash
unsterwerx jobs list --status failed
bash
unsterwerx jobs list --json

jobs status

Shows detailed information about a single job including timestamps, progress counters, associated import batches, and worker log path.

bash
unsterwerx jobs status [OPTIONS] <ID>

Arguments

ArgumentRequiredDescription
IDYesJob ID or unique prefix (see ID prefix lookup)

Options

OptionTypeDescription
--jsonflagOutput as JSON via emit_json envelope

Example

bash
unsterwerx jobs status a1b2
Job Details
══════════════════════════════════════════════════════════════
  ID:              a1b2c3d4-5e6f-7890-abcd-ef1234567890
  Type:            ingest
  Mode:            background
  Status:          completed
  Input path:      /path/to/documents
  Source type:      local
  PID:             48231
  Log:             /home/operator/.unsterwerx/logs/a1b2c3d4-5e6f-7890-abcd-ef1234567890.log
  Spec version:    2
  Resume count:    0

  Queued at:       2026-03-31 14:25:52
  Started at:      2026-03-31 14:25:53
  Heartbeat at:    2026-03-31 14:32:41
  Completed at:    2026-03-31 14:32:42

  Items total:         2873
  Items processed:     2873
  Items skipped:          0
  Items error:           32

  Import batches:
    2e96772f
══════════════════════════════════════════════════════════════

jobs stop

Requests a running or paused job to stop cleanly. The worker will finish processing the current item and then exit. This is a cooperative signal, not a kill. The job transitions to stopping and then to stopped once the worker acknowledges.

bash
unsterwerx jobs stop [OPTIONS] <ID>

Arguments

ArgumentRequiredDescription
IDYesJob ID or unique prefix

Options

OptionTypeDescription
--jsonflagOutput as JSON via emit_json envelope

Example

bash
unsterwerx jobs stop e5f6
Stop requested for job e5f6a7b8 (status: stopping)

A stopped job can later be resumed with ingest --resume <ID> or import run --resume <ID>, which creates a new job that continues from where the stopped job left off.


jobs pause

Requests a running job to pause at the next safe checkpoint. The worker will finish the current item, write a checkpoint, and then wait. The job transitions to paused once the worker acknowledges.

Pause is not supported for retry_errors jobs.

bash
unsterwerx jobs pause [OPTIONS] <ID>

Arguments

ArgumentRequiredDescription
IDYesJob ID or unique prefix

Options

OptionTypeDescription
--jsonflagOutput as JSON via emit_json envelope

Example

bash
unsterwerx jobs pause e5f6
Pause requested for job e5f6a7b8 (worker will pause at next checkpoint)

jobs resume

Unpauses a paused job. The worker resumes processing from where it left off. This only works on jobs in paused status.

This is distinct from ingest --resume <ID> or import run --resume <ID>, which create a new job from a stale, stopped, or failed predecessor. jobs resume unpauses an existing paused worker in place.

bash
unsterwerx jobs resume [OPTIONS] <ID>

Arguments

ArgumentRequiredDescription
IDYesJob ID or unique prefix

Options

OptionTypeDescription
--jsonflagOutput as JSON via emit_json envelope

Example

bash
unsterwerx jobs resume e5f6
Resumed job e5f6a7b8

jobs errors

Shows error diagnostics recorded during a job. This is a shortcut for jobs logs --level error.

Each diagnostic records the file that caused the error, the NAC processing phase where it occurred, and a human-readable message.

bash
unsterwerx jobs errors [OPTIONS] <ID>

Arguments

ArgumentRequiredDescription
IDYesJob ID or unique prefix

Options

OptionTypeDefaultDescription
--levelstringerrorFilter by severity: error, warn, or warning (overridden to error for this subcommand)
--limitinteger100Maximum number of diagnostics to show
--jsonflagOutput as JSON via emit_json envelope

Example

bash
unsterwerx jobs errors a1b2
Diagnostics for job a1b2c3d4 (32 errors, 5 warnings)
  Timestamp            Level    Phase       Item                            Message
  ------------------------------------------------------------------------------------------
  2026-03-31 14:28:01  error    parse       .../legacy-report.doc           unsupported format: doc
  2026-03-31 14:29:12  error    canonical   .../corrupted.pdf               pdf extraction failed: no pages

jobs logs

Shows all diagnostics (both warnings and errors) recorded during a job. Diagnostics are recorded unconditionally for every warn/error event during processing, regardless of terminal output suppression.

bash
unsterwerx jobs logs [OPTIONS] <ID>

Arguments

ArgumentRequiredDescription
IDYesJob ID or unique prefix

Options

OptionTypeDefaultDescription
--levelstringFilter by severity: error, warn, or warning
--limitinteger100Maximum number of diagnostics to show
--jsonflagOutput as JSON via emit_json envelope

Example

bash
unsterwerx jobs logs a1b2 --limit 5
Diagnostics for job a1b2c3d4 (32 errors, 5 warnings)
  Timestamp            Level    Phase       Item                            Message
  ------------------------------------------------------------------------------------------
  2026-03-31 14:27:55  warning  parse       .../spreadsheet.xlsx            truncated to 10000 rows
  2026-03-31 14:28:01  error    parse       .../legacy-report.doc           unsupported format: doc
  2026-03-31 14:28:33  warning  canonical   .../big-presentation.pptx       slide count exceeds 500
  2026-03-31 14:29:12  error    canonical   .../corrupted.pdf               pdf extraction failed: no pages
  2026-03-31 14:30:01  warning  hash        .../empty-file.txt              zero-byte file skipped
bash
unsterwerx jobs logs a1b2 --level warn

Key Concepts

Job Lifecycle

Every job follows a state machine with these transitions:

queued → running → completed
                 → failed
                 → stopping → stopped
                 → paused → (resume) → running
                          → stopping → stopped
running → stale  (worker died without completing)

Terminal states: completed, failed, stopped, stale. No further transitions are possible from these states.

Stale Detection

A background worker writes periodic heartbeats to the database. If a worker process dies unexpectedly (crash, kill signal, system reboot), the job will have no recent heartbeat. When any jobs subcommand runs, stale detection is triggered automatically: jobs whose last heartbeat exceeds the configured threshold (ingest.stale_threshold_secs, default 120) are marked stale.

Stale jobs can be recovered with ingest --resume <ID> or import run --resume <ID>.

ID Prefix Lookup

All subcommands that accept a job ID also accept a unique prefix of that ID. For example, if job a1b2c3d4-5e6f-7890-abcd-ef1234567890 is the only job starting with a1b2, then a1b2 is sufficient:

bash
unsterwerx jobs status a1b2
unsterwerx jobs stop a1b2
unsterwerx jobs errors a1b2

If the prefix is ambiguous (matches multiple jobs), the command returns an error asking for a longer prefix.

Resume vs. Unpause

There are two distinct "resume" operations:

Diagnostic Phases

Diagnostics record the NAC processing phase where each warning or error occurred:

PhaseDescription
scanFile discovery and size filtering
hashSHA-256 content hashing
registerDatabase registration and deduplication
parseFormat-specific parsing (PDF, DOCX, XLSX, etc.)
canonicalCanonical markdown extraction
indexFull-text search indexing

Notes