Documentation
Technical reference for the AriaOS: Forge training pipeline.
Pipeline Overview
AriaOS: Forge implements a four-stage pipeline: Collect, Prepare, Train, Deploy. Each stage is fully auditable, runs entirely on local hardware, and produces versioned artifacts. The pipeline is orchestrated by a single configuration file and monitored by the ResilientMind Agent.
Data Collection Config
Configure data sources via forge.collect.yaml. Supported sources include local directories, Git repositories (local clones), database exports, and structured document feeds. All collection happens on-device with no network egress.
Dataset Preparation
Raw data is transformed into instruction-response pairs using configurable templates. The preparation stage handles deduplication, quality filtering, tokenization validation, and train/eval splits. Output format is JSONL compatible with all major training frameworks.
LoRA Training Parameters
Fine-tuning uses Low-Rank Adaptation (LoRA) to minimize GPU memory requirements. Key parameters include rank (default 16), alpha (default 32), dropout (default 0.05), and target modules. Training supports fp16 and gradient checkpointing for memory-constrained hardware.
Model Export & Ollama Deployment
Trained adapters are merged with the base model and exported to GGUF format for deployment via Ollama. The export stage handles quantization (Q4_K_M, Q5_K_M, Q8_0), Modelfile generation, and automatic registration with the local Ollama instance.
ResilientMind Agent Config
The monitoring agent is configured via forge.agent.yaml. It watches pipeline stages for failures, resource exhaustion, and quality regressions. Configurable alert thresholds, automatic retry policies, and fallback paths (e.g., fp16 fallback for CUDA kernel issues).
Federal Deployment Guide
Step-by-step guide for deploying AriaOS: Forge in air-gapped federal environments. Covers offline dependency bundling, STIG-compatible system configuration, audit trail setup, and integration with AriaOS for continuous compliance monitoring.
Quick Start
AriaOS: Forge runs on NVIDIA Jetson AGX Orin or any CUDA-capable Linux system with 32+ GB unified or GPU memory.
Prerequisites
- NVIDIA Jetson AGX Orin (64 GB recommended) or CUDA-capable GPU
- JetPack 6.x or CUDA 12.x
- Python 3.10+
- Ollama (for model deployment)
Installation
# Clone the repository (from local mirror in air-gapped environments)
git clone https://github.com/ResilientMind-AI/ariaos-forge.git
cd ariaos-forge
# Install dependencies
pip install -r requirements.txt
# Initialize configuration
python -m forge init
Running the Pipeline
# Run the full pipeline
python -m forge run --config forge.yaml
# Run individual stages
python -m forge collect --config forge.yaml
python -m forge prepare --config forge.yaml
python -m forge train --config forge.yaml
python -m forge deploy --config forge.yaml
Configuration Example
# forge.yaml
project:
name: my-domain-model
base_model: qwen2.5-coder:3b
collect:
sources:
- type: directory
path: /data/training-corpus
extensions: [.py, .md, .yaml]
prepare:
template: instruction-response
min_quality: 0.7
train_split: 0.9
train:
method: lora
rank: 16
alpha: 32
epochs: 3
batch_size: 4
learning_rate: 2e-4
fp16: true
gradient_checkpointing: true
deploy:
format: gguf
quantization: Q4_K_M
ollama_model_name: my-domain-model
For detailed configuration options, contact Joseph@ResilientMindAI.com.