AriaOS: Forge Benchmarks

Training Run Summary

End-to-end pipeline execution details.

623

Training Samples

~10h

Full Pipeline Duration

Fine-tuned Model Parameters

LoRA

Training Method

Head-to-head evaluation of the fine-tuned model against the base model.

Model	Quality	Tok/s	TTFT	Result
`ariaos-forge:latest`	80/100	19.7	4,467 ms	Winner
`qwen2.5-coder:7b`	60/100	10.3	29,853 ms	—

Quality via PraetorianMind Model A/B Compare — heuristic evaluation. Speed reflects 3B vs 7B parameter difference.

Token generation performance on Jetson AGX Orin.

Resource consumption during training and inference.

Phase	GPU Memory	System RAM	Notes
LoRA Training	~28 GB	~16 GB	fp16 with gradient checkpointing
Inference (3B)	~6 GB	~4 GB	Ollama runtime
Inference (7B base)	~14 GB	~8 GB	Ollama runtime

Platform

NVIDIA Jetson AGX Orin

GPU Memory

64 GB Unified

CUDA Cores

2048

Network

Air-gapped (zero cloud)