EXP-004: LARS 3D Task Reasoning v1

Experiment 004 - LARS 3D Task Reasoning v1

Date: 2025-12-29 Status: Completed - SUCCESS

Configuration

Base Model: qwen2.5-7b-abliterated
Dataset: lars_3d_combined.json (20 examples: 12 identity + 8 tasks)
Format: 3D with and tags
Epochs: 10
Learning Rate: 3e-4
Max Length: 1024 tokens
Training Time: ~5 minutes

Results

Loss: 2.96 → 0.026 (excellent convergence)
Task Reasoning Test: 4/4 novel tasks handled with 3D format
Multi-step Planning: WORKING
Storage Decision: WORKING
Ambiguity Handling: WORKING

Novel Task Test Results

'Find that thing saved last week' → Reasoned about Context/KB search
'Check website redesign project' → Planned to search Track, create if missing
'Document this conversation' → Listed storage options, asked for clarification
'Which experiment worked best' → Planned to compare KPIs, identify metrics

Key Achievement

LARS now reasons through multi-step tasks, not just identity questions. The thinking process includes: - Breaking down vague requests - Identifying storage locations - Planning search operations - Handling ambiguity

Issues Found

Some hallucination (made up numbers like '18 experiments')
Base model knowledge mixing with trained knowledge
But: REASONING PATTERN is correct

Comparison to EXP-003

EXP-003: 12 examples, identity only → Identity works
EXP-004: 20 examples, identity + tasks → Tasks work too
Combined training preserves identity while adding capabilities

Output

Path: ~/corlera-training/outputs/lars-3d-v2-tasks