arc-humans-interface-db/docs/ARC_V1_TASK_IDS_README.md

# ARC-AGI Version 1 Task IDs

## Summary

This directory contains the official list of **800 task IDs** from the original ARC-AGI Version 1 dataset.

- **Source**: [fchollet/ARC-AGI](https://github.com/fchollet/ARC-AGI) v1.0.2
- **Training Tasks**: 400
- **Evaluation Tasks**: 400
- **Total**: 800 tasks

## Files Generated

1. **arc_v1_official_task_ids.json** - Complete structured JSON with all V1 task IDs
2. **arc_v1_all_ids.txt** - Simple text file with all task IDs
3. **arc_v1_training_ids.txt** - Training task IDs only (400 tasks)
4. **arc_v1_evaluation_ids.txt** - Evaluation task IDs only (400 tasks)

## Key Findings About Your Dataset

Your local `arc_data` directory contains:
- **Training**: 1,000 tasks (600 more than V1)
- **Evaluation**: 120 tasks (280 fewer than V1)
- **Total**: 1,120 tasks

This indicates your dataset is **NOT the original V1** and likely contains:
- Extended/augmented training data
- Potentially a subset of evaluation data
- Possibly a mix of V1 and newer tasks

## How to Identify V1 Tasks in Your Dataset

Use the task IDs in `arc_v1_official_task_ids.json` as a reference:

```python
import json

# Load official V1 IDs
with open('arc_v1_official_task_ids.json', 'r') as f:
    v1_data = json.load(f)
    v1_task_ids = set(v1_data['all_task_ids'])

# Check if a task is V1
task_id = "007bbfb7"
is_v1 = task_id in v1_task_ids
print(f"Task {task_id} is V1: {is_v1}")
```

## Sample V1 Training Task IDs

```
007bbfb7
00d62c1b
017c7c7b
025d127b
045e512c
0520fde7
05269061
05f2a901
06df4c85
08ed6ac7
```

## Sample V1 Evaluation Task IDs

```
00576224
009d5c81
00dbd492
03560426
05a7bcf2
0607ce86
0692e18c
070dd51e
08573cc6
0934a4d8
```

## Usage

To tag your database with version information:

```python
import json
import pymysql

# Load V1 task IDs
with open('arc_v1_official_task_ids.json', 'r') as f:
    v1_data = json.load(f)
    v1_task_ids = set(v1_data['all_task_ids'])

# Connect to database
connection = pymysql.connect(...)
cursor = connection.cursor()

# Add version column (if not exists)
cursor.execute("ALTER TABLE arc_jsons ADD COLUMN version VARCHAR(10)")

# Tag V1 tasks
for task_id in v1_task_ids:
    cursor.execute(
        "UPDATE arc_jsons SET version = 'v1' WHERE id = %s",
        (task_id,)
    )

# Tag non-V1 tasks as v2 or unknown
cursor.execute(
    "UPDATE arc_jsons SET version = 'v2_or_extended' WHERE version IS NULL"
)

connection.commit()
```

## References

- [ARC-AGI Repository](https://github.com/fchollet/ARC-AGI)
- [ARC Prize](https://arcprize.org/)
- [Original Paper: On the Measure of Intelligence](https://arxiv.org/abs/1911.01547)