This commit is contained in:
bmachado
2025-11-05 00:24:05 +00:00
commit b8856c0660
1157 changed files with 26817 additions and 0 deletions

View File

@ -0,0 +1,118 @@
# ARC-AGI Version 1 Task IDs
## Summary
This directory contains the official list of **800 task IDs** from the original ARC-AGI Version 1 dataset.
- **Source**: [fchollet/ARC-AGI](https://github.com/fchollet/ARC-AGI) v1.0.2
- **Training Tasks**: 400
- **Evaluation Tasks**: 400
- **Total**: 800 tasks
## Files Generated
1. **arc_v1_official_task_ids.json** - Complete structured JSON with all V1 task IDs
2. **arc_v1_all_ids.txt** - Simple text file with all task IDs
3. **arc_v1_training_ids.txt** - Training task IDs only (400 tasks)
4. **arc_v1_evaluation_ids.txt** - Evaluation task IDs only (400 tasks)
## Key Findings About Your Dataset
Your local `arc_data` directory contains:
- **Training**: 1,000 tasks (600 more than V1)
- **Evaluation**: 120 tasks (280 fewer than V1)
- **Total**: 1,120 tasks
This indicates your dataset is **NOT the original V1** and likely contains:
- Extended/augmented training data
- Potentially a subset of evaluation data
- Possibly a mix of V1 and newer tasks
## How to Identify V1 Tasks in Your Dataset
Use the task IDs in `arc_v1_official_task_ids.json` as a reference:
```python
import json
# Load official V1 IDs
with open('arc_v1_official_task_ids.json', 'r') as f:
v1_data = json.load(f)
v1_task_ids = set(v1_data['all_task_ids'])
# Check if a task is V1
task_id = "007bbfb7"
is_v1 = task_id in v1_task_ids
print(f"Task {task_id} is V1: {is_v1}")
```
## Sample V1 Training Task IDs
```
007bbfb7
00d62c1b
017c7c7b
025d127b
045e512c
0520fde7
05269061
05f2a901
06df4c85
08ed6ac7
```
## Sample V1 Evaluation Task IDs
```
00576224
009d5c81
00dbd492
03560426
05a7bcf2
0607ce86
0692e18c
070dd51e
08573cc6
0934a4d8
```
## Usage
To tag your database with version information:
```python
import json
import pymysql
# Load V1 task IDs
with open('arc_v1_official_task_ids.json', 'r') as f:
v1_data = json.load(f)
v1_task_ids = set(v1_data['all_task_ids'])
# Connect to database
connection = pymysql.connect(...)
cursor = connection.cursor()
# Add version column (if not exists)
cursor.execute("ALTER TABLE arc_jsons ADD COLUMN version VARCHAR(10)")
# Tag V1 tasks
for task_id in v1_task_ids:
cursor.execute(
"UPDATE arc_jsons SET version = 'v1' WHERE id = %s",
(task_id,)
)
# Tag non-V1 tasks as v2 or unknown
cursor.execute(
"UPDATE arc_jsons SET version = 'v2_or_extended' WHERE version IS NULL"
)
connection.commit()
```
## References
- [ARC-AGI Repository](https://github.com/fchollet/ARC-AGI)
- [ARC Prize](https://arcprize.org/)
- [Original Paper: On the Measure of Intelligence](https://arxiv.org/abs/1911.01547)