Abstraction and Reasoning Corpus (ARC) Benchmark

An Abstraction and Reasoning Corpus (ARC) Benchmark is a AI benchmark designed to measure an AI system's ability to perform abstract reasoning and generalize from few examples, closely mimicking human cognitive processes.

Context:
- It can (typically) test the ability of an AI to acquire new skills and solve novel problems using core knowledge naturally possessed by humans.
- It can (often) include tasks that require the AI to transform input images into output images based on a few demonstrated examples.
- It can range from being a simple grid task to complex pattern recognition challenges.
- It can highlight the gap between human and machine learning capabilities, especially in abstract reasoning.
- It can serve as a benchmark for Artificial General Intelligence (AGI) research, pushing the boundaries of current AI systems.
- ...
Example(s):
- an ARC task that involves deducing patterns linking pairs of colored grids.
- an ARC evaluation where AI systems must solve new tasks without prior specific training.
- ...
Counter-Example(s):
- MNIST dataset, which focuses on digit recognition and does not require abstract reasoning.
- ImageNet benchmark, which evaluates object recognition in images but does not test the ability to generalize from few examples.
- ...
See: Directed_Graph_Edge.

References