Abstraction and Reasoning Corpus (ARC) Benchmark

From GM-RKB
Jump to navigation Jump to search

An Abstraction and Reasoning Corpus (ARC) Benchmark is a AI benchmark designed to measure an AI system's ability to perform abstract reasoning and generalize from few examples, closely mimicking human cognitive processes.

  • Context:
    • It can (typically) test the ability of an AI to acquire new skills and solve novel problems using core knowledge naturally possessed by humans.
    • It can (often) include tasks that require the AI to transform input images into output images based on a few demonstrated examples.
    • It can range from being a simple grid task to complex pattern recognition challenges.
    • It can highlight the gap between human and machine learning capabilities, especially in abstract reasoning.
    • It can serve as a benchmark for Artificial General Intelligence (AGI) research, pushing the boundaries of current AI systems.
    • ...
  • Example(s):
    • an ARC task that involves deducing patterns linking pairs of colored grids.
    • an ARC evaluation where AI systems must solve new tasks without prior specific training.
    • ...
  • Counter-Example(s):
    • MNIST dataset, which focuses on digit recognition and does not require abstract reasoning.
    • ImageNet benchmark, which evaluates object recognition in images but does not test the ability to generalize from few examples.
    • ...
  • See: Directed_Graph_Edge.


References

2024