HotpotQA Benchmarking Task

From GM-RKB
Jump to navigation Jump to search

A HotpotQA Benchmarking Task is a LLM inference evaluation task that can be used to evaluate multi-hop reasoning in question answering by requiring evidence aggregation across multiple documents.



References

2018a

2028b