GPT-2 Benchmark Task

From GM-RKB
Jump to navigation Jump to search

A GPT-2 Benchmark Task is a Natural Language Processing Benchmark Task that evaluates the performance GPT-2 in solving NLP tasks.

LAMBADA (PPL) LAMBADA (ACC) CB T-CN (ACC) CBT-NE (ACC) WikiText2 (PPL) PTB (PPL) enwik8 (BPB) text8 (BPC) WikiText103 (PPL) 1BW (PPL)
SOTA 99.8 59.23 85.7 82.3 39.14 46.54 0.99 1.08 18.3 21.8
117M 35.13 45.99 87.65 83.4 29.41 65.85 1.16 1.17 37.50 75.20
345M 15.60 55.48 92.35 87.1 22.76 47.33 1.01 1.06 26.37 55.72
762M 10.87 60.12 93.45 88.0 19.93 40.31 0.97 1.02 22.05 44.575
1542M 8.63 63.24 93.30 89.05 18.34 35.76 0.93 0.98 17.48 42.16
R-1 R-2 R-L R—AVG
Bottom-Up Sum 41.22 18.68 38.34 32.75
Lede-3 40.38 17.66 36.62 31.55
Seq2Seq + Attn 31.33 11.81 28.83 23.99
GPT-2 TL; DR: 29.34 8.27 26.58 21.40
Random-3 28.78 8.63 25.52 20.98
GPT-2 no hint 21.58 4.03 19.47 15.03


References

Referenes

2019a

2019b