GPT-2 Benchmark Task

From GM-RKB
Revision as of 07:29, 22 August 2024 by Gmelli (talk | contribs) (Text replacement - "ions]] " to "ion]]s ")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

A GPT-2 Benchmark Task is a Natural Language Processing Benchmark Task that evaluates the performance GPT-2 in solving NLP tasks.

LAMBADA (PPL) LAMBADA (ACC) CB T-CN (ACC) CBT-NE (ACC) WikiText2 (PPL) PTB (PPL) enwik8 (BPB) text8 (BPC) WikiText103 (PPL) 1BW (PPL)
SOTA 99.8 59.23 85.7 82.3 39.14 46.54 0.99 1.08 18.3 21.8
117M 35.13 45.99 87.65 83.4 29.41 65.85 1.16 1.17 37.50 75.20
345M 15.60 55.48 92.35 87.1 22.76 47.33 1.01 1.06 26.37 55.72
762M 10.87 60.12 93.45 88.0 19.93 40.31 0.97 1.02 22.05 44.575
1542M 8.63 63.24 93.30 89.05 18.34 35.76 0.93 0.98 17.48 42.16
R-1 R-2 R-L R—AVG
Bottom-Up Sum 41.22 18.68 38.34 32.75
Lede-3 40.38 17.66 36.62 31.55
Seq2Seq + Attn 31.33 11.81 28.83 23.99
GPT-2 TL; DR: 29.34 8.27 26.58 21.40
Random-3 28.78 8.63 25.52 20.98
GPT-2 no hint 21.58 4.03 19.47 15.03


References

Referenes

2019a

2019b