2024 Gemini15UnlockingMultimodalUnde

From GM-RKB
(Redirected from Reid, Savinov et al., 2024)
Jump to navigation Jump to search

Subject Headings: Gemini 1.5 LLM.

Notes

Cited By

2024

Quotes

Abstract

In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra’s state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5’s long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professions on their completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.

1. Introduction

We present our latest multimodal models from the Gemini line: Gemini 1.5 Pro and Gemini 1.5 Flash. They are members of Gemini 1.5, a new family of highly-capable multimodal models which incorporates a novel mixture-of-experts architecture as well as major advances in training, distillation and serving infrastructure that allow it to push the boundary of efficiency, reasoning, planning, multi-linguality, function calling and long-context performance. Gemini 1.5 models are built to handle extremely long contexts; they have the ability to recall and reason over fine-grained information from up to at least 10M tokens. This scale is unprecedented among contemporary large language models (LLMs), and enables the processing of long-form mixed-modality inputs including entire collections of documents, multiple hours of video, and almost five days long of audio.

...

Productivity Impact of LLMs Across Jobs

There is huge potential for LLMs to aid and augment people on routine, time-consuming, or repetitive tasks in the course of their jobs, leading thus to improved productivity. Here we specifically study and measure the productivity improvement that Gemini models bring to tasks from various professions.

In previous work, productivity or economic impact was measured in studies that classified jobs based on what current LLMs are able to do with human annotators or classifiers categorizing the tasks in each job as impacted by AI advances (Eloundou et al., 2023; Felten et al., 2018; World Economic Forum, 2023). Here, we conduct a practical exercise to evaluate how our models can help people from various industries in their jobs. Specifically, we ask participants to consider typical and complex tasks they do in the course of their jobs. This task description is then given as input to the models together with any other attached material required to complete these tasks (e.g., documents, web pages, spreadsheets, or images).

The 325 prompts we collected are rich depictions of user needs in practical settings. For example, a pre-school teacher might elicit activity ideas and worksheets for every day of a week (see Table 17, Appendix 12.9). Prompts are on average 277 words long, and 78% of them have at least one attachment. Additionally, we ask participants to indicate the difficulty of the task in terms of time and effort, and also the job expertise-level needed to complete it. Both these indicators were skewed towards higher complexity.

Interestingly, participants estimated that without any AI support, the average time to complete the task was 2.5 hours, indicating that these tasks typically involve significant effort. Raters from the same profession were then presented with model responses and asked to estimate how much time they would save using them as support for their tasks compared to having no AI support. Overall, raters estimated a 56.4% time saving for our prompt set with the 1.5 Pro model, and 27.7% for the 1.0 Pro model.

We also present these time savings by job categories in Figure 19. Our model responses were rated as saving time across all these jobs, with the 1.5 Pro model emerging stronger than the 1.0 Pro model. The 1.5 Pro model saves 26% time in the architecture domain, and has bigger gains in photography (73%) and programming (75%). As a qualitative measure, raters were also asked to judge the usefulness of the response on a scale from 1 to 5. The average usefulness of 1.5 Pro model responses was 4.0, and 2.7 for the 1.0 Pro model.

To the best of our knowledge, this is the first study to elicit real-world occupation-oriented prompts and examine the usefulness of LLMs to collaborate on these tasks. Overall, the 1.5 Gemini models significantly improve job productivity in multiple domains. We envision that these collaborative settings could be further improved using more suitable tools, additional model capabilities, and explanatory behavior.

Footnote 26: The average difficulty was 3.4 on a 5-point scale, and expertise was 1.8 average on a 3-point scale.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2024 Gemini15UnlockingMultimodalUndeSergey Brin
Chris Dyer
Jing Li
Jun Xu
Koray Kavukcuoglu
Jeffrey Dean
Sanjay Ghemawat
Cheng Li
Sebastian Riedel
Amir Globerson
Quan Yuan
Salvatore Scellato
Slav Petrov
James Manyika
Ivo Danihelka
Luke Vilnis
David Silver
Ioannis Antonoglou
Demis Hassabis (1976-)
Steven Hand
Oriol Vinyals
Ying Xu
Ming Zhang
Arnar Mar Hrafnkelsson
Zhe Chen
Dian Yu
Arthur Guez
George van den Driessche
Julian Schrittwieser
Timothy Lillicrap
Rohan Anil
Malcolm Reynolds
Adrià Puigdomènech Badia
Paul Barham
Michael Isard
Martin Wicke
Cicero Nogueira dos Santos
Kelvin Xu
Mohammad Saleh
Heiga Zen
Neil Houlsby
Paul Natsev
Fan Yang
Jacob Devlin
Tomas Kocisky
Yuanzhong Xu
Yonghui Wu
Maxim Krikun
Melvin Johnson
Meire Fortunato
Paul Michel
Pranav Shyam
Shaobo Hou
Diana Mincu
Zachary Nado
Angeliki Lazaridou
Radu Soricut
Yujia Li
Sebastian Borgeaud
Eliza Rutherford
Katie Millican
Jean-Baptiste Lespiau
Bogdan Damoc
Diego de Las Casas
Roman Ring
Tom Hennigan
Loren Maggiore
Albin Cassirer
Michela Paganini
Jack W. Rae
Aakanksha Chowdhery
Noah Fiedel
Fangyu Liu
Mostafa Dehghani
Le Hou
Hongkun Yu
Marco Selvi
Shariq Iqbal
Amol Mandhane
Elena Buchatskaya
Lisa Anne Hendricks
Eric Noland
Tom Le Paine
Srivatsan Srinivasan
Aditya Siddhant
Chenjie Gu
Orhan Firat
Machel Reid
Clement Farabet
Milad Nasr
Christopher A. Choquette-Choo
Katherine Lee
Bo Li
Aishwarya Kamath
Yong Cheng
Basil Mustafa
Jean-baptiste Alayrac
Siamak Shakeri
Luheng He
Ben Caine
Albert Webson
Bernd Bohnet
Ankesh Anand
Zaheer Abbas
Azade Nova
Vijay Bolina
Will Hawkins
Nikolay Savinov
Denis Teplyashin
Dmitry Lepikhin
Andrew Dai
Ethan Dyer
Mia Glaese
Thibault Sottiaux
Benjamin Lee
Fabio Viola
James Molloy
Jilin Chen
Ross McIlroy
Johan Schalkwyk
Eli Collins
Erica Moreira
Kareem Ayoub
Megha Goel
Clemens Meyer
Gregory Thornton
Zhen Yang
Henryk Michalewski
Nathan Schucher
Richard Ives
James Keeling
Karel Lenc
Salem Haykal
Stephen Spencer
Eren Sezener
Oscar Chang
Nobuyuki Morioka
George Tucker
Ce Zheng
Oliver Woodman
Nithya Attaluri
Evgenii Eltyshev
Xi Chen
Timothy Chung
Vittorio Selo
Siddhartha Brahma
Petko Georgiev
Ambrose Slone
Zhenkai Zhu
James Lottes
Siyuan Qiao
Alex Tomala
Martin Chadwick
Juliette Love
Peter Choy
Sid Mittal
Yunhao Tang
Matthew Lamm
Libin Bai
Qiao Zhang
Peter Humphreys
Yingjie Miao
Lukas Zilka
Taylor Tobin
Lev Proleev
Daniel Sohn
Alberto Magni
Isabel Gao
Santiago Ontanon
Oskar Bunyan
Nathan Byrd
Abhanshu Sharma
Biao Zhang
Mario Pinto
Rishika Sinha
Harsh Mehta
Dawei Jia
Sergi Caelles
Alex Morris
Becca Roelofs
Yifan Ding
Robin Strudel
Xuehan Xiong
Marvin Ritter
Rahma Chaabouni
Abhijit Karmarkar
Guangda Lai
Fabian Mentzer
Bibo Xu
YaGuang Li
Yujing Zhang
Alex Goldin
Behnam Neyshabur
Kate Baumli
Anselm Levskaya
Michael Laskin
Wenhao Jia
Kefan Xiao
Antoine He
Skye Giordano
Lakshman Yagati
Sanjay Ganapathy
Danilo Martins
Nanxin Chen
Yunhan Xu
Megan Barnes
Rhys May
Arpi Vezer
Junhyuk Oh
Ken Franko
Sophie Bridgers
Ruizhe Zhao
Boxi Wu
Sean Sechrist
Emilio Parisotto
Thanumalayan Sankaranarayana Pillai
Chris Larkin
Christina Sorokin
Alexey Guseynov
Jessica Landon
Romina Datta
Alexander Pritzel
Phoebe Thacker
Kevin Hui
Anja Hauth
Chih-Kuan Yeh
David Barker
Justin Mao-Jones
Sophia Austin
Hannah Sheahan
Parker Schuh
James Svensson
Rohan Jain
Vinay Ramasesh
Anton Briukhov
Da-Woon Chung
Tamara von Glehn
Christina Butterfield
Priya Jhakra
Matthew Wiethoff
Justin Frye
Jordan Grimstad
Beer Changpinyo
Charline Le Lan
Anna Bortsova
Paul Voigtlaender
Tara Sainath
Shane Gu
Charlotte Smith
Kris Cao
James Besley
Mark Omernick
Colin Gaffney
Gabriela Surita
Ryan Burnell
Junwhan Ahn
Andrew Brock
Mantas Pajarskas
Anastasia Petrushkina
Seb Noury
Lorenzo Blanco
Kevin Swersky
Arun Ahuja
Thi Avrahami
Vedant Misra
Raoul de Liedekerke
Mariko Iinuma
Alex Polozov
Sarah York
Justin Chiu
Rory Blevins
Zach Gleicher
Adrià Recasens
Alban Rrustemi
Elena Gribovskaya
Aurko Roy
Wiktor Gworek
Sébastien M. R. Arnold
Lisa Lee
James Lee-Thorp
Marcello Maggioni
Enrique Piqueras
Kartikeya Badola
Sharad Vikram
Lucas Gonzalez
Anirudh Baddepudi
Evan Senter
James Qin
Michael Azzam
Maja Trebacz
Martin Polacek
Kashyap Krishnakumar
Shuo-yiin Chang
Matthew Tung
Ivo Penchev
Rishabh Joshi
Kate Olszewska
Carrie Muir
Mateo Wirth
Ale Jakse Hartman
Josh Newlan
Sheleem Kashem
Elahe Dabir
Joost van Amersfoort
Zafarali Ahmed
James Cobon-Kerr
Ian Mackinnon
Alexandre Frechette
Xiance Si
Emanuel Taropa
Dong Li
Phil Crone
Anmol Gulati
Sébastien Cevey
Jonas Adler
Ada Ma
Simon Tokumine
Richard Powell
Stephan Lee
Kiran Vodrahalli
Samer Hassan
Antoine Yang
Nir Levine
Jenny Brennan
Mingqiu Wang
Sarah Hodkinson
Jeffrey Zhao
Josh Lipschultz
Aedan Pope
Michael B. Chang
Laurent El Shafey
Sholto Douglas
Fabio Pardo
Seth Odoom
Mihaela Rosca
Kedar Soparkar
Tom Hudson
Steven Hansen
Chulayuth Asawaroengchai
Ravi Addanki
Tianhe Yu
Wojciech Stokowiec
Mina Khan
Justin Gilmer
Jaehoon Lee
Carrie Grimes Bostock
Keran Rong
Jonathan Caton
Pedram Pejman
Filip Pavetic
Geoff Brown
Vivek Sharma
Mario Lučić
Rajkumar Samuel
Josip Djolonga
Lars Lowe Sjösund
Elspeth White
Natalie Clay
Jiepu Jiang
Hyeontaek Lim
Ross Hemsley
Zeyncep Cankara
Jane Labanowski
Nicola De Cao
David Steiner
Sayed Hadi Hashemi
Jacob Austin
Anita Gergely
Tim Blyth
Joe Stanton
Kaushik Shivakumar
Anders Andreassen
Carlos Araya
Nikhil Sethi
Rakesh Shivanna
Ankur Bapna
Ali Khodaei
Antoine Miech
Garrett Tanzer
Andy Swing
Shantanu Thakoor
Lora Aroyo
Zhufeng Pan
Jakub Sygnowski
Stephanie Winkler
Yamini Bansal
Xavier Garcia
Mehran Kazemi
Piyush Patil
Ishita Dasgupta
Iain Barr
Minh Giang
Thais Kagohara
Amit Marathe
Vladimir Feinberg
Mohamed Elhawaty
Nimesh Ghelani
Dan Horgan
Helen Miller
Lexi Walker
Richard Tanburn
Mukarram Tariq
Disha Shrivastava
Fei Xia
Qingze Wang
Chung-Cheng Chiu
Zoe Ashwood
Khuslen Baatarsukh
Sina Samangooei
Raphaël Lopez Kaufman
Fred Alcober
Axel Stjerngren
Paul Komarek
Katerina Tsihlas
Anudhyan Boral
Ramona Comanescu
Jeremy Chen
Ruibo Liu
Chris Welty
Dawn Bloxwich
Charlie Chen
Yanhua Sun
Fangxiaoyu Feng
Matthew Mauger
Xerxes Dotiwalla
Vincent Hellendoorn
Michael Sharman
Ivy Zheng
Krishna Haridasan
Gabe Barth-Maron
Craig Swanson
Dominika Rogozińska
Alek Andreev
Paul Kishan Rubenstein
Ruoxin Sang
Dan Hurt
Gamaleldin Elsayed
Renshen Wang
Dave Lacey
Anastasija Ilić
Yao Zhao
Adam Iwanicki
Alejandro Lince
Alexander Chen
Christina Lyu
Carl Lebsack
Jordan Griffith
Meenu Gaba
Paramjit Sandhu
Phil Chen
Anna Koop
Ravi Rajwar
Soheil Hassas Yeganeh
Solomon Chang
Rui Zhu
Soroush Radpour
Elnaz Davoodi
Ving Ian Lei
Yang Xu
Daniel Toyama
Constant Segal
Hanzhao Lin
Anna Bulanova
Nemanja Rakićević
Pablo Sprechmann
Angelos Filos
Víctor Campos
Nora Kassner
Devendra Sachan
Chimezie Iwuanyanwu
Vitaly Nikolaev
Balaji Lakshminarayanan
Sadegh Jazayeri
Mani Varadarajan
Chetan Tekur
Doug Fritz
Misha Khalman
David Reitter
Kingshuk Dasgupta
Shourya Sarcar
Tina Ornduff
Javier Snaider
Fantine Huot
Johnson Jia
Rupert Kemp
Nejc Trdin
Anitha Vijayakumar
Lucy Kim
Christof Angermueller
Li Lao
Tianqi Liu
Haibin Zhang
David Engel
Somer Greene
Anaïs White
Jessica Austin
Lilly Taylor
Shereen Ashraf
Dangyi Liu
Maria Georgaki
Irene Cai
Yana Kulizhskaya
Sonam Goenka
Brennan Saeta
Christian Frank
Dario de Cesare
Brona Robenek
Harry Richardson
Mahmoud Alnahlawi
Christopher Yew
Priya Ponnapalli
Marco Tagliasacchi
Alex Korchemniy
Yelin Kim
Dinghua Li
Bill Rosgen
Kyle Levin
Jeremy Wiesner
Praseem Banzal
Praveen Srinivasan
Çağlar Ünlü
David Reid
Zora Tung
Daniel Finchelstein
Ravin Kumar
Andre Elisseeff
Jin Huang
Ricardo Aguilar
Mai Giménez
Jiawei Xia
Olivier Dousse
Willi Gierke
Damion Yates
Komal Jalan
Lu Li
Eri Latorre-Chimoto
Duc Dung Nguyen
Ken Durden
Praveen Kallakuri
Yaxin Liu
Matthew Johnson
Tomy Tsai
Alice Talbert
Jasmine Liu
Alexander Neitz
Chen Elkind
Mimi Jasarevic
Livio Baldini Soares
Albert Cui
Pidong Wang
Alek Wenjiao Wang
Xinyu Ye
Krystal Kallarackal
Lucia Loher
Hoi Lam
Josef Broder
Dan Holtmann-Rice
Nina Martin
Bramandia Ramadhana
Mrinal Shukla
Sujoy Basu
Abhi Mohan
Nick Fernando
Kim Paterson
Hui Li
Ankush Garg
Jane Park
DongHyun Choi
Diane Wu
Sankalp Singh
Zhishuai Zhang
Lily Yu
John Carpenter
Félix de Chaumont Quitry
Carey Radebaugh
Chu-Cheng Lin
Alex Tudor
Prakash Shroff
Drew Garmon
Dayou Du
Neera Vats
Han Lu
Alex Yakubovich
Nilesh Tripuraneni
Haroon Qureshi
Nan Hua
Christel Ngani
Maria Abi Raad
Hannah Forbes
Jeff Stanway
Mukund Sundararajan
Victor Ungureanu
Colton Bishop
Yunjie Li
Balaji Venkatraman
Chloe Thornton
Nishesh Gupta
Yicheng Wang
Ian Tenney
Xihui Wu
Ashish Shenoy
Gabriel Carvajal
Diana Gage Wright
Ben Bariach
Zhuyun Xiao
Peter Hawkins
Sid Dalmia
Pedro Valenzuela
Ananth Agarwal
Mia Chen
Wooyeol Kim
Brice Hulse
Nandita Dukkipati
Adam Paszke
Andrew Bolt
Kiam Choo
Jennifer Beattie
Jennifer Prendki
Harsha Vashisht
Rebeca Santamaria-Fernandez
Luis C. Cobo
Jarek Wilkiewicz
David Madras
Ali Elqursh
Grant Uy
Kevin Ramirez
Matt Harvey
Tyler Liechty
Jeff Seibert
Clara Huiyi Hu
Andrey Khorlin
Maigo Le
Asaf Aharoni
Megan Li
Lily Wang
Sandeep Kumar
Norman Casagrande
Jay Hoover
Dalia El Badawy
David Soergel
Denis Vnukov
Matt Miecnikowski
Jiri Simsa
Praveen Kumar
Thibault Sellam
Daniel Vlasic
Samira Daruki
Nir Shabat
John Zhang
Guolong Su
Jiageng Zhang
Jeremiah Liu
Yi Sun
Evan Palmer
Alireza Ghaffarkhah
Xi Xiong
Victor Cotruta
Michael Fink
Lucas Dixon
Ashwin Sreevatsa
Adrian Goedeckemeyer
Alek Dimitriev
Mohsen Jafari
Remi Crocker
Nicholas FitzGerald
Aviral Kumar
Ivan Philips
Frederick Liu
Yannie Liang
Rachel Sterneck
Alena Repina
Marcus Wu
Laura Knight
Marin Georgiev
Hyo Lee
Harry Askham
Abhishek Chakladar
Annie Louis
Carl Crous
Hardie Cate
Dessie Petrova
Michael Quinn
Denese Owusu-Afriyie
Achintya Singhal
Nan Wei
Solomon Kim
Damien Vincent
Reiko Tojo
Shawn Lu
Yuchung Cheng
Tolga Bolukbasi
Saaber Fatehi
Rajagopal Ananthanarayanan
Miteyan Patel
Charbel Kaed
Shreyas Rammohan Belle
Jaclyn Konzelmann
Siim Põder
Roopal Garg
Vinod Koverkathu
Adam Brown
Rosanne Liu
Alanna Walton
Alicia Parrish
Mark Epstein
Sara McCarthy
Gemini 1.5: Unlocking Multimodal Understanding Across Millions of Tokens of Context2024