Ethan Perez

Authors

Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, +22 more, Samuel R Bowman, Ethan Perez

Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

Improving the faithfulness of model-generated reasoning; continued improvements may lead to reasoning that enables us to verify the correctness and safety of LLM behavior.

arXiv 2023

Authors

Ansh Radhakrishnan, Karina Nguyen, Anna Chen, Carol Chen, Carson Denison, Danny Hernandez, +16 more, Samuel R Bowman, Ethan Perez

Inverse Scaling: When Bigger Isn’t Better

We present evidence for the claim that LMs may show inverse scaling, or worse task performance with increased scale.

arXiv 2023

Authors

Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, +19 more, Samuel R. Bowman, Ethan Perez

Training Language Models with Language Feedback at Scale

Pretrained language models often generate harmful or incorrect outputs. Imitation Learning from Language Feedback addresses this issue leading to roughly human-level summarization performance.

arXiv 2023

Authors

Jérémy Scheurer, Jon Ander Campos, Tomasz Korbak, Jun Shern Chan, Angelica Chen, Kyunghyun Cho, Ethan Perez

Improving Code Generation by Training with Natural Language Feedback

We develop an algorithm that improves language models’ performance on code generation tasks using minimal human-written feedback during training, making it user-friendly and sample-efficient.

arXiv 2023

Authors

Angelica Chen, Jérémy Scheurer, Tomasz Korbak, Jon Ander Campos, Jun Shern Chan, Samuel R Bowman, Kyunghyun Cho, Ethan Perez

Pretraining Language Models with Human Preferences

We propose methods for pretraining language models with human preferences, resulting in much better preference satisfaction than standard pretraining-then-finetune paradigm.

arXiv 2023

Authors

Tomasz Korbak, Kejian Shi, Angelica Chen, Rasika Bhalerao, Christopher L. Buckley, Jason Phang, Samuel R. Bowman, Ethan Perez

The Capacity for Moral Self-Correction in Large Language Models

We find that language models can self-correct their own biases against different demographic groups.

arXiv 2023

Authors

Deep Ganguli*, Amanda Askell*, Nicholas Schiefer, Thomas I. Liao, Kamile Lukošiute, Anna Chen, +41 more, Samuel R. Bowman, Jared Kaplan

Discovering Language Model Behaviors with Model-Written Evaluations

We’ve developed an automated way to generate evaluations with LMs. We test LMs using >150 LM-written evaluations, uncovering novel LM behaviors and risks.

arXiv 2022

Cite Data Data Visualization AI Safety Relevance

Authors

Ethan Perez, Sam Ringer*, Kamile Lukošiute*, Karina Nguyen*, Edwin Chen, Scott Heiner, +55 more, Nicholas Schiefer, Jared Kaplan

Cite Data Data Visualization AI Safety Relevance

Measuring Progress on Scalable Oversight for Large Language Models

Human participants who chat with an unreliable language model assistant substantially outperform both the model alone and their own unaided performance.

arXiv 2022

Authors

Samuel R. Bowman, Jeeyoon Hyun, Ethan Perez, Edwin Chen, Craig Pettit, Scott Heiner, +38 more, Ben Mann, Jared Kaplan

Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

We describe our early efforts to red team language models in order to simultaneously discover, measure, and attempt to reduce their potentially harmful outputs.

arXiv 2022

Authors

Deep Ganguli, Liane Lovitt, Jackson Kernion, Amanda Askell, Yuntao Bai, Saurav Kadavath, +28 more, Jared Kaplan, Jack Clark

Inverse Scaling Prize

We’re announcing the ISP: a $100k grand prize + $150k in additional prizes for finding an important task where larger language models do *worse*.

Winners AI Safety Relevance Related Work

Authors

Ian McKenzie, Alexander Lyzhov, Alicia Parrish, Ameya Prabhu, Aaron Mueller, Najoung Kim, Sam Bowman, Ethan Perez

Winners AI Safety Relevance Related Work

Few-shot Adaptation Works with UnpredicTable Data

Training on odd data (e.g. tables from support.google.com) improves few-shot learning with language models in the same way as diverse NLP data.

arXiv 2022

Code Cite Data

Authors

Jun Shern Chan, Michael Pieler, Jonathan Jao, Jérémy Scheurer, Ethan Perez

Code Cite Data

A man and woman pointing fingers at each other

Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions

Dataset of QA explanations with the goal of helping humans more reliably determine the correct answer when the ground truth can’t be directly determined.

ACL 2022 Workshop on Learning with Natural Language Supervision

Authors

Alicia Parrish*, Harsh Trivedi*, Ethan Perez*, Angelica Chen, Nikita Nangia, Jason Phang, Samuel R. Bowman

RL with KL penalties is better viewed as Bayesian inference

KL penalties in RL with language models aren’t a hack; KL penalties have a principled, Bayesian justification.

EMNLP 2022

Authors

Tomasz Korbak, Ethan Perez, Christopher L Buckley

Training Language Models with Language Feedback

We found a way to learn from language feedback (not ratings), enabling us to finetune GPT3 to human-level summarization with just 100 feedback samples.

ACL 2022 Workshop on Learning with Natural Language Supervision

Talk

Authors

Jérémy Scheurer, Jon Ander Campos, Jun Shern Chan, Angelica Chen, Kyunghyun Cho, Ethan Perez

Talk

Finding and Fixing Undesirable Behaviors in Pretrained Language Models

Language models often generate undesirable text. We introduce methods for finding undesirable behaviors and training them away.

PhD Thesis

Talk

Authors

Ethan Perez

Talk

Red Teaming Language Models with Language Models

Language models (LMs) generate harmful text. We generate test cases (“red teaming”) using another LM, to catch harmful behaviors before impacting users.

EMNLP 2022

Cite

Authors

Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, Geoffrey Irving

Cite

High variance in prompts chosen by cross-validation

True Few-Shot Learning with Language Models

Language models do much worse at few-shot learning when choosing prompts in a few-shot way instead of using large held-out sets (prior work).

NeurIPS 2021

Authors

Ethan Perez, Douwe Kiela, Kyunghyun Cho

Case-based Reasoning for Natural Language Queries over Knowledge Bases

Retrieval-augmented generation achieves SOTA on knowledge base question-answering.

EMNLP 2021

Authors

Rajarshi Das, Manzil Zaheer, Dung Thai, Ameya Godbole, Ethan Perez, Jay-Yoon Lee, Lizhen Tan, Lazaros Polymenakos, Andrew McCallum

Rissanen Data Analysis: Examining Dataset Characteristics with Description Length

We propose a theoretically-justified way to “probe datasets” for what capabilities they require of a model.

ICML 2021

Authors

Ethan Perez, Douwe Kiela, Kyunghyun Cho

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

We present a single, retrieval-based architecture that can learn a variety of knowledge-intensive tasks: extractive and generative alike.

NeurIPS 2020

Blog Post Cite Code Demo Talk

Authors

Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, +3 more, Sebastian Riedel, Douwe Kiela

Blog Post Cite Code Demo Talk

Unsupervised Question Decomposition for Question Answering

We decompose a hard question into several, easier questions with unsupervised learning, improving multi-hop question answering without extra supervision.

EMNLP 2020

Blog Post Cite Code Poster Talk

Authors

Ethan Perez, Patrick Lewis, Wen-tau Yih, Kyunghyun Cho, Douwe Kiela

Blog Post Cite Code Poster Talk

Retrospective for “FiLM: Visual Reasoning with a General Conditioning Layer”

An honest reflection on FiLM conditioning layers based on the work that followed, including when (not) to use FiLM layers and how to train…

NeurIPS 2019 Retrospectives Workshop

Cite Talk (starts at 18:28)

Authors

Ethan Perez

Cite Talk (starts at 18:28)

Finding Generalizable Evidence by Learning to Convince Q&A Models

We find text evidence for an answer to a question by finding text that convinces Q&A models to pick that answer.

EMNLP 2019

Blog Post Cite Code Press

Authors

Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

Blog Post Cite Code Press

Supervised Multimodal Bitransformers for Classifying Images and Text

We introduce a simple yet effective baseline for multimodal BERT-like architectures that jointly finetunes unimodally pretrained text and image encoders.

arXiv 2019

Authors

Douwe Kiela, Suvrat Bhooshan, Hamed Firooz, Ethan Perez, Davide Testuggine

ELI5: Long Form Question Answering

We introduce a dataset for abstractive question-answering where answers are 100+ words long (many “how” and “why” questions).

ACL 2019

Blog Post Cite Code Website

Authors

Angela Fan, Yacine Jernite*, Ethan Perez*, David Grangier, Jason Weston, Michael Auli

Blog Post Cite Code Website

Visual Reasoning with Multi-hop Feature Modulation

Decoding FiLM conditioning parameters in multiple hops helps for more advanced vision-and-language tasks such as visual dialogue.

ECCV 2018

Authors

Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jeremie Mary, Aaron Courville, Olivier Pietquin

Feature-wise transformations

A review of a simple and surprisingly effective class of neural conditioning mechanisms.

Distill 2018

Authors

Vincent Dumoulin, Ethan Perez, Nathan Schucher, Florian Strub, Harm de Vries, Aaron Courville, Yoshua Bengio

HoME: a Household Multimodal Environment

We introduce a simulated environment for agents to learn from vision, audio, semantics, physics, and object-interaction within a realistic, household context.

ICLR 2018 Workshop

Authors

Simon Brodeur, Ethan Perez*, Ankesh Anand*, Florian Golemo*, Luca Celotti, Florian Strub, Hugo Larochelle, Aaron Courville

FiLM: Visual Reasoning with a General Conditioning Layer

We introduce a general-purpose neural network layer to integrate multimodal input to answer reasoning questions about images.

AAAI 2018

Authors

Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville