Information for Vishal

Table of contents

Basic information

Item Value
Agendas Recursive reward modeling

List of positions (1 position)

Organization Title Start date End date Employment type Source Notes
Suvita Associate Field Officer 2023-06-21 [1], [2] No last name given

Products (0 products)

Name Creation date Description

Organization documents (0 documents)

Title Publication date Author Publisher Affected organizations Affected people Document scope Cause area Notes

Documents (1 document)

Title Publication date Author Publisher Affected organizations Affected people Affected agendas Notes
Scalable agent alignment via reward modeling: a research direction 2018-11-19 Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg arXiv Google DeepMind Recursive reward modeling, Imitation learning, inverse reinforcement learning, Cooperative inverse reinforcement learning, myopic reinforcement learning, iterated amplification, debate This paper introduces the (recursive) reward modeling agenda, discussing its basic outline, challenges, and ways to overcome those challenges. The paper also discusses alternative agendas and their relation to reward modeling.

Similar people

Showing at most 20 people who are most similar in terms of which organizations they have worked at.

Person Number of organizations in common List of organizations in common
Joey Savoie 1 Suvita
Alex Catalán Flores 1 Suvita
Allen Francis 1 Suvita
Amritesh Nishad 1 Suvita
Anupam Kumari 1 Suvita
Asif Akhtar Khan 1 Suvita
Deepak Bansal 1 Suvita
Deepika Kumari 1 Suvita
Fiona Conlon 1 Suvita
Juhi Kumari 1 Suvita
Jyoti Rajput 1 Suvita
Kahkasha Khan 1 Suvita
Katriel Friedman 1 Suvita
Krutika Ravishankar 1 Suvita
Kundan Prasad 1 Suvita
Manoj Kumar Chaudhary 1 Suvita
Mario Pinto 1 Suvita
Mohammad Munaf 1 Suvita
Mukesh Raj 1 Suvita
Patrick Stadler 1 Suvita