Information for Miljan Martic

Basic information

Item	Value
Donations List Website (data still preliminary)
Agendas	Recursive reward modeling

Organization	Title	Start date	End date	Employment type	Source	Notes
Google DeepMind	Reserach Engineer	2017-03-01	2020-10-31		[1], [2], [3]
Google DeepMind	Senior Reserach Engineer	2020-10-01	2021-09-01		[1], [2], [4]

Name	Creation date	Description

Title	Publication date	Author	Publisher	Affected organizations	Affected people	Document scope	Cause area	Notes

Title	Publication date	Author	Publisher	Affected organizations	Affected people	Affected agendas	Notes
Scalable agent alignment via reward modeling: a research direction	2018-11-19	Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg	arXiv	Google DeepMind		Recursive reward modeling, Imitation learning, inverse reinforcement learning, Cooperative inverse reinforcement learning, myopic reinforcement learning, iterated amplification, debate	This paper introduces the (recursive) reward modeling agenda, discussing its basic outline, challenges, and ways to overcome those challenges. The paper also discusses alternative agendas and their relation to reward modeling.

Showing at most 20 people who are most similar in terms of which organizations they have worked at.

Person	Number of organizations in common	List of organizations in common
Chris Maddison	1	Google DeepMind
Laurent Orseau	1	Google DeepMind
Tom Everitt	1	Google DeepMind
Pedro A. Ortega	1	Google DeepMind
Vishal Maini	1	Google DeepMind
Thore Graepel	1	Google DeepMind
Nick Bostrom	1	Google DeepMind
Stanislav Fort	1	Google DeepMind
George McGowan	1	Google DeepMind
Sebastian Farquhar	1	Google DeepMind
Victoria Krakovna	1	Google DeepMind
Aditya Srikanth Veerubhotla	1	Google DeepMind
Abhishek Rao	1	Google DeepMind
Andrew Lefrancq	1	Google DeepMind
Azade Nova	1	Google DeepMind
Been Kim	1	Google DeepMind
Blanca Huergo	1	Google DeepMind
David Stutz	1	Google DeepMind
Demis Hassabis	1	Google DeepMind
Dmitry Nikulin	1	Google DeepMind