lecture

Reward Function Integrity in Artificially Intelligent Systems

Reward Function Integrity in Artificially Intelligent Systems

Video and abstract of Roman's presentation at Oxford University. Analysis of historical examples of wireheading in man and machine and evaluate a number of approaches proposed for dealing with reward-function corruption. While simplistic optimizers driven to maximize a proxy measure for a particular goal will always be a subject to corruption, sufficiently rational self-improving machines are bel...

Read More »