Free Energy and the Generalized Optimality Equations for Sequential Decision 
Making

Ortega, PA; Braun, DA

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Konferenzbeitrag

Free Energy and the Generalized Optimality Equations for Sequential Decision Making

MPG-Autoren

/persons/resource/persons84121

Ortega, PA
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83827

Braun, DA
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen

https://ewrl.wordpress.com/past-ewrl/ewrl10-2012/
(Verlagsversion)

https://ar5iv.labs.arxiv.org/html/1205.3997
(beliebiger Volltext)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Ortega, P., & Braun, D. (2012). Free Energy and the Generalized Optimality Equations for Sequential Decision Making. In 10th European Workshop on Reinforcement Learning (EWRL 2012).

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-B6C2-2

Zusammenfassung

The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments.
We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules
such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each
node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.