On Markov decision processes with the stochastic differential Bellman Equation

Lade...
Vorschaubild

Datum

2025

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Zusammenfassung

Stochastic differential equations play an important role in capturing the dynamics of complex systems, where uncertainty prevails in the form of noise. In complex systems noise is abundant, but its exact behaviour is unknown. However, noise can be simulated with stochastic processes. Stochastic calculi, such as the Itˆo formula, provide tools for navigating these systems. In this work, the adaptation of the Bellman equation, a cornerstone of dynamic programming, to the realm of stochastic differential equations is explored, facilitating the modeling of decision problems subject to noise. Value iteration and Q-learning, two well-known solution methods in machine learning, are extended to stochastic algorithms in order to approximate the solution for Markov decision processes with uncertainties modeled by the stochastic differential Bellman equation. These stochastic algorithms enable a realistic approach to modeling and solving decision problems in stochastic environments efficiently. The stochastic value iteration is applied when the environment is fully known, while the stochastic Q-learning extends its utility even in cases where transition probabilities remain unknown. Through theoretical analyses and case studies, these algorithms demonstrate their efficacy and applicability, delivering meaningful results. Additionally, the stochastic Q-learning achieves superior rewards compared to the deterministic algorithm, indicating its ability to optimize decision processes in stochastic environments more effectively by exploring more states. Finally, the stochastic differential Bellman equation is formulated as a system of ordinary equations, providing an alternative solution. For this, the concept of the random dynamical system is explored, of which a stochastic differential equation is an example.

Beschreibung

Schlagwörter

markov decision process, stochastic process, machine learning

Zitierform

Institut/Klinik

Institut für Informationssysteme

KONTAKT

Universität zu Lübeck
Zentrale Hochschulbibliothek - Haus 60
Ratzeburger Allee 160
23562 Lübeck
Tel. +49 451 3101 2201
Fax +49 451 3101 2204


IMPRESSUM

DATENSCHUTZ

BARIEREFREIHEIT

Feedback schicken

Cookie-Einstellungen