# Modular Deep Reinforcement Learning with Temporal Logic Specifications

@article{Yuan2019ModularDR, title={Modular Deep Reinforcement Learning with Temporal Logic Specifications}, author={Li Yuan and Mohammadhosein Hasanbeig and Alessandro Abate and Daniel Kroening}, journal={ArXiv}, year={2019}, volume={abs/1909.11591} }

We propose an actor-critic, model-free, and online Reinforcement Learning (RL) framework for continuous-state continuous-action Markov Decision Processes (MDPs) when the reward is highly sparse but encompasses a high-level temporal structure. We represent this temporal structure by a finite-state machine and construct an on-the-fly synchronised product with the MDP and the finite machine. The temporal structure acts as a guide for the RL agent within the product, where a modular Deep… Expand

#### 19 Citations

Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic through Gaussian Processes and Control Barrier Functions

- Computer Science
- ArXiv
- 2021

A learning-based control framework consisting of an innovative reward scheme for RL-agents with the formal guarantee that global optimal policies maximize the probability of satisfying the LTL specifications, and an ECBF-based modular deep RL algorithm that achieves near-perfect success rates and safety guarding with high probability confidence during training is proposed. Expand

Deep Reinforcement Learning with Temporal Logics

- Computer Science
- FORMATS
- 2020

This work proposes a deep Reinforcement Learning method for policy synthesis in continuous-state/action unknown environments, under requirements expressed in Linear Temporal Logic (LTL), and shows that this combination lifts the applicability of deep RL to complex temporal and memory-dependent policy synthesis goals. Expand

Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward Machines

- Computer Science
- ArXiv
- 2021

Experimental results show that LSRM outperforms the methods that learn the target tasks from scratch by taking advantage of the task decomposition using SLTL and knowledge transfer over RM during the lifelong learning process. Expand

Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning

- Engineering, Computer Science
- 2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS)
- 2020

A key contribution of the paper is to leverage the classical convergence results for reinforcement learning on finite MDPs and provide control strategies maximizing the probability of satisfaction over unknown, continuous-space MDPS while providing probabilistic closeness guarantees. Expand

A Framework for Transforming Specifications in Reinforcement Learning

- Computer Science
- ArXiv
- 2021

A formal framework for defining transformations among RL tasks with different forms of objectives is developed and the notion of sampling-based reduction is defined to relate two MDPs whose transition probabilities can be learnt by sampling, followed by formalization of preservation of optimal policies, convergence, and robustness. Expand

Inverse Reinforcement Learning of Autonomous Behaviors Encoded as Weighted Finite Automata

- Computer Science
- ArXiv
- 2021

This paper uses a spectral learning approach to extract a weighted finite automaton, approximating the unknown logic structure of the task, and demonstrates that the method is capable of generalizing the execution of the inferred task specification to new environment configurations. Expand

The Logical Options Framework

- Computer Science
- ICML
- 2021

This work introduces a hierarchical reinforcement learning framework called the Logical Options Framework (LOF) that learns policies that are satisfying, optimal, and composable that can be composed to satisfy unseen tasks with only 10-50 retraining steps on benchmarks. Expand

Towards Verifiable and Safe Model-Free Reinforcement Learning

- Computer Science
- OVERLAY@AI*IA
- 2019

This line of work addresses issues by proposing a general framework that leverages the success of RL in learning high-performance controllers, while guaranteeing the satisfaction of given requirements and guiding the learning process within safe configurations. Expand

Cautious Reinforcement Learning with Logical Constraints

- Computer Science, Engineering
- AAMAS
- 2020

This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Policies… Expand

Safe Reinforcement Learning through Meta-learned Instincts

- Computer Science
- The 2020 Conference on Artificial Life
- 2020

The results suggest that meta-learning augmented with an instinctual network is a promising new approach for safe AI, which may enable progress in this area on a variety of different domains. Expand

#### References

SHOWING 1-10 OF 56 REFERENCES

Continuous control with deep reinforcement learning

- Computer Science, Mathematics
- ICLR
- 2016

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs. Expand

Teaching Multiple Tasks to an RL Agent using LTL

- Computer Science
- AAMAS
- 2018

This paper uses Linear Temporal Logic as a language for specifying multiple tasks in a manner that supports the composition of learned skills and proposes a novel algorithm that exploits LTL progression and off-policy RL to speed up learning without compromising convergence guarantees. Expand

Hierarchical Relative Entropy Policy Search

- Computer Science
- AISTATS
- 2012

This work defines the problem of learning sub-policies in continuous state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to select the low-level sub-Policies for execution by the agent and treats them as latent variables which allows for distribution of the update information between the sub- policies. Expand

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

- Mathematics, Computer Science
- NIPS
- 2016

h-DQN is presented, a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning, and allows for flexible goal specifications, such as functions over entities and relations. Expand

Modular Multitask Reinforcement Learning with Policy Sketches

- Computer Science
- ICML
- 2017

Experiments show that using the approach to learn policies guided by sketches gives better performance than existing techniques for learning task-specific or shared policies, while naturally inducing a library of interpretable primitive behaviors that can be recombined to rapidly adapt to new tasks. Expand

Improving Stability in Deep Reinforcement Learning with Weight Averaging

- 2018

Deep reinforcement learning (RL) methods are notoriously unstable during training. In this paper, we focus on model-free RL algorithms where we observe that the average reward is unstable throughout… Expand

Strategic Attentive Writer for Learning Macro-Actions

- Computer Science
- NIPS
- 2016

A novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting, which is at the same time a general algorithm that can be applied on any sequence data. Expand

Logically-Constrained Neural Fitted Q-Iteration

- Computer Science, Mathematics
- AAMAS
- 2019

We propose a method for efficient training of Q-functions for continuous-state Markov Decision Processes (MDPs) such that the traces of the resulting policies satisfy a given Linear Temporal Logic… Expand

Temporal abstraction in reinforcement learning

- Computer Science
- ICML 2000
- 2000

A general framework for prediction, control and learning at multiple temporal scales, and the way in which multi-time models can be used to produce plans of behavior very quickly, using classical dynamic programming or reinforcement learning techniques is developed. Expand

Playing Atari with Deep Reinforcement Learning

- Computer Science
- ArXiv
- 2013

This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them. Expand