Episodic Memory

Graph-based memory used by self-supervised robot pythia

1. Idea

The memory should be able to replace the classical "episodic buffer" commonly used in reinforcement settings.

Having a graph-based memory allows to store sequences of (action, observation) tuples in such a way that it should be possible to use planning algorithm on top of it (Search on the replay buffer). It can also be used as a goal-space memory (Learning latent plans from play).

One other problem that can resolve a graph-based memory is the storage limit. The classical way of handling that is to remove oldest tuples from the memory. This is a hard limitation, because such a system becomes subject to the catastrophic forgetting problem. Graph-based memory can emulate a "natural decay" wich reinforce useful and importants memories and discard progressively the other ones.

2. Features

Store high-dimensional vectors
Keep sequences of actions and observations as a multi-directed graph
Perform fast approximate nearest-neighbors search to find relevant memories
Implement a natural memory decay
Random sampling of sequences
Planning algorithm

3. How it works

The episodic memory is based on two sub-memories:

a) An index memory This is an index of (high-dimensional) memory states that can retrieve top-k neighbors really fast. This is useful: - if we want to know if we already experienced a particular state - if we want to retrieve the most similar states compared to the current one (external or imagined) b) A graph memory This is a multi-directed graph that stores sequences of (action, state). It uses a 'natural decay' to forget least useful memories and so free some space

4. How to use

# Install requirements pip install -r requirements.txt

import numpy as np import random from memory import EpisodicMemory max_size = 10000 sim_threshold = 31 vector_dim = 200 stability_start = 1000 actions = ["up", "down", "left", "right"] memory = EpisodicMemory(base_path='model_files', max_size=max_size, index_sim_threshold=sim_threshold, vector_dim=vector_dim, stability_start=stability_start) # simulate some actions / perceptions state_m1 = np.random.random((vector_dim,)) action_m1 = random.choice(actions) for it in range(30): state = np.random.random((vector_dim,)) memory.update(state_m1, action_m1, state) state_m1 = state action_m1 = random.choice(actions) print(f"states : {memory.n_states}\ttransitions : {memory.n_transitions}\tforgeted states : {memory.forgeted}") # sample some trajectories trajectories = memory.tree_memory.sample_trajectories(n=15, horizon=6)

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
demo		demo
memory		memory
notebooks		notebooks
tests		tests
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Episodic Memory

1. Idea

2. Features

3. How it works

4. How to use

About

Uh oh!

Releases

Packages

Languages

License

lmanhes/episodic-memory

Folders and files

Latest commit

History

Repository files navigation

Episodic Memory

1. Idea

2. Features

3. How it works

4. How to use

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages