The class C_Memory is the C++ backend for memory-buffer used in algorithms that stores transitions in a buffer. This class contains optimized routines to support Python front-end of rlpack._C.memory.Memory class. More...
Data Structures | |
struct | C_MemoryData |
The class C_MemoryData keeps the references to data that is associated with C_Memory. This class implements the functions necessary to retrieve the data by de-referencing the data associated with C_Memory. More... | |
Public Member Functions | |
C_Memory () | |
C_Memory (int64_t bufferSize, const std::string &device, const int32_t &prioritizationStrategyCode, const int32_t &batchSize) | |
void | clear () |
void | delete_item (int64_t index) |
std::map< std::string, torch::Tensor > | get_item (int64_t index) |
void | initialize (C_MemoryData &viewC_Memory) |
void | insert (torch::Tensor &stateCurrent, torch::Tensor &stateNext, torch::Tensor &reward, torch::Tensor &action, torch::Tensor &done, torch::Tensor &priority, torch::Tensor &probability, torch::Tensor &weight, bool isTerminalState) |
int64_t | num_terminal_states () |
std::map< std::string, torch::Tensor > | sample (float_t forceTerminalStateProbability, int64_t parallelismSizeThreshold, float_t alpha=0.0, float_t beta=0.0, int64_t numSegments=0) |
void | set_item (int64_t index, torch::Tensor &stateCurrent, torch::Tensor &stateNext, torch::Tensor &reward, torch::Tensor &action, torch::Tensor &done, torch::Tensor &priority, torch::Tensor &probability, torch::Tensor &weight, bool isTerminalState) |
size_t | size () |
int64_t | tree_height () |
void | update_priorities (torch::Tensor &randomIndices, torch::Tensor &newPriorities) |
C_MemoryData | view () const |
~C_Memory () | |
Data Fields | |
std::shared_ptr< C_MemoryData > | cMemoryData |
Shared Pointer to C_Memory::C_MemoryData. More... | |
Static Private Member Functions | |
static torch::Tensor | compute_important_sampling_weights (torch::Tensor &probabilities, int64_t currentSize, float_t beta) |
static torch::Tensor | compute_probabilities (torch::Tensor &priorities, float_t alpha) |
Private Attributes | |
std::deque< torch::Tensor > | actions_ |
Deque of torch tensors for actions. More... | |
int32_t | batchSize_ = 32 |
The batch size that is set during class initialisation. Number of samples equivalent to this are selected during sampling. More... | |
int64_t | bufferSize_ = 32768 |
Buffer size passed during the class initialisation. Defaults to 32768. More... | |
torch::Device | device_ = torch::kCPU |
Torch device passed during class initialisation. Defaults to CPU. More... | |
std::map< std::string, torch::DeviceType > | deviceMap_ |
The map between std::string and torch::DeviceType; mapping the device name in string to DeviceType. More... | |
std::deque< torch::Tensor > | dones_ |
Deque of torch tensors for dones. More... | |
std::vector< int64_t > | loadedIndices_ |
Vector of loaded indices. This indicates the indices that have been loaded out of total capacity of the memory. More... | |
std::vector< int64_t > | loadedIndicesSlice_ |
The loaded indices slice; the slice of indices that is sampled during sampling process. In each sampling size its size is equal to C_Memory::batchSize_. More... | |
Offload< float_t > * | offloadFloat_ |
Offload class initialised with float template. More... | |
Offload< int64_t > * | offloadInt64_ |
Offload class initialised with int64 template. More... | |
std::deque< torch::Tensor > | priorities_ |
Deque of torch tensors for priorities. More... | |
std::deque< float_t > | prioritiesFloat_ |
Deque of float indicating the priorities in C++ float. Values are obtained from C_Memory::priorities_. More... | |
int32_t | prioritizationStrategyCode_ = 0 |
The prioritization strategy code that is being. This determines the sampling technique that is employed. Refer rlpack.dqn.dqn.Dqn.get_prioritization_code. More... | |
std::deque< torch::Tensor > | probabilities_ |
Deque of torch tensors for probabilities. More... | |
std::deque< torch::Tensor > | rewards_ |
Deque of torch tensors for rewards. More... | |
std::vector< torch::Tensor > | sampledActions_ |
The sampled action tensors from C_Memory::actions_. More... | |
std::vector< torch::Tensor > | sampledDones_ |
The done tensors from C_Memory::dones_. More... | |
std::vector< torch::Tensor > | sampledIndices_ |
The sampled indices as tensors from C_Memory::loadedIndices_. More... | |
std::vector< torch::Tensor > | sampledPriorities_ |
The sampled priority tensors from C_Memory::priorities. More... | |
std::vector< torch::Tensor > | sampledRewards_ |
The sampled reward tensors from C_Memory::rewards_. More... | |
std::vector< torch::Tensor > | sampledStateCurrent_ |
The sampled current state tensors from C_Memory::statesCurrent_. More... | |
std::vector< torch::Tensor > | sampledStateNext_ |
The sampled next state tensors from C_Memory::statesNext_. More... | |
std::vector< float_t > | seedValues_ |
The seed values generated during each sampling cycle for proportional based prioritization. More... | |
std::vector< int64_t > | segmentQuantileIndices_ |
The Quantile segment indices sampled when rank-based prioritization is used. More... | |
std::deque< torch::Tensor > | statesCurrent_ |
Deque of torch tensors for current states. More... | |
std::deque< torch::Tensor > | statesNext_ |
Deque of torch tensors for next states. More... | |
int64_t | stepCounter_ = 0 |
The counter variable the tracks the loaded indices in sync with total timesteps. Once memory reaches the buffer size, this will not update. More... | |
std::shared_ptr< SumTree > | sumTreeSharedPtr_ |
Shared Pointer to SumTree class object. More... | |
std::deque< int64_t > | terminalStateIndices_ |
Deque of integers indicating the indices of terminal states. More... | |
std::deque< torch::Tensor > | weights_ |
Deque of torch tensors for weights. More... | |
The class C_Memory is the C++ backend for memory-buffer used in algorithms that stores transitions in a buffer. This class contains optimized routines to support Python front-end of rlpack._C.memory.Memory class.
A memory
index refers to an index that yields a transition from C_Memory. This works by indexing the following variables and grouping them together:
C_Memory::C_Memory | ( | ) |
The default non-parameterised constructor. This constructor allocates memory as per default initialised variables. This initialises the rlpack._C.memory.Memory.c_memory and is equivalent to rlpack._C.memory.Memory.__init__.
|
explicit |
The class constructor for C_Memory. This constructor initialised the C_Memory class and allocates the required memory as per input arguments. This initialises the rlpack._C.memory.Memory.c_memory and is equivalent to rlpack._C.memory.Memory.__init__.
bufferSize | : The buffer size to be used and allocated for the memory. |
device | : The device transfer relevant tensors to. |
prioritizationStrategyCode | : The prioritization strategy code. Refer rlpack.dqn.dqn.Dqn.get_prioritization_code. |
batchSize | : The batch size to be used for sampling. |
C_Memory::~C_Memory | ( | ) |
The destructor for C_Memory.
void C_Memory::clear | ( | ) |
Clears the data in C_Memory. This will NOT free the memory since it doesn't perform any memory de-allocation. This is C++ backend of rlpack._C.memory.Memory.clear method.
|
staticprivate |
Method to compute the important sampling weights for each probabilities.
probabilities | : The input probabilities for which IS weights are to be computed. |
currentSize | : The current size of the C_Memory (see C_Memory::size) |
beta | : The beta value for prioritization. Refer C_Memory::sample for more information. |
|
staticprivate |
Method to compute probabilities when not using uniform prioritization strategy.
priorities | : The sampled priorities for which probabilities are to be computed. |
alpha | : The alpha value for prioritization. Refer C_Memory::sample for more information. |
void C_Memory::delete_item | ( | int64_t | index | ) |
Deletion method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__delitem__ so can be accessed by simple indexing operation (with operator []; del memory[index]) from Python side.
This the deletion is fast if index is either the first or last element, else will take O(n) to allocate memory for items after index.
index | : The index of the transition we want to remove. |
std::map< std::string, torch::Tensor > C_Memory::get_item | ( | int64_t | index | ) |
Getter method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__getitem__ method so can be accessed by simple indexing operation (with operator []; item = memory[index]) from Python side.
index | : The index from which we want to obtain the transition |
void C_Memory::initialize | ( | C_Memory::C_MemoryData & | viewC_MemoryData | ) |
Initialize method for C_Memory for initializing all the data from an object of C_Memory::C_MemoryData. This is the C++ backend of rlpack._C.memory.Memory.initialize method
viewC_MemoryData | : An object of C_Memory::C_MemoryData. |
void C_Memory::insert | ( | torch::Tensor & | stateCurrent, |
torch::Tensor & | stateNext, | ||
torch::Tensor & | reward, | ||
torch::Tensor & | action, | ||
torch::Tensor & | done, | ||
torch::Tensor & | priority, | ||
torch::Tensor & | probability, | ||
torch::Tensor & | weight, | ||
bool | isTerminalState | ||
) |
Insertion method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.insert method.
stateCurrent | : Current state from transition |
stateNext | : Next state from transition. |
reward | : Reward obtained during transition. |
action | : Action taken during transition. |
done | : Flag indicating if next state is terminal packaged in PyTorch Tensor. |
priority | : Priority value associated with the transition. |
probability | : Probability value associated with the transition. |
weight | : Weight value associated with the transition. |
isTerminalState | : Flag indicating if next state is terminal. |
int64_t C_Memory::num_terminal_states | ( | ) |
Method to obtain the number of terminal states currently in C_Memory. This is the C++ backend of rlpack._C.memory.Memory.num_terminal_states method.
std::map< std::string, torch::Tensor > C_Memory::sample | ( | float_t | forceTerminalStateProbability, |
int64_t | parallelismSizeThreshold, | ||
float_t | alpha = 0.0 , |
||
float_t | beta = 0.0 , |
||
int64_t | numSegments = 0 |
||
) |
The sampling method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.sample. Sampling is done as per the prioritization strategy specified during initialisation of C_Memory.
forceTerminalStateProbability | : The probability to force a terminal state in final sample. |
parallelismSizeThreshold | : The threshold size of buffer (from C_Memory::size method) beyond with OpenMP parallelized routines are used for sampling. |
alpha | : The alpha value for prioritization. This is used to compute probabilities, where higher alpha indicates more aggressive prioritization. |
beta | : The beta value for prioritization. This is used to compute important sampling weights, where higher beta indicates more aggressive bias correction. |
numSegments | : The number of segments to be used for rank-based prioritization (in accordance with Zipf's law) |
(batchSize, ...)
:void C_Memory::set_item | ( | int64_t | index, |
torch::Tensor & | stateCurrent, | ||
torch::Tensor & | stateNext, | ||
torch::Tensor & | reward, | ||
torch::Tensor & | action, | ||
torch::Tensor & | done, | ||
torch::Tensor & | priority, | ||
torch::Tensor & | probability, | ||
torch::Tensor & | weight, | ||
bool | isTerminalState | ||
) |
Setter method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__setitem__ method so can be accessed by simple indexing operation (with operator []; memory[index] = index) from Python side. This method modified the items at the given index.
index | : The index to which we want to set the transition. |
stateCurrent | : Current state from transition |
stateNext | : Next state from transition. |
reward | : Reward obtained during transition. |
action | : Action taken during transition. |
done | : Flag indicating if next state is terminal packaged in PyTorch Tensor. |
priority | : Priority value associated with the transition. |
probability | : Probability value associated with the transition. |
weight | : Weight value associated with the transition. |
isTerminalState | : Flag indicating if next state is terminal. |
size_t C_Memory::size | ( | ) |
This method obtains the current size of C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__len__ method, so length can be obtained by in-built python function len(memory).
int64_t C_Memory::tree_height | ( | ) |
Method to obtain the tree height of the sum tree if using a proportional prioritization strategy. This is the C++ backend of rlpack._C.memory.Memory.tree_height. If not using proportional prioritization strategy, calling this method will throw an error.
void C_Memory::update_priorities | ( | torch::Tensor & | randomIndices, |
torch::Tensor & | newPriorities | ||
) |
The method to update priorities as per new values computed by agent as per the prioritization strategy. This is the C++ backend of rlpack._C.memory.Memory.update_priorities method.
randomIndices | : The random indices on which priorities are required to be updated. C_Memory::sample provides this information which can be used. |
newPriorities | : The new priorities computed by the agent as per the prioritization strategy. |
C_Memory::C_MemoryData C_Memory::view | ( | ) | const |
The pointer to C_Memory::C_MemoryData object. This will contain references of data in C_Memory and provides an easy data view. This is the C++ backend of rlpack._C.memory.Memory.view method.
|
private |
Deque of torch tensors for actions.
|
private |
The batch size that is set during class initialisation. Number of samples equivalent to this are selected during sampling.
|
private |
Buffer size passed during the class initialisation. Defaults to 32768.
std::shared_ptr<C_MemoryData> C_Memory::cMemoryData |
Shared Pointer to C_Memory::C_MemoryData.
|
private |
Torch device passed during class initialisation. Defaults to CPU.
|
private |
The map between std::string and torch::DeviceType; mapping the device name in string to DeviceType.
|
private |
Deque of torch tensors for dones.
|
private |
Vector of loaded indices. This indicates the indices that have been loaded out of total capacity of the memory.
|
private |
The loaded indices slice; the slice of indices that is sampled during sampling process. In each sampling size its size is equal to C_Memory::batchSize_.
|
private |
Deque of torch tensors for priorities.
|
private |
Deque of float indicating the priorities in C++ float. Values are obtained from C_Memory::priorities_.
|
private |
The prioritization strategy code that is being. This determines the sampling technique that is employed. Refer rlpack.dqn.dqn.Dqn.get_prioritization_code.
|
private |
Deque of torch tensors for probabilities.
|
private |
Deque of torch tensors for rewards.
|
private |
The sampled action tensors from C_Memory::actions_.
|
private |
The done tensors from C_Memory::dones_.
|
private |
The sampled indices as tensors from C_Memory::loadedIndices_.
|
private |
The sampled priority tensors from C_Memory::priorities.
|
private |
The sampled reward tensors from C_Memory::rewards_.
|
private |
The sampled current state tensors from C_Memory::statesCurrent_.
|
private |
The sampled next state tensors from C_Memory::statesNext_.
|
private |
The seed values generated during each sampling cycle for proportional based prioritization.
|
private |
The Quantile segment indices sampled when rank-based prioritization is used.
|
private |
Deque of torch tensors for current states.
|
private |
Deque of torch tensors for next states.
|
private |
The counter variable the tracks the loaded indices in sync with total timesteps. Once memory reaches the buffer size, this will not update.
|
private |
Shared Pointer to SumTree class object.
|
private |
Deque of integers indicating the indices of terminal states.
|
private |
Deque of torch tensors for weights.