summaryrefslogtreecommitdiff
path: root/crypto/README-sparse_array.md
diff options
context:
space:
mode:
Diffstat (limited to 'crypto/README-sparse_array.md')
-rw-r--r--crypto/README-sparse_array.md155
1 files changed, 155 insertions, 0 deletions
diff --git a/crypto/README-sparse_array.md b/crypto/README-sparse_array.md
new file mode 100644
index 0000000000..d86a48d9e1
--- /dev/null
+++ b/crypto/README-sparse_array.md
@@ -0,0 +1,155 @@
+The sparse_array.c file contains an implementation of a sparse array that
+attempts to be both space and time efficient.
+
+The sparse array is represented using a tree structure. Each node in the
+tree contains a block of pointers to either the user supplied leaf values or
+to another node.
+
+There are a number of parameters used to define the block size:
+
+ OPENSSL_SA_BLOCK_BITS Specifies the number of bits covered by each block
+ SA_BLOCK_MAX Specifies the number of pointers in each block
+ SA_BLOCK_MASK Specifies a bit mask to perform modulo block size
+ SA_BLOCK_MAX_LEVELS Indicates the maximum possible height of the tree
+
+These constants are inter-related:
+ SA_BLOCK_MAX = 2 ^ OPENSSL_SA_BLOCK_BITS
+ SA_BLOCK_MASK = SA_BLOCK_MAX - 1
+ SA_BLOCK_MAX_LEVELS = number of bits in size_t divided by
+ OPENSSL_SA_BLOCK_BITS rounded up to the next multiple
+ of OPENSSL_SA_BLOCK_BITS
+
+OPENSSL_SA_BLOCK_BITS can be defined at compile time and this overrides the
+built in setting.
+
+As a space and performance optimisation, the height of the tree is usually
+less than the maximum possible height. Only sufficient height is allocated to
+accommodate the largest index added to the data structure.
+
+The largest index used to add a value to the array determines the tree height:
+
+ +----------------------+---------------------+
+ | Largest Added Index | Height of Tree |
+ +----------------------+---------------------+
+ | SA_BLOCK_MAX - 1 | 1 |
+ | SA_BLOCK_MAX ^ 2 - 1 | 2 |
+ | SA_BLOCK_MAX ^ 3 - 1 | 3 |
+ | ... | ... |
+ | size_t max | SA_BLOCK_MAX_LEVELS |
+ +----------------------+---------------------+
+
+The tree height is dynamically increased as needed based on additions.
+
+An empty tree is represented by a NULL root pointer. Inserting a value at
+index 0 results in the allocation of a top level node full of null pointers
+except for the single pointer to the user's data (N = SA_BLOCK_MAX for
+brevity):
+
+ +----+
+ |Root|
+ |Node|
+ +-+--+
+ |
+ |
+ |
+ v
+ +-+-+---+---+---+---+
+ | 0 | 1 | 2 |...|N-1|
+ | |nil|nil|...|nil|
+ +-+-+---+---+---+---+
+ |
+ |
+ |
+ v
+ +-+--+
+ |User|
+ |Data|
+ +----+
+ Index 0
+
+
+Inserting at element 2N+1 creates a new root node and pushes down the old root
+node. It then creates a second second level node to hold the pointer to the
+user's new data:
+
+ +----+
+ |Root|
+ |Node|
+ +-+--+
+ |
+ |
+ |
+ v
+ +-+-+---+---+---+---+
+ | 0 | 1 | 2 |...|N-1|
+ | |nil| |...|nil|
+ +-+-+---+-+-+---+---+
+ | |
+ | +------------------+
+ | |
+ v v
+ +-+-+---+---+---+---+ +-+-+---+---+---+---+
+ | 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1|
+ |nil| |nil|...|nil| |nil| |nil|...|nil|
+ +-+-+---+---+---+---+ +---+-+-+---+---+---+
+ | |
+ | |
+ | |
+ v v
+ +-+--+ +-+--+
+ |User| |User|
+ |Data| |Data|
+ +----+ +----+
+ Index 0 Index 2N+1
+
+
+The nodes themselves are allocated in a sparse manner. Only nodes which exist
+along a path from the root of the tree to an added leaf will be allocated.
+The complexity is hidden and nodes are allocated on an as needed basis.
+Because the data is expected to be sparse this doesn't result in a large waste
+of space.
+
+Values can be removed from the sparse array by setting their index position to
+NULL. The data structure does not attempt to reclaim nodes or reduce the
+height of the tree on removal. For example, now setting index 0 to NULL would
+result in:
+
+ +----+
+ |Root|
+ |Node|
+ +-+--+
+ |
+ |
+ |
+ v
+ +-+-+---+---+---+---+
+ | 0 | 1 | 2 |...|N-1|
+ | |nil| |...|nil|
+ +-+-+---+-+-+---+---+
+ | |
+ | +------------------+
+ | |
+ v v
+ +-+-+---+---+---+---+ +-+-+---+---+---+---+
+ | 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1|
+ |nil|nil|nil|...|nil| |nil| |nil|...|nil|
+ +---+---+---+---+---+ +---+-+-+---+---+---+
+ |
+ |
+ |
+ v
+ +-+--+
+ |User|
+ |Data|
+ +----+
+ Index 2N+1
+
+
+Accesses to elements in the sparse array take O(log n) time where n is the
+largest element. The base of the logarithm is SA_BLOCK_MAX, so for moderately
+small indices (e.g. NIDs), single level (constant time) access is achievable.
+Space usage is O(minimum(m, n log(n)) where m is the number of elements in the
+array.
+
+Note: sparse arrays only include pointers to types. Thus, SPARSE_ARRAY_OF(char)
+can be used to store a string.