Add a Unique Node Table #444
Labels
📁 internal
This is where the ✨magic✨happens
✨ optimisation
It's all about speed / space
🎓 student project
Work, work... but academic!
Milestone
With Adiar v1.2 we have been able to decrease the threshold as to when Adiar is viable in comparison to conventional depth-first BDD packages. Yet, this does not remove the major relative performance decrease for tiny BDDs. The offshot of this is that Adiar still has very undesirable characteristics in contexts, where one is manipulating a lot of tiny decision diagrams (e.g. our Picotrav benchmarks).
Interestingly, the threshold we derived in our experiments is somewhat below a threshold in the CAL BDD package [Sanghavi96]: if below this size, CAL switches from its breadth-first to the conventional depth-first algorithms. Hence, it is likely that we have hit a lower bound as to when using time-forward processing is a viable option.
For an end-user, it is of interest to have Adiar be efficient across the board. That is, we also need to be competitive for the tiniest of instances. To this end, we will have to (similar to CAL) switch to the conventional approach.
Task: Implement BuDDy's Unique Node Table
From our experiments, the BDD package BuDDy is the fastest implementation. My hypothesis is, that it is due to its extremely simple design in terms of code, but also due to the way it lays out the unique node table. Hence, this seems like a great place to start.
The hash-table is for values of type
T
(whereT
is some type of a node). It is stored as single list of elementsentry<T>
which adds the two (indices) hash and next. The hash-value of an element of typeT
is used to look up theentry<T>
at said index; the hash value is the index for the first entry in that bucket. The bucket is traversed with the next pointer until the desired node is found.To this end, all nodes are initialised with the default value
T{}
and the next pointer for element i set to i+1. This creates a long chain of free nodes. A variablefree
initially is set to i = 0. When a new node node is added to the table, the next free node i is used,free
is updated totable[i].next
before it is set to null.Bucket Ordering
When adding a node to a bucket described with the next pointers, one could place it in many different places. Which actually is best should be determined based on experiments.
BuDDy adds the new nodes at the front of the bucket.
Garbage Collection
For garbage collection, each
entry<T>
also includes a ref_count variable. When allocating a node, the reference count of the children are incremented by one. Here, thedd
class also updates the reference count of the root on its constructors, assignments, and destructors.The basic idea behind collection is to free all nodes that have a reference count of zero (recursively decrementing their children). Here, the freed entry's next pointer is updated to be
free
before being placed at the front of thefree
list (its content does not need to be reset toT{}
as that will be fixed on the next allocation).BuDDy's garbage collection essentially is a mark-and-sweep in two simple linear scans (which is good for the CPU cache performance).
In a forward iteration all non-dead nodes are marked and the hash index is reset back to
null
.In a backwards iteration, if a node is to-be kept then its hash is appended/prepended to its bucket.
If a node needs to be freed, then the
free
chain is updated as described above. Since the iteration is backwards, this ensures that all allocated nodes stay clumped together towards the beginning of the array.Task: Sweeping through the Unique Node Table
How do we best switch from the internal memory (depth-first) to the external memory (time-forward processing) paradigm? How do we go back to the (depth-first) unique node table when the result of some time-forward processing algorithm turns out to be small enough to fit into it?
We could do so by converting the data from one paradigm to the other. But, this has a computational overhead that in fact is not needed! This is because we can run the time-forward processing algorithms on inputs from the depth-first algorithms and also feed their results back into the unique node table.
Unique Node Table in Top-down Sweeps
Adiar's top-down algorithms exploit a levelized ordering of the unique identifier (uid) for a node. Currently, they use a
node_stream
toseek(...)
( #401 ) the uid within a file on disk.For the algorithm to be correct, this node does not have to be read from disk. The uid of a node from the depth-first algorithm merely is its index in the unique node table. Hence, we can replace the
node_stream
within the algorithm with another class that merely does a O(1) lookup in the unique node table on eachseek(...)
.This can be done merely by templating the algorithm as for v1.2. In fact, the work on v2.0.0-beta.3 already implements an algorithm that changes the order of Apply to abuse random-access.
Unique Node Table in Reduce Sweeps
These things can be done with templates:
The size of the unreduced BDD is an upper bound on the output size of Reduce. We can check whether the unreduced input would fit into the (depth-first) unique node table - assuming none of its current nodes can be reused. If so, we can run the Reduce algorithm where the
node_writer
is replaced with the unique node table.Since we are outputting nodes into the unique node table, we may as well use its much faster hashing to identify duplicate nodes. This entirely skips two sorting steps of the current algorithm, which will (roughly) double its performance.
Level Information
The levelised priority queue needs to know the levels a priori to generate its buckets. Hence, the Reduce needs to also create the support for the BDD as a separate And-chain of variables. Furthermore, the levelised priority queue needs to abstract its way of obtaining the levels such that it can freely mix and match between files and the table.
Garbage Collection
Furthermore, we need to consider when to trigger garbage collection. For now, it might be best to do so at the start of the Reduce if there is not enough space (and BDDs in the table have been killed since).
References
[Sanghavi96] Jagesh V. Sanghavi, Rajeev K. Ranjan, Robert K. Brayton, and Alberto Sangiovanni-Vincentelli. “High performance BDD package by exploiting memory hierarchy”. In: Proceedings of the 33rd Annual Design Automation Conference (1996)
[Pastva2023] Samuel Pastva and Thomas Henzinger. “Binary Decision Diagrams on Modern Hardware”. In: Proceedings of the 23rd Conference on Formal Methods in Computer-Aided Design (2023)
The text was updated successfully, but these errors were encountered: