-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alloc metadata storage scheme. #282
Conversation
…(using slab occupancy bits space).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not implement any kind of feature, so there is not much feedback I can provide. To know what architecture work, you need to have the code do something. Only then we can judge of the code.
@@ -217,13 +280,22 @@ public: | |||
return (bits >> 48) & Mask; | |||
} | |||
|
|||
uint allocate() { | |||
uint allocate(bool isAppendable = false, bool isFinalizable = false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unlikely to be the API we want. We are going to allocate slots in bulk and store them in a per thread cache, as per #269 so this API already doesn't work, unless we want to have 4 caches for every combination of flags, which doesn't seems workable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it be necessary to have separate caches for the combinations of flags when caching per thread? The flags are only set when a slot is occupied.
Also, I thought we were going to start with the simplest mechanisms and then refine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you gain by increasing the cache size by 4x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deadalnix do you have a description of the intended cache scheme somewhere that would explain why using this scheme would require a 4x larger cache? Slabs were not even mentioned in #269 or the linked snippets from jemalloc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deadalnix Finally understood what you meant here, following our Slack conversation re: when in the pipeline the metadata ought to be set (i.e. in tcache
rather than in Extent
or the arena.)
sdlib/d/gc/extent.d
Outdated
} | ||
|
||
@property | ||
ref _meta extMeta() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't used anywhere, why add it?
} | ||
|
||
@property | ||
bool isFinalizable(uint index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I highly doubt this is the API we want. In what context will we want to ask the question in this manner? When collecting, we'll probably go over the whole slab either asking the question or not depending on the sizeclass right out of the bat. That question would be easy to answer if the code actually did something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isFinalizable
is intended to be used when destroying a slab allocation (in which case the finalizer is expected to be present in the tail of the block, and is invoked.) isAppendable
is intended to be used when carrying out an operation which requires appendability (or invoking the predicated which determines whether such operations may be used) on a slab allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that it is impossible to answer the question of e.g. isFinalizable
simply by looking at a slab's size class -- a slab alloc in that class may nonetheless not contain a finalizer. Similarly for isAppendable
. I did not exclude these classes from containing "non-special" allocs (i.e. without a finalizer or length appendix.)
sdlib/d/gc/extent.d
Outdated
@property | ||
ulong slabSlots() const { | ||
assert(isSlab(), "slabSlots accessed on non slab!"); | ||
return ulong(binInfos[sizeClass].slots); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time this is called, you take the risk of taking one cache miss. There is no reason to make it look like member data, and more generally, I doubt it's actually what's needed because I doubt most of these computation ought to happen in this class. In fact, #283 kinda gives it away, as computation there is redundant with computation here, so even without the code actually doing anything, it's already suspicious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can one determine which meta bits are valid for a given slab alloc size class without knowing the slot size? And it doesn't seem that there is any convenient place to cache it locally in Extent
. And where would you have the computation happen if not in Extent
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, there is nothing redundant about this piece vs. #283. In the latter, we determine the effective size of a slab alloc required to support the flags shown here. Let's illustrate with excerpt from the size class dumper:
id: 0 size: 8 needPages: 1 slots: 512
id: 1 size: 16 needPages: 1 slots: 256
id: 2 size: 24 needPages: 3 slots: 512
id: 3 size: 32 needPages: 1 slots: 128
id: 4 size: 40 needPages: 5 slots: 512
id: 5 size: 48 needPages: 3 slots: 256
id: 6 size: 56 needPages: 7 slots: 512
id: 7 size: 64 needPages: 1 slots: 64
id: 8 size: 80 needPages: 5 slots: 256
id: 9 size: 96 needPages: 3 slots: 128
id: 10 size: 112 needPages: 7 slots: 256
With the given scheme, classes and 0, 2, 4, 6 leave no room for any flags.
1, 5, 8, and 10 allow only one flag per slab slot.
All others permit two flags per slot.
In #283 we compute any required up-sizing for a slab alloc to allow it to utilize the requested flags.
Whereas here we access the flags per se.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, I can hardcode the classes in the predicates in Extent
; this is probably what you meant when said "redundant". But this would leave less flexibility to modify the size classes later. Is this what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which means you can do (sc & 0x01) || (sc > 6)
to know which size are suitable for appendable allocations. you can use a unittest to ensure implementation match the intended behavior (256 entries or less).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deadalnix Excellent point, thank you! I will add this and some tests.
I'm going to close this. There are too many PR in parallel that are not moving. |
No description provided.