Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reentrant allocator #16

Merged
merged 11 commits into from
Oct 21, 2018
Merged

Reentrant allocator #16

merged 11 commits into from
Oct 21, 2018

Conversation

TheCount
Copy link

@TheCount TheCount commented Oct 5, 2018

This is an early preparation for a fix to #12.

It basically does two things:

  • Remove allocate.h from the yaep install/interface: it's not actually needed for using the yaep library.
  • Rewrite the allocate.h interface internally so the allocator becomes reentrant. Incidentally, this should remove the remaining memory leak.

Note that the new allocate.h internal interface is not compatible to the old one, and the fact that the allocator is now reentrant makes it necessary to carry an alloc handle around, which had ramifications throughout the code.

Bonus feature: yaep now has a convenience function to free a returned parse tree.

Free parser list when freeing grammar.
Fix a memory leak where the memory used for YAEP_NIL and YAEP_ERROR
nodes would not be freed if they were not used in the parse tree.
C++ already has error detection by throwing std::bad_alloc in case of
allocation failure.
@sanssecours
Copy link
Contributor

Thank you for working on this.

The first thing I noticed is that the unit tests still fail after I apply this pull request. I really think the tests should work before and after someone adds code to the main branch of the repository.

I also noticed that I do not know how to use the new memory management code. If I do not change my code, then the compiler reports the following errors:

In file included from src/plugins/yawn/convert.cpp:24:
In file included from src/plugins/yawn/memory.hpp:13:
/usr/local/include/yaep/objstack.h:346:2: error: unknown type name 'YaepAllocator'
        YaepAllocator * os_alloc;
        ^
/usr/local/include/yaep/objstack.h:355:15: error: unknown type name 'YaepAllocator'
        explicit os (YaepAllocator * allocator, size_t initial_segment_length = OS_DEFAULT_SEGMENT_LENGTH);
                     ^
/usr/local/include/yaep/objstack.h:505:17: error: unknown type name 'YaepAllocator'
        friend os::os (YaepAllocator *, size_t);

. I fixed the problem by replacing the memory management function with malloc. Since you are certainly more familiar with the code than me, could you maybe tell me if that is a good approach? What allocation function would you use as argument for the method yaep::parse?

To end this on a positive note, AddressSanitizer does not report any memory leaks if the OS creates a yaep object any more. The method yaep::parse still seems to leak memory though.

@TheCount
Copy link
Author

TheCount commented Oct 7, 2018

The first thing I noticed is that the unit tests still fail after I apply this pull request. I really think the tests should work before and after someone adds code to the main branch of the repository.

On my machine, all the tests succeed. To debug what's going wrong with the Travis tests, we need more info (the generated yaep_tyaep.c and contents of the two generated outputs). I don't know Travis very well. Is there some option to obtain this info?

Just a thought: which branch are you applying the pull request to? vnmakarov/main or some other branch? Your PR #15 is quite possibly incompatible with my PR #16 (see below).

I also noticed that I do not know how to use the new memory management code. If I do not change my code, then the compiler reports the following errors: […]

In making the new allocator, I changed the API of OS, VLO, and hash_table. I didn't think much of it since these were private interfaces anyway (I actually even removed allocate.h, since yaep doesn't need to make that public). However, almost simultaneously, you created #15, making (at least) OS public. Hence, boom.

I fixed the problem by replacing the memory management function with malloc. Since you are certainly more familiar with the code than me, could you maybe tell me if that is a good approach?

I'm not sure what you mean here. If you replace all calls of yaep_malloc() with malloc(), and so on, you lose the error recovery feature of allocate.h. As long as there is enough memory, you won't notice a difference, though.

What allocation function would you use as argument for the method yaep::parse?

I'd like to say malloc() and free() are certainly good choices, except that the interface is a little botched because parse_alloc() expects an int instead of a size_t. Maybe a good approach would be to extend yaep_parse() to accept null pointers for parse_alloc() and parse_free(), using some sensible internal default in this case.

The method yaep::parse still seems to leak memory though.

If you call yaep_free_grammar() after yaep_parse(), the only memory that should remain is the parse tree (and to free that, I've added a new function yaep_free_tree()). Do you see other leakage?


So, how to go on from here?

  • Why/how are you using OS? If it's outside the context of yaep, then maybe it should be a library of its own.
  • I'm planning to do some work on the tests (see Build and testing system needs an overhaul #17). This should make it easier to deal with test failures.
  • Your thoughts?

@sanssecours
Copy link
Contributor

Thank you for the quick, extensive and helpful reply.

On my machine, all the tests succeed.

Interesting, on my machine the test suite fails, just like it does on Travis.

Is there some option to obtain this info?

Yep, just sign up with your GitHub account at http://travis-ci.com and enable testing for your version of the YAEP repository. For more information, please take a look here. Afterwards just change the file .travis.yml to retrieve the information you need.

Just a thought: which branch are you applying the pull request to? vnmakarov/main or some other branch?

I crated a branch based on this PR, uninstalled YAEP and then installed YAEP via ./configure && make -C src install again.

Your PR #15 is quite possibly incompatible with my PR #16 (see below).

I assumed as much. Anyway, I did not apply your code on my pull request or vice versa. I just used the code of this PR to build and install YAEP. I also removed objstack.h from /usr/local/include.

I'm not sure what you mean here.

The C++ method yaep::parse contains the argument void *(*parse_alloc) (int nmemb). Unfortunately, I need to provide a memory allocation function for this argument or parsing will not work.

If you replace all calls of yaep_malloc() with malloc(), and so on, you lose the error recovery feature of allocate.h. As long as there is enough memory, you won't notice a difference, though.

Interesting, I did not know that such a feature exists.

Maybe a good approach would be to extend yaep_parse() to accept null pointers for parse_alloc() and parse_free(), using some sensible internal default in this case.

That would be really great, especially if the C++ interface would provide the same feature.

If you call yaep_free_grammar() after yaep_parse(), the only memory that should remain is the parse tree (and to free that, I've added a new function yaep_free_tree()). Do you see other leakage?

As far as I can tell, yaep_free_grammar will be called automatically by the destructor of yaep. I guess I still have to free the parse tree then. Unfortunately yaep_free_tree is not accessible from within C++. According to the documentation, the function has also to be called after yaep_free_grammar(). I do not know how I can do that, and still be able to use RAII. I guess I have to live with the memory leaks in my code for now.

Why/how are you using OS?

I use OS because the tests (yaep++.tst) also use this code. Since the documentation of YAEP is not very extensive, I just copied the code from one of the tests, and modified it until I had a minimal working example. Afterwards I used what I had learned to create parsing code for a very small subset of YAML.

Your thoughts?

I would prefer that YAEP does not require that I provide a memory allocation function at all. As a user of YAEP I (currently) do not care what code the library uses to allocate and deallocate memory. In my opinion an interface that makes memory management easier would be really great. For example, I think using unique_ptr to manage the memory for the parse tree would be a nice improvement.

@TheCount
Copy link
Author

TheCount commented Oct 7, 2018

yaep_free_tree() not having a counterpart in the C++ interface is an oversight on my part, which I will fix by adding a commit on top of this pull request.

Regarding yaep_parse()/yaep::parse(), the specification currently says that parse_alloc is used for allocation, and that parse_free is the counterpart to parse_alloc unless parse_free is a null pointer, in which case the memory is not freed. So it's possible to extend the semantics as follows without breaking backwards compatibility:

  • If both parse_alloc and parse_free are null pointers, default versions are provided by the library.
  • It is an error if parse_alloc is a null pointer and parse_free is not.
  • If parse_alloc is not a null pointer, but parse_free is, the memory will not be deallocated.
    As for yaep_free_tree(), we can say that a null pointer for parse_free provides a default version, and if you don't want the memory to be freed, you just don't call yaep_free_tree(), period.

I'll also add this on top of this PR.

Thank you for the pointers regarding Travis. I'll look into that and into your OS usage after I have fixed the above things.

@TheCount
Copy link
Author

TheCount commented Oct 7, 2018

Added two commits, see #16 (comment)

@sanssecours
Copy link
Contributor

Wow, you are really fast 😊. I was able to fix the memory leaks using the latest code from your PR. Thank you very much.

@TheCount
Copy link
Author

TheCount commented Oct 7, 2018

OK, I've also got travis to run and see this: https://travis-ci.org/TheCount/yaep/jobs/438340845

What happens here is essentially that yaep_read_grammar() frees the grammar on error even though it does not "own" it. Later attempts to use the grammar's allocator lead to a segmentation fault.

The API description does not mention this behaviour, so I'll just remove it.

TheCount added a commit to TheCount/yaep that referenced this pull request Oct 7, 2018
When yaep_read_grammar() fails, it must not free the grammar; this needs
to be done by the caller of yaep_create_grammar().

See vnmakarov#16 (comment)
When yaep_read_grammar() fails, it must not free the grammar; this needs
to be done by the caller of yaep_create_grammar().

See vnmakarov#16 (comment)
@vnmakarov
Copy link
Owner

Thank you for the patches. It is a big work. I'll try to review the patches on the next week and, if they are ok, I'll merge them into the repository.

YAEP was originally implemented on C. For some reasons, I decided to add C++ interface. But basically it was a wrap up around C.

YAEP was a part of COCOM toolset and as many other tools it used a lot of common data-structures (os, vlo, hashtab etc). I don't work on COCOM anymore (some tool designs are too outdated) but on the base of it programming language DINO implemented http://dino-lang.github.io/ (it is my language to do a research on dynamic programming languages implementations). Earley parser is part of dino standard library. So YAEP code is very old and probably need some modernization.

@vnmakarov vnmakarov merged commit 96aec83 into vnmakarov:master Oct 21, 2018
vnmakarov added a commit that referenced this pull request Oct 21, 2018
@vnmakarov
Copy link
Owner

Alexander, thank you again for the patches. I've merged the pull request.

YAEP code uses GNU standard style. So I've changed the format of your changes after the merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants