-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue #11189 part 00 refactor citation relation tab logic #11845
base: main
Are you sure you want to change the base?
Fix issue #11189 part 00 refactor citation relation tab logic #11845
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code currently does not meet JabRef's code guidelines.
We use Checkstyle to identify issues.
The tool reviewdog already placed comments on GitHub to indicate the places. See the tab "Files" in you PR.
Please carefully follow the setup guide for the codestyle.
Afterwards, please run checkstyle locally and fix the issues.
You can check review dog's comments at the tab "Files changed" of your pull request.
9fe8522
to
cbe9e96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code currently does not meet JabRef's code guidelines.
We use Checkstyle to identify issues.
The tool reviewdog already placed comments on GitHub to indicate the places. See the tab "Files" in you PR.
Please carefully follow the setup guide for the codestyle.
Afterwards, please run checkstyle locally and fix the issues.
You can check review dog's comments at the tab "Files changed" of your pull request.
cbe9e96
to
8231340
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code currently does not meet JabRef's code guidelines.
We use OpenRewrite to ensure "modern" Java coding practices.
The issues found can be automatically fixed.
Please execute the gradle task rewriteRun
, check the results, commit, and push.
You can check the detailed error output by navigating to your pull request, selecting the tab "Checks", section "Tests" (on the left), subsection "OpenRewrite".
8231340
to
33967c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code currently does not meet JabRef's code guidelines.
We use OpenRewrite to ensure "modern" Java coding practices.
The issues found can be automatically fixed.
Please execute the gradle task rewriteRun
, check the results, commit, and push.
You can check the detailed error output by navigating to your pull request, selecting the tab "Checks", section "Tests" (on the left), subsection "OpenRewrite".
33967c2
to
592d4d7
Compare
d94f4d3
to
3155242
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add code explanations to the PR
src/main/java/org/jabref/gui/entryeditor/citationrelationtab/BibEntryRelationsCache.java
Show resolved
Hide resolved
src/main/java/org/jabref/gui/entryeditor/citationrelationtab/BibEntryRelationsRepository.java
Show resolved
Hide resolved
src/main/java/org/jabref/gui/entryeditor/citationrelationtab/CitationRelationsTab.java
Show resolved
Hide resolved
src/main/java/org/jabref/gui/entryeditor/citationrelationtab/CitationRelationsTab.java
Show resolved
Hide resolved
src/main/java/org/jabref/gui/entryeditor/citationrelationtab/CitationRelationsTab.java
Show resolved
Hide resolved
src/main/java/org/jabref/logic/citation/repository/LRUBibEntryRelationsCache.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/logic/citation/repository/LRUBibEntryRelationsCache.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/logic/citation/repository/LRUBibEntryRelationsRepository.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/logic/citation/service/SearchCitationsRelationsService.java
Outdated
Show resolved
Hide resolved
Please no force push if not needed. All commits will be squashed when merged |
* Move repository, cache, and fetcher to logic package * Move citations model to model/citations/semanticscholar package
* Introduce service layer * Rename LRU cache implementation * Add tests helpers for repository
* Move logic from repository to service * Refactor repositories * Update tab configuration
3155242
to
18db75e
Compare
Sorry, just re-based main branch locally. |
…lation-tab-logic # Conflicts: # src/main/java/org/jabref/gui/entryeditor/citationrelationtab/CitationRelationsTab.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general looks good. Some minor comments.
Sorry for delay. Please go ahead with everything.
src/main/java/org/jabref/logic/citation/service/SearchCitationsRelationsService.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/gui/entryeditor/citationrelationtab/CitationRelationsTab.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/gui/entryeditor/citationrelationtab/CitationRelationsTab.java
Show resolved
Hide resolved
src/test/java/org/jabref/logic/citation/service/SearchCitationsRelationsServiceTest.java
Outdated
Show resolved
Hide resolved
Small other comments - IntelliJ proposed to extract a method
in the tests - maybe you can also include that. |
@alexandre-cremieux Please pull before you continue working on it - I merged |
src/main/java/org/jabref/logic/citation/repository/LRUBibEntryRelationsRepository.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/logic/citation/service/SearchCitationsRelationsService.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/logic/citation/service/SearchCitationsRelationsService.java
Outdated
Show resolved
Hide resolved
Thanks for the review and the merge. I will resume the work on this branch and apply the changes. |
@alexandre-cremieux Sorry for the merge conflicts - can you handle them? I was always happy with IntelliJ's "resolve merge conflicts" dialog. Hope, it works in this case, too. |
Hello @koppor . Seems that we have new conflicts to resolve to be able to merge main. But I will do that when the feature will be fully developed. Was quite busy last month, I resumed the work this week. PR comments were addressed. |
/** | ||
* TODO: Make the method return a callable and let the calling method create the background task. | ||
*/ | ||
private BackgroundTask<List<BibEntry>> createBackGroundTask( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Background (without uppercase G)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The highlighted TODO: Which benefts are this - and why should it implemetned lateR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO follows a discussion we had earlier that you can read here: #11845 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what stops you from returning a runnable?
If it's used in many places and that makes you rewrite a lot of code to include BackgroundTask
constructor, then maybe returning a BackgroundTask
right away was not a bad code decision 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in theory, this could be needed to keep the level of abstration. Should be decided on where the method is actually used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Callable
is expected by the BackgroundTask
. Also, Callable
has less overhead than a BackgroundTask
, it is more convenient to use this type in order to:
- Ensure that the same task is not already being executed (for same tab for instance) => why kill a task that is already running if we want to minimize the application's footprint ?
- Insert it into a waiting queue if we want to provide a proper task management logic for this tab (see code of this class => we have one executor per tab instance but the tasks are shared between all those instances...)
- Decouple the callable implementation from the task itself
As mentioned, this improvement could indeed imply several code changes (in this class):
- refresh the screen properly => is the user focus on the tab for which a task was ended ?
- set up a queue would need to override
equals
andhashCode
and so provide a proper implementation of a callable - proper testing, etc.
At least, I think we agree that this should not be part of this PR. As I saw that the current implementation could be improved (stopping a task that maybe shouldn't be stopped, avoid same task but different executors for each instances of the tab, etc). I proposed to take a look to this after merging this PR. Reason why the TODO is there (after Oliver's accepted that at that time).
But I can remove completely the TODO also if it bother you (and I understand that cause I do not like TODO also). However, I already have another to issue opened for which I volunteered related to same tab...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just add a link your reasoning in your comment in the to-do?
public SearchCitationsRelationsService( | ||
CitationFetcher citationFetcher, BibEntryRelationsRepository repository | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codestyle seems off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check style test ran and is green (as reported bellow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was late here, and my energy was drawn by checking the CHANGELOG.md. Sorry for that. I will try to review properly the next days.
CHANGELOG.md
Outdated
@@ -11,11 +11,14 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv | |||
|
|||
### Added | |||
|
|||
- We introduced a settings parameters to manage citations' relations local storage time-to-live. [#11189](https://github.com/JabRef/jabref/issues/11189) | |||
- We added a 'Copy to' context menu option with features for cross-reference inclusion/exclusion, as well as the ability to save user preferences. [#12374](https://github.com/JabRef/jabref/pull/12374) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really in this PR? - Think, this is from a merge conflict and needs to be removed.
(The next line is the feature)
The line at hand is from https://github.com/JabRef/jabref/pull/12374/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4ed - but really seems to be reworded to the next one.
"Proof": Same link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be an update merge artifact
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thank. It is indeed a merge artifact.
CHANGELOG.md
Outdated
@@ -11,11 +11,14 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv | |||
|
|||
### Added | |||
|
|||
- We introduced a settings parameters to manage citations' relations local storage time-to-live. [#11189](https://github.com/JabRef/jabref/issues/11189) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the default value? (in case users read the CHANGELOG, it should more-or-less be self contained)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the default value in the description.
CHANGELOG.md
Outdated
@@ -25,6 +28,7 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv | |||
- We fixed an issue where a bib file with UFF-8 charset was wrongly loaded with a different charset [forum#5369](https://discourse.jabref.org/t/jabref-5-15-opens-bib-files-with-shift-jis-encoding-instead-of-utf-8/5369/) | |||
- We fixed an issue where new entries were inserted in the middle of the table instead of at the end. [#12371](https://github.com/JabRef/jabref/pull/12371) | |||
- We fixed an issue where removing the sort from the table did not restore the original order. [#12371](https://github.com/JabRef/jabref/pull/12371) | |||
- We fixed the citations relations merge logic. [#11189](https://github.com/JabRef/jabref/issues/11189) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what that means; it is not in the issue, is it?
Either remove that (because we typically have one changelog entry per PR) - or link this PR directly.
If it is kept, it should be more user facing. I think, users cold not merge citations relations, could they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous LRU caching implementation's merging logic was in fact overwriting the cache instead of `cachingAndMerging'.
But, you are right: this is an implementation detail over which the user has no control. I will remove this line.
Here is the original code (that is now updated and covered by a test): #11845 (comment)
@@ -175,6 +176,9 @@ public void initialize() { | |||
dialogService, | |||
taskExecutor); | |||
Injector.setModelOrService(AiService.class, aiService); | |||
|
|||
var searchRelationsService = new SearchCitationsRelationsService(preferences.getImporterPreferences()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name does not match the class name - i thought, this would be auto generated
(but no need to fix if everyhting else is OK)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I edited the CHANGE_LOG, I am fixing this small flavor of my own ;)
|
||
import org.jabref.model.entry.BibEntry; | ||
|
||
public class BibEntryRelationsRepositoryHelpersForTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small JavaDoc could help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cbe5acc
to
eb3dfd8
Compare
…logic' of https://github.com/alexandre-cremieux/jabref into fix-issue-11189-part-00-refactor-citation-relation-tab-logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started to dive deeper. I rewrote code as soon as I found a "mistake".
The most huge mistakes are:
- MVStore is opened on each read --> it has to be started upon open of the library tab
- A caching layer before MVStore --> this is not necesasry. MVSTore can cache. See https://www.h2database.com/html/mvstore.html -->
cacheSize: the cache size in MB.
- The "ubitios langauge" is not used. At
org.jabref.logic.importer.fetcher.CitationFetcher.SearchType
, we "agreed" to useCITES
andCITED_BY
, but the code variables usereference
andcitation
, which are confusing at some places. - The logic currently only supports DOI-based entries to be handled. There is no handling of entries without DOI.
The last point can be left to future work (as it involves storing citation information in the BibEntry itself and merging this)
Fix required:
- One global MV store loaded on start of JabRef
- the names tables in the mvstore are the path of .bib suffixed with -cites and -citing
- Store closed on shutdown of JabRef
I also added JavaDoc comments. This is a good practicse in all cases. |
I am thinking, if a global store of citation relations would be suffice somehow? The code assumes that citations are bound to a DOI (*). Maybe, we should store the citations of one DOI centrally and use that information in each library. IMHO, the Postgres backend can be used to search for the BibEntry belonging to a DOI (in a library). (*) This is a fine assumption as we do not incoroporate other services. |
Hello @koppor, Many thanks for the review. However, I am a bit confused. Let me try to explain my choices based on your comments:
Yes indeed, this is to prevent any error in case of interrupt that would lead to data loss and possibly to file corruption (see the documentation of the MVStore and reported error). By the way, it is impossible to handle gracefully a SIGKILL, so closing the file ASAP seems to be the safer way to avoid such kind of error without having to implement all the bells and whistle that come with a database engine. The first caching layer is here to avoid accessing the disk when citations are already fetched from the store. This avoid losing memory space cause the MVStore cache is fixed. Let's say that we allocate enough RAM to the MVStore to store 10 average entries. If the cache is full and you need to access a new entry that is not cached but does have a size that corresponds to 1 and a half average entry. Then you will remove 2 average entries from the cache to add the new one. Then, instead of having 10 entries in the cache, you will have only 9 (it can be worst than that). From my point of view, interest of the JVM is that it handles and optimise this for you. By the way, the MVStore is using a LRU caching algorithm (LRIS), so there won't be much changes compared to the first layer of cache. By the way, in a database, the entries are often of a fixed size. We are not in this situation here. Another thing with the MVStore is that it needs serialization/deserialization, so you will loose time with those operations even if it is cached. This is not to say that the MVStore cache is implemented for H2 database needs (reentrant locks, memory paging, thread pools in the background, etc.). Do we really need this level of complexity for this use case ? All this reasons made me keep the LRU implementation as a first level cache, cause it is a simple solution.
This is true. However, the By the way, talking about
Okay, does this means that you are not sure about the technology to use here ?
But I did this on my free time to help the project. I will not get anything from it as a reward (I do not do that for this). But before I do the changes you asked, I want to be sure that you are not going to push me back. Could you please tell me exactly what you want and if this design is okay for you before I add more code ? |
Did you do measurements regarding performance? What is the real impact on the users side? Starting from which count of cites are we touching MV Stores limits so that we need to add a custom cache layer in front? I have worked with terrabytes of data. I don't see that we touch this limit here. As far as I understood, the citation information needs to be persisted. |
Its a free time project for all of us... The reward we get is food and sometimes the money for thr train tickets to our yearly convention. Where we work in our free time in the tool. 😅 Happy to see that you are also joining this club somehow.
Sometimes one sees the implication of a decision when seeing the real code. Maybe I just experiment myself, since I need that to really describe what I think is best 😅. Seeing something which causes to think: "That can be done better (whatever metrics for that is applied)" is simpler than describing the specification, the software design, and skeletons of methods... |
High level description
This contributions aims to provide a local storage solution for citations relations using MVStore. Please see #11189 for more information.
Implementation details
No new dependency added.
Caching strategy
The implemented caching logic is following a 2 levels side-caching strategy with:
The interaction between the two levels of cache is backed by a responsibility chain pattern. This helps to:
Search caching high level logic
Serialization
MVStoreDAO uses the canonical representation to serialize a BibEntry and the
BibtexParser
to deserialize it.Configuration
Citations Relations
tab using the JabRef IoC providercitations
folder under JabRef's app directorySetting translation
The the local storage TTL setting message has been traduced in:
Refactoring
This contributions simplifies the citations/references fetching and caching logic by introducing two layers:
This should help to make this feature more extendable without modifying orchestration logic following open/close principle.
Screen shots
Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if applicable)