Improve content scanning #149

Abestanis · 2024-07-21T11:14:07Z

This includes #145 and is based on top of #148.

This makes content scanning more robust and prevents failing the entire content search because one item fails by catching exceptions on an item level and reporting them, which implements the first checkbox on #127.

I think it is valuable to report these as non-fatal errors, so we can see if we are too strict and need to make a content attribute nullable.

nt4f04uNd

General comment: the solution is fine, but tests are hard to read. If we could make it a bit more simple, or more readable, that would be great

nt4f04uNd · 2024-10-19T20:10:29Z

test/logic/models/content_test.dart

+          albumWith(id: 19, lastYear: 2000, firstYear: 3000),
+        ),
+      ];
+      final propertiesThatCanBeMissing = ['albumArt', 'artistId', 'firstYear', 'lastYear', 'numberOfSongs'];


If we add a new property, this will likely not be updated

The way I see it, that's not going to be a problem. The tests verify the current behaviour, any change will (and should) break them. If we add a nullable property, the subSets function below will generate albums maps where that property is null/missing, so the test will fail when we expect parsing of these to fail but they get accepted.

test/logic/models/content_test.dart

nt4f04uNd · 2024-10-19T20:16:11Z

test/logic/models/content_test.dart

+        ...validSong.copyWithout(propertiesThatCanBeMissing).subSets(),
+        ...validSong
+            .withWrongTypes()
+            .whereNot((map) => propertiesThatCanBeMissing.any((property) => map[property] == null))


What does it do?

Generally it's hard to understand what is the output of these function calls, how they work, and why we use them

Generally it's hard to understand what is the output of these function calls, how they work, and why we use them

I agree.

The point of this is to automatically create all possible bad variations of inputs to the parse function.
There is two types of problems that this covers:

Missing data: The subSets function gives us variations of the validSong map (without the values that are allowed to be null) with one or more missing elements. We generate all possible subsets instead of just checking if it's ok if individual elements are missing because sometimes we fall back to another element in that case (like Song.dateModified falling back to Song.dateAdded).

Invalid data types: The withWrongTypes gives us variations of the validSong map where each element of the map is a different type from the expected one. For example, the title property has a type String, so withWrongTypes will generate one variation where the title is null, one where the title is true, one where it's 0 and so on. We only generate variations where one of the elements is changed and don't bother with variations where multiple element types have changed because there would be a large amount of variations and because I don't believe it is very relevant.

I agree that this is not very easy to parse from the code, but I'm not quite sure how to improve the situation other than adding comments and maybe changing the name of withWrongTypes and subSets. Do you have any recomendations?

I've added some helper extension methods to improve the clarity of the tests.

I think splitting the tests into the valid and invalid cases improves clarity a lot.

test/logic/models/content_test.dart

nt4f04uNd · 2024-10-19T20:22:58Z

test/logic/models/content_test.dart

+      ];
+      final propertiesThatCanBeMissing = ['albumArt', 'artistId', 'firstYear', 'lastYear', 'numberOfSongs'];
+      final invalidAlbums = [
+        ...validAlbum.copyWithout(propertiesThatCanBeMissing).subSets(),


What is the point of generating a model, then removing by hand all of its properties?

subSets generates all possible variations of the map where one or more elements are missing to generate a list of invalid maps that can't be parsed into a valid album. The copyWithout removes the elements of the map that are allowed to be missing and would therefore successfully parse.

The reason to not hard code the list of properties is that if we add a new property to Album it will be automatically included here.

The copyWithout removes the elements of the map that are allowed to be missing and would therefore successfully parse.

But if it removes those fields from map, subSets gets the map with those removed fields. How would it know it needs to generate some variations of those non-existent fields, am I missing something here?

How would it know it needs to generate some variations of those non-existent fields, am I missing something here?

The idea is for it to not generate any variations with the removed fields. We want to generate invalid maps here, but since these field are nullable it's valid for them to be missing.

test/logic/models/content_test.dart

Abestanis added the enhancement New feature or request label Jul 21, 2024

Abestanis requested a review from nt4f04uNd July 21, 2024 11:14

Abestanis marked this pull request as draft July 21, 2024 11:14

Abestanis force-pushed the feature/improve_content_scanning branch 2 times, most recently from 96b3628 to 3e31748 Compare July 21, 2024 11:39

Abestanis mentioned this pull request Jul 24, 2024

Catch errors on content scan #150

Draft

Abestanis force-pushed the feature/improve_content_scanning branch 2 times, most recently from cff81ae to 75de00c Compare October 14, 2024 16:21

Abestanis marked this pull request as ready for review October 15, 2024 10:40

Abestanis added 10 commits October 19, 2024 20:28

Allow data factories to fail parsing entries from the native media store

8ab1bad

Catch type errors when parsing mediastore content and report an error

c6126d5

Add FakeFirebaseApp and CrashlyticsObserver to the tests

23e156d

Allow to specify raw content in the FakeSweyerPluginPlatform

428a314

Add tests for mediastore content parsing

df4634f

Remove extensions that are now included in dart3

385b77c

Don't use the tuple library in tests

606a278

Fix formatting

9c7e75a

Adjust tests for properties that can be null now

3286984

Remove unused extension

2e33f71

Abestanis force-pushed the feature/improve_content_scanning branch from 4ae4508 to 2e33f71 Compare October 19, 2024 18:31

nt4f04uNd reviewed Oct 19, 2024

View reviewed changes

Abestanis added 2 commits October 19, 2024 23:06

Adjust the tests for the fact that Playlist.dateModified can be null

b2f9d6d

Make test extensions private

5aebbef

nt4f04uNd reviewed Oct 19, 2024

View reviewed changes

test/logic/models/content_test.dart Outdated Show resolved Hide resolved

Abestanis added 4 commits October 19, 2024 23:18

Make some variables final

3a47403

Remove names of test extensions

13ee254

Improve content tests

7342904

Split the content tests into valid and invalid test cases

2259f93

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve content scanning #149

Improve content scanning #149

Abestanis commented Jul 21, 2024

nt4f04uNd left a comment

nt4f04uNd Oct 19, 2024

Abestanis Oct 19, 2024

nt4f04uNd Oct 19, 2024

nt4f04uNd Oct 19, 2024

Abestanis Oct 19, 2024

Abestanis Oct 22, 2024

Abestanis Oct 23, 2024

nt4f04uNd Oct 19, 2024

Abestanis Oct 19, 2024

nt4f04uNd Oct 19, 2024

Abestanis Oct 19, 2024

Improve content scanning #149

Are you sure you want to change the base?

Improve content scanning #149

Conversation

Abestanis commented Jul 21, 2024

nt4f04uNd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment