-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for StringIO #295
base: master
Are you sure you want to change the base?
Support for StringIO #295
Conversation
Reviewer's Guide by SourceryThis PR adds support for StringIO to handle string data alongside BytesIO in the DALiuGE engine. The implementation includes type-safe handling of pydata fields, preservation of data types when copying between drops, and UTF-8 string handling capabilities. The changes span across multiple components including data drops, IO handling, and data parsing. Sequence diagram for data reading and writing with StringIOsequenceDiagram
participant User
participant DataDROP
participant DataIO
participant StringIO
User->>DataDROP: Request to read data
DataDROP->>DataIO: getIO()
DataIO->>StringIO: open(OpenMode.OPEN_READ)
StringIO-->>DataIO: Return data
DataIO-->>DataDROP: Return data
DataDROP-->>User: Return data
User->>DataDROP: Request to write data
DataDROP->>DataIO: getIO()
DataIO->>StringIO: open(OpenMode.OPEN_WRITE)
StringIO-->>DataIO: Write data
DataIO-->>DataDROP: Confirm write
DataDROP-->>User: Confirm write
Class diagram for updated IO handlingclassDiagram
class DataIO {
+open(mode: OpenMode)
+read(count: int)
+write(data)
+close()
}
class MemoryIO {
+_open()
+_write(data)
+exists()
+delete()
}
class FileIO {
+_read(count: int)
+_write(data)
}
DataIO <|-- MemoryIO
DataIO <|-- FileIO
class DropParser {
<<enumeration>>
+PATH
+DATAURL
+UTF8
}
class DropLoader {
+load_utf8(drop: DataDROP)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @awicenec - I've reviewed your changes - here's some feedback:
Overall Comments:
- Please add unit tests covering the new StringIO functionality and type checking behavior between MemoryDrops to prevent potential regressions.
- The new type checking behavior between MemoryDrops needs to be documented to prevent unexpected runtime failures. Please add documentation explaining this functionality to users.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
daliuge-engine/dlg/apps/pyfunc.py
Outdated
raise Exception( | ||
"Can't write data of this type: ", type(r).__name__ | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Raise a specific error instead of the general Exception
or BaseException
(raise-specific-error
)
Explanation
If a piece of code raises a specific exception type rather than the generic [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) or [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception), the calling code can:- get more information about what type of error it is
- define specific exception handling for it
This way, callers of the code can handle the error appropriately.
How can you solve this?
- Use one of the built-in exceptions of the standard library.
- Define your own error class that subclasses
Exception
.
So instead of having code raising Exception
or BaseException
like
if incorrect_input(value):
raise Exception("The input is incorrect")
you can have code raising a specific error like
if incorrect_input(value):
raise ValueError("The input is incorrect")
or
class IncorrectInputError(Exception):
pass
if incorrect_input(value):
raise IncorrectInputError("The input is incorrect")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @awicenec, I've added some suggestions for the Exceptions. Everything looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awicenec the code is fine as it is if you'd like to get these changes merged. I think we can update the exception handling at a later date.
This commit has established per-commit encoding, whilst also updating/clarifying/polishing some existing PyFunc and Translator code.
Both unit tests pass. Updated setup.py to use the LIU-392 branch of eagle_test_graphs. This will need to be reverted when tests pass through the GitHub workflows.
Also added quickstart for README.rst to help people get started ASAP, based on user feedback.
Sourcery updates. Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
This commit has established per-commit encoding, whilst also updating/clarifying/polishing some existing PyFunc and Translator code.
JIRA Ticket
liu-418
Type
Problem/Issue
Provide ability to read and write plain strings if requested by the user.
Solution
The basic solution is to use StringIO rather than BytesIO if requested by the user, but the IO implementation of DALiuGE is quite complex and requires a number of changes to make this happen. As a side-effect the functionality to provide data for a memory drop through the pydata field is now sort of type-safe and obeying the type specified in the pull-down menu. Moreover, when copying from one drop to another the original type of the data is preserved. As a side effect, when copying from one MemoryDrop to another the types of those drops need to match. This is a bit of a hidden feature right now and can lead to unexpected graph execution failures, since the way to specify the type of the target MemoryDrop is by using the pydata type pull-down menu, but leaving the pydata value empty. We might want to think abut a more explicit way of specifying the type of a MemoryDrop in the future.
Checklist
No specific documentation has been added.
Summary by Sourcery
Add support for StringIO to enable reading and writing of plain strings, enhancing type safety and data type preservation during data operations.
New Features:
Enhancements: