-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NPI-3653 - Fix errors in parsing IGS log updates #68
Conversation
I've run this on a whole directory of log files using the |
…g. add: Unit-tests to test parsing and gathering of files
result_v1 = _REGEX_LOG_VERSION_1.search(data) | ||
if result_v1: | ||
return "v1.0" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For efficiency and robustness, I'd suggest trimming the binary string to the first line, before running the regex over it.
result_v1 = _REGEX_LOG_VERSION_1.search(data) | |
if result_v1: | |
return "v1.0" | |
# Remove leading newline if present (can show up when reading from a file), truncate to first line | |
first_line_bytes = data.lstrip(b"\n").split(b"\n")[0] | |
result_v1 = _REGEX_LOG_VERSION_1.search(first_line_bytes) | |
if result_v1: | |
return "v1.0" |
if result_v1: | ||
return "v1.0" | ||
|
||
result_v2 = _REGEX_VERSION_2.search(data) | ||
result_v2 = _REGEX_LOG_VERSION_2.search(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result_v2 = _REGEX_LOG_VERSION_2.search(data) | |
result_v2 = _REGEX_LOG_VERSION_2.search(first_line_bytes) |
|
||
_REGEX_LOG_VERSION_1 = _re.compile(rb"""(site log\))""") | ||
_REGEX_LOG_VERSION_2 = _re.compile(rb"""(site log v2.0)""") | ||
|
||
_REGEX_ID_V1 = _re.compile( | ||
rb""" | ||
(?:Four\sCharacter\sID|Site\sID)\s+\:\s*(\w{4}).*\W+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This named capture group might read more clearly with underscores rather than escaped spaces(?)
Fixing various type hints and adding helpful comments Co-authored-by: Nathan <95725385+treefern@users.noreply.github.com>
The previous PR on this branch had a couple errors that meant the
write_meta_gather_master()
function (andgather_metadata()
function) were not working correctly.Ultimately, the arguments were being fed in the wrong order to the new
extract_...
functions, and therefore an "incorrect version" flag was being shown (because a station Code was being fed in instead).This PR fixes my initial oversight - arguments are explicitly passed in