-
Notifications
You must be signed in to change notification settings - Fork 31
Change history
Richard Lehane edited this page Dec 22, 2015
·
6 revisions
- measure time elapsed with -log time
- bugfix: percent encode file URIs in droid output
- bugfix: long windows directory paths (further work on bug fixed in 1.4.2); reported by Ross Spencer
- bugfix: mscfb panic; reported by Ross Spencer
- bugfix: TIFF mis-identifications due to an early halt error
- new -throttle flag; requested by Ross Spencer
- errors logged to stderr by default (to quieten use -log ""); requested by Ross Spencer
- mscfb update: lazy reading
- webarchive update: decode Transfer-Encoding and Content-Encoding; requested by Dragan Espenschied
- bugfix: long windows paths; reported by Ross Spencer
- bugfix: 32-bit file size overflow; reported by Ross Spencer
- -log replaces -debug, -slow, -unknown and -known flags (see usage above)
- highlight empty file/stream with error and warning
- negative text match overrides extension-only plain text match
- new MIME matcher; requested by Dragan Espenschied
- support warc continuations
- add all.json and tiff.json sets
- minor speed-up
- report less redundant basis information
- report error on empty file/stream
- scan within warc and arc files with -z flag; reqested by Dragan Espenschied
- quit scanning earlier on known unknowns
- don't include byte signatures where formats have container signatures (unless -doubleup flag is given); fixes a mis-identification reported by Ross Spencer
- sf -slow FILE | DIR reports slow signatures
- sf -debug output simplified
- sf -version describes signature file; requested by Michelle Lindlar
- roy -limit and -exclude now operate on text and default zip matches
- roy -nopriority re-configured to return more results
- bugfix: upgraded versions of sf panic when attempting to read old signature files; reported by Stefan
- bugfix: panic mmap'ing files over 1GB on Win32; reported by Duncan
- bugfix: reporting extensions for folders with "."; reported by Ross Spencer
- add -noext flag to roy to suppress extension matching; requested by Greg Lepore
- -known and -unknown flags for sf to output lists of recognised and unknown files respectively; requested by Greg Lepore
- support annotation of sets.json files; requested by Greg Lepore
- add warning when use -extendc without -extend
- bugfix: report container extensions in details; reported by Ross Spencer
- text matcher (i.e. sf README will now report a 'Plain Text File' result)
- -notext flag to suppress text matcher (roy build -notext)
- all outputs now include file last modified time
- -hash flag with choice of md5, sha1, sha256, sha512, crc (e.g. sf -hash md5 FILE)
- -droid flag to mimic droid output (sf -droid FILE)
- bugfix: detect encoding of zip filenames reported by Dragan Espenschied
- bugfix: mscfb reported by Dragan Espenschied
- scan within archive formats (zip, tar, gzip) with -z flag
- format sets (e.g. roy build -exclude @pdfa)
- leaner, faster signature format
- support bitmask patterns
- mirror bof patterns as eof patterns where both roy -bof and -eof limits set
- bugfix: (mscfb) reported by Pascal Aantz
- bugfix: race condition in scorer (affected tip golang)
- user documentation
- bugfixes (mscfb, match/wac and sf)
- QA using comparator
- json output
- server mode
- bugfix: single quote YAML output
- optimisations (mmap, multithread, etc.)
- csv output
- periodic priority checking to stop searches earlier
- range/distance/choices bugfix
- change to signature file format
- roy (r2d2 rename) signature customisation
- parse Droid signature (not just PRONOM reports)
- support extension signatures
- support multiple identifiers
- config package
- mscfb bugfixes
- license info in srcs (no change to license; this allows for attributing authorship for non-Richard contribs)
- default home change to "$HOME/siegfried" (no longer ".siegfried")
- container matching
- cross-compile was broken (because of use of os/user). Now doing native builds on the three platforms so the download binaries should all work now.
- bug in processing code caused really bad matching profile for MP3 sigs. No need to update the tool for this, but please do a sieg -update to get the latest signature file.
- sf command line: descriptive output in YAML, including basis for matches
- optimisations inc. initial BOF loop before main matching loop
- sf command line changes: -version and -update flags now enabled
- over-the-wire updates of signature files from www.itforarchivists.com/siegfried
- replaced ac matcher with wac matcher
- re-write of bytematcher code
- some benchmarks slower but fewer really poor edge cases (see cmd/sieg/testdata/bench_results.txt)... so a win!
- but still too slow!
- benchmarks (cmd/sieg/testdata)
- an Identifier type that controls the matching process and stops on best possible match (i.e. no longer require a full file scan for all files)
- name/extension matching
- a custom reader (pkg/core/siegreader)
- simplifications to the sieg command and signature file
- optimisations that have boosted performance (see cmd/sieg/testdata/bench_results.txt). But still too slow!
First release. Parses PRONOM signatures and performs byte matching. Bare bones CLI. Glacially slow!