summaryrefslogtreecommitdiff
path: root/CHANGELOG.md
diff options
context:
space:
mode:
authorSeth Morton <seth.m.morton@gmail.com>2022-01-29 17:20:14 -0800
committerSeth Morton <seth.m.morton@gmail.com>2022-01-29 17:31:37 -0800
commit9aad50ddc877cf97b9141604cd198b3b06d88e63 (patch)
tree0437ec097b1cc1fa8494bd6c6a0609af7b7388fb /CHANGELOG.md
parent961d3bbd28d134280ebff30552ae209ee4f26b5e (diff)
downloadnatsort-9aad50ddc877cf97b9141604cd198b3b06d88e63.tar.gz
Add some limiting heuristics to the PATH suffix splitting
The prior algorithm went as follows: Obtain ALL suffixes from the base component of the filename. Then, starting from the back, keep the suffixes split until a suffix is encountered that begins with the regular expression /.\d/. It was assumed that this was intended to be a floating point number, and not an extension, and thus the splitting would stop at that point. Some input has been seen where the filenames are composed nearly entirely of Word.then.dot.and.then.dot. One entry amongst them contained Word.then.dot.5.then.dot. This caused this one entry to be treated differently from the rest of the entries due to the ".5", and the sorting order was not as expected. The new algorithm is as follows: Obtain a maxium of two suffixes. Keep these suffixes until one of them has a length greater than 4 or starts with the regular expression /.\d/. This heuristic of course is not bullet-proof, but it will do a better job on most real-world filenames than the previous algorithm.
Diffstat (limited to 'CHANGELOG.md')
0 files changed, 0 insertions, 0 deletions