This cuts varcon lookup times in half but I still suspect slower than
phf. Like with bsearch and unlike, the cost is consistent between hits
and misses.
At least this doesn't have the compile hit of PHF + unicase. Maybe I
should experiment with integrating a non-const-fn variant of unicase
with PHF and give up on all of this extra complexity.
Before, only some dicts did we guarentee were pre-sorted. Now, all are
for-sure pre-sorted.
This also gives each dict the size-check to avoid lookup.
But this is really about refactoring in prep for playing with other
lookup options, like tries.
This skips a lot of validation for being "good enough" (comment
open/closes matching, etc).
This has a chance of incorrectly matching in languages with `@` as an
operator, like Python, but Python encourages spaces arround operators,
so hopefully this won't be a problem.
For now, we hardcoded a min length of 90 bytes to ensure to avoid
ambiguity with math operations on variables (generally people use
whitespace anyways).
Fixes#287
We might be able to make this bail our earlier and not accidentally
detect the wrong thing by checking if the hex values are lowercase. RFC
4122 says that UUIDs must be generated lowecase, while input accepts
any case. The main issues are risk on the "input" part and the extra
annoyance of writing a custm `is_hex_digit` function.
This is prep for other items to be ignored
BREAKING CHANGE: `TokenizerBuilder` no longer takes config for ignoring
tokens. Related, we now ignore token-ignore config flags.
For no additional build dependencies (besides Python >= 3.6), and faster
install.
The setup.py here is a copy of the one at
https://github.com/shellcheck-py/shellcheck-py with changes kept at the
bare minimum for future diffability.
We don't support checksumming at least yet, because there's no access to
them from cargo-release at the moment.
Closes#285