Commit graph

71 commits

Author SHA1 Message Date
Ed Page
f60e798a2a chore: Release 2021-07-27 15:31:01 -05:00
Ed Page
3b43272724 refactor(dict): Separate dictgen concerns 2021-07-01 11:00:33 -05:00
Ed Page
bbbf985777 perf(dict): Switch varcon to a burst-trie
This cuts varcon lookup times in half but I still suspect slower than
phf.  Like with bsearch and unlike, the cost is consistent between hits
and misses.

At least this doesn't have the compile hit of PHF + unicase.  Maybe I
should experiment with integrating a non-const-fn variant of unicase
with PHF and give up on all of this extra complexity.
2021-06-30 21:03:57 -05:00
Ed Page
908f9d44eb refactor(dict): Be more cache concious 2021-06-30 19:56:03 -05:00
Ed Page
a1e95bc7c0 refactor(dict): Pull out table-lookup logic
Before, only some dicts did we guarentee were pre-sorted.  Now, all are
for-sure pre-sorted.

This also gives each dict the size-check to avoid lookup.

But this is really about refactoring in prep for playing with other
lookup options, like tries.
2021-06-30 10:12:17 -05:00
Ed Page
bfa7888f82 chore: Skip more releases 2021-06-29 15:39:28 -05:00
Ed Page
9149c4765d chore: Release 2021-06-29 15:05:18 -05:00
Ed Page
04f5d40e57 chore: Release 2021-06-05 14:39:37 -05:00
Ed Page
2b1f565eaa refactor(varcon): Remove reliance on const-fn
This dropped RSS (memory usage) from 4GB to 1.5GB when compiling.

The extra `match` could impact performance but not too concerned since
the default is to not look within vars.
2021-06-04 15:01:08 -05:00
Ed Page
b1cf03c7eb refactor(varcon): Move away from PHF
This is mostly to give implementation flexibility for changing out how
we store the data to reduce compilation memory usage.

This does have performance impact, jumping from ~220ns to ~320ns for a
dict lookup, according to our micro benchmarks.
2021-06-04 14:59:46 -05:00
Ed Page
3e66a99674 chore: Release 2021-05-21 20:41:02 -05:00
Ed Page
3995745362 chore: Release 2021-05-21 20:39:12 -05:00
Ed Page
b99f32dea8 perf(dict): Bypass vars when possible
Variant support slows us down by 10-50$.  I assume most people will run
with `en` and so most of this overhead is to waste.  So instead of
merging vars with dict, let's instead get a quick win by just skipping
vars when we don't need to.  If the assumptions behind this change over
time or if there is need for speeding up a specific locale, we can
re-address this.

Before:
```
check_file/Typos/code   time:   [35.860 us 36.021 us 36.187 us]
                        thrpt:  [8.0117 MiB/s 8.0486 MiB/s 8.0846 MiB/s]
check_file/Typos/corpus time:   [26.966 ms 27.215 ms 27.521 ms]
                        thrpt:  [21.127 MiB/s 21.365 MiB/s 21.562 MiB/s]
```
After:
```
check_file/Typos/code   time:   [33.837 us 33.928 us 34.031 us]
                        thrpt:  [8.5191 MiB/s 8.5452 MiB/s 8.5680 MiB/s]
check_file/Typos/corpus time:   [17.521 ms 17.620 ms 17.730 ms]
                        thrpt:  [32.794 MiB/s 32.999 MiB/s 33.184 MiB/s]
```

This puts us inline with `--no-default-features --features dict`

Fixes #253
2021-05-19 13:55:41 -05:00
Ed Page
7c803681c4 chore: Release 2021-05-13 09:58:09 -05:00
Ed Page
3b9061dece
Merge pull request #240 from crate-ci/dependabot/cargo/codegenrs-1.0.0
chore(deps): Bump codegenrs from 0.1.5 to 1.0.0
2021-05-01 09:04:51 -05:00
dependabot[bot]
d72fa7acba
chore(deps): Bump codegenrs from 0.1.5 to 1.0.0
Bumps [codegenrs](https://github.com/crate-ci/codegenrs) from 0.1.5 to 1.0.0.
- [Release notes](https://github.com/crate-ci/codegenrs/releases)
- [Changelog](https://github.com/crate-ci/codegenrs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crate-ci/codegenrs/compare/v0.1.5...v1.0.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-01 07:01:59 +00:00
Ed Page
6216fa0837 fix(dict)!: Clarify word sizes with Ranges
The code was generated with separate min / max, rather than using a
Range and ensuring the API is used correctly.
2021-04-30 21:33:33 -05:00
Ed Page
1f4c587692 chore({{crate_name}}): Release {{version}} 2021-04-14 19:13:25 -05:00
dependabot-preview[bot]
b8d3190ce9
chore(deps): bump itertools from 0.9.0 to 0.10.0
Bumps [itertools](https://github.com/bluss/rust-itertools) from 0.9.0 to 0.10.0.
- [Release notes](https://github.com/bluss/rust-itertools/releases)
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bluss/rust-itertools/compare/v0.9.0...v0.10.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2021-01-03 03:40:45 +00:00
Ed Page
6bdbd821e3 perf(dict): Avoid hashing unknwon words
Bypass hashing when we know (through str::len) that a word won't be in
the dict.

Master:
```
real    0m26.675s
user    0m33.683s
sys     0m4.535s
```

With this change
```
real    0m24.432s
user    0m32.492s
sys     0m4.190s
```
2020-11-10 20:57:04 -06:00
Ed Page
ab4a5bbdaf feat: Support english dialects
The goal is to be as accepting and unobtrusive to new code bases as
possible.  To this end, we correct typos into the closest english
dialect.

If someone wants to opt-in, they can have typos correct to a specific
english dialect.

Fixes #52
Fixes #22
2020-08-20 19:37:37 -05:00