Commit graph

31 commits

Author SHA1 Message Date
Ed Page
fb0dac4297 refactor(dict): Allow 0..n corrections in BuiltIn
The main use case is taking `ther` -> `there` and adding `the` and
`their`.
2021-05-18 12:02:03 -05:00
Ed Page
04e55e4e85 fix(dict): Correctly connect dict with varcon
We had a bug where `finallizes` with EnGb would not correct to
`finalises`
2021-05-17 21:23:12 -05:00
Ed Page
b830872ad0 chore: Update enumflags2 2021-05-13 10:20:15 -05:00
Ed Page
cec850890c
Merge pull request #238 from epage/range
fix(dict)!: Clarify word sizes with Ranges
2021-05-01 08:54:08 -05:00
Ed Page
6216fa0837 fix(dict)!: Clarify word sizes with Ranges
The code was generated with separate min / max, rather than using a
Range and ensuring the API is used correctly.
2021-04-30 21:33:33 -05:00
Ed Page
2fc1f5468e chore(cli): Allow building without expensive parts
The obvious case is building for docs.rs but this can be helpful for
special use cases or faster development iteration.
2021-04-30 21:31:25 -05:00
Ed Page
9bfb506c6d fix(typos)!: Clarify Case::Uppers name
`Scream` was referrin to `SCREAMING_CASE` but outside of that context, I
think `Upper` is more accurate.
2021-04-21 20:36:35 -05:00
Ed Page
b17f9c3a12 feat: Const some fns 2021-03-29 20:27:06 -05:00
Ed Page
ce16d38cfd perf(dict): Skip checking numbers 2020-11-11 18:52:23 -06:00
Ed Page
482d320407 fix(dict): Ensure we fall through to built-in dict 2020-11-11 12:22:29 -06:00
Ed Page
6bdbd821e3 perf(dict): Avoid hashing unknwon words
Bypass hashing when we know (through str::len) that a word won't be in
the dict.

Master:
```
real    0m26.675s
user    0m33.683s
sys     0m4.535s
```

With this change
```
real    0m24.432s
user    0m32.492s
sys     0m4.190s
```
2020-11-10 20:57:04 -06:00
Ed Page
beaa0f4091 perf(dict): Avoid hashing unknwon words
Bypass hashing when we know (through str::len) that a word won't be in
the dict.

Master:
```
real    0m26.675s
user    0m33.683s
sys     0m4.535s
```

With this change:
```
real    0m24.060s
user    0m31.559s
sys     0m4.258s
```
2020-11-10 20:57:00 -06:00
Ed Page
18e31fa578 perf: Avoid hashing withut custom dict
`HashMap::get` (at least hashbrown) hashes before getting and doesn't
check if dict is empty.  For the custom dict, a common use case will
have the dict be empty.

Master:
```
real    0m26.675s
user    0m33.683s
sys     0m4.535s
```

Bypassing `HashMap::get`
```
real    0m16.415s
user    0m14.519s
sys     0m4.118s
```

On a moderately sized repo.
2020-11-10 20:56:54 -06:00
Ed Page
150c5bfdc1 perf: Hash faster for custom dicts
If we have to hash for the custom dict, we might as well be fast about
it.  We do not need a cryptographically secure algorithm since the
content is fixed for the user.

Master:
```
real    0m26.675s
user    0m33.683s
sys     0m4.535s
```

With ahash:
```
real    0m23.993s
user    0m30.800s
sys     0m4.440s
```
2020-11-10 20:56:49 -06:00
Ed Page
527b9837b4 feat: Custom dictionary support
Switching `valid-*` to just `*` where you map typo to correction, with
support for always-valid and never-valid.

Fixes #9
2020-10-27 21:15:25 -05:00
Ed Page
043692afe0 feat(dict): Override builtin dictionary
Sometimes you just have to live with a typo or its done intentionally
(like weird company names).  With this commit, a user can now identifier
blessed identifiers and words.

This is ostly what is needed for #9 but sometimes people will have
common typos that they'll want to provide corrections for.
2020-09-02 20:24:54 -05:00
Ed Page
ab4a5bbdaf feat: Support english dialects
The goal is to be as accepting and unobtrusive to new code bases as
possible.  To this end, we correct typos into the closest english
dialect.

If someone wants to opt-in, they can have typos correct to a specific
english dialect.

Fixes #52
Fixes #22
2020-08-20 19:37:37 -05:00
Ed Page
bc1302f01b feat: Support multiple, valid corrections
Some of the other spell checkers already do this. While I've not checked
where we might need it for our dictionary, this will be important for
dialects.
2020-07-04 20:52:48 -05:00
Ed Page
ce1ef2ca30 refactor!: Move dict implementation into CLI 2019-10-28 11:00:47 -06:00
Ed Page
164ee9cb84 refactor: Split bin/lib into separate crates 2019-08-08 10:04:51 -05:00
Ed Page
adcbe68621 refactor(dict): Split out a trait 2019-07-27 19:50:36 -06:00
Ed Page
953064e7d1 fix(dict): Fix should match typo's case
Fixes #10
2019-06-26 07:22:59 -06:00
Ed Page
a5b8636bdb refactor(dict): Allow for owned corrections 2019-06-24 21:46:40 -06:00
Ed Page
859769b835 refactor: Rename Symbol to Identifier
This is more descriptive
2019-06-24 21:46:39 -06:00
Ed Page
3d1fb3b1ae feat(parse): Process words composing symbols 2019-06-15 22:21:40 -06:00
Ed Page
905de9bd8d chore(CI): Fighting clippy 2019-06-14 14:53:34 -06:00
Ed Page
f1e3163ba2 fix: Clippy 2019-06-14 07:04:58 -06:00
Ed Page
9ccfc9c27d fix: Clippy 2019-06-14 06:51:22 -06:00
Ed Page
af66072272 feat(dict): Perform case-insensitive comparisons 2019-06-13 19:55:21 -06:00
Ed Page
b6aabc9392 refactor: Switch to bytes for symbol lookup 2019-04-16 18:15:12 -06:00
Ed Page
85ee5cfac9 fix(api): Split lib 2019-01-24 08:24:20 -07:00