typos

mirror of https://github.com/crate-ci/typos.git synced 2024-11-22 17:11:07 -05:00

Author	SHA1	Message	Date
Ed Page	5ae7bda8eb	style: Silence clippy	2022-05-16 09:09:17 -05:00
Ed Page	5c83dec07b	style: Remove unused variable	2021-12-14 15:41:52 -06:00
Ed Page	c8d1058a71	refactor(dict): Change typos-dict to trie This is +/- 15%, depending on the benchmark.	2021-07-01 10:41:56 -05:00
Ed Page	bbbf985777	perf(dict): Switch varcon to a burst-trie This cuts varcon lookup times in half but I still suspect slower than phf. Like with bsearch and unlike, the cost is consistent between hits and misses. At least this doesn't have the compile hit of PHF + unicase. Maybe I should experiment with integrating a non-const-fn variant of unicase with PHF and give up on all of this extra complexity.	2021-06-30 21:03:57 -05:00
Ed Page	a1e95bc7c0	refactor(dict): Pull out table-lookup logic Before, only some dicts did we guarentee were pre-sorted. Now, all are for-sure pre-sorted. This also gives each dict the size-check to avoid lookup. But this is really about refactoring in prep for playing with other lookup options, like tries.	2021-06-30 10:12:17 -05:00
Ed Page	b1cf03c7eb	refactor(varcon): Move away from PHF This is mostly to give implementation flexibility for changing out how we store the data to reduce compilation memory usage. This does have performance impact, jumping from ~220ns to ~320ns for a dict lookup, according to our micro benchmarks.	2021-06-04 14:59:46 -05:00
Ed Page	b99f32dea8	perf(dict): Bypass vars when possible Variant support slows us down by 10-50$. I assume most people will run with `en` and so most of this overhead is to waste. So instead of merging vars with dict, let's instead get a quick win by just skipping vars when we don't need to. If the assumptions behind this change over time or if there is need for speeding up a specific locale, we can re-address this. Before: ``` check_file/Typos/code time: [35.860 us 36.021 us 36.187 us] thrpt: [8.0117 MiB/s 8.0486 MiB/s 8.0846 MiB/s] check_file/Typos/corpus time: [26.966 ms 27.215 ms 27.521 ms] thrpt: [21.127 MiB/s 21.365 MiB/s 21.562 MiB/s] ``` After: ``` check_file/Typos/code time: [33.837 us 33.928 us 34.031 us] thrpt: [8.5191 MiB/s 8.5452 MiB/s 8.5680 MiB/s] check_file/Typos/corpus time: [17.521 ms 17.620 ms 17.730 ms] thrpt: [32.794 MiB/s 32.999 MiB/s 33.184 MiB/s] ``` This puts us inline with `--no-default-features --features dict` Fixes #253	2021-05-19 13:55:41 -05:00
Ed Page	d65fa79d0e	refactor(dict): Make feature flag paths clearer	2021-05-18 19:45:11 -05:00
Ed Page	639e65b88a	fix(dict): Handle cases from Linux These were found while running `typos` on Linux and inspecting a sampling of the results. #249 represents additional changes to make. There were some identifiers, that looked like hardware registers, that I'm unsure of what can be done for them.	2021-05-18 12:02:03 -05:00
Ed Page	fb0dac4297	refactor(dict): Allow 0..n corrections in BuiltIn The main use case is taking `ther` -> `there` and adding `the` and `their`.	2021-05-18 12:02:03 -05:00
Ed Page	04e55e4e85	fix(dict): Correctly connect dict with varcon We had a bug where `finallizes` with EnGb would not correct to `finalises`	2021-05-17 21:23:12 -05:00
Ed Page	b830872ad0	chore: Update enumflags2	2021-05-13 10:20:15 -05:00
Ed Page	cec850890c	Merge pull request #238 from epage/range fix(dict)!: Clarify word sizes with Ranges	2021-05-01 08:54:08 -05:00
Ed Page	6216fa0837	fix(dict)!: Clarify word sizes with Ranges The code was generated with separate min / max, rather than using a Range and ensuring the API is used correctly.	2021-04-30 21:33:33 -05:00
Ed Page	2fc1f5468e	chore(cli): Allow building without expensive parts The obvious case is building for docs.rs but this can be helpful for special use cases or faster development iteration.	2021-04-30 21:31:25 -05:00
Ed Page	9bfb506c6d	fix(typos)!: Clarify `Case::Upper`s name `Scream` was referrin to `SCREAMING_CASE` but outside of that context, I think `Upper` is more accurate.	2021-04-21 20:36:35 -05:00
Ed Page	b17f9c3a12	feat: Const some fns	2021-03-29 20:27:06 -05:00
Ed Page	ce16d38cfd	perf(dict): Skip checking numbers	2020-11-11 18:52:23 -06:00
Ed Page	482d320407	fix(dict): Ensure we fall through to built-in dict	2020-11-11 12:22:29 -06:00
Ed Page	6bdbd821e3	perf(dict): Avoid hashing unknwon words Bypass hashing when we know (through str::len) that a word won't be in the dict. Master: ``` real 0m26.675s user 0m33.683s sys 0m4.535s ``` With this change ``` real 0m24.432s user 0m32.492s sys 0m4.190s ```	2020-11-10 20:57:04 -06:00
Ed Page	beaa0f4091	perf(dict): Avoid hashing unknwon words Bypass hashing when we know (through str::len) that a word won't be in the dict. Master: ``` real 0m26.675s user 0m33.683s sys 0m4.535s ``` With this change: ``` real 0m24.060s user 0m31.559s sys 0m4.258s ```	2020-11-10 20:57:00 -06:00
Ed Page	18e31fa578	perf: Avoid hashing withut custom dict `HashMap::get` (at least hashbrown) hashes before getting and doesn't check if dict is empty. For the custom dict, a common use case will have the dict be empty. Master: ``` real 0m26.675s user 0m33.683s sys 0m4.535s ``` Bypassing `HashMap::get` ``` real 0m16.415s user 0m14.519s sys 0m4.118s ``` On a moderately sized repo.	2020-11-10 20:56:54 -06:00
Ed Page	150c5bfdc1	perf: Hash faster for custom dicts If we have to hash for the custom dict, we might as well be fast about it. We do not need a cryptographically secure algorithm since the content is fixed for the user. Master: ``` real 0m26.675s user 0m33.683s sys 0m4.535s ``` With ahash: ``` real 0m23.993s user 0m30.800s sys 0m4.440s ```	2020-11-10 20:56:49 -06:00
Ed Page	527b9837b4	feat: Custom dictionary support Switching `valid-` to just `` where you map typo to correction, with support for always-valid and never-valid. Fixes #9	2020-10-27 21:15:25 -05:00
Ed Page	043692afe0	feat(dict): Override builtin dictionary Sometimes you just have to live with a typo or its done intentionally (like weird company names). With this commit, a user can now identifier blessed identifiers and words. This is ostly what is needed for #9 but sometimes people will have common typos that they'll want to provide corrections for.	2020-09-02 20:24:54 -05:00
Ed Page	ab4a5bbdaf	feat: Support english dialects The goal is to be as accepting and unobtrusive to new code bases as possible. To this end, we correct typos into the closest english dialect. If someone wants to opt-in, they can have typos correct to a specific english dialect. Fixes #52 Fixes #22	2020-08-20 19:37:37 -05:00
Ed Page	bc1302f01b	feat: Support multiple, valid corrections Some of the other spell checkers already do this. While I've not checked where we might need it for our dictionary, this will be important for dialects.	2020-07-04 20:52:48 -05:00
Ed Page	ce1ef2ca30	refactor!: Move dict implementation into CLI	2019-10-28 11:00:47 -06:00
Ed Page	164ee9cb84	refactor: Split bin/lib into separate crates	2019-08-08 10:04:51 -05:00
Ed Page	adcbe68621	refactor(dict): Split out a trait	2019-07-27 19:50:36 -06:00
Ed Page	953064e7d1	fix(dict): Fix should match typo's case Fixes #10	2019-06-26 07:22:59 -06:00
Ed Page	a5b8636bdb	refactor(dict): Allow for owned corrections	2019-06-24 21:46:40 -06:00
Ed Page	859769b835	refactor: Rename Symbol to Identifier This is more descriptive	2019-06-24 21:46:39 -06:00
Ed Page	3d1fb3b1ae	feat(parse): Process words composing symbols	2019-06-15 22:21:40 -06:00
Ed Page	905de9bd8d	chore(CI): Fighting clippy	2019-06-14 14:53:34 -06:00
Ed Page	f1e3163ba2	fix: Clippy	2019-06-14 07:04:58 -06:00
Ed Page	9ccfc9c27d	fix: Clippy	2019-06-14 06:51:22 -06:00
Ed Page	af66072272	feat(dict): Perform case-insensitive comparisons	2019-06-13 19:55:21 -06:00
Ed Page	b6aabc9392	refactor: Switch to bytes for symbol lookup	2019-04-16 18:15:12 -06:00
Ed Page	85ee5cfac9	fix(api): Split lib	2019-01-24 08:24:20 -07:00

40 commits