typos

mirror of https://github.com/crate-ci/typos.git synced 2024-11-23 09:30:57 -05:00

Author	SHA1	Message	Date
Ed Page	9149c4765d	chore: Release	2021-06-29 15:05:18 -05:00
Ed Page	c83f655109	feat(parser): Ignore URLs Fixes #288	2021-06-29 14:14:58 -05:00
Ed Page	b673b81146	fix(parser): Ensure we get full base64 We greedily matched separators, including ones that might be part of base64. This impacts the length calculation, so we want as much as possible.	2021-06-29 13:55:46 -05:00
Ed Page	6915d85c0b	feat(parser): Ignore emails This skips a lot of validation for being "good enough" (comment open/closes matching, etc). This has a chance of incorrectly matching in languages with `@` as an operator, like Python, but Python encourages spaces arround operators, so hopefully this won't be a problem.	2021-06-29 13:42:27 -05:00
Ed Page	2a1e6ca0f6	feat(parser): Ignore base64 For now, we hardcoded a min length of 90 bytes to ensure to avoid ambiguity with math operations on variables (generally people use whitespace anyways). Fixes #287	2021-06-29 13:25:10 -05:00
Ed Page	23b6ad5796	feat(parser): Ignore SHA-1+ Fixes #270	2021-06-29 12:20:08 -05:00
Ed Page	8566b31f7b	fix(parser): Go ahead and do lower UUIDs I need this for hash support anyways	2021-06-29 12:13:21 -05:00
Ed Page	85082cdbb1	feat(parser): Ignore UUIDs We might be able to make this bail our earlier and not accidentally detect the wrong thing by checking if the hex values are lowercase. RFC 4122 says that UUIDs must be generated lowecase, while input accepts any case. The main issues are risk on the "input" part and the extra annoyance of writing a custm `is_hex_digit` function.	2021-06-29 12:11:50 -05:00
Ed Page	32f5e6c682	refactor(typos)!: Bake ignores into parser This is prep for other items to be ignored BREAKING CHANGE: `TokenizerBuilder` no longer takes config for ignoring tokens. Related, we now ignore token-ignore config flags.	2021-06-29 11:41:25 -05:00
Ed Page	ded90f2387	perf(parser): Auto-detect unicode For smaller, ascii-only content, this seems to be taking ~30% less time for parsing.	2021-06-29 05:28:17 -05:00
Ed Page	95417f3a41	refactor(parser): Consolidate utf8/ascii logic	2021-06-29 05:10:02 -05:00
Ed Page	83b2804623	fix(ci): Don't fail codegen checks	2021-06-28 14:06:47 -05:00
Ed Page	4066d21790	style: Address clippy	2021-06-28 13:51:06 -05:00
Ed Page	3a4d039c4f	chore: Reduce code-gen memory usage More `const fn` removals to reduce compilation memory use	2021-06-07 08:58:34 -05:00
Ed Page	04f5d40e57	chore: Release	2021-06-05 14:39:37 -05:00
Ed Page	2b1f565eaa	refactor(varcon): Remove reliance on const-fn This dropped RSS (memory usage) from 4GB to 1.5GB when compiling. The extra `match` could impact performance but not too concerned since the default is to not look within vars.	2021-06-04 15:01:08 -05:00
Ed Page	b1cf03c7eb	refactor(varcon): Move away from PHF This is mostly to give implementation flexibility for changing out how we store the data to reduce compilation memory usage. This does have performance impact, jumping from ~220ns to ~320ns for a dict lookup, according to our micro benchmarks.	2021-06-04 14:59:46 -05:00
Ed Page	1cb9b37120	chore: Update codespell dict Based on 2ed354c at https://github.com/codespell-project/codespell	2021-05-22 21:44:56 -05:00
Ed Page	3e66a99674	chore: Release	2021-05-21 20:41:02 -05:00
Ed Page	3995745362	chore: Release	2021-05-21 20:39:12 -05:00
Ed Page	b99f32dea8	perf(dict): Bypass vars when possible Variant support slows us down by 10-50$. I assume most people will run with `en` and so most of this overhead is to waste. So instead of merging vars with dict, let's instead get a quick win by just skipping vars when we don't need to. If the assumptions behind this change over time or if there is need for speeding up a specific locale, we can re-address this. Before: ``` check_file/Typos/code time: [35.860 us 36.021 us 36.187 us] thrpt: [8.0117 MiB/s 8.0486 MiB/s 8.0846 MiB/s] check_file/Typos/corpus time: [26.966 ms 27.215 ms 27.521 ms] thrpt: [21.127 MiB/s 21.365 MiB/s 21.562 MiB/s] ``` After: ``` check_file/Typos/code time: [33.837 us 33.928 us 34.031 us] thrpt: [8.5191 MiB/s 8.5452 MiB/s 8.5680 MiB/s] check_file/Typos/corpus time: [17.521 ms 17.620 ms 17.730 ms] thrpt: [32.794 MiB/s 32.999 MiB/s 33.184 MiB/s] ``` This puts us inline with `--no-default-features --features dict` Fixes #253	2021-05-19 13:55:41 -05:00
Ed Page	639e65b88a	fix(dict): Handle cases from Linux These were found while running `typos` on Linux and inspecting a sampling of the results. #249 represents additional changes to make. There were some identifiers, that looked like hardware registers, that I'm unsure of what can be done for them.	2021-05-18 12:02:03 -05:00
Ed Page	fb0dac4297	refactor(dict): Allow 0..n corrections in BuiltIn The main use case is taking `ther` -> `there` and adding `the` and `their`.	2021-05-18 12:02:03 -05:00
Ed Page	77cfccb392	refactor(varcon): Clarify check's meanings	2021-05-15 19:29:27 -05:00
Ed Page	b830872ad0	chore: Update enumflags2	2021-05-13 10:20:15 -05:00
Ed Page	7c803681c4	chore: Release	2021-05-13 09:58:09 -05:00
Ed Page	3b9061dece	Merge pull request #240 from crate-ci/dependabot/cargo/codegenrs-1.0.0 chore(deps): Bump codegenrs from 0.1.5 to 1.0.0	2021-05-01 09:04:51 -05:00
dependabot[bot]	d72fa7acba	chore(deps): Bump codegenrs from 0.1.5 to 1.0.0 Bumps [codegenrs](https://github.com/crate-ci/codegenrs) from 0.1.5 to 1.0.0. - [Release notes](https://github.com/crate-ci/codegenrs/releases) - [Changelog](https://github.com/crate-ci/codegenrs/blob/master/CHANGELOG.md) - [Commits](https://github.com/crate-ci/codegenrs/compare/v0.1.5...v1.0.0) Signed-off-by: dependabot[bot] <support@github.com>	2021-05-01 07:01:59 +00:00
Ed Page	6216fa0837	fix(dict)!: Clarify word sizes with Ranges The code was generated with separate min / max, rather than using a Range and ensuring the API is used correctly.	2021-04-30 21:33:33 -05:00
Ed Page	f40ed5a328	style: Address clippy	2021-04-30 11:37:16 -05:00
Ed Page	517da7ecd2	perf(parser): Allow people to bypass unicode cost	2021-04-29 21:07:59 -05:00
Ed Page	09d2124d0f	perf(parser): Limit inner-loop assers	2021-04-29 18:31:05 -05:00
Ed Page	287c4cbfe9	refactor(parser): Give more impl flexibility	2021-04-29 18:31:05 -05:00
Ed Page	9cbc7410a4	fix(parser)!: Defer to Unicode XID for identifiers This saves us from having to have configuration for every detail. If people need more control, we can offer it later. Fixes #225	2021-04-29 18:30:57 -05:00
Ed Page	f15cc58f71	fix(parser): Flip leading digits to work correctly	2021-04-29 18:30:14 -05:00
Ed Page	4b94352b7a	perf(parser): Try hand-rolled number parsing	2021-04-29 18:30:14 -05:00
Ed Page	6b92e345cc	perf(parser): Speed up UTF-8 validation	2021-04-27 21:17:46 -05:00
Ed Page	819702c82f	refactor(parser): Unify str/bytes code paths The main goal is to support replacing the parser with `nom` where I need access to `str` only functionality. With crates like simdutf8, this might also offer up performance gains since they see the biggest benefit when doing large blocks of validation.	2021-04-27 21:17:43 -05:00
Ed Page	fce11d6c35	refactor(parser)!: Allow short-circuiting word splitting This is prep for experiments with getting this information ahead of time. See #224	2021-04-27 21:17:38 -05:00
Ed Page	9bfb506c6d	fix(typos)!: Clarify `Case::Upper`s name `Scream` was referrin to `SCREAMING_CASE` but outside of that context, I think `Upper` is more accurate.	2021-04-21 20:36:35 -05:00
Ed Page	1f4c587692	chore({{crate_name}}): Release {{version}}	2021-04-14 19:13:25 -05:00
Ed Page	b4459bef33	chore: Fix readme paths in Cargo.toml	2021-04-13 21:36:47 -05:00
Ed Page	d7978658d4	test(cli): Ensure we apply corrections	2021-04-10 19:13:48 -05:00
Ed Page	b5f606f201	refactor(typos): Simplify the top-level API	2021-03-01 11:50:23 -06:00
Ed Page	1010d2ffe5	refactor(tokenizer): Remove stale function	2021-03-01 11:50:23 -06:00
dependabot-preview[bot]	b8d3190ce9	chore(deps): bump itertools from 0.9.0 to 0.10.0 Bumps [itertools](https://github.com/bluss/rust-itertools) from 0.9.0 to 0.10.0. - [Release notes](https://github.com/bluss/rust-itertools/releases) - [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md) - [Commits](https://github.com/bluss/rust-itertools/compare/v0.9.0...v0.10.0) Signed-off-by: dependabot-preview[bot] <support@dependabot.com>	2021-01-03 03:40:45 +00:00
Ed Page	67222e9338	style: Address clippy	2021-01-02 13:49:28 -06:00
Ed Page	692f0ac095	refactor(typos): Focus API on primary use case	2021-01-02 13:10:40 -06:00
Ed Page	aba85df435	docs(typos): Clarify intent	2021-01-02 13:10:40 -06:00
Ed Page	48112a47e9	refactor(parser): Abstract over lifetimes	2021-01-02 13:10:30 -06:00

1 2

87 commits