Commit graph

1289 commits

Author SHA1 Message Date
Ed Page
15210c928c
Merge pull request #66 from epage/digits
perf: Use standard identifier rules to avoid doing umber checks
2019-11-02 19:55:34 -06:00
Ed Page
19321d9e48 style: Fix benchmark names 2019-11-02 19:40:07 -06:00
Ed Page
107308a655 perf: Use standard identifier rules to avoid doing umber checks 2019-11-02 19:40:06 -06:00
Ed Page
ed00f3ceae docs: Fix typo 2019-11-02 08:57:07 -06:00
Ed Page
c05ab4f9dc
Merge pull request #65 from epage/digits
fix: Ignore numbers as identifiers
2019-11-01 20:13:21 -06:00
Ed Page
68cd36d0de perf: Only do hex check if digits are in identifiers 2019-11-01 16:29:35 -06:00
Ed Page
a00831c847 fix: Ignore numbers as identifiers 2019-11-01 16:29:35 -06:00
Ed Page
cc4b53a1b4
Merge pull request #64 from epage/debug
feat: Dump files, identifiers, and words
2019-10-31 11:40:43 -06:00
Ed Page
ce365ae12e feat: Dump files, identifiers, and words
This will help people debug their configurations.

Fixes #41
2019-10-31 10:44:23 -06:00
Ed Page
a48a457cc3 fix: Improve the organization of --help 2019-10-30 11:02:02 -06:00
Ed Page
975dab8514 chore(benches): Fix compilation errors 2019-10-30 07:20:52 -06:00
Ed Page
b4c4bdd1c5 fix!: Use a simple error trait in lib 2019-10-29 13:37:48 -06:00
Ed Page
06db6fc693 refactor!: Move off of failure 2019-10-29 11:36:50 -06:00
Ed Page
559718cb22 chore: Remove unused dependency 2019-10-29 11:21:10 -06:00
Ed Page
0867a6a6eb
docs: Pre-built binaries!
Fixes #18
2019-10-29 09:42:26 -06:00
Ed Page
ea93373b37 chore(typos-cli): Release 0.1.3 2019-10-29 09:09:53 -06:00
Ed Page
c258d5cd6c chore(CI): Fix creating pre-built binaries 2019-10-29 09:09:18 -06:00
Ed Page
403cba8c78 perf: v0.1.2 benchmark results 2019-10-29 09:05:27 -06:00
Ed Page
cca956d3c9 tests: Fix ru_small benchmarks 2019-10-29 09:00:58 -06:00
Ed Page
0ff060f5f2 tests: Remove ru benchmark; too slow 2019-10-29 08:55:41 -06:00
Ed Page
ac90322c37 chore(typos-cli): Release 0.1.2 2019-10-29 08:36:00 -06:00
Ed Page
b10b7756f6 chore(typos): Release 0.2.0 2019-10-29 08:33:35 -06:00
Ed Page
0208dfadba chore(typos-dict): Release 0.2.0 2019-10-29 08:31:56 -06:00
Ed Page
2684b9b228
Merge pull request #63 from epage/dict
Prepare for dict cleanup
2019-10-29 08:19:17 -06:00
Ed Page
2e95e5e1f6 style(misspell): Collapae case 2019-10-29 08:07:27 -06:00
Ed Page
cec3ad07f1 style(misspell): Make contract explicit 2019-10-29 08:06:17 -06:00
Ed Page
ed004e7df9 chore(clippy): Ignore lints about code-genned code 2019-10-29 08:06:17 -06:00
Ed Page
dc327e0d51 style: Address clippy 2019-10-28 16:27:18 -06:00
Ed Page
8a8007d353 fix(dict): Hold off on publishing non-typos dicts
This postpones worrying about the names or anything else, for now.  The
main reason they exist is for help in building up `typos` dict.
2019-10-28 16:07:23 -06:00
Ed Page
0f06e602cb feat: Expose wikipedia's dict to Rust 2019-10-28 13:41:00 -06:00
Ed Page
3daafd1ea7 feat: Expose client9/misspell's dict to Rust 2019-10-28 13:41:00 -06:00
Ed Page
ce1ef2ca30 refactor!: Move dict implementation into CLI 2019-10-28 11:00:47 -06:00
Ed Page
5de368ac9d refactor(codegen): Hard code data 2019-10-28 11:00:47 -06:00
Ed Page
1cbdb3a77a feat: Expose codespell's dict to Rust 2019-10-28 11:00:47 -06:00
Ed Page
8f428b8fec refactor(dict): Prepare for more dicts 2019-10-28 11:00:47 -06:00
Ed Page
03fa6f8b8a
Merge pull request #62 from epage/error
Refactor to prepare for errors
2019-10-28 07:03:39 -06:00
Ed Page
4049a1e625 refactor(bench): Make it easier to change benchsuite 2019-10-26 20:32:35 -06:00
Ed Page
86b22d1f49 fix(dict)!: Make dictionary usable across threads 2019-10-26 20:32:16 -06:00
Ed Page
0a2f865d0f refactor: Change error strategy for future thread use 2019-10-26 20:31:10 -06:00
Ed Page
158f83b29c
docs: Update install instructions for crate split 2019-10-26 09:04:25 -06:00
Ed Page
a1a8ba2268
Merge pull request #61 from epage/overhead
Look into processing overhead
2019-10-25 16:00:26 -06:00
Ed Page
a60ab52c56 test: Add benchmarks for real-word processing 2019-10-25 15:48:39 -06:00
Ed Page
09513fdc13 refactor: Update naming 2019-10-25 15:17:24 -06:00
Ed Page
3bbd9b1c72 refactor: Update naming 2019-10-25 15:14:03 -06:00
Ed Page
c20e8f6880 perf: Speed up detection of text files
We reduce how much of the buffer we walk twice which should speed up
large files.  We still load the entire file into memory which will still
hurt binary files.

This is part of #34.
2019-10-25 15:05:36 -06:00
Ed Page
af49b6af86
Merge pull request #60 from epage/lazy
perf: Speed up identifier splitting
2019-10-25 15:01:42 -06:00
Ed Page
ff8fce5fb6 fix: Don't check words if ident gets a hit 2019-10-25 14:58:08 -06:00
Ed Page
a8629916b4 refactor: Delegate to rust for IO 2019-10-25 14:54:05 -06:00
Ed Page
979b42ed6f perf: Speed up identifier splitting
Before
```
test process_code         ... bench:      25,627 ns/iter (+/- 2,062)
test process_corpus       ... bench:  20,192,253 ns/iter (+/- 603,029)
test process_empty        ... bench:       7,418 ns/iter (+/- 707)
test process_no_tokens    ... bench:       8,788 ns/iter (+/- 1,065)
test process_sherlock     ... bench:      30,420 ns/iter (+/- 2,699)
test process_single_token ... bench:       9,426 ns/iter (+/- 811)
test symbol_split_lowercase_long  ... bench:       2,763 ns/iter (+/- 246)
test symbol_split_lowercase_short ... bench:         110 ns/iter (+/- 12)
test symbol_split_mixed_long      ... bench:       7,373 ns/iter (+/- 1,111)
test symbol_split_mixed_short     ... bench:         357 ns/iter (+/- 86)
```

After
```
test process_code         ... bench:      20,973 ns/iter (+/- 1,717)
test process_corpus       ... bench:  15,826,059 ns/iter (+/- 1,016,628)
test process_empty        ... bench:       7,364 ns/iter (+/- 616)
test process_no_tokens    ... bench:       8,858 ns/iter (+/- 632)
test process_sherlock     ... bench:      24,707 ns/iter (+/- 3,482)
test process_single_token ... bench:       9,339 ns/iter (+/- 706)
test symbol_split_lowercase_long  ... bench:       2,727 ns/iter (+/- 151)
test symbol_split_lowercase_short ... bench:          46 ns/iter (+/- 2)
test symbol_split_mixed_long      ... bench:       5,753 ns/iter (+/- 441)
test symbol_split_mixed_short     ... bench:          76 ns/iter (+/- 3)
```

Fixes #33
2019-10-25 14:47:58 -06:00
Ed Page
2ae1a0bca6 docs: Explain rg in benchmarks 2019-10-25 10:59:20 -06:00