Commit graph

865 commits

Author SHA1 Message Date
Ed Page
2e95e5e1f6 style(misspell): Collapae case 2019-10-29 08:07:27 -06:00
Ed Page
cec3ad07f1 style(misspell): Make contract explicit 2019-10-29 08:06:17 -06:00
Ed Page
ed004e7df9 chore(clippy): Ignore lints about code-genned code 2019-10-29 08:06:17 -06:00
Ed Page
dc327e0d51 style: Address clippy 2019-10-28 16:27:18 -06:00
Ed Page
8a8007d353 fix(dict): Hold off on publishing non-typos dicts
This postpones worrying about the names or anything else, for now.  The
main reason they exist is for help in building up `typos` dict.
2019-10-28 16:07:23 -06:00
Ed Page
0f06e602cb feat: Expose wikipedia's dict to Rust 2019-10-28 13:41:00 -06:00
Ed Page
3daafd1ea7 feat: Expose client9/misspell's dict to Rust 2019-10-28 13:41:00 -06:00
Ed Page
ce1ef2ca30 refactor!: Move dict implementation into CLI 2019-10-28 11:00:47 -06:00
Ed Page
5de368ac9d refactor(codegen): Hard code data 2019-10-28 11:00:47 -06:00
Ed Page
1cbdb3a77a feat: Expose codespell's dict to Rust 2019-10-28 11:00:47 -06:00
Ed Page
8f428b8fec refactor(dict): Prepare for more dicts 2019-10-28 11:00:47 -06:00
Ed Page
03fa6f8b8a
Merge pull request #62 from epage/error
Refactor to prepare for errors
2019-10-28 07:03:39 -06:00
Ed Page
4049a1e625 refactor(bench): Make it easier to change benchsuite 2019-10-26 20:32:35 -06:00
Ed Page
86b22d1f49 fix(dict)!: Make dictionary usable across threads 2019-10-26 20:32:16 -06:00
Ed Page
0a2f865d0f refactor: Change error strategy for future thread use 2019-10-26 20:31:10 -06:00
Ed Page
158f83b29c
docs: Update install instructions for crate split 2019-10-26 09:04:25 -06:00
Ed Page
a1a8ba2268
Merge pull request #61 from epage/overhead
Look into processing overhead
2019-10-25 16:00:26 -06:00
Ed Page
a60ab52c56 test: Add benchmarks for real-word processing 2019-10-25 15:48:39 -06:00
Ed Page
09513fdc13 refactor: Update naming 2019-10-25 15:17:24 -06:00
Ed Page
3bbd9b1c72 refactor: Update naming 2019-10-25 15:14:03 -06:00
Ed Page
c20e8f6880 perf: Speed up detection of text files
We reduce how much of the buffer we walk twice which should speed up
large files.  We still load the entire file into memory which will still
hurt binary files.

This is part of #34.
2019-10-25 15:05:36 -06:00
Ed Page
af49b6af86
Merge pull request #60 from epage/lazy
perf: Speed up identifier splitting
2019-10-25 15:01:42 -06:00
Ed Page
ff8fce5fb6 fix: Don't check words if ident gets a hit 2019-10-25 14:58:08 -06:00
Ed Page
a8629916b4 refactor: Delegate to rust for IO 2019-10-25 14:54:05 -06:00
Ed Page
979b42ed6f perf: Speed up identifier splitting
Before
```
test process_code         ... bench:      25,627 ns/iter (+/- 2,062)
test process_corpus       ... bench:  20,192,253 ns/iter (+/- 603,029)
test process_empty        ... bench:       7,418 ns/iter (+/- 707)
test process_no_tokens    ... bench:       8,788 ns/iter (+/- 1,065)
test process_sherlock     ... bench:      30,420 ns/iter (+/- 2,699)
test process_single_token ... bench:       9,426 ns/iter (+/- 811)
test symbol_split_lowercase_long  ... bench:       2,763 ns/iter (+/- 246)
test symbol_split_lowercase_short ... bench:         110 ns/iter (+/- 12)
test symbol_split_mixed_long      ... bench:       7,373 ns/iter (+/- 1,111)
test symbol_split_mixed_short     ... bench:         357 ns/iter (+/- 86)
```

After
```
test process_code         ... bench:      20,973 ns/iter (+/- 1,717)
test process_corpus       ... bench:  15,826,059 ns/iter (+/- 1,016,628)
test process_empty        ... bench:       7,364 ns/iter (+/- 616)
test process_no_tokens    ... bench:       8,858 ns/iter (+/- 632)
test process_sherlock     ... bench:      24,707 ns/iter (+/- 3,482)
test process_single_token ... bench:       9,339 ns/iter (+/- 706)
test symbol_split_lowercase_long  ... bench:       2,727 ns/iter (+/- 151)
test symbol_split_lowercase_short ... bench:          46 ns/iter (+/- 2)
test symbol_split_mixed_long      ... bench:       5,753 ns/iter (+/- 441)
test symbol_split_mixed_short     ... bench:          76 ns/iter (+/- 3)
```

Fixes #33
2019-10-25 14:47:58 -06:00
Ed Page
2ae1a0bca6 docs: Explain rg in benchmarks 2019-10-25 10:59:20 -06:00
Ed Page
70e9b5d191
Merge pull request #59 from epage/update
chore: Update dependencies
2019-10-25 10:38:30 -06:00
Ed Page
00ac204492 chore: Configure commit lints 2019-10-25 10:28:28 -06:00
Ed Page
12e6cadbe7 chore: Update dependencies 2019-10-25 10:25:59 -06:00
Ed Page
2ae125536d chore(typos-codegen): Release 1.0.1 2019-10-25 10:12:07 -06:00
Ed Page
04775f6679 chore(typos-cli): Release 0.1.1 2019-10-25 10:10:55 -06:00
Ed Page
bdbb826478 chore(typos-dict): Release 0.1.1 2019-10-25 10:06:02 -06:00
Ed Page
de77cdcb5b chore(typos): Release 0.1.1 2019-10-25 10:00:54 -06:00
Ed Page
38b51bec05 docs: Fix links 2019-10-25 09:58:24 -06:00
Ed Page
f163b8a56c chore: Fix readme links 2019-10-25 09:51:37 -06:00
Ed Page
7d94e4952b chore: Add release config 2019-10-25 09:48:58 -06:00
Ed Page
ed89557e96 chore: Update dependencies 2019-10-25 09:47:02 -06:00
Ed Page
0396c5942a chore: Reove dead feature 2019-10-25 09:47:02 -06:00
Ed Page
52926d8cd1
docs: Fix source precedence 2019-10-25 09:39:55 -06:00
Ed Page
5863158a31 docs: Add reference 2019-10-25 08:17:58 -06:00
Ed Page
851336b931 docs: Fix links 2019-10-25 08:06:05 -06:00
Ed Page
9005b93dd3 docs: Link to new CI 2019-10-25 07:49:41 -06:00
Ed Page
a859cee1eb chore(CI): Fix code-gen verification 2019-10-25 07:47:10 -06:00
Ed Page
1c56aa6883 chore(CI): Verify code-gen 2019-10-25 07:41:36 -06:00
Ed Page
2d2cbe166a chore(CI): Fix indentation 2019-10-25 07:37:41 -06:00
Ed Page
8811ebf75a chore(CI): Switch to AzDO 2019-10-25 07:32:34 -06:00
Ed Page
f5af748146
docs: Add link to benchmarks 2019-10-25 06:15:40 -06:00
Ed Page
8ef836a51f
Merge pull request #51 from epage/bench
perf: Benchmark 0.1
2019-10-25 06:14:00 -06:00
Ed Page
ca78fed347 perf: Benchmark 0.1 2019-10-24 21:04:43 -06:00
Ed Page
a3fabbd855 perf: Create end-to-end benchmark suite
Fixtures were taken from ripgrep.  The framework was rewritten to be
more composable (rather than a single python script that had both
generic fixtures and selection of units-under-test)

One of the goals was to completely generate a report that would include
all relevant information for reproducing the results or adding nuance
for when results change.

Having problems with subtitles_en, so its not fully included atm.
2019-10-24 21:04:43 -06:00