Commit Graph

7731 Commits

Author SHA1 Message Date
Roman Domrachev
505c6a2f2f Completely cleanup tokenizer cache
Tokenizer cache can have be different keys than string

That modification can slow down tokenizer and need to be measured
2017-11-15 17:55:48 +03:00
Roman Domrachev
3e21680814 Use safer method to get string without hit 2017-11-14 22:58:46 +03:00
Roman Domrachev
a33d5a068d Try to hold origin data instead of restore it 2017-11-14 22:40:03 +03:00
Roman Domrachev
91e2fa6561 Clean all caches 2017-11-14 21:15:04 +03:00
Roman Domrachev
4e378dc4a4 Remove all obsolete code and test only initial problem 2017-11-14 20:45:04 +03:00
Roman
47ce2347b0
Create test that fails when actual cleanup caused 2017-11-14 20:28:13 +03:00
Roman
caae77f72d
Update strings.pyx 2017-11-14 19:44:40 +03:00
Roman Domrachev
3d247d2bb8 Get back previous testcase 2017-11-14 18:01:37 +03:00
Roman Domrachev
870defa815 Swap keys in proper place
Remove unnecessary clear of the hits
2017-11-14 17:56:30 +03:00
Roman Domrachev
86ca434c93 Merge github.com:explosion/spaCy 2017-11-14 17:46:22 +03:00
Roman Domrachev
a2745b0e84 StringStore now actually cleaned
Do not lose docs in ref tracking
2017-11-14 17:45:50 +03:00
Ines Montani
48b6cfe59e
Merge pull request #1569 from KMLDS/patch-1
trivial typo in docs
2017-11-14 01:46:34 +01:00
KMLDS
d5b20ac3b6
Update span.jade 2017-11-13 19:27:20 -05:00
Ines Montani
ea6c85c67a
Merge pull request #1566 from MathiasDesch/master (resolves #1248)
Add exceptions to tokenizer and norm
2017-11-13 19:05:22 +01:00
Matthew Honnibal
1b348389bb Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-13 18:18:48 +01:00
Matthew Honnibal
ca73d0d8fe Cleanup states after beam parsing, explicitly 2017-11-13 18:18:26 +01:00
Matthew Honnibal
63ef9a2e73 Remove __dealloc__ from ParserBeam 2017-11-13 18:18:08 +01:00
Mathias Deschamps
d82f868e1c Ignore pycharm project files 2017-11-13 17:46:05 +01:00
Mathias Deschamps
c0691b2ab4 Add tokenizer exceptions for ing verbs
Extend list of tokenizing exceptions introduced in 123810b
2017-11-13 17:46:05 +01:00
Mathias Deschamps
288298ead9 Add norm exception for ing verbs
Some ing verbs are sometimes written in or in'. Make the NORM form correct
2017-11-13 17:46:05 +01:00
ines
0e5642593e Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-13 17:00:07 +01:00
ines
bc79274706 Fix typo 2017-11-13 17:00:03 +01:00
Ines Montani
339675c9fb
Merge pull request #1565 from DuyguA/patch-2
added contributor agreement for DuyguA
2017-11-13 16:21:50 +01:00
Ines Montani
6ef702f79f
Merge pull request #1563 from abhi18av/patch-2
improved upon the list of included stop_words
2017-11-13 16:14:28 +01:00
Duygu Altinok
c263c3acce
added contributor agreement for DuyguA 2017-11-13 15:45:13 +01:00
Abhinav Sharma
4dd34058a2
Create abhi18av.md 2017-11-13 17:23:05 +05:30
Abhinav Sharma
59f5740ede
improved upon the list of included stop_words 2017-11-13 17:13:49 +05:30
ines
7a7b01feb1 Update links 2017-11-13 08:30:06 +01:00
ines
b3e502a076 Add videos section to resources 2017-11-13 08:29:57 +01:00
ines
f2b6b98b75 Fix typo in code example (resolves #1556) 2017-11-13 08:29:16 +01:00
Matthew Honnibal
f0e28e8ae5
Make fasttext reader accommodate whitespace 2017-11-12 12:07:13 +01:00
Ines Montani
94d8b711a3
Update CONTRIBUTING.md 2017-11-12 12:06:59 +01:00
Matthew Honnibal
6e641f46d4 Create a preprocess function that gets bigrams 2017-11-12 00:43:41 +01:00
Matthew Honnibal
86d37301c9
Merge pull request #1552 from ligser/master
Try to add ability to clean up StringStore in pipe
2017-11-11 18:39:48 +01:00
Matthew Honnibal
c9251d79e3
Edit comment 2017-11-11 18:38:32 +01:00
Matthew Honnibal
dd1678eab3
Edit comment 2017-11-11 18:37:08 +01:00
ines
ceb2c596f1 Update conda details 2017-11-11 13:07:00 +01:00
Roman Domrachev
378280039b Fill contributer agreement 2017-11-11 11:39:31 +03:00
Roman Domrachev
ee60a52ee7 Fix test imports and last batch cleanup 2017-11-11 11:32:16 +03:00
Roman Domrachev
4a6b094e09 Remove unused import 2017-11-11 03:13:05 +03:00
Roman Domrachev
3c600adf23 Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
ines
ee97fd3cb4 Add regression test for #1547 2017-11-11 00:14:03 +01:00
ines
2df27db671 Add unicode declaration 2017-11-11 00:13:56 +01:00
ines
f36fab39b0 Don't rename component in intent parser example (resolves #1551)
Otherwise, the default saved model won't know that it's supposed to create spaCy's 'parser'.
2017-11-10 23:35:38 +01:00
ines
35653bef3a Add missing import (fixes #1546) 2017-11-10 19:05:18 +01:00
ines
4a97def06a Update features 2017-11-10 19:05:10 +01:00
ines
dea5636d6c Fix broken links 2017-11-10 13:06:38 +01:00
Ines Montani
5a5d46e0ab
Merge pull request #1542 from Wahib/patch-1
Fix typo. Add missing '='.
2017-11-10 12:53:27 +01:00
Wahib Faizi
0da56f8ef8
Fix typo. Add missing '='. 2017-11-10 14:51:24 +03:00
Ines Montani
1a23a0f87e
Remove broken link (resolves #1541) 2017-11-10 12:28:39 +01:00