Matthew Honnibal
4b2e5e59ed
Add flush_cache method to tokenizer, to fix #1061
...
The tokenizer caches output for common chunks, for efficiency. This
cache is be invalidated when the tokenizer rules change, e.g. when a new
special-case rule is introduced. That's what was causing #1061 .
When the cache is flushed, we free the intermediate token chunks.
I *think* this is safe --- but if we start getting segfaults, this patch
is to blame. The resolution would be to simply not free those bits of
memory. They'll be freed when the tokenizer exits anyway.
2017-07-22 15:06:50 +02:00
Matthew Honnibal
d9b85675d7
Rename regression test
2017-07-22 14:14:35 +02:00
Matthew Honnibal
dfbc7e49de
Add test for Issue #1207
2017-07-22 14:14:01 +02:00
Matthew Honnibal
0ae3807d7d
Fix gaps in Lexeme API. Closes #1031
2017-07-22 13:53:48 +02:00
Matthew Honnibal
24c4c51f13
Try to make test999 less flakey
2017-04-26 18:42:06 +02:00
Matthew Honnibal
c4be9c36fe
Fix unicode header in tests
2017-04-24 10:09:01 +02:00
Matthew Honnibal
65f10b53e5
Fix test
2017-04-24 00:25:55 +02:00
Matthew Honnibal
70a43858e1
Fix flakey test
2017-04-24 00:06:30 +02:00
Matthew Honnibal
3973af2d15
Make training test less flakey
2017-04-23 22:59:34 +02:00
Matthew Honnibal
874a3cbb07
Add test for Issue #955
2017-04-23 17:57:01 +02:00
Matthew Honnibal
5d8af40445
Add test for Issue #999
2017-04-23 17:06:30 +02:00
Matthew Honnibal
040751ad17
Remove xfail on Test #910
2017-04-23 16:28:55 +02:00
Matthew Honnibal
1dca7eeb03
Add unicode declaration on new regression test
2017-04-07 18:09:23 +02:00
ines
887827fc6a
Merge branch 'develop'
2017-04-07 17:36:23 +02:00
ines
bf0f15e762
Add / to tokenizer infixes ( resolves #891 )
2017-04-07 17:30:44 +02:00
ines
00b9011a49
Fix whitespace
2017-04-07 17:29:59 +02:00
Matthew Honnibal
cc36c308f4
Fix noun_chunk rules around coordination
...
Closes #693 .
2017-04-07 17:06:40 +02:00
Matthew Honnibal
83dca920d4
Rename test #913 -> #957 , comment
...
Make test for #957 reference correct bug. Add comment.
Previous commit closes #957 .
2017-04-07 15:54:25 +02:00
Matthew Honnibal
5887383fc0
Add test for Issue #913 : Hang from bad regex
2017-04-07 15:47:27 +02:00
Matthew Honnibal
cfff4e0f61
Improve test
2017-03-31 13:59:32 +02:00
Matthew Honnibal
e854f28304
Add test for Issue #758
...
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Matthew Honnibal
b94286de30
Fix regression test
2017-03-25 22:35:07 +01:00
Matthew Honnibal
4f400fa486
Prevent lemmatization of base nouns
...
Update lemmatizer's base-form check, for change in morphology class.
Closes #903 .
2017-03-25 21:51:12 +01:00
Matthew Honnibal
4454c1b23f
Block lemmatization of base-form adjectives
...
Fixes check that an adjective is a base form (as opposed to a
comparative or superlative), so that it's not lemmatized.
e.g. inner -!> inn. Closes #912 .
2017-03-25 21:29:57 +01:00
Matthew Honnibal
f40fbc3710
Add test for Issue #910 : Resuming entity training
2017-03-23 23:38:57 +01:00
ines
fe0ff00fe1
Fix spacing
2017-03-19 11:55:37 +01:00
ines
5712da6095
Add regression test for #891
2017-03-19 11:48:01 +01:00
ines
aefb898e37
Add title-case version of morph rules ( resolves #686 )
2017-03-18 17:27:11 +01:00
ines
d0b85faf69
Pass regression test for #401 ( resolves #401 )
...
Fixed in new English models.
2017-03-18 17:06:49 +01:00
Matthew Honnibal
de0e6385b4
Merge branch 'master' of https://github.com/explosion/spaCy
2017-03-18 16:17:28 +01:00
Matthew Honnibal
fe442cac53
Fix #717 : Set correct lemma for contracted verbs
2017-03-18 16:16:10 +01:00
ines
ad934a9abd
Add regression test for #693
2017-03-18 16:12:30 +01:00
ines
f57c616830
Add regression test for #704 and test new model ( resolves #704 )
...
(using new English model)
2017-03-18 16:04:14 +01:00
Matthew Honnibal
413138de79
Fix #719 : Lemmatizer can no longer output empty string
2017-03-18 16:02:06 +01:00
Matthew Honnibal
db51abf685
Fix tests
2017-03-16 18:53:47 -05:00
Matthew Honnibal
fea9fe08af
Merge pull request #866 from juanmirocks/master
...
Fix lemmatization of OOV words
2017-03-16 23:37:36 +01:00
ines
42ba740dde
Revert "Merge branch 'debug'"
...
This reverts commit 89b79d1178
, reversing
changes made to 02bdf490a1
.
2017-03-13 20:11:52 +01:00
ines
4c5f51e49e
Update regression test
2017-03-13 15:16:11 +01:00
ines
02bdf490a1
Remove regression test to see if it caused pytest Travis error
2017-03-13 13:00:22 +01:00
ines
17018750ac
Add regression test for #717
2017-03-13 12:58:22 +01:00
ines
2883ebfca2
Remove print statement
2017-03-13 12:30:42 +01:00
ines
98c13d8aa9
Add regression test for #401
2017-03-13 12:28:41 +01:00
ines
444d665f9d
Add regression test for #686
2017-03-13 12:23:35 +01:00
ines
46b17e5b51
Add regression test for #719
2017-03-13 12:17:35 +01:00
ines
c8ae682ff9
Add regression test for #636
2017-03-13 12:08:31 +01:00
ines
337f9601f2
Add missing unicode declaration
2017-03-13 12:08:19 +01:00
ines
d70386ec6e
Update docstring in #886 regression test
2017-03-13 12:00:38 +01:00
ines
51ba3ef0a8
Add regression test for #886
2017-03-13 11:44:58 +01:00
ines
66c1f194f9
Use consistent unicode declarations
2017-03-12 13:07:28 +01:00
Matthew Honnibal
5b0b968d13
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-03-08 15:03:10 +01:00