Motoki Wu
b818afaa0e
Added failing test for Issue #1207 .
...
The noun chunk iterator should work for `Doc` but not for `Span`.
2017-11-17 17:04:27 -08:00
ines
a3d4dd1a5d
Test adding of lots of pipeline components (see #1585 )
...
Just to make sure that there's no error now or in the future with adding a large number of pipeline components.
2017-11-15 17:28:06 +01:00
Roman Domrachev
505c6a2f2f
Completely cleanup tokenizer cache
...
Tokenizer cache can have be different keys than string
That modification can slow down tokenizer and need to be measured
2017-11-15 17:55:48 +03:00
Roman Domrachev
3e21680814
Use safer method to get string without hit
2017-11-14 22:58:46 +03:00
Roman Domrachev
4e378dc4a4
Remove all obsolete code and test only initial problem
2017-11-14 20:45:04 +03:00
Roman
47ce2347b0
Create test that fails when actual cleanup caused
2017-11-14 20:28:13 +03:00
Roman Domrachev
3d247d2bb8
Get back previous testcase
2017-11-14 18:01:37 +03:00
Roman Domrachev
a2745b0e84
StringStore now actually cleaned
...
Do not lose docs in ref tracking
2017-11-14 17:45:50 +03:00
Roman Domrachev
ee60a52ee7
Fix test imports and last batch cleanup
2017-11-11 11:32:16 +03:00
Roman Domrachev
3c600adf23
Try to fix StringStore clean up (see #1506 )
2017-11-11 03:11:27 +03:00
ines
ee97fd3cb4
Add regression test for #1547
2017-11-11 00:14:03 +01:00
ines
2df27db671
Add unicode declaration
2017-11-11 00:13:56 +01:00
ines
1c218397f6
Ensure path in Doc.to_disk/from_disk (resolves ##1521)
...
Also add Doc serialization tests with both Path and string path options
2017-11-09 02:29:03 +01:00
Matthew Honnibal
a5ea0fdf5a
Fix #1518 : vocab.vectors.resize() didn't work
2017-11-08 22:18:37 +01:00
Matthew Honnibal
4194bc5744
Xfail flakey serialization test
2017-11-08 13:55:13 +01:00
ines
42a0fbf291
Fix textcat simple train example
2017-11-07 01:25:54 +01:00
ines
5f43953536
Move test
2017-11-06 23:14:10 +01:00
Matthew Honnibal
1831dbd065
Add test of simple textcat workflow
2017-11-06 22:04:29 +01:00
Matthew Honnibal
2f7e9f390d
Make test less flakey
2017-11-06 17:34:50 +01:00
Matthew Honnibal
407b08017e
Make test less flakey
2017-11-06 17:31:40 +01:00
Matthew Honnibal
102f797933
Fix lemma ordering in test
2017-11-06 17:02:17 +01:00
Matthew Honnibal
63c6ae4191
Fix lemmatizer test
2017-11-06 11:57:06 +01:00
Matthew Honnibal
00435d8f0c
Add extra beam parsing test
2017-11-05 14:39:57 +01:00
ines
5e7d98f72a
Remove test for #1491
2017-11-03 22:10:57 +01:00
ines
718f1c50fb
Add regression test for #1491
2017-11-03 21:11:20 +01:00
Matthew Honnibal
144a93c2a5
Back-off to tensor for similarity if no vectors
2017-11-03 20:56:33 +01:00
Matthew Honnibal
d6e831bf89
Fix lemmatizer tests
2017-11-03 19:46:34 +01:00
ines
eef930c73e
Assert instead of print
2017-11-03 18:50:57 +01:00
ines
f0986df94b
Add test for #1488 (passes on v2.0.0a18?)
2017-11-03 14:44:36 +01:00
Matthew Honnibal
711278b667
Make test less flakey
2017-11-03 14:36:08 +01:00
Matthew Honnibal
0a534ae96a
Fix test for backprop d_pad
2017-11-03 14:04:16 +01:00
Matthew Honnibal
a22f96c3f1
Add test for backpropagating padding
2017-11-03 00:48:54 +01:00
ines
3af281a334
Update test model name
2017-11-01 23:02:00 +01:00
ines
8c2260e18c
Move span tests to /doc
2017-11-01 16:56:35 +01:00
ines
260cb37224
Catch deprecation warning
2017-11-01 16:49:18 +01:00
ines
5914faafbb
Fix .merge tests to not use deprecated API
2017-11-01 16:49:11 +01:00
Matthew Honnibal
9e0ebee81c
Add Token.is_sent_start property, so can deprecate Token.sent_start
2017-11-01 13:27:14 +01:00
Matthew Honnibal
c047498f87
Fix vectors test
2017-11-01 13:24:47 +01:00
Matthew Honnibal
86eba61fae
Fix token.vector when vectors are missing
2017-11-01 00:47:35 +01:00
Ines Montani
d11659463b
Merge pull request #1152 from jimregan/develop-irish
...
[WIP] attempt a port from #1147
2017-11-01 00:23:43 +01:00
Jim O'Regan
08b0bfd153
merge
2017-10-31 22:55:59 +00:00
Jim O'Regan
00ecfa5417
Ó, not O
2017-10-31 22:54:42 +00:00
Ines Montani
25b1d6cd91
Fix syntax error
2017-10-31 22:36:03 +01:00
Matthew Honnibal
92dc127569
Fix test for Python 3
2017-10-31 22:21:55 +01:00
Jim O'Regan
fe4b10346a
replace example sentence until I get around to adding a punctuation.py
2017-10-31 20:24:53 +00:00
Matthew Honnibal
77d8f5de9a
Revise and simplify Vectors class
2017-10-31 18:25:08 +01:00
Jim O'Regan
d4a8160c36
change quotes
2017-10-31 15:15:44 +00:00
Jim O'Regan
34ca59691b
no idea what is wrong here
2017-10-31 14:50:13 +00:00
Jim O'Regan
41dd29e48e
merge
2017-10-31 14:07:45 +00:00
Matthew Honnibal
cb5217012f
Fix vector remapping
2017-10-31 11:40:46 +01:00
Matthew Honnibal
9c11ee4a1c
WIP on vectors fixes
2017-10-31 11:22:56 +01:00
Matthew Honnibal
368fdb389a
WIP on refactoring and fixing vectors
2017-10-31 02:00:26 +01:00
Explosion Bot
72aea8f105
Update vectors.add() to allow setting keys to rows
2017-10-30 10:03:08 +01:00
Matthew Honnibal
64e4ff7c4b
Merge 'tidy-up' changes into branch. Resolve conflicts
2017-10-28 13:16:06 +02:00
Ines Montani
4033e70c71
Merge pull request #1461 from explosion/feature/disable-pipes
...
💫 Add Language.disable_pipes(), to temporarily edit pipeline and update code examples
2017-10-27 12:21:40 +02:00
Matthew Honnibal
b0f3ea2200
Fix names of pipeline components
...
NeuralDependencyParser --> DependencyParser
NeuralEntityRecognizer --> EntityRecognizer
TokenVectorEncoder --> Tensorizer
NeuralLabeller --> MultitaskObjective
2017-10-26 12:38:23 +02:00
ines
de1e5f35d5
Merge branch 'develop' into feature/disable-pipes
2017-10-25 16:33:12 +02:00
ines
c0b55ebdac
Fix PhraseMatcher.__contains__ and add more tests
2017-10-25 16:31:11 +02:00
ines
657a4d91bc
Merge branch 'develop' into feature/disable-pipes
2017-10-25 15:19:05 +02:00
ines
1a722dac31
Merge branch 'develop' into feature/disable-pipes
2017-10-25 15:18:18 +02:00
Matthew Honnibal
b5de768852
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-25 14:44:16 +02:00
Matthew Honnibal
094512fd47
Fix model-mark on regression test.
2017-10-25 14:44:00 +02:00
Matthew Honnibal
e70f80f29e
Add Language.disable_pipes()
2017-10-25 13:46:41 +02:00
Ines Montani
d3bf488e16
Merge pull request #1171 from mollerhoj/support-danish
...
Improve basic support for Danish
2017-10-24 20:29:57 +02:00
Matthew Honnibal
908809d488
Update tests
2017-10-24 17:05:15 +02:00
Matthew Honnibal
30e67fa808
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-24 16:08:23 +02:00
Matthew Honnibal
63f0bde749
Add test for #1250 : Tokenizer cache clobbered special-case attrs
2017-10-24 16:07:18 +02:00
ines
090aed940a
Add test for currently failing span.as_doc case
2017-10-24 16:00:56 +02:00
ines
4ef81a9ebc
Fix whitespace
2017-10-24 16:00:56 +02:00
Matthew Honnibal
4bea65a1a8
Fix Issue #1450 : Off-by-1 in * and ? matches
...
Patterns that end in variable-length operators e.g. * and ? now end on
the correct token. Previously, they were off by 1: the next token was
pulled into the match, even if that's where the pattern failed.
2017-10-24 14:26:27 +02:00
Matthew Honnibal
391d5ef0d1
Normalize imports in regression test
2017-10-24 14:25:49 +02:00
Matthew Honnibal
b66b8f028b
Fix #1375 -- out-of-bounds on token.nbor()
2017-10-24 12:10:39 +02:00
Matthew Honnibal
a68d89a4f3
Add failing test for bug #1375 -- no out-of-bounds error for token.nbor()
2017-10-24 12:05:25 +02:00
Ines Montani
facf77e541
Merge branch 'develop' into support-danish
2017-10-24 11:53:19 +02:00
Matthew Honnibal
ccd2ab1a62
Merge pull request #1443 from ramananbalakrishnan/develop-get-lca-matrix
...
Add LCA matrix for spans and docs
2017-10-24 11:22:46 +02:00
Matthew Honnibal
ef3e5a361b
Merge pull request #1442 from explosion/feature/fix-sp
...
💫 Fix SP tag, tweak Vectors.__init__, fix Morphology
2017-10-24 10:24:07 +02:00
Matthew Honnibal
fdf25d10ba
Merge pull request #1440 from ramananbalakrishnan/develop
...
Support single value for attribute list in doc.to_array
2017-10-24 10:23:12 +02:00
Matthew Honnibal
490ad3eaf0
Check that empty strings are handled. Closes #1242
2017-10-21 00:52:14 +02:00
Ramanan Balakrishnan
d2fe56a577
Add LCA matrix for spans and docs
2017-10-20 23:58:00 +05:30
Matthew Honnibal
d8391b1c4d
Fix #1434 : Matcher failed on ending ? if no token
2017-10-20 16:49:36 +02:00
Matthew Honnibal
f111b228e0
Fix re-parsing of previously parsed text
...
If a Doc object had been previously parsed, it was possible for
invalid parses to be added. There were two problems:
1) The parse was only being partially erased
2) The RightArc action was able to create a 1-cycle.
This patch fixes both errors, and avoids resetting the parse if one is
present. In theory this might allow a better parse to be predicted by
running the parser twice.
Closes #1253 .
2017-10-20 16:27:36 +02:00
Matthew Honnibal
ebecaddb76
Make 'data_or_width' two keyword args in Vectors.__init__
...
Previously the data and width options were one argument in Vectors,
which meant you couldn't say vectors = Vectors(strings, width=300).
It's better to have two keywords.
2017-10-20 14:17:15 +02:00
Ramanan Balakrishnan
b3ab124fc5
Support strings for attribute list in doc.to_array
2017-10-20 11:46:57 +05:30
ines
bf415fd778
Add test for serializing extension attrs (see #1085 )
2017-10-19 00:53:08 +02:00
Matthew Honnibal
fe844148f6
Test pickling hooks
2017-10-17 19:43:52 +02:00
Matthew Honnibal
374819edf8
Test user_data deserialization, re #1085
2017-10-17 19:28:54 +02:00
Matthew Honnibal
8ca97f32a3
Fix doc pickling test
2017-10-17 18:19:57 +02:00
Matthew Honnibal
45d1dd90b1
Add tests for pickling doc
2017-10-17 17:20:58 +02:00
Matthew Honnibal
4174477161
Fix equality check in test
2017-10-16 19:50:35 +02:00
Matthew Honnibal
010a7309ff
Merge pull request #1402 from explosion/feature/fix-matcher-operators
...
💫 Fix Matcher variable-length operators
2017-10-16 17:53:19 +02:00
Matthew Honnibal
c29927d2e7
Fix matcher test
2017-10-16 17:22:18 +02:00
Matthew Honnibal
a928ae2f35
Merge branch 'develop' into feature/fix-matcher-operators
2017-10-16 13:38:36 +02:00
Matthew Honnibal
748d525801
Add more matcher operator tests
2017-10-16 13:38:01 +02:00
ines
3516aa0cea
Port over changes from #1389
2017-10-14 13:32:55 +02:00
ines
cd6a29dce7
Port over changes from #1294
2017-10-14 13:28:46 +02:00
ines
38c756fd85
Port over changes from #1287
2017-10-14 13:16:21 +02:00
ines
612224c10d
Port over changes from #1157
2017-10-14 13:11:39 +02:00
ines
9b3f8f9ec3
Fix formatting and add comment on languages
2017-10-14 13:11:18 +02:00
ines
a4d974d97b
Port over URL pattern changes from #1411
2017-10-14 12:58:07 +02:00
Matthew Honnibal
cf6da9301a
Update lemmatizer test
2017-10-12 22:50:52 +02:00