Matthew Honnibal
90d1d9b230
Remove obsolete parser code
2017-10-26 13:22:45 +02:00
Matthew Honnibal
33f8c58782
Remove obsolete parser.pyx
2017-10-26 12:42:05 +02:00
Matthew Honnibal
a8abc47811
Rename BaseThincComponent --> Pipe
2017-10-26 12:40:40 +02:00
Matthew Honnibal
b0f3ea2200
Fix names of pipeline components
...
NeuralDependencyParser --> DependencyParser
NeuralEntityRecognizer --> EntityRecognizer
TokenVectorEncoder --> Tensorizer
NeuralLabeller --> MultitaskObjective
2017-10-26 12:38:23 +02:00
Matthew Honnibal
b6b4f1aaf7
Merge pull request #1462 from explosion/feature/vector-meta-data
...
💫 Add vector meta data to model meta.json on train/package and show in docs
2017-10-26 11:39:41 +02:00
Ines Montani
090bd00369
Merge pull request #1464 from mayukh18/develop_bengali_pronouns
...
added the bengali pronouns for v2.0
2017-10-25 21:55:25 +02:00
mayukh18
1bc07758fa
added few bengali pronouns
2017-10-25 22:24:40 +05:30
ines
728b609bf9
Merge branch 'develop' into feature/vector-meta-data
2017-10-25 16:32:22 +02:00
ines
c0b55ebdac
Fix PhraseMatcher.__contains__ and add more tests
2017-10-25 16:31:11 +02:00
ines
91beacf5e3
Fix Matcher.__contains__
2017-10-25 16:19:38 +02:00
ines
11e3f19764
Fix vectors data added after training (see #1457 )
2017-10-25 16:08:26 +02:00
ines
057954695b
Read pipeline and vector data off model in --generate-meta
2017-10-25 16:03:26 +02:00
ines
273e638183
Add vector data to model meta after training (see #1457 )
2017-10-25 16:03:05 +02:00
ines
18aae423fb
Remove import of non-existing function
2017-10-25 15:54:10 +02:00
ines
5117a7d24d
Fix whitespace
2017-10-25 15:54:02 +02:00
Matthew Honnibal
b5de768852
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-25 14:44:16 +02:00
Matthew Honnibal
094512fd47
Fix model-mark on regression test.
2017-10-25 14:44:00 +02:00
ines
72497c8cb2
Remove comments and add TODO
2017-10-25 12:15:43 +02:00
ines
4d97efc3b5
Add missing docstrings
2017-10-25 12:10:16 +02:00
ines
1262aa0bf9
Implement PhraseMatcher.__contains__
2017-10-25 12:10:04 +02:00
ines
9c733a8849
Implement PhraseMatcher.__len__
2017-10-25 12:09:56 +02:00
ines
7eebeeaf85
Fix Matcher.__contains__
2017-10-25 12:09:47 +02:00
ines
7bcec57462
Remove unused attribute
2017-10-25 12:08:54 +02:00
ines
0b1dcbac14
Remove unused function
2017-10-25 12:08:46 +02:00
ines
3484174e48
Add Language.path
2017-10-25 11:57:43 +02:00
Ines Montani
d3bf488e16
Merge pull request #1171 from mollerhoj/support-danish
...
Improve basic support for Danish
2017-10-24 20:29:57 +02:00
Matthew Honnibal
d9bb1e5de8
Increment version
2017-10-24 17:06:19 +02:00
Matthew Honnibal
908809d488
Update tests
2017-10-24 17:05:15 +02:00
Matthew Honnibal
66766c1454
Restore SP tag to English tag_map, until models migrate
2017-10-24 17:05:00 +02:00
Matthew Honnibal
30e67fa808
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-24 16:08:23 +02:00
Matthew Honnibal
b0f6fd3f1d
Disable tokenizer cache for special-cases. Fixes #1250
2017-10-24 16:08:05 +02:00
Matthew Honnibal
63f0bde749
Add test for #1250 : Tokenizer cache clobbered special-case attrs
2017-10-24 16:07:18 +02:00
ines
8492d5be6d
Always make lemmatizer return a list of lemmas, not a set
2017-10-24 16:00:56 +02:00
ines
95f866f99f
Add lookup argument to Lemmatizer.load
2017-10-24 16:00:56 +02:00
ines
95f6174516
Remove tensorizer from model pipeline example in spacy package
2017-10-24 16:00:56 +02:00
ines
090aed940a
Add test for currently failing span.as_doc case
2017-10-24 16:00:56 +02:00
ines
4ef81a9ebc
Fix whitespace
2017-10-24 16:00:56 +02:00
Matthew Honnibal
18f1c1d0ba
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-24 14:29:43 +02:00
Matthew Honnibal
4bea65a1a8
Fix Issue #1450 : Off-by-1 in * and ? matches
...
Patterns that end in variable-length operators e.g. * and ? now end on
the correct token. Previously, they were off by 1: the next token was
pulled into the match, even if that's where the pattern failed.
2017-10-24 14:26:27 +02:00
Matthew Honnibal
391d5ef0d1
Normalize imports in regression test
2017-10-24 14:25:49 +02:00
ines
c55db0a4a1
Add example sentences for Japanese and Chinese (see #1107 )
2017-10-24 13:02:24 +02:00
ines
66f8f9d4a0
Fix Japanese tokenizer
...
JapaneseTokenizer now returns a Doc, not individual words
2017-10-24 13:02:19 +02:00
Matthew Honnibal
dd5b2d8fa3
Check for out-of-memory when calling calloc. Closes #1446
2017-10-24 12:40:47 +02:00
Matthew Honnibal
b66b8f028b
Fix #1375 -- out-of-bounds on token.nbor()
2017-10-24 12:10:39 +02:00
Matthew Honnibal
a68d89a4f3
Add failing test for bug #1375 -- no out-of-bounds error for token.nbor()
2017-10-24 12:05:25 +02:00
Ines Montani
facf77e541
Merge branch 'develop' into support-danish
2017-10-24 11:53:19 +02:00
Matthew Honnibal
ccd2ab1a62
Merge pull request #1443 from ramananbalakrishnan/develop-get-lca-matrix
...
Add LCA matrix for spans and docs
2017-10-24 11:22:46 +02:00
Matthew Honnibal
ef3e5a361b
Merge pull request #1442 from explosion/feature/fix-sp
...
💫 Fix SP tag, tweak Vectors.__init__, fix Morphology
2017-10-24 10:24:07 +02:00
Matthew Honnibal
fdf25d10ba
Merge pull request #1440 from ramananbalakrishnan/develop
...
Support single value for attribute list in doc.to_array
2017-10-24 10:23:12 +02:00
ines
a31f048b4d
Fix formatting
2017-10-23 10:38:06 +02:00