ines
7459ecfa87
Port over contributor agreements
2017-10-24 20:13:34 +02:00
ines
d71702b827
Fix formatting
2017-10-24 20:11:04 +02:00
Matthew Honnibal
d9bb1e5de8
Increment version
2017-10-24 17:06:19 +02:00
Matthew Honnibal
908809d488
Update tests
2017-10-24 17:05:15 +02:00
Matthew Honnibal
66766c1454
Restore SP tag to English tag_map, until models migrate
2017-10-24 17:05:00 +02:00
ines
b51dcee3ce
Fix unicode in lightning tour example ( resolves #1356 )
2017-10-24 16:25:49 +02:00
ines
ebd2e5ff54
Fix matcher docs ( resolves #1453 )
2017-10-24 16:22:46 +02:00
ines
90601cf1b3
Fix formatting
2017-10-24 16:22:37 +02:00
ines
0e081d0167
Update JSON training format docs ( resolves #1291 )
2017-10-24 16:17:54 +02:00
ines
91dbee1b8f
Add BILUO docs to NER annotation scheme
2017-10-24 16:17:03 +02:00
ines
fdd8dacb75
Fix compilation of color utility class names
2017-10-24 16:13:52 +02:00
ines
a2e7e9be98
Update landing
2017-10-24 16:12:47 +02:00
Matthew Honnibal
30e67fa808
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-24 16:08:23 +02:00
Matthew Honnibal
b0f6fd3f1d
Disable tokenizer cache for special-cases. Fixes #1250
2017-10-24 16:08:05 +02:00
Matthew Honnibal
63f0bde749
Add test for #1250 : Tokenizer cache clobbered special-case attrs
2017-10-24 16:07:18 +02:00
ines
8492d5be6d
Always make lemmatizer return a list of lemmas, not a set
2017-10-24 16:00:56 +02:00
ines
95f866f99f
Add lookup argument to Lemmatizer.load
2017-10-24 16:00:56 +02:00
ines
95f6174516
Remove tensorizer from model pipeline example in spacy package
2017-10-24 16:00:56 +02:00
ines
6686e53530
Allow GitHub embeds to specify optional language
2017-10-24 16:00:56 +02:00
ines
56a47f137f
Add title description for tokenizer
2017-10-24 16:00:56 +02:00
ines
3944c1d6e7
Document lemmatizer
2017-10-24 16:00:56 +02:00
ines
c9dc88ddfc
Document current JSON format for training
2017-10-24 16:00:56 +02:00
ines
2b8e7c45e0
Use better training data JSON example
2017-10-24 16:00:56 +02:00
ines
090aed940a
Add test for currently failing span.as_doc case
2017-10-24 16:00:56 +02:00
ines
4ef81a9ebc
Fix whitespace
2017-10-24 16:00:56 +02:00
Matthew Honnibal
18f1c1d0ba
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-24 14:29:43 +02:00
Matthew Honnibal
4bea65a1a8
Fix Issue #1450 : Off-by-1 in * and ? matches
...
Patterns that end in variable-length operators e.g. * and ? now end on
the correct token. Previously, they were off by 1: the next token was
pulled into the match, even if that's where the pattern failed.
2017-10-24 14:26:27 +02:00
Matthew Honnibal
391d5ef0d1
Normalize imports in regression test
2017-10-24 14:25:49 +02:00
ines
c55db0a4a1
Add example sentences for Japanese and Chinese (see #1107 )
2017-10-24 13:02:24 +02:00
ines
66f8f9d4a0
Fix Japanese tokenizer
...
JapaneseTokenizer now returns a Doc, not individual words
2017-10-24 13:02:19 +02:00
Matthew Honnibal
5ae0b8613a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-24 12:41:07 +02:00
Matthew Honnibal
dd5b2d8fa3
Check for out-of-memory when calling calloc. Closes #1446
2017-10-24 12:40:47 +02:00
ines
9bf5751064
Pretty-print JSON
2017-10-24 12:22:17 +02:00
Matthew Honnibal
0f9d966317
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-24 12:10:58 +02:00
Matthew Honnibal
b66b8f028b
Fix #1375 -- out-of-bounds on token.nbor()
2017-10-24 12:10:39 +02:00
Matthew Honnibal
a68d89a4f3
Add failing test for bug #1375 -- no out-of-bounds error for token.nbor()
2017-10-24 12:05:25 +02:00
ines
6675755005
Add training data JSON example
2017-10-24 12:05:10 +02:00
Ines Montani
facf77e541
Merge branch 'develop' into support-danish
2017-10-24 11:53:19 +02:00
Matthew Honnibal
1b64a44d85
Add dependency patterns example
2017-10-24 11:48:20 +02:00
Matthew Honnibal
8775efbfdf
Merge pull request #1120 from raphael0202/pattern
...
Implementation of dependency pattern-matching algorithm
2017-10-24 11:44:23 +02:00
Matthew Honnibal
ccd2ab1a62
Merge pull request #1443 from ramananbalakrishnan/develop-get-lca-matrix
...
Add LCA matrix for spans and docs
2017-10-24 11:22:46 +02:00
Matthew Honnibal
ef3e5a361b
Merge pull request #1442 from explosion/feature/fix-sp
...
💫 Fix SP tag, tweak Vectors.__init__, fix Morphology
2017-10-24 10:24:07 +02:00
Matthew Honnibal
fdf25d10ba
Merge pull request #1440 from ramananbalakrishnan/develop
...
Support single value for attribute list in doc.to_array
2017-10-24 10:23:12 +02:00
Matthew Honnibal
4ad24abb7e
Merge pull request #1447 from mayukh18/bengali_pronouns
...
added a few bengali pronouns
2017-10-24 10:22:17 +02:00
Matthew Honnibal
72a48dec14
Merge pull request #1454 from jnothman/patch-1
...
DOC "OP" key in token spec
2017-10-24 10:08:46 +02:00
Joel Nothman
80a9652617
DOC "OP" key in token spec
2017-10-24 15:48:22 +11:00
Matthew Honnibal
e7556ff048
Fix non-maxout parser
2017-10-23 18:16:23 +02:00
ines
7701984f13
Document Span.as_doc
2017-10-23 10:38:27 +02:00
ines
db15902e84
Tidy up
2017-10-23 10:38:21 +02:00
ines
3f0a157b33
Fix typo
2017-10-23 10:38:13 +02:00