Ines Montani
36ac044937
Update README.md [ci skip]
2019-08-07 13:38:59 +02:00
Ines Montani
3e60afacf9
Add Serbian to languages [ci skip]
2019-08-07 13:38:25 +02:00
Ines Montani
1dc28a9ecb
Update Binder version [ci skip]
2019-08-07 13:38:12 +02:00
Ines Montani
6bec24cdd0
Require downloaded model in pkg_resources ( #4090 )
2019-08-07 13:18:11 +02:00
Ines Montani
8b4a0fabbb
Adjust docs example [ci skip]
2019-08-07 00:46:47 +02:00
adrianeboyd
69aca7d839
Add validate option to EntityRuler ( #4089 )
...
* Add validate option to EntityRuler
* Add validate to EntityRuler, passed to Matcher and PhraseMatcher
* Add validate to usage and API docs
* Update website/docs/usage/rule-based-matching.md
Co-Authored-By: Ines Montani <ines@ines.io>
* Update website/docs/usage/rule-based-matching.md
Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-07 00:40:53 +02:00
Ines Montani
4ae320e5c2
Use consistent casing for entity ruler patterns (see #4063 ) [ci skip]
2019-08-06 12:20:22 +02:00
Ines Montani
223bde5cf6
Improve docs on matcher attributes [ci skip] ( closes #4063 )
2019-08-06 12:13:42 +02:00
Ines Montani
2bfae0b167
Auto-format
2019-08-06 12:13:31 +02:00
Jeno
15be09ceb0
Raise error if annotation dict in simple training style has unexpected keys #4074 ( #4079 )
...
* adding enhancement #4074 .
* modified behavior to strictly require top level dictionary keys - issue #4074
* pass expected keys to error message and add links as expected top level key
2019-08-06 11:01:25 +02:00
Sofie Van Landeghem
ad09b0d6f3
fetch norm from lex if necessary for matching ( #4080 )
2019-08-05 23:51:04 +02:00
Ines Montani
7f3212e2f5
💫 Sync branches ( #4084 ) [ci skip]
...
* Update from master
* Re-added Universe readme (#3688 ) (closes #3680 )
* Fix typo
* Add version tag to `--base-model` argument (closes #3720 )
* fixing regex matcher examples (#3708 ) (#3719 )
* Improve Token.prob and Lexeme.prob docs (resolves #3701 )
* Fix DependencyParser.predict docs (resolves #3561 )
* Update languages.json
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Aaron Kub <aaronkub@gmail.com>
2019-08-05 14:32:54 +02:00
Ines Montani
0f740fad1a
Update universe.json [ci skip]
2019-08-05 14:30:07 +02:00
Pavle Vidanović
e1a935d71c
Stopwords for Serbian language. ( #4078 )
...
* Serbian stopwords added. (cyrillic alphabet)
* spaCy Contribution agreement included.
* Test initialize updated
2019-08-05 10:22:27 +02:00
Sebastian Jordan
878302a55d
Fix typo in requirements section of pyproject.toml ( #4081 )
2019-08-05 10:21:14 +02:00
veer-bains
874bd8c8dd
Fixed syntax error in lang/ko when using python 2 ( #4082 ) ( closes #4068 )
...
* fixed syntax error in declaring variables with python 2.7 in spacy/lang/ko/__init__.py
* fixed syntax error in declaring variables with python 2.7 in spacy/lang/ko/__init__.py
* Update __init__.py
* Create veer-bains.md
* Update __init__.py
fixed syntax errors in variable datatype assignment when calling spacy.blank("ko") with python 2.7
2019-08-05 10:19:32 +02:00
Ines Montani
87ddbdc33e
Fix handling of kwargs in Language.evaluate
...
Makes it consistent with other methods
2019-08-04 13:44:21 +02:00
Muhammad Irfan
d1d30b0442
added missing punctuation following conventions. ( #4066 )
2019-08-04 13:41:18 +02:00
Anastassia
33b14724a5
Update gold corpus code to properly ingest a directory of jsonl… ( #4067 )
...
* Update gold corpus code to properly ingest a directory of jsonlines files
In response to: https://github.com/explosion/spaCy/issues/3975
* Update spacy/gold.pyx
Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-02 09:58:51 +02:00
Ines Montani
0f76e0022d
Update .tensor docs [ci skip]
2019-08-01 18:37:09 +02:00
Ines Montani
3072eb28c2
Support and render Markdown in model meta [ci skip]
2019-08-01 18:33:10 +02:00
Matthew Honnibal
944a66c326
Add span.tensor and token.tensor attributes
2019-08-01 18:30:50 +02:00
Matthew Honnibal
d3071ecdbc
Set version to v2.1.7
2019-08-01 18:09:19 +02:00
Matthew Honnibal
97c51ef93b
Set version to v2.1.7.dev1
2019-08-01 17:29:25 +02:00
Matthew Honnibal
4632c597e7
Fix Pipe base class
2019-08-01 17:29:01 +02:00
Ines Montani
8718ca8b1f
Fix init_model if there's no vocab ( closes #4048 ) ( #4049 )
2019-08-01 17:26:09 +02:00
adrianeboyd
925a852bb6
Improve NER per type scoring ( #4052 )
...
* Improve NER per type scoring
* include all gold labels in per type scoring, not only when recall > 0
* improve efficiency of per type scoring
* Create Scorer tests, initially with NER tests
* move regression test #3968 (per type NER scoring) to Scorer tests
* add new test for per type NER scoring with imperfect P/R/F and per
type P/R/F including a case where R == 0.0
2019-08-01 17:15:36 +02:00
Sofie Van Landeghem
f7d950de6d
ensure the lang of vocab and nlp stay consistent ( #4057 )
...
* ensure the language of vocab and nlp stay consistent across serialization
* equality with =
2019-08-01 17:13:01 +02:00
Björn Böing
a83c0add2e
Add links to tokenizer API docs to refer relevant information. ( #4064 )
...
* Add links to tokenizer API docs to refer relevant information.
* Add suggested changes
Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-01 14:28:38 +02:00
Ejar
2cdf7d39e7
Corrected imported fucntion ( #4062 )
...
The example showed an incorrected import
2019-08-01 12:43:36 +02:00
Mohammed Daudali
23ec07debd
Correct typo for AllenAI url on homepage ( #4050 )
...
* Typo fix for AllenAI url
Changed incorrect home page url for AllenAI from appenai.org to allenai.org
* Sign contributor agreement
* Change date format
2019-07-31 00:16:33 +02:00
Sofie Van Landeghem
7de3b129ab
Resolve edge case when calling textcat.predict with empty doc ( #4035 )
...
* resolve edge case where no doc has tokens when calling textcat.predict
* more explicit value test
2019-07-30 14:58:01 +02:00
Ines Montani
fcd2f7f656
Fix version introducing Span.ents ( closes #4045 ) [ci skip]
2019-07-30 10:32:33 +02:00
Matthew Honnibal
89c92c65fb
Update version
2019-07-28 17:56:38 +02:00
Matthew Honnibal
06eb428ed1
Make pipe base class a bit less presumptuous
2019-07-28 17:56:11 +02:00
Matthew Honnibal
16b5144095
Don't raise NotImplemented in Pipe.update
2019-07-28 17:54:11 +02:00
Ines Montani
fc69da0acb
💫 Support simple training format in nlp.evaluate and add tests ( #4033 )
...
* Support simple training format in nlp.evaluate and add tests
* Update docs [ci skip]
2019-07-27 17:30:18 +02:00
Ines Montani
a3723f439c
Fix formatting [ci skip]
2019-07-27 16:35:42 +02:00
Ines Montani
d5bce35fb1
Fix bug in Span.similarity when called via hook
2019-07-27 15:33:27 +02:00
Ines Montani
109b5e1798
Fix bug in Token.similarity when called via hook
2019-07-27 15:26:01 +02:00
Ines Montani
e000b5ed82
Also support "requirements" in model.json
2019-07-27 13:34:57 +02:00
Ines Montani
307ffe472d
Support custom language factory setting in meta.json ( #4031 )
2019-07-27 13:17:43 +02:00
Ines Montani
b7cd58c736
Tidy up and auto-format [ci skip]
2019-07-27 12:19:35 +02:00
Bae Yong-Ju
05fbf5d976
Fix error when Korean text contains regexp special characters. ( #4022 )
2019-07-25 17:53:33 +02:00
Ines Montani
bd39e5e630
Add "Processing text" section [ci skip]
2019-07-25 17:38:03 +02:00
Ines Montani
a5e3d2f318
Improve section on disabling pipes [ci skip]
2019-07-25 14:25:34 +02:00
Ines Montani
02e444ec7c
Add section on special tokenizer component [ci skip]
2019-07-25 14:25:03 +02:00
Ines Montani
1fa6d6ba55
Improve consistency of docs examples [ci skip]
2019-07-25 14:24:56 +02:00
adrianeboyd
784a5f4284
Update GoldParse attributes in API docs ( #4023 )
...
* add `words`
* update name of entity list to `ner`
I think it might be a bit more consistent to have `ner` named `entities`
or `ents` (and `ents` is actually set somewhere to `None`, which is a
bit confusing), but it looks like renaming it would be a non-trivial
decision.
2019-07-25 12:14:02 +02:00
Matthew Honnibal
73e095923f
💫 Improve error message when model.from_bytes() dies ( #4014 )
...
* Improve error message when model.from_bytes() dies
When Thinc's model.from_bytes() is called with a mismatched model, often
we get a particularly ungraceful error,
e.g. "AttributeError: FunctionLayer has no attribute G"
This is because we're trying to load the parameters for something like
a LayerNorm layer, and the model architecture has some other layer there
instead. This is obviously terrible, especially since the error *type*
is wrong.
I've changed it to raise a ValueError. The error message is still
probably a bit terse, but it's hard to be sure exactly what's gone
wrong.
* Update spacy/pipeline/pipes.pyx
* Update spacy/pipeline/pipes.pyx
* Update spacy/pipeline/pipes.pyx
* Update spacy/syntax/nn_parser.pyx
* Update spacy/syntax/nn_parser.pyx
* Update spacy/pipeline/pipes.pyx
Co-Authored-By: Matthew Honnibal <honnibal+gh@gmail.com>
* Update spacy/pipeline/pipes.pyx
Co-Authored-By: Matthew Honnibal <honnibal+gh@gmail.com>
Co-authored-by: Ines Montani <ines@ines.io>
2019-07-24 11:27:34 +02:00