adrianeboyd
8fe7bdd0fa
Improve token pattern checking without validation ( #4105 )
...
* Fix typo in rule-based matching docs
* Improve token pattern checking without validation
Add more detailed token pattern checks without full JSON pattern validation and
provide more detailed error messages.
Addresses #4070 (also related: #4063 , #4100 ).
* Check whether top-level attributes in patterns and attr for PhraseMatcher are
in token pattern schema
* Check whether attribute value types are supported in general (as opposed to
per attribute with full validation)
* Report various internal error types (OverflowError, AttributeError, KeyError)
as ValueError with standard error messages
* Check for tagger/parser in PhraseMatcher pipeline for attributes TAG, POS,
LEMMA, and DEP
* Add error messages with relevant details on how to use validate=True or nlp()
instead of nlp.make_doc()
* Support attr=TEXT for PhraseMatcher
* Add NORM to schema
* Expand tests for pattern validation, Matcher, PhraseMatcher, and EntityRuler
* Remove unnecessary .keys()
* Rephrase error messages
* Add another type check to Matcher
Add another type check to Matcher for more understandable error messages
in some rare cases.
* Support phrase_matcher_attr=TEXT for EntityRuler
* Don't use spacy.errors in examples and bin scripts
* Fix error code
* Auto-format
Also try get Azure pipelines to finally start a build :(
* Update errors.py
Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2019-08-21 14:00:37 +02:00
Ines Montani
3134a9b6e0
Add section on expanding regex match to token boundaries (see #4158 ) [ci skip]
2019-08-21 12:53:31 +02:00
Ines Montani
66aba2d676
Improve regex matching docs [ci skip]
2019-08-19 13:59:41 +02:00
Sofie Van Landeghem
cc66f47893
Make enabling/disabling jupyter mode more explicit ( #4144 )
...
* make enabling/disabling jupyter mode more explicit
* markup fix
2019-08-19 11:53:34 +02:00
Ines Montani
e520eb3f6c
Make visualized NER examples more clear ( closes #4104 ) [ci skip]
2019-08-18 16:29:29 +02:00
Ines Montani
1362f793cf
Improve docs on phrase pattern attributes ( closes #4100 ) [ci skip]
2019-08-11 11:13:49 +02:00
Ines Montani
8b4a0fabbb
Adjust docs example [ci skip]
2019-08-07 00:46:47 +02:00
adrianeboyd
69aca7d839
Add validate option to EntityRuler ( #4089 )
...
* Add validate option to EntityRuler
* Add validate to EntityRuler, passed to Matcher and PhraseMatcher
* Add validate to usage and API docs
* Update website/docs/usage/rule-based-matching.md
Co-Authored-By: Ines Montani <ines@ines.io>
* Update website/docs/usage/rule-based-matching.md
Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-07 00:40:53 +02:00
Ines Montani
4ae320e5c2
Use consistent casing for entity ruler patterns (see #4063 ) [ci skip]
2019-08-06 12:20:22 +02:00
Ines Montani
223bde5cf6
Improve docs on matcher attributes [ci skip] ( closes #4063 )
2019-08-06 12:13:42 +02:00
Ines Montani
2bfae0b167
Auto-format
2019-08-06 12:13:31 +02:00
Ines Montani
bd39e5e630
Add "Processing text" section [ci skip]
2019-07-25 17:38:03 +02:00
Ines Montani
a5e3d2f318
Improve section on disabling pipes [ci skip]
2019-07-25 14:25:34 +02:00
Ines Montani
02e444ec7c
Add section on special tokenizer component [ci skip]
2019-07-25 14:25:03 +02:00
Ines Montani
1fa6d6ba55
Improve consistency of docs examples [ci skip]
2019-07-25 14:24:56 +02:00
Ines Montani
1167c303a0
Fix typos [ci skip]
2019-07-19 13:08:18 +02:00
Ines Montani
c3ead02ea5
Adjust wording [ci skip]
2019-07-17 16:06:25 +02:00
Ines Montani
1d5ff3e455
Add infobox
2019-07-17 15:29:36 +02:00
Ines Montani
114cb18892
Improve wording
2019-07-17 15:27:53 +02:00
Ines Montani
7522beef9e
Add "Things to try" prompts
2019-07-17 15:25:02 +02:00
Ines Montani
9f02e3c027
Adjust example
...
Not actually supported in this alignment interpretation
2019-07-17 15:13:50 +02:00
Ines Montani
1ea472468a
Add usage docs for aligning tokenization
2019-07-17 15:08:33 +02:00
pmbaumgartner
9a86d95ea2
fix custom attribute links
2019-07-14 20:23:54 -04:00
Ines Montani
ebe58e7fa1
Document gold.docs_to_json [ci skip]
2019-07-10 10:27:33 +02:00
Ines Montani
881f5bc401
Auto-format
2019-07-10 10:27:29 +02:00
Ines Montani
d361e380b8
Fix matcher callback example ( closes #3862 )
2019-06-26 14:47:26 +02:00
Alejandro Alcalde
4866a7ee9e
Changed learning rate by its param name. ( #3855 )
...
* Changed learning rate by its param name.
I've been searching for a while how the parameter learning rate was named, with `beta1` and `beta2` its easy as they are marked as code, but learning rate wasn't. I think writing the actual parameter name would be helpful.
* Signing SCA
2019-06-20 10:29:20 +02:00
Ramanan Balakrishnan
eb12703d10
minor fix to broken link in documentation ( #3819 ) [ci skip]
2019-06-04 11:15:35 +02:00
Ines Montani
0c74506c9c
Fix typos in docs ( closes #3802 ) [ci skip]
2019-06-01 11:35:01 +02:00
mak
89379a7fa4
Corrected example model URL in requirements.txt ( #3786 )
...
The URL used to show how to add a model to the requirements.txt had the old release path (excl. explosion).
2019-05-29 10:51:55 +02:00
Aaron Kub
719a15f23d
fixing regex matcher examples ( #3708 ) ( #3719 )
2019-05-10 14:23:52 +02:00
张晓飞
ba1ff00370
update response after calling add_pipe ( #3661 )
...
* update response after calling add_pipe
component:print_info is appened in the last, so need show it at the end of pipeline
* Create henry860916.md
2019-05-01 12:02:18 +02:00
Ramiro Gómez
8ee4100f8f
Remove dangling M ( #3657 )
...
I assume this is a typo. Sorry if it has a meaning that I'm not aware of.
2019-04-29 19:44:43 +02:00
Amit Chaudhary
167d63af31
Fix broken link to Dive Into Python 3 website ( #3656 )
...
* Fix broken link to Dive Into Python 3 website
* Sign spaCy Contributor Agreement
2019-04-29 19:44:00 +02:00
Ivan Tham
fa94f83697
Improve redundant variable name ( #3643 )
...
* Improve redundant variable name
* Apply suggestions from code review
Co-Authored-By: pickfire <pickfire@riseup.net>
2019-04-26 16:50:14 +02:00
Ines Montani
0dce4585b1
Add course to 101
2019-04-19 15:59:51 +02:00
Ines Montani
38395d9518
Merge branch 'spacy.io'
2019-04-19 15:26:20 +02:00
Ines Montani
7ac5bb0a7b
Update landing and feature overview
2019-04-19 15:23:08 +02:00
fizban99
f2f2df6e78
entity types for colors should be in uppercase ( #3599 )
...
although the text indicates the entity types should be in lowercase, the sample code shows uppercase, which is the correct format.
2019-04-17 11:22:56 +02:00
Ines Montani
9e7deeaf48
Remove Datacamp
2019-04-13 17:46:32 +02:00
Ines Montani
2f0f439c54
Remove non-existent example ( closes #3533 )
2019-04-03 09:59:17 +02:00
Ines Montani
200d8bdb3c
Merge branch 'spacy.io' [ci skip]
2019-03-23 16:46:34 +01:00
Ines Montani
06bf130890
💫 Add better and serializable sentencizer ( #3471 )
...
* Add better serializable sentencizer component
* Replace default factory
* Add tests
* Tidy up
* Pass test
* Update docs
2019-03-23 15:45:02 +01:00
Ines Montani
b532386a60
Fix typo [ci skip]
2019-03-22 18:36:17 +01:00
Ines Montani
5073ce63fd
Merge branch 'spacy.io' [ci skip]
2019-03-22 15:17:11 +01:00
Ines Montani
0712efc6b3
Update version requirements [ci skip]
2019-03-21 10:23:54 +01:00
Ines Montani
d4eed4a84f
Add note on unicode build to troubleshooting guide (see #3421 ) [ci skip]
2019-03-19 10:27:02 +01:00
Ines Montani
a611b32fbf
Update model docs [ci skip]
2019-03-17 11:48:18 +01:00
Ines Montani
cbcba699dd
Fix missing ids
2019-03-14 17:56:53 +01:00
Ines Montani
4cfe4aa224
Fix small issues in the docs [ci skip]
2019-03-12 22:57:15 +01:00