Matthw Honnibal
6aa1c53b1b
Call resume_training for base model in train CLI
2019-10-17 21:09:41 +02:00
Matthw Honnibal
3f26c50a4d
Refactor some of tok2vec
2019-10-17 17:58:00 +02:00
Matthw Honnibal
e63f28079a
Try 3 NER features
2019-10-07 16:51:03 +02:00
Matthw Honnibal
2d55ccdd27
Support option of three NER features
2019-10-07 16:50:44 +02:00
Matthw Honnibal
c8857181f8
Fix get labels for textcat
2019-10-07 16:50:15 +02:00
Matthw Honnibal
a6a2ff217f
Fix char_embed for gpu
2019-10-07 16:49:32 +02:00
Matthw Honnibal
f4040a98f0
Fix passing of cats in gold.pyx
2019-10-07 16:49:00 +02:00
Matthw Honnibal
a132da1558
Fix gold-preproc training mode
2019-10-07 02:07:03 +02:00
Matthw Honnibal
63ff233ba2
Enable GPU in pytorch n use_gpu functon
2019-10-06 19:24:21 +02:00
Matthw Honnibal
9dbaea1ab4
Use cosine loss in Cloze multitask
2019-10-06 19:23:46 +02:00
Matthw Honnibal
157d3d769b
Support bilstm_depth arg in spacy pretrain
2019-10-06 19:22:26 +02:00
Matthw Honnibal
615ebe584f
Add option to ignore zero vectors in get_cossim_loss
2019-10-06 19:20:54 +02:00
adrianeboyd
cbc2cee2c8
Improve URL_PATTERN and handling in tokenizer ( #4374 )
...
* Move prefix and suffix detection for URL_PATTERN
Move prefix and suffix detection for `URL_PATTERN` into the tokenizer.
Remove associated lookahead and lookbehind from `URL_PATTERN`.
Fix tokenization for Hungarian given new modified handling of prefixes
and suffixes.
* Match a wider range of URI schemes
2019-10-05 13:00:09 +02:00
Ines Montani
e65dffd80b
Clarify serialization of extension attributes ( closes #4377 ) [ci skip]
2019-10-05 11:58:00 +02:00
Ines Montani
fec9433044
Make PhraseMatcher.vocab consistent with Matcher.vocab ( closes #4373 )
2019-10-04 12:18:41 +02:00
Ines Montani
e7ddc6f662
Add conda install for lookups [ci skip]
2019-10-03 17:52:53 +02:00
Matthew Honnibal
37ef874d8b
Set version to v2.2.1
2019-10-03 14:50:39 +02:00
Sofie Van Landeghem
4e7259c6cf
Bugfix initializing DocBin with attributes ( #4368 )
...
* docbin init fix + documentation fix + unit tests
* newline
* try with zlib instead of gzip (python 2 incompatibilities)
2019-10-03 14:48:45 +02:00
Ines Montani
ce1d441de5
Add docs for Vectors.most_similar [ci skip]
2019-10-03 14:29:47 +02:00
Ben Taylor
1db79a33cb
most_similar() return the k most similar vectors ( #4364 )
...
* most_similar return n-most similar vectors
* updated most_similar comment
* add bintay contributor agreement
* sign bintay contributor agreement
* fix most_similar documentation typo
* fixed error in prune_vectors
* updated prune_vectors test
2019-10-03 14:09:44 +02:00
Ines Montani
4159936720
Update README.md [ci skip]
2019-10-02 19:15:22 +02:00
Ines Montani
e4782feae9
Update README.md [ci skip]
2019-10-02 18:49:55 +02:00
Ines Montani
80cf385f65
Update v2-2.md [ci skip]
2019-10-02 16:58:21 +02:00
Ines Montani
f8e606c303
Update README.md [ci skip]
2019-10-02 16:47:10 +02:00
Ines Montani
12a941d841
Update binder version [ci skip]
2019-10-02 16:47:01 +02:00
Matthew Honnibal
2eb31012e7
Set version to v2.2.0
2019-10-02 14:40:06 +02:00
Matthew Honnibal
796072e560
Set version to v2.2.0.dev19
2019-10-02 12:51:29 +02:00
Sofie Van Landeghem
9d3ce7cba2
Ensure training doesn't crash with empty batches ( #4360 )
...
* unit test for previously resolved unflatten issue
* prevent batch of empty docs to cause problems
2019-10-02 12:50:47 +02:00
Ines Montani
52b5912dbf
Tidy up [ci skip]
2019-10-02 12:05:59 +02:00
adrianeboyd
d82241218a
Make the default NER labels less model-specific [ci skip] ( #4361 )
2019-10-02 12:05:17 +02:00
adrianeboyd
dda86118bd
Update Ukrainian lemmatizer with new lookups ( #4359 )
...
* Update Ukrainian lemmatizer with new lookups
* Add missing import
Co-authored-by: Ines Montani <ines@ines.io>
2019-10-02 12:04:06 +02:00
Ines Montani
b6670bf0c2
Use consistent spelling
2019-10-02 10:37:39 +02:00
Ines Montani
208629615d
Auto-format
2019-10-02 10:37:04 +02:00
Ines Montani
867e93aae2
Add Streamlit example [ci skip]
2019-10-02 01:21:20 +02:00
Matthew Honnibal
38b6e69389
Merge branch 'master' of https://github.com/explosion/spaCy
2019-10-01 22:28:25 +02:00
Matthew Honnibal
d4b63bb6dd
Set version to v2.2.0
2019-10-01 22:28:13 +02:00
Ines Montani
9885b5ae68
Update spacy_lookups_data version [ci skip]
2019-10-01 22:21:21 +02:00
Ines Montani
475e3188ce
Add docs on filtering overlapping spans for merging ( resolves #4352 ) [ci skip]
2019-10-01 21:59:50 +02:00
Matthew Honnibal
667f294627
Merge branch 'master' of https://github.com/explosion/spaCy
2019-10-01 21:37:25 +02:00
Ines Montani
0dd127bb00
Update v2-2.md [ci skip]
2019-10-01 21:37:06 +02:00
Matthew Honnibal
64a9577d43
Set version to v2.2.0.dev17
2019-10-01 21:36:59 +02:00
Ines Montani
cf65a80f36
Refactor lemmatizer and data table integration ( #4353 )
...
* Move test
* Allow default in Lookups.get_table
* Start with blank tables in Lookups.from_bytes
* Refactor lemmatizer to hold instance of Lookups
* Get lookups table within the lemmatization methods to make sure it references the correct table (even if the table was replaced or modified, e.g. when loading a model from disk)
* Deprecate other arguments on Lemmatizer.__init__ and expect Lookups for consistency
* Remove old and unsupported Lemmatizer.load classmethod
* Refactor language-specific lemmatizers to inherit as much as possible from base class and override only what they need
* Update tests and docs
* Fix more tests
* Fix lemmatizer
* Upgrade pytest to try and fix weird CI errors
* Try pytest 4.6.5
2019-10-01 21:36:03 +02:00
Ines Montani
3297a19545
Warn in Tagger.begin_training if no lemma tables are available ( #4351 )
2019-10-01 15:13:55 +02:00
Ines Montani
bc7e7db208
Fix wording [ci skip]
2019-10-01 14:20:44 +02:00
Ines Montani
2a3a4565cd
Update infobox [ci skip]
2019-10-01 14:19:34 +02:00
Ines Montani
66aa0d479f
Update v2.2 page [ci skip]
2019-10-01 14:11:05 +02:00
Ines Montani
a8a1800f2a
Update lemma data documentation [ci skip]
2019-10-01 13:22:13 +02:00
Ines Montani
932ad9cb91
Fix typos and formatting [ci skip]
2019-10-01 12:30:04 +02:00
Ines Montani
ca0b20ae8b
Make prereleases less verbose [ci skip]
2019-10-01 12:29:14 +02:00
Matthew Honnibal
2fb05482dd
Set version to v2.2.0
2019-10-01 03:50:13 +02:00