Frederic R. Hopp
78bb4275d5
Update universe.json
...
Added more detailed description to eMFDscore project
2021-05-14 19:59:34 +09:00
Frederic R. Hopp
93d9860cba
Update universe.json
2021-05-14 19:59:33 +09:00
Julien Salinas
c496f78245
Add NLP Cloud to Universe.
2021-05-14 11:13:44 +02:00
Julien Salinas
a176d2209a
Sign contributors agreement.
2021-05-14 11:00:27 +02:00
Paul O'Leary McCann
2dc6db53fd
Merge pull request #8072 from medianeuroscience/master
...
Added eMFDscore to universe.json
2021-05-14 11:58:30 +09:00
Frederic R. Hopp
c5962b9fba
Update universe.json
...
fixed typo
2021-05-13 07:40:05 -07:00
Frederic R. Hopp
a9ca221e03
Update universe.json
...
Added more detailed description to eMFDscore project
2021-05-12 09:20:17 -07:00
svlandeg
235e9f5488
call replace_listener_cfg attr if it's available
2021-05-12 17:19:38 +02:00
svlandeg
44a3a58599
call replace_listener attr if it's available
2021-05-12 16:01:02 +02:00
svlandeg
ece8be4fec
extend test to training with replaced tok2vec layer
2021-05-12 11:32:22 +02:00
Frederic R. Hopp
7bba9cdc14
Update universe.json
2021-05-11 19:18:19 -07:00
Adriane Boyd
d5bbd1f94f
Handle partial entities in Span.as_doc ( #8055 )
...
* Handle partial entities in Span.as_doc
In `Span.as_doc` replace partial entities at the beginning or end of the
span with missing entity annotation.
Fixes a bug where invalid entity annotation (no initial `B`) was
returned for an initial partial entity.
* Check for empty span in ents conversion
Note: `Span.as_doc()` will still fail on an empty span due to failures
in `Span.vector`.
2021-05-11 17:10:16 +02:00
Ines Montani
77ee7c872b
Fix default transformer in quickstart generator ( resolves #8018 ) [ci skip]
2021-05-11 11:27:30 +10:00
Ines Montani
3883d49446
Fix default transformer in quickstart generator ( resolves #8018 ) [ci skip]
2021-05-11 11:27:08 +10:00
Paul O'Leary McCann
bdeaf3a18b
Fix/fix en ordinals ( #8028 )
...
* Fix #8019
"th" is not the only ordinal ending.
* Add some more ordinal tests
2021-05-07 10:26:42 +02:00
Adriane Boyd
40ca23bde0
Fix new version for match_alignments ( #8021 )
2021-05-07 09:56:22 +02:00
Adriane Boyd
71c2a3ab47
Fix new version for match_alignments ( #8021 )
2021-05-07 09:55:20 +02:00
Jeno Pizarro
7cc8df1a28
Update negspacy example code for spaCy 3.0 ( #8022 )
2021-05-07 09:35:07 +02:00
Jeno Pizarro
5cf76ab608
Update negspacy example code for spaCy 3.0 ( #8022 )
2021-05-07 09:33:21 +02:00
Adriane Boyd
6788d90f61
Preserve existing ENT_KB_ID annotation in NER ( #7988 )
...
* Preserve existing ENT_KB_ID annotation in NER
Preserve `ent_kb_id` annotation on existing entity spans, which is not
preserved by the transition system.
* Simplify kb_id assignment
* Simplify further
2021-05-06 18:49:55 +10:00
Sofie Van Landeghem
02a6a5fea0
Fix 'debug model' for transformers + generalize ( #7973 )
...
* add overrides to docs
* fix debug model with transformer
* assume training data is set in config
2021-05-06 18:43:32 +10:00
Adriane Boyd
cc5aeaed29
Add Chinese PTB tags to glossary ( #7993 )
2021-05-06 18:43:03 +10:00
Adriane Boyd
0a22fed634
Fix span offsets for Matcher(as_spans) on spans ( #7992 )
...
Fix returned span offsets for `Matcher(as_spans=True)(span)`.
2021-05-06 18:42:44 +10:00
Adriane Boyd
7d5db41ac3
Skip vector ngram backoff if minn is not set ( #7925 )
2021-05-06 18:34:35 +10:00
Sofie Van Landeghem
e9037d8fc0
make EntityLinker robust for nO=None ( #7930 )
2021-05-06 18:14:47 +10:00
Paul O'Leary McCann
66bfabd839
Fix pretraining objectives fragment ( #8005 )
...
* Fix pretraining objectives fragment
The fragment here is reused from a heading higher up, so you couldn't
link to this section.
* Fix section link to new fragment
2021-05-06 08:27:36 +02:00
Adriane Boyd
a71194362f
Fix Docs.from_docs for all empty docs ( #8009 )
2021-05-05 18:44:14 +02:00
meghanabhange
46311cf03f
Update details in universe denomme | Multilingual Name Detection ( #7982 )
...
* Add denomme
* spaCy contributor agreement
* Update install and thumb
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-05-05 17:14:14 +02:00
meghanabhange
debaab7021
Update details in universe denomme | Multilingual Name Detection ( #7982 )
...
* Add denomme
* spaCy contributor agreement
* Update install and thumb
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-05-05 17:12:13 +02:00
Adriane Boyd
31528f62ed
Add / to nb infixes ( #7991 )
2021-05-04 11:00:10 +02:00
Santiago Castro
e99ff6f255
Fix typo in Language docstrings ( #7958 )
2021-05-03 14:44:09 +02:00
Ines Montani
62a01956c3
Fix quickstart default checked of conditional fields [ci skip]
2021-05-03 14:04:45 +02:00
Ines Montani
12d3d0fedd
Fix quickstart default checked of conditional fields [ci skip]
2021-05-03 11:48:12 +10:00
Adriane Boyd
ffaa0d6b9b
Fix Transformer.initialize example ( #7963 )
2021-04-30 12:21:59 +02:00
Adriane Boyd
2320791f6d
Fix Transformer.initialize example ( #7963 )
2021-04-30 12:21:31 +02:00
Adriane Boyd
cf032ec31e
Update to catalogue>=2.0.4 ( #7951 )
2021-04-29 19:11:28 +02:00
Adriane Boyd
7cf5bd072f
Refactor util.to_ternary_int ( #7944 )
...
* Refactor to avoid literal comparison with `is`
* Extend tests
2021-04-29 16:58:54 +02:00
Sevdimali
49aed683cc
Azerbaijani language added ( #7911 )
2021-04-28 14:42:02 +02:00
Adriane Boyd
f4080983ea
Extend to cupy 9.0.0 ( #7914 )
2021-04-28 10:18:24 +02:00
Paul O'Leary McCann
8007d5c814
Check if the resume path points to a directory ( #7919 )
...
This came up in #7878 , but if --resume-path is a directory then loading
the weights will fail. On Linux this will give a straightforward error
message, but on Windows it gives "Permission Denied", which is
confusing.
2021-04-28 09:17:15 +02:00
Paul O'Leary McCann
de6b5ed14d
Fix percent unk display in debug data ( #7886 )
...
* Fix percent unk display
This was showing (ratio %), so 10% would show as 0.10%. Fix by
multiplying ration by 100.
Might want to add a warning if this is over a threshold.
* Only show whole-integer percents
2021-04-27 09:16:35 +02:00
Janis Klaise
b33fb9ac1e
Update load_lookups return type and docstring ( #7907 )
...
* Update load_lookups return type and docstring
* Add contributor agreement
2021-04-27 09:14:59 +02:00
Janis Klaise
1690595e4d
Update load_lookups return type and docstring ( #7907 )
...
* Update load_lookups return type and docstring
* Add contributor agreement
2021-04-27 09:13:39 +02:00
Adriane Boyd
946a4284be
Set spacy-legacy to >=3.0.5 ( #7897 )
...
Set `spacy-legacy` to `>=3.0.5` due to `spacy.StaticVectors.v1` init bug.
2021-04-26 18:25:39 +02:00
Adriane Boyd
874cd02539
Set spacy-legacy to >=3.0.5 ( #7897 )
...
Set `spacy-legacy` to `>=3.0.5` due to `spacy.StaticVectors.v1` init bug.
2021-04-26 17:06:32 +02:00
Adriane Boyd
ae855a4625
Clean up Morphology imports and definitions ( #7441 )
...
* Clean up Morphology imports and definitions
* Whitespace formatting
2021-04-26 16:54:23 +02:00
Adriane Boyd
ceee1ecf17
Replace cpdef variables with cdef ( #7834 )
2021-04-26 16:54:02 +02:00
Adriane Boyd
95c0833656
Add training option to set annotations on update ( #7767 )
...
* Add training option to set annotations on update
Add a `[training]` option called `set_annotations_on_update` to specify
a list of components for which the predicted annotations should be set
on `example.predicted` immediately after that component has been
updated. The predicted annotations can be accessed by later components
in the pipeline during the processing of the batch in the same `update`
call.
* Rename to annotates / annotating_components
* Add test for `annotating_components` when training from config
* Add documentation
2021-04-26 16:53:53 +02:00
Jacopo Farina
c105ed10fd
Remove torino from stop words ( #7634 )
...
Torino is the proper name of a city and the token has no other meaning
2021-04-26 16:53:43 +02:00
Sofie Van Landeghem
e0b29f8ef7
Fix scoring normalization ( #7629 )
...
* fix scoring normalization
* score weights by total sum instead of per component
* cleanup
* more cleanup
2021-04-26 16:53:38 +02:00