Adriane Boyd
d777d9cc38
Extend v2.3 migration guide ( #5653 )
...
* Extend preloaded vocab section
* Add section on tag maps
2020-06-26 14:13:01 +02:00
Adriane Boyd
c4d0209472
Extend v2.3 migration guide ( #5653 )
...
* Extend preloaded vocab section
* Add section on tag maps
2020-06-26 14:12:29 +02:00
PluieElectrique
90c7eb0e2f
Reduce memory usage of Lookup's BloomFilter ( #5606 )
...
* Reduce memory usage of Lookup's BloomFilter
* Remove extra Table update
2020-06-26 14:09:10 +02:00
Adriane Boyd
b7107ac89f
Disregard special tag _SP in check for new tag map ( #5641 )
...
* Skip special tag _SP in check for new tag map
In `Tagger.begin_training()` check for new tags aside from `_SP` in the
new tag map initialized from the provided gold tuples when determining
whether to reinitialize the morphology with the new tag map.
* Simplify _SP check
2020-06-26 09:23:21 +02:00
Ines Montani
5d235fb767
Merge branch 'develop' into feature/project-cli
2020-06-25 12:27:58 +02:00
Ines Montani
01c394eb23
Update to latest Typer and remove hacks
2020-06-25 12:27:19 +02:00
Ines Montani
82a03ee18e
Replace python with sys.executable
2020-06-25 12:26:53 +02:00
Adriane Boyd
a2660bd9c6
Fix backslashes in warnings config diff ( #5640 )
...
Fix backslashes in warnings config diff in v2.3 migration section.
2020-06-24 10:26:57 +02:00
Adriane Boyd
fd4287c178
Fix backslashes in warnings config diff ( #5640 )
...
Fix backslashes in warnings config diff in v2.3 migration section.
2020-06-24 10:26:12 +02:00
Adriane Boyd
6fe6e761de
Skip vocab in component config overrides ( #5624 )
2020-06-23 23:21:11 +02:00
Adriane Boyd
4f73ced914
Extend what's new in v2.3 with vocab / is_oov ( #5635 )
2020-06-23 16:50:43 +02:00
Adriane Boyd
7ce451c211
Extend what's new in v2.3 with vocab / is_oov ( #5635 )
2020-06-23 16:48:59 +02:00
Adriane Boyd
d94e961f14
Fix polarity of Token.is_oov and Lexeme.is_oov ( #5634 )
...
Fix `Token.is_oov` and `Lexeme.is_oov` so they return `True` when the
lexeme does **not** have a vector.
2020-06-23 13:29:51 +02:00
Richard Liaw
0ef78bad93
contribute ( #5632 )
2020-06-23 08:53:58 +02:00
Ines Montani
8131a65dee
Update __init__.py
2020-06-22 16:09:09 +02:00
Ines Montani
2ad7a02400
Merge branch 'develop' into feature/project-cli
2020-06-22 15:33:11 +02:00
Ines Montani
83b4aa05c9
Merge pull request #5626 from explosion/feature/typer
2020-06-22 06:29:03 -07:00
Ines Montani
0ee6d7a4d1
Remove project stuff from this branch
2020-06-22 14:54:38 +02:00
Ines Montani
a6b76440b7
Update project CLI
2020-06-22 14:53:31 +02:00
Adriane Boyd
fcdecefacf
Add warnings example in v2.3 migration guide ( #5627 )
2020-06-22 14:38:06 +02:00
Adriane Boyd
bc1cb30b21
Add warnings example in v2.3 migration guide ( #5627 )
2020-06-22 14:37:24 +02:00
Hiroshi Matsuda
150a39ccca
Japanese model: add user_dict entries and small refactor ( #5573 )
...
* user_dict fields: adding inflections, reading_forms, sub_tokens
deleting: unidic_tags
improve code readability around the token alignment procedure
* add test cases, replace fugashi with sudachipy in conftest
* move bunsetu.py to spaCy Universe as a pipeline component BunsetuRecognizer
* tag is space -> both surface and tag are spaces
* consider len(text)==0
2020-06-22 14:32:25 +02:00
Ines Montani
3f2f5f9cb3
Remove ml_datasets from install dependencies
2020-06-22 12:14:51 +02:00
Ines Montani
ea9fd3abcd
Replace plac with typer [ci skip]
2020-06-22 12:04:41 +02:00
Ines Montani
95cc9d657d
Update srsly pin [ci skip]
2020-06-22 11:57:46 +02:00
Ines Montani
34d59b494f
Merge pull request #5619 from explosion/master-tmp
2020-06-22 02:36:08 -07:00
Rameshh
c34420794a
Add Nepali Language ( #5622 )
...
* added support for nepali lang
* added examples and test files
* added spacy contributor agreement
2020-06-22 10:25:46 +02:00
Karen Hambardzumyan
66a4834e56
Some changes for Armenian ( #5616 )
...
* Fixing numericals
* We need a Armenian question sign to make the sentence a question
2020-06-22 08:50:34 +02:00
Ines Montani
dc5d535659
Tidy up info
2020-06-22 01:17:11 +02:00
Ines Montani
189ed56777
Fix and simplify info
2020-06-22 01:07:48 +02:00
Ines Montani
fca3907d4e
Add correct uppercase variants for boolean flags
2020-06-22 00:57:28 +02:00
Ines Montani
79dd824906
Tidy up
2020-06-22 00:45:40 +02:00
Ines Montani
1e5b4d8524
Fix DVC check
2020-06-22 00:30:05 +02:00
Ines Montani
5ba1df5e78
Update project CLI
2020-06-22 00:15:06 +02:00
Ines Montani
ef5f548fb0
Tidy up and auto-format
2020-06-21 22:38:04 +02:00
Ines Montani
f77e0bc028
Merge branch 'develop' into master-tmp
2020-06-21 22:34:15 +02:00
Ines Montani
40bb918a4c
Remove unicode declarations and tidy up
2020-06-21 22:34:10 +02:00
Ines Montani
e0c16c0577
Update wasabi pin
2020-06-21 22:25:34 +02:00
Ines Montani
275bab62df
Refactor CLI
2020-06-21 21:35:01 +02:00
Ines Montani
c12713a8be
Port CLI to Typer and add project stubs
2020-06-21 13:44:00 +02:00
svlandeg
689600e17d
add additional test back in (it works now)
2020-06-20 23:23:57 +02:00
svlandeg
2f6062a8a4
add line that got removed from EntityLinker
2020-06-20 23:14:45 +02:00
svlandeg
12dc8ab208
remove redundant code from master in EntityLinker
2020-06-20 23:07:42 +02:00
svlandeg
6179774278
fix test_build_dependencies by ignoring new libs
2020-06-20 22:49:37 +02:00
svlandeg
256d4c27c8
fix tagger begin_training being called without examples
2020-06-20 22:38:00 +02:00
svlandeg
5cb812e0ab
fix NER warn empty lookups (cf PR #5588 )
2020-06-20 22:04:18 +02:00
svlandeg
c9242e9bf4
fix entity linker (cf PR #5548 )
2020-06-20 21:47:23 +02:00
svlandeg
dc069e90b3
fix token.morph_ for v.3 (cf PR #5517 )
2020-06-20 21:13:11 +02:00
Ines Montani
988d2a4eda
Add --code-path option to train CLI ( #5618 )
2020-06-20 18:43:12 +02:00
Ines Montani
5424b70e51
Remove v2 test
2020-06-20 16:18:53 +02:00