Adriane Boyd
81d3a1edb1
Use tokenizer URL_MATCH pattern in LIKE_URL ( #8765 )
2021-07-27 12:07:01 +02:00
Adriane Boyd
4f28190afe
Merge pull request #8813 from adrianeboyd/chore/develop-v3.2
...
Update develop for v3.2
2021-07-27 11:26:18 +02:00
Ines Montani
7f21c7dfa2
Merge pull request #8794 from explosion/autoblack
...
Auto-format code with black
2021-07-27 12:17:15 +10:00
Ines Montani
34c401f04f
Merge pull request #8801 from polm/fix/respect-no-skip ( fixes #8796 )
...
Respect the no_skip value
2021-07-27 12:16:47 +10:00
Ines Montani
cf3855ae05
Merge pull request #8806 from Ledenel/master [ci skip]
...
fix typo
2021-07-27 12:15:44 +10:00
Ines Montani
5c762e08d7
Merge pull request #8808 from kevinlu1248/master [ci skip]
...
Changed a CLI command in data-formats.md due to erroneous information
2021-07-27 12:15:35 +10:00
Ines Montani
134cb06af3
Merge pull request #8808 from kevinlu1248/master [ci skip]
...
Changed a CLI command in data-formats.md due to erroneous information
2021-07-27 12:15:16 +10:00
Ines Montani
9bf0d6f2fd
Merge pull request #8806 from Ledenel/master [ci skip]
...
fix typo
2021-07-27 12:14:22 +10:00
Kevin Lu
4a8e9e4e4e
Update data-formats.md
2021-07-25 22:58:53 -07:00
Ledenel
413f745c68
fix broken example in spaCy universe Chatterbot
2021-07-25 15:53:32 +00:00
Paul O'Leary McCann
284b530c63
Respect the no_skip value
...
Seems like the logic for this was just left out. See #8796 .
2021-07-24 15:31:17 +09:00
explosion-bot
a58ab6ea22
Auto-format code with black
2021-07-23 08:04:09 +00:00
Adriane Boyd
6bbc2b1956
Reload train corpus in debug data after initialize ( #8776 )
2021-07-21 22:38:40 +02:00
svlandeg
f4f270940c
Merge remote-tracking branch 'upstream/master' into spacy.io
2021-07-20 16:14:16 +02:00
Adriane Boyd
d48c01a6f7
Remove extraneous grc test file ( #8768 )
2021-07-20 15:51:15 +02:00
Sofie Van Landeghem
ffaead8fe0
bump to 3.1.1
2021-07-19 14:48:27 +02:00
Sofie Van Landeghem
83e27d262e
negative tag annotation ( #8731 )
...
* unit test to unlearn tag via negative annotation
* bump thinc to 8.0.8
2021-07-19 14:39:11 +02:00
Adriane Boyd
0e4b96c97e
Update lexeme ranks for loaded vectors ( #8640 )
...
Update the ranks for any lexemes that have been added to the vocab
before the vectors are added to the model.
2021-07-19 18:25:54 +10:00
Adriane Boyd
e532c69475
Update Language.replace_pipe for disabled components ( #8729 )
...
* Fix the index where the replacement in inserted to account for
disabled components
* Allow `Language.replace_pipe` to replace disabled components
2021-07-19 18:06:12 +10:00
Kenneth Enevoldsen
2880ae70b0
removed outdated spacy version for spacymoji
...
From the documentation of spacymoji (and the requirements.txt) it seems like it is not only for version 2.
2021-07-18 19:19:55 +09:00
Kenneth Enevoldsen
812746464b
fixed GitHub link and thumbnail
...
Sorry, I seem to have misunderstood that the GitHub reference shouldn't be a link.
2021-07-18 19:19:37 +09:00
Paul O'Leary McCann
d717593eb7
Merge pull request #8754 from KennethEnevoldsen/patch-1
...
[minor] removed outdated spacy version for spacymoji
2021-07-18 19:17:33 +09:00
Paul O'Leary McCann
ac67639eaf
Merge pull request #8755 from KennethEnevoldsen/patch-2
...
fixed GitHub link and thumbnail
2021-07-18 19:14:57 +09:00
Kenneth Enevoldsen
5d6aed0773
fixed GitHub link and thumbnail
...
Sorry, I seem to have misunderstood that the GitHub reference shouldn't be a link.
2021-07-18 10:22:00 +02:00
Ines Montani
f90482d077
Tidy up and auto-format
2021-07-18 15:44:56 +10:00
Ines Montani
98cf872e11
Fix JSON [ci skip]
2021-07-18 13:21:43 +10:00
Ines Montani
313f55e560
Fix JSON [ci skip]
2021-07-18 13:21:33 +10:00
Ines Montani
a792e1119f
Merge pull request #8702 from KennethEnevoldsen/master [ci skip]
2021-07-18 13:19:09 +10:00
Ines Montani
51e5903d6f
Merge pull request #8702 from KennethEnevoldsen/master [ci skip]
2021-07-18 13:18:42 +10:00
Kenneth Enevoldsen
8546948fba
removed outdated spacy version for spacymoji
...
From the documentation of spacymoji (and the requirements.txt) it seems like it is not only for version 2.
2021-07-17 15:19:43 +02:00
Kenneth Enevoldsen
a0e0ccdb46
Update website/meta/universe.json
...
Co-authored-by: Ines Montani <ines@ines.io>
2021-07-17 07:14:46 +02:00
Ines Montani
c0f436efbc
Merge pull request #8735 from explosion/autoblack
2021-07-17 13:46:17 +10:00
Ines Montani
483f3175cb
Tidy up [ci skip]
2021-07-17 13:43:15 +10:00
Ines Montani
15e6578f7d
Adjust formatting
2021-07-17 10:49:13 +10:00
Mario Šaško
47c5a63a83
Add TakeLab/spacy-udpipe to Universe ( #8698 )
...
* Add TakeLab/spacy-udpipe to universe
* Add SCA
* Sign SCA
2021-07-16 11:18:09 +02:00
Mario Šaško
1ba2e8a646
Add TakeLab/spacy-udpipe to Universe ( #8698 )
...
* Add TakeLab/spacy-udpipe to universe
* Add SCA
* Sign SCA
2021-07-16 11:15:52 +02:00
explosion-bot
eff3d1088b
Auto-format code with black
2021-07-16 08:03:36 +00:00
Adriane Boyd
e76e2addd1
Remove TrainablePipe as base class for Lemmatizer in API docs ( #8725 )
2021-07-15 16:42:14 +02:00
Adriane Boyd
f5acc48111
Remove TrainablePipe as base class for Lemmatizer in API docs ( #8725 )
2021-07-15 16:41:36 +02:00
Adriane Boyd
ac45c7c045
Add pre-commit to ignored requirements ( #8728 )
2021-07-15 16:41:15 +02:00
jmyerston
993b0fab0e
Added ancient Greek language support ( #8606 )
...
* Add ancient Greek language support
Initial commit
* Contributor Agreement
* grc tokenizer test added and files formatted with black, unnecessary import removed
Co-Authored-By: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Commas in lists fixed. __init__py added to test
* Update lex_attrs.py
* Update stop_words.py
* Update stop_words.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-07-15 10:27:17 +02:00
Sofie Van Landeghem
77859beb99
spacy.ngram_range_suggester.v1 ( #8699 )
2021-07-15 10:01:22 +02:00
Julien Rossi
e117573822
Adding noun_chunks to the DUTCH language model (nl) ( #8529 )
...
* ✨ implement noun_chunks for dutch language
* copy/paste FR and SV syntax iterators to accomodate UD tags
* added tests with dutch text
* signed contributor agreement
* 🐛 fix noun chunks generator
* built from scratch
* define noun chunk as a single Noun-Phrase
* includes some corner cases debugging (incorrect POS tagging)
* test with provided annotated sample (POS, DEP)
* ✅ fix failing test
* CI pipeline did not like the added sample file
* add the sample as a pytest fixture
* Update spacy/lang/nl/syntax_iterators.py
* Update spacy/lang/nl/syntax_iterators.py
Code readability
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/tests/lang/nl/test_noun_chunks.py
correct comment
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* finalize code
* change "if next_word" into "if next_word is not None"
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-07-14 14:01:02 +02:00
Ines Montani
8ca6c58625
Merge pull request #8703 from thomashacker/update/spacy-stanza [ci skip]
...
Update spacy-stanza universe.json
2021-07-13 19:03:56 +10:00
Ines Montani
2a8eeed5da
Merge pull request #8703 from thomashacker/update/spacy-stanza [ci skip]
...
Update spacy-stanza universe.json
2021-07-13 19:03:42 +10:00
thomashacker
aafb89df78
Update universe.json code_example
2021-07-13 10:22:49 +02:00
KennethEnevoldsen
e5127992a0
added agreement
2021-07-13 10:11:02 +02:00
Kenneth Enevoldsen
94ce904e10
added missing comma
2021-07-13 09:59:34 +02:00
Kenneth Enevoldsen
a81fcc81b0
added dacy to universe
2021-07-13 09:54:08 +02:00
Adriane Boyd
f9fd2889b7
Use 0-vector for OOV lexemes ( #8639 )
2021-07-13 14:48:12 +10:00