Paul O'Leary McCann
d717593eb7
Merge pull request #8754 from KennethEnevoldsen/patch-1
...
[minor] removed outdated spacy version for spacymoji
2021-07-18 19:17:33 +09:00
Paul O'Leary McCann
a4531be099
Add simple mention test
2021-07-18 19:15:32 +09:00
Paul O'Leary McCann
ac67639eaf
Merge pull request #8755 from KennethEnevoldsen/patch-2
...
fixed GitHub link and thumbnail
2021-07-18 19:14:57 +09:00
Kenneth Enevoldsen
5d6aed0773
fixed GitHub link and thumbnail
...
Sorry, I seem to have misunderstood that the GitHub reference shouldn't be a link.
2021-07-18 10:22:00 +02:00
Ines Montani
f90482d077
Tidy up and auto-format
2021-07-18 15:44:56 +10:00
Ines Montani
98cf872e11
Fix JSON [ci skip]
2021-07-18 13:21:43 +10:00
Ines Montani
313f55e560
Fix JSON [ci skip]
2021-07-18 13:21:33 +10:00
Ines Montani
a792e1119f
Merge pull request #8702 from KennethEnevoldsen/master [ci skip]
2021-07-18 13:19:09 +10:00
Ines Montani
51e5903d6f
Merge pull request #8702 from KennethEnevoldsen/master [ci skip]
2021-07-18 13:18:42 +10:00
Kenneth Enevoldsen
8546948fba
removed outdated spacy version for spacymoji
...
From the documentation of spacymoji (and the requirements.txt) it seems like it is not only for version 2.
2021-07-17 15:19:43 +02:00
Kenneth Enevoldsen
a0e0ccdb46
Update website/meta/universe.json
...
Co-authored-by: Ines Montani <ines@ines.io>
2021-07-17 07:14:46 +02:00
Ines Montani
c0f436efbc
Merge pull request #8735 from explosion/autoblack
2021-07-17 13:46:17 +10:00
Ines Montani
483f3175cb
Tidy up [ci skip]
2021-07-17 13:43:15 +10:00
Ines Montani
15e6578f7d
Adjust formatting
2021-07-17 10:49:13 +10:00
Mario Šaško
47c5a63a83
Add TakeLab/spacy-udpipe to Universe ( #8698 )
...
* Add TakeLab/spacy-udpipe to universe
* Add SCA
* Sign SCA
2021-07-16 11:18:09 +02:00
Mario Šaško
1ba2e8a646
Add TakeLab/spacy-udpipe to Universe ( #8698 )
...
* Add TakeLab/spacy-udpipe to universe
* Add SCA
* Sign SCA
2021-07-16 11:15:52 +02:00
explosion-bot
eff3d1088b
Auto-format code with black
2021-07-16 08:03:36 +00:00
Adriane Boyd
e76e2addd1
Remove TrainablePipe as base class for Lemmatizer in API docs ( #8725 )
2021-07-15 16:42:14 +02:00
Adriane Boyd
f5acc48111
Remove TrainablePipe as base class for Lemmatizer in API docs ( #8725 )
2021-07-15 16:41:36 +02:00
Adriane Boyd
ac45c7c045
Add pre-commit to ignored requirements ( #8728 )
2021-07-15 16:41:15 +02:00
Paul O'Leary McCann
9b63cbb775
Add extract spans import
2021-07-15 18:16:53 +09:00
jmyerston
993b0fab0e
Added ancient Greek language support ( #8606 )
...
* Add ancient Greek language support
Initial commit
* Contributor Agreement
* grc tokenizer test added and files formatted with black, unnecessary import removed
Co-Authored-By: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Commas in lists fixed. __init__py added to test
* Update lex_attrs.py
* Update stop_words.py
* Update stop_words.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-07-15 10:27:17 +02:00
Sofie Van Landeghem
77859beb99
spacy.ngram_range_suggester.v1 ( #8699 )
2021-07-15 10:01:22 +02:00
Julien Rossi
e117573822
Adding noun_chunks to the DUTCH language model (nl) ( #8529 )
...
* ✨ implement noun_chunks for dutch language
* copy/paste FR and SV syntax iterators to accomodate UD tags
* added tests with dutch text
* signed contributor agreement
* 🐛 fix noun chunks generator
* built from scratch
* define noun chunk as a single Noun-Phrase
* includes some corner cases debugging (incorrect POS tagging)
* test with provided annotated sample (POS, DEP)
* ✅ fix failing test
* CI pipeline did not like the added sample file
* add the sample as a pytest fixture
* Update spacy/lang/nl/syntax_iterators.py
* Update spacy/lang/nl/syntax_iterators.py
Code readability
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/tests/lang/nl/test_noun_chunks.py
correct comment
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* finalize code
* change "if next_word" into "if next_word is not None"
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-07-14 14:01:02 +02:00
Paul O'Leary McCann
e9626e38c1
Fix serialization test
...
This test was failing not because the thing it was testing wasn't
working, but because of the way span equality works. Span equality
relies on doc equality, and doc equality is object identity, so spans
from different docs will never be equal.
2021-07-14 18:37:34 +09:00
Paul O'Leary McCann
4a9dc00d86
Use relative indices for mentions
...
Was using batch absolute indices to manage mentions, but extract_spans
expects doc-relative ones.
2021-07-14 18:36:18 +09:00
Paul O'Leary McCann
3684f7fdfd
Remove comment from fixed test
2021-07-14 18:22:14 +09:00
Paul O'Leary McCann
f1796e4af7
Fix mention list bug
...
There was an off-by-one error in how mentions are generated that would
affect mentions at the end of a sentence. This was pretty nasty.
2021-07-14 18:19:00 +09:00
Ines Montani
8ca6c58625
Merge pull request #8703 from thomashacker/update/spacy-stanza [ci skip]
...
Update spacy-stanza universe.json
2021-07-13 19:03:56 +10:00
Ines Montani
2a8eeed5da
Merge pull request #8703 from thomashacker/update/spacy-stanza [ci skip]
...
Update spacy-stanza universe.json
2021-07-13 19:03:42 +10:00
thomashacker
aafb89df78
Update universe.json code_example
2021-07-13 10:22:49 +02:00
KennethEnevoldsen
e5127992a0
added agreement
2021-07-13 10:11:02 +02:00
Kenneth Enevoldsen
94ce904e10
added missing comma
2021-07-13 09:59:34 +02:00
Kenneth Enevoldsen
a81fcc81b0
added dacy to universe
2021-07-13 09:54:08 +02:00
Adriane Boyd
f9fd2889b7
Use 0-vector for OOV lexemes ( #8639 )
2021-07-13 14:48:12 +10:00
Edward
8233359225
Fix preservation of spacy package meta ( #8663 )
...
* update package meta with existing_meta and nlp_meta
* Add spaCy contributor agreement
* Added more info when creating readme
2021-07-12 11:18:52 +02:00
Paul O'Leary McCann
80a17071d3
Remove unused code
2021-07-11 18:46:39 +09:00
Paul O'Leary McCann
447c7070e3
Fix loss
...
Accidentally deleted it
2021-07-10 22:45:25 +09:00
Paul O'Leary McCann
c25ec292a9
Cleanup
2021-07-10 22:42:55 +09:00
Paul O'Leary McCann
e00bd422d9
Fix span embeds
...
Some of the lengths and backprop weren't right.
Also various cleanup.
2021-07-10 21:38:53 +09:00
Paul O'Leary McCann
d7d317a1b5
Clean up span embedding code
...
This is now cleaner and significantly faster. There's still some messy
parts in the code (particularly variable names), will get to that later.
2021-07-10 19:59:08 +09:00
Paul O'Leary McCann
dc1f974d39
Merge branch 'master' into feature/coref
2021-07-10 18:10:40 +09:00
Paul O'Leary McCann
f34915c1e8
Use scatter_add to speed up span embed backprop
...
This was the slowest part of the code, and using scatter_add here
probably reduces the runtime by 50%.
2021-07-10 18:08:51 +09:00
Paul O'Leary McCann
1c70c87daf
Fix autoblack
...
The conditional needs double equals.
2021-07-10 16:02:39 +09:00
Ines Montani
616f4de034
Merge pull request #8674 from polm/fix/autoblack-no-forks [ci skip]
...
Make the autoblack job not run on forks
2021-07-10 16:41:59 +10:00
Paul O'Leary McCann
b8cdbb4bb6
Make the autoblack job not run on forks
...
The autoblack job is an occasional cleanup job. If it runs on forks and
those PRs are accepted the git history will be weird and that doesn't
help anyone.
The way to make the job not run on forks is a little non-obvious but
based on this thread.
https://github.com/prisma/prisma/issues/3539
2021-07-10 15:38:20 +09:00
Ines Montani
d8ae5750a6
Merge pull request #8665 from rynoV/patch-1 [ci skip]
2021-07-10 10:52:39 +10:00
Ines Montani
d4fecdfb82
Merge pull request #8665 from rynoV/patch-1 [ci skip]
2021-07-10 10:52:15 +10:00
Ines Montani
50000d37e4
Avoid double parentheses [ci skip]
2021-07-10 10:52:01 +10:00
Calum Sieppert
e2d53aa1a6
Typo fixes
2021-07-09 10:25:56 -06:00