Adriane Boyd
2e91e07388
Update cythonize
2022-10-18 16:06:10 +02:00
Adriane Boyd
dca663a2ef
Update CI for v2.x ( #8290 )
...
* Remove Travis CI for python 2.7
* Move download CLI test to separate step
* Switch to ubuntu-18.04
* Remove duplicate CI download tests
* Restrict download test to linux python 3.9
2021-06-07 10:25:36 +02:00
Adriane Boyd
cae72e46dd
Set version to v2.3.7 ( #8289 )
...
* Set version to v2.3.7
* Add download test to CI
2021-06-04 19:33:42 +02:00
Adriane Boyd
22287c89c0
Fix pip args in download CLI ( #8287 )
2021-06-04 19:02:02 +02:00
Adriane Boyd
2c1de4b9a4
Set version to v2.3.6 ( #8117 )
2021-05-17 17:55:19 +02:00
Adriane Boyd
5e7e7cda94
Fix range in Span.get_lca_matrix ( #8115 )
...
Fix the adjusted token index / lca matrix index ranges for
`_get_lca_matrix` for spans.
* The range for `k` should correspond to the adjusted indices in
`lca_matrix` with the `start` indexed at `0`
2021-05-17 16:54:10 +02:00
Ines Montani
6ce9f0469f
Merge pull request #7261 from adrianeboyd/docs/v2-model-details
...
Limit to v2 models on v2.spacy.io
2021-03-03 23:12:59 +11:00
Adriane Boyd
6ffb395d68
Limit to v2 models on v2.spacy.io
2021-03-03 09:34:45 +01:00
Ines Montani
c70e6ee72d
Fix code branch for v2.x site [ci skip]
2021-02-01 11:48:35 +11:00
Ines Montani
6daf2381fa
Update meta [ci skip]
2021-01-30 20:18:01 +11:00
Ines Montani
fba7550537
Set to legacy [ci skip]
2021-01-30 19:57:14 +11:00
Ines Montani
44dc987d85
Fix icon [ci skip]
2021-01-30 18:27:55 +11:00
Ines Montani
8d293a4c4b
Update website to support legacy state [ci skip]
2021-01-30 18:27:31 +11:00
Ines Montani
8ddf53f8e1
Merge pull request #6857 from tupui/patch-1
2021-01-30 12:07:05 +11:00
Pamphile ROY
e496b8623f
SCA tupui
2021-01-29 15:46:53 +01:00
Pamphile ROY
41ee75ac6d
Remove --no-cache-dir when downloading models
...
When `--no-cache-dir` is present, it prevents caching to properly function.
If the user still wants to do this, there is the possibility to pass options with `user_pip_args`.
But you should not enforce options like these. In my case this is preventing some docker build (using buildkit caching) to have proper caching of models.
2021-01-29 15:37:44 +01:00
Adriane Boyd
4096a79de7
Add alignment mode error and fix Doc.char_span docs ( #6820 )
...
* Raise an error on an unrecognized alignment mode rather than
defaulting to `strict`
* Fix the `Doc.char_span` API doc alignment mode details
2021-01-27 23:40:42 +11:00
Ines Montani
d5ef245bb1
Merge pull request #6822 from jganseman/master [ci skip]
2021-01-27 13:04:30 +11:00
Ines Montani
560b7acece
Merge pull request #6802 from jumasheff/add-ky
2021-01-27 13:02:54 +11:00
jganseman
907bce7a78
Merge pull request #1 from jganseman/patch-1
...
Patch 1
2021-01-26 11:12:30 +01:00
jganseman
8bc57ec372
also update is_oov in lexeme docs
2021-01-26 11:09:16 +01:00
jganseman
c9103d60fa
Create jganseman.md
2021-01-26 11:02:31 +01:00
jganseman
1f2b0ec168
proposing a more concise explanation for is_oov
...
proposing a more concise explanation for is_oov
2021-01-26 10:53:39 +01:00
muratjumashev
2b19ebad59
Remove Kyrgyz chars fr. char_classes since Tatar ones already cover
2021-01-25 00:46:45 +06:00
muratjumashev
7d0154a36e
Added language meta data
2021-01-25 00:42:19 +06:00
muratjumashev
79327197d1
Add contributor agreement
2021-01-25 00:34:12 +06:00
muratjumashev
87168eb81f
Add tests
2021-01-24 20:56:16 +06:00
muratjumashev
53abf759ad
Fix punctuation
2021-01-24 20:54:22 +06:00
muratjumashev
2a2646362b
Fix language subclass
2021-01-23 22:00:50 +06:00
muratjumashev
fe3b5b8ff5
Add kyrgyz to char_classes
2021-01-23 21:53:41 +06:00
muratjumashev
e30bbf5432
Add examples
2021-01-23 21:49:08 +06:00
muratjumashev
2f385385a9
Remove comment
2021-01-23 21:36:28 +06:00
muratjumashev
d53724ba1d
Add lex_attrs
2021-01-23 21:35:25 +06:00
muratjumashev
4418ec2eee
Add punctuation
2021-01-23 21:31:31 +06:00
muratjumashev
101d265778
Add stopwords
2021-01-23 21:25:28 +06:00
muratjumashev
28d06ab860
Add tokenizer_exceptions
2021-01-22 23:08:41 +06:00
Sofie Van Landeghem
5ace559201
ensure span.text works for an empty span ( #6772 )
2021-01-21 23:18:46 +08:00
Sofie Van Landeghem
fdf8c77630
support IS_SENT_START in PhraseMatcher ( #6771 )
...
* support IS_SENT_START in PhraseMatcher
* add unit test and friendlier error
* use IDS.get instead
2021-01-21 09:59:17 +01:00
Adriane Boyd
bc7d83d4be
Skip 0-length matches ( #6759 )
...
Add hack to prevent matcher from returning 0-length matches.
2021-01-19 07:38:11 +08:00
Santiago Castro
28256522c8
Fix spacy.util.minibatch
when the size iterator is finished ( #6745 )
2021-01-17 19:48:43 +08:00
Adriane Boyd
e649242927
Prevent overlapping noun chunks for Spanish ( #6712 )
...
* Prevent overlapping noun chunks in Spanish noun chunk iterator
* Clean up similar code in Danish noun chunk iterator
2021-01-14 17:33:31 +11:00
Adriane Boyd
9957ed7897
Override language defaults for null token and URL match ( #6705 )
...
* Override language defaults for null token and URL match
When the serialized `token_match` or `url_match` is `None`, override the
language defaults to preserve `None` on deserialization.
* Fix fixtures in tests
2021-01-14 17:31:29 +11:00
Ines Montani
29c3ca7e34
Fix SVG integration [ci skip]
2021-01-14 13:33:41 +11:00
Antonio Miras
b4bd8f347a
spaCy Universe: New project; SpacyDotNet ( #6702 )
...
* Universe: SpacyDotNet a .NET Core spaCy wrapper
* Signed contributor agreement
Co-authored-by: Antonio Miras <antonio@amiras.net>
2021-01-13 12:47:30 +11:00
Alex Combessie
9cc880014c
Remove questionable French stopwords ( #6310 )
...
* Remove questionable French stopwords
* Create alexcombessie.md
2021-01-08 11:36:22 +11:00
Cristiana S Parada
7a0222f260
Update stop_words.py in Portuguese (a,o,e) ( #6345 )
...
* Update stop_words.py
Added three aditional stopwords: "a" and "o" that means "the", and "e" that means "and"
* Create cristianasp.md
* zero edit to push CI
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-01-08 11:35:38 +11:00
Lorena Ciutacu
f11002f1f1
add new Romanian stopwords ( #6621 )
...
* add contributor agreement
* update ro stopwords list
* add new stopwords
2021-01-08 11:34:47 +11:00
ophelielacroix
e3222fdec9
Add (noun chunks) syntax iterators for Danish ( #6246 )
...
* add syntax iterators for danish
* add test noun chunks for danish syntax iterators
* add contributor agreement
* update da syntax iterators to remove nested chunks
* add tests for da noun chunks
* Fix test
* add missing import
* fix example
* Prevent overlapping noun chunks
Prevent overlapping noun chunks by tracking the end index of the
previous noun chunk span.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-01-07 16:33:00 +11:00
Sofie Van Landeghem
6f7e7d88b9
remove cause without apostrophe from norm exceptions ( #6636 )
2021-01-06 12:30:30 +08:00
Sofie Van Landeghem
87562e470d
fix backticks in docs ( #6635 )
2020-12-27 22:12:37 +01:00