Commit Graph

11683 Commits

Author SHA1 Message Date
Álvaro Abella Bascarán
7111b9de2e Fix in docs: pipe(docs) instead of pipe(texts) (#5680)
Very minor fix in docs, specifically in this part:

```
 matcher = PhraseMatcher(nlp.vocab)
>   for doc in matcher.pipe(texts, batch_size=50):
>       pass
```

`texts` suggests the input is an iterable of strings. I replaced it for `docs`.
2020-06-30 20:01:12 +02:00
Matthias Hertel
305221f3e5 Website: fixed the token span in the text about the rule-based matching example (#5669)
* fixed token span in pattern matcher example

* contributor agreement
2020-06-30 19:58:55 +02:00
Adriane Boyd
d777d9cc38 Extend v2.3 migration guide (#5653)
* Extend preloaded vocab section

* Add section on tag maps
2020-06-26 14:13:01 +02:00
Adriane Boyd
a2660bd9c6 Fix backslashes in warnings config diff (#5640)
Fix backslashes in warnings config diff in v2.3 migration section.
2020-06-24 10:26:57 +02:00
Adriane Boyd
4f73ced914 Extend what's new in v2.3 with vocab / is_oov (#5635) 2020-06-23 16:50:43 +02:00
Adriane Boyd
fcdecefacf Add warnings example in v2.3 migration guide (#5627) 2020-06-22 14:38:06 +02:00
Adriane Boyd
66889de166 Warning for sudachipy 0.4.5 (#5611) 2020-06-19 13:45:23 +02:00
Ines Montani
959bc616dd Merge branch 'master' into spacy.io 2020-06-16 22:50:11 +02:00
Ines Montani
6d712f3e06
Merge pull request #5599 from adrianeboyd/docs/v2.3.0-minor 2020-06-16 13:49:25 -07:00
Adriane Boyd
02369f91d3 Fix spacy convert argument 2020-06-16 20:41:17 +02:00
Adriane Boyd
f0fd77648f Change example title to Dr.
Change example title to Dr. so the current model does exclude the title
in the initial example.
2020-06-16 20:36:21 +02:00
Adriane Boyd
a6abdfbc3c Fix numpy.zeros() dtype for Doc.from_array 2020-06-16 20:35:45 +02:00
Adriane Boyd
9aff317ca7 Update POS in tagging example 2020-06-16 20:26:57 +02:00
Adriane Boyd
457babfa0c Update alignment example for new gold.align 2020-06-16 20:22:03 +02:00
Ines Montani
19b9ea0436 Fix languages.json 2020-06-16 18:34:11 +02:00
Ines Montani
ed240458f6 Try and upgrade gatsby 2020-06-16 18:28:24 +02:00
Ines Montani
0faabf3325 Merge branch 'master' into spacy.io 2020-06-16 18:13:44 +02:00
Ines Montani
41003a5117 Update Binder version [ci skip] 2020-06-16 17:41:23 +02:00
Ines Montani
19be89b2ce Merge branch 'master' into spacy.io 2020-06-16 17:36:14 +02:00
Ines Montani
fd89f44c0c Update Binder URL [ci skip] 2020-06-16 17:34:26 +02:00
Ines Montani
ec6e35c1c2 Merge branch 'master' into spacy.io 2020-06-16 17:13:49 +02:00
Ines Montani
44af53bdd9 Add pkuseg warnings and auto-format [ci skip] 2020-06-16 17:13:35 +02:00
Ines Montani
ec26180b8f Merge branch 'master' into spacy.io 2020-06-16 16:38:55 +02:00
Ines Montani
a9e5b840ee Fix typos and auto-format [ci skip] 2020-06-16 16:38:45 +02:00
Ines Montani
1d3e8b7578
Merge pull request #5595 from explosion/v2.3.x 2020-06-16 07:37:10 -07:00
Ines Montani
e9d3e177f0 Merge branch 'master' into v2.3.x 2020-06-16 16:31:38 +02:00
Ines Montani
e9711c2f17 Merge branch 'master' into spacy.io 2020-06-16 16:10:28 +02:00
Ines Montani
bb54f54369 Fix model accuracy table [ci skip] 2020-06-16 16:10:12 +02:00
Adriane Boyd
d5110ffbf2
Documentation updates for v2.3.0 (#5593)
* Update website models for v2.3.0

* Add docs for Chinese word segmentation

* Tighten up Chinese docs section

* Merge branch 'master' into docs/v2.3.0 [ci skip]

* Merge branch 'master' into docs/v2.3.0 [ci skip]

* Auto-format and update version

* Update matcher.md

* Update languages and sorting

* Typo in landing page

* Infobox about token_match behavior

* Add meta and basic docs for Japanese

* POS -> TAG in models table

* Add info about lookups for normalization

* Updates to API docs for v2.3

* Update adding norm exceptions for adding languages

* Add --omit-extra-lookups to CLI API docs

* Add initial draft of "What's New in v2.3"

* Add new in v2.3 tags to Chinese and Japanese sections

* Add tokenizer to migration section

* Add new in v2.3 flags to init-model

* Typo

* More what's new in v2.3

Co-authored-by: Ines Montani <ines@ines.io>
2020-06-16 15:37:35 +02:00
Matthew Honnibal
7ff447c5a0 Set version to v2.3.0 2020-06-15 18:22:25 +02:00
Adriane Boyd
0d8405aafa Updates to docstrings (#5589) 2020-06-15 14:58:36 +02:00
Adriane Boyd
e867e9fa8f Fix and add warnings related to spacy-lookups-data (#5588)
* Fix warning message for lemmatization tables

* Add a warning when the `lexeme_norm` table is empty. (Given the
relatively lang-specific loading for `Lookups`, it seemed like too much
overhead to dynamically extract the list of languages, so for now it's
hard-coded.)
2020-06-15 14:58:29 +02:00
Arvind Srinivasan
f698007907 Added Tamil Example Sentences (#5583)
* Added Examples for Tamil Sentences

#### Description
This PR add example sentences for the Tamil language which were missing as per issue #1107 

#### Type of Change
This is an enhancement.

* Accepting spaCy Contributor Agreement

* Signed on my behalf as an individual
2020-06-15 14:58:21 +02:00
Adriane Boyd
c94f7d0e75
Updates to docstrings (#5589) 2020-06-15 14:56:51 +02:00
Adriane Boyd
c482f20778
Fix and add warnings related to spacy-lookups-data (#5588)
* Fix warning message for lemmatization tables

* Add a warning when the `lexeme_norm` table is empty. (Given the
relatively lang-specific loading for `Lookups`, it seemed like too much
overhead to dynamically extract the list of languages, so for now it's
hard-coded.)
2020-06-15 14:56:04 +02:00
Arvind Srinivasan
aa5b40fa64
Added Tamil Example Sentences (#5583)
* Added Examples for Tamil Sentences

#### Description
This PR add example sentences for the Tamil language which were missing as per issue #1107 

#### Type of Change
This is an enhancement.

* Accepting spaCy Contributor Agreement

* Signed on my behalf as an individual
2020-06-13 15:56:26 +02:00
theudas
3f5e2f9d99 Added Parameter to NEL to take n sentences into account (#5548)
* added setting for neighbour sentence in NEL

* added spaCy contributor agreement

* added multi sentence also for training

* made the try-except block smaller
2020-06-12 15:15:03 +02:00
adrianeboyd
4724fa4cf4 Expand Japanese requirements warning (#5572)
Include explicit install instructions in Japanese requirements warning.
2020-06-12 15:14:55 +02:00
adrianeboyd
44967a3f9c Update pytest conf for sudachipy with Japanese (#5574) 2020-06-12 15:14:47 +02:00
theudas
fa46e0bef2
Added Parameter to NEL to take n sentences into account (#5548)
* added setting for neighbour sentence in NEL

* added spaCy contributor agreement

* added multi sentence also for training

* made the try-except block smaller
2020-06-12 02:03:23 +02:00
Sofie Van Landeghem
18c6dc8093
removing label both on comment and on close 2020-06-11 14:09:40 +02:00
adrianeboyd
556895177e
Expand Japanese requirements warning (#5572)
Include explicit install instructions in Japanese requirements warning.
2020-06-11 13:47:37 +02:00
adrianeboyd
fe167fcf7d
Update pytest conf for sudachipy with Japanese (#5574) 2020-06-11 10:23:50 +02:00
Jones Martins
bab30e4ad2
Add "c'mon" token exception (#5570)
* Add "c'mon" exception

* Fix typo in "C'mon" exception
2020-06-10 21:54:06 +02:00
Jones Martins
28db7dd5d9
Add missing pronoums/determiners (#5569)
* Add missing pronoums/determiners

* Add test for missing pronoums

* Add contributor file
2020-06-10 18:47:04 +02:00
Sofie Van Landeghem
12c1965070
set delay to 7 days 2020-06-10 10:46:12 +02:00
adrianeboyd
0a70bd6281
Bump version to 2.3.0.dev1 (#5567) 2020-06-09 15:47:31 +02:00
adrianeboyd
b7e6e1b9a7
Disable sentence segmentation in ja tokenizer (#5566) 2020-06-09 12:00:59 +02:00
Sofie Van Landeghem
86112d2168
update issue manager's version 2020-06-09 08:57:38 +02:00
adrianeboyd
f162815f45
Handle empty and whitespace-only docs for Japanese (#5564)
Handle empty and whitespace-only docs in the custom alignment method
used by the Japanese tokenizer.
2020-06-08 21:09:23 +02:00