Commit Graph

12029 Commits

Author SHA1 Message Date
Adriane Boyd
f0fd77648f Change example title to Dr.
Change example title to Dr. so the current model does exclude the title
in the initial example.
2020-06-16 20:36:21 +02:00
Adriane Boyd
a6abdfbc3c Fix numpy.zeros() dtype for Doc.from_array 2020-06-16 20:35:45 +02:00
Adriane Boyd
9aff317ca7 Update POS in tagging example 2020-06-16 20:26:57 +02:00
Adriane Boyd
457babfa0c Update alignment example for new gold.align 2020-06-16 20:22:03 +02:00
svlandeg
8b66c11ff2 add spaces to json output format 2020-06-16 19:30:03 +02:00
svlandeg
ba80ad7efd fixed some tests + WIP roundtrip unit test 2020-06-16 18:26:50 +02:00
Ines Montani
41003a5117 Update Binder version [ci skip] 2020-06-16 17:41:23 +02:00
Ines Montani
fd89f44c0c Update Binder URL [ci skip] 2020-06-16 17:34:26 +02:00
Ines Montani
44af53bdd9 Add pkuseg warnings and auto-format [ci skip] 2020-06-16 17:13:35 +02:00
Ines Montani
a9e5b840ee Fix typos and auto-format [ci skip] 2020-06-16 16:38:45 +02:00
Ines Montani
1d3e8b7578
Merge pull request #5595 from explosion/v2.3.x 2020-06-16 07:37:10 -07:00
Ines Montani
e9d3e177f0 Merge branch 'master' into v2.3.x 2020-06-16 16:31:38 +02:00
Ines Montani
bb54f54369 Fix model accuracy table [ci skip] 2020-06-16 16:10:12 +02:00
Adriane Boyd
d5110ffbf2
Documentation updates for v2.3.0 (#5593)
* Update website models for v2.3.0

* Add docs for Chinese word segmentation

* Tighten up Chinese docs section

* Merge branch 'master' into docs/v2.3.0 [ci skip]

* Merge branch 'master' into docs/v2.3.0 [ci skip]

* Auto-format and update version

* Update matcher.md

* Update languages and sorting

* Typo in landing page

* Infobox about token_match behavior

* Add meta and basic docs for Japanese

* POS -> TAG in models table

* Add info about lookups for normalization

* Updates to API docs for v2.3

* Update adding norm exceptions for adding languages

* Add --omit-extra-lookups to CLI API docs

* Add initial draft of "What's New in v2.3"

* Add new in v2.3 tags to Chinese and Japanese sections

* Add tokenizer to migration section

* Add new in v2.3 flags to init-model

* Typo

* More what's new in v2.3

Co-authored-by: Ines Montani <ines@ines.io>
2020-06-16 15:37:35 +02:00
svlandeg
43d41d6bb6 allow None as BILUO annotation 2020-06-16 15:30:05 +02:00
svlandeg
44a0f9c2c8 test_gold_biluo_different_tokenization works 2020-06-16 15:21:20 +02:00
svlandeg
1c35b8efcd fix spaces 2020-06-16 12:08:25 +02:00
svlandeg
6fea5fa4bd attempt to fix cases with weird spaces 2020-06-16 11:52:29 +02:00
svlandeg
0702a1d3fb fix test for misaligned 2020-06-15 23:10:47 +02:00
svlandeg
a28f8f369e Fix many-to-one IOB codes 2020-06-15 23:06:22 +02:00
svlandeg
12886b787b fixing NER one-to-many alignment 2020-06-15 22:44:17 +02:00
Matthew Honnibal
7ff447c5a0 Set version to v2.3.0 2020-06-15 18:22:25 +02:00
Matthew Honnibal
a0bf73a5dd Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow 2020-06-15 18:16:01 +02:00
Matthew Honnibal
c66f93299e Remove TokenAnnotation code from nonproj 2020-06-15 18:14:47 +02:00
Matthew Honnibal
c95494739c Fix import 2020-06-15 18:11:10 +02:00
Matthew Honnibal
8f978f2031 Fix import 2020-06-15 18:10:47 +02:00
Matthew Honnibal
95de7efaad Draft create_gold_state for arc_eager oracle 2020-06-15 18:10:19 +02:00
svlandeg
68986a252e additional tests for new get_aligned function 2020-06-15 17:42:40 +02:00
svlandeg
41d29983a7 start testing get_aligned 2020-06-15 17:16:01 +02:00
svlandeg
fd5f199feb fixing language and scoring tests 2020-06-15 15:02:05 +02:00
Adriane Boyd
0d8405aafa Updates to docstrings (#5589) 2020-06-15 14:58:36 +02:00
Adriane Boyd
e867e9fa8f Fix and add warnings related to spacy-lookups-data (#5588)
* Fix warning message for lemmatization tables

* Add a warning when the `lexeme_norm` table is empty. (Given the
relatively lang-specific loading for `Lookups`, it seemed like too much
overhead to dynamically extract the list of languages, so for now it's
hard-coded.)
2020-06-15 14:58:29 +02:00
Arvind Srinivasan
f698007907 Added Tamil Example Sentences (#5583)
* Added Examples for Tamil Sentences

#### Description
This PR add example sentences for the Tamil language which were missing as per issue #1107 

#### Type of Change
This is an enhancement.

* Accepting spaCy Contributor Agreement

* Signed on my behalf as an individual
2020-06-15 14:58:21 +02:00
Adriane Boyd
c94f7d0e75
Updates to docstrings (#5589) 2020-06-15 14:56:51 +02:00
Adriane Boyd
c482f20778
Fix and add warnings related to spacy-lookups-data (#5588)
* Fix warning message for lemmatization tables

* Add a warning when the `lexeme_norm` table is empty. (Given the
relatively lang-specific loading for `Lookups`, it seemed like too much
overhead to dynamically extract the list of languages, so for now it's
hard-coded.)
2020-06-15 14:56:04 +02:00
svlandeg
b4d914ec77 fix error catching 2020-06-15 12:56:32 +02:00
svlandeg
b9c9cbb2cd informative error when calling to_array with wrong field 2020-06-15 11:53:31 +02:00
svlandeg
ff231e1cdd fix merge conflict 2020-06-15 09:04:19 +02:00
svlandeg
a48553c1ed fix error numbers 2020-06-15 08:51:31 +02:00
Matthew Honnibal
3c0fc10dc4 Remove beam for now (maybe)
Remove beam_utils

Update setup.py

Remove beam
2020-06-14 19:53:29 +02:00
Matthew Honnibal
98ca14f577 Remove GoldParse
WIP on removing goldparse

Get ArcEager compiling after GoldParse excise

Update setup.py

Get spacy.syntax compiling after removing GoldParse

Rename NewExample -> Example and clean up

Clean html files

Start updating tests

Update Morphologizer
2020-06-14 19:53:30 +02:00
Matthew Honnibal
d53723aa4f Merge from whatif/arrow 2020-06-14 17:43:59 +02:00
Matthew Honnibal
380cce9d8b Update errors 2020-06-14 17:40:05 +02:00
Matthew Honnibal
706e652820 Merge from develop 2020-06-14 17:35:01 +02:00
Matthew Honnibal
9296d71a54 More GoldParse excise 2020-06-14 17:26:54 +02:00
Matthew Honnibal
60d4e5a9e0 WIP on updating transition-system 2020-06-14 17:22:14 +02:00
Matthew Honnibal
7d65615625 WIP start excising GoldParse 2020-06-14 17:11:41 +02:00
Matthew Honnibal
4362ec7084 Hack Language.evaluate 2020-06-13 23:37:42 +02:00
Matthew Honnibal
7de997c0a5 Update test 2020-06-13 23:11:45 +02:00
Matthew Honnibal
8f941ef527 Update GoldParse 2020-06-13 23:11:29 +02:00