1
1
mirror of https://github.com/explosion/spaCy.git synced 2025-04-23 18:41:59 +03:00
Commit Graph

10026 Commits

Author SHA1 Message Date
fizban99
f2f2df6e78 entity types for colors should be in uppercase ()
although the text indicates the entity types should be in lowercase, the sample code shows uppercase, which is the correct format.
2019-04-17 11:22:56 +02:00
fizban99
57d4a8bf3d Create fizban99.md () 2019-04-17 11:22:19 +02:00
Matthew Honnibal
83511972d3 Set version to v2.1.4.dev0 2019-04-16 14:17:26 +02:00
Matthew Honnibal
8b5ae0733e Merge branch 'master' of https://github.com/explosion/spaCy 2019-04-16 12:29:46 +02:00
Matthew Honnibal
d59b2e8a0c Fix issue : Upper case lemmas
If the Morphology class tries to lemmatize a word that's not in the
string store, it's forced to just return it as-is. While loading
exceptions, the class could hit a case where these strings weren't in
the string store yet. The resulting lemmas could then be cached, leading
to some words receiving upper-case lemmas. Closes .
2019-04-16 12:27:15 +02:00
BreakBB
5b8dbe4975 Fix symlink creation to show error message on failure () (resolves ))
* Fix symlink creation to show error message on failure. Update tests to reflect those changes.

* Fix test to succeed on non windows systems.
2019-04-16 11:58:31 +02:00
Krzysztof Kowalczyk
cc1516ec26 Improved training and evaluation ()
* Add early stopping

* Add return_score option to evaluate

* Fix missing str to path conversion

* Fix import + old python compatibility

* Fix bad beam_width setting during cpu evaluation in spacy train with gpu option turned on
2019-04-15 12:04:36 +02:00
Shikhar Chauhan
bbf6f9f764 Change default output format from jsonl to json for cli convert () (closes )
* Changing default ouput format from jsonl to json for cli convert

* Adding Contributor Agreement
2019-04-12 11:31:23 +02:00
Omer Celik
531c0869b2 Added Turkish Lira symbol(₺) ()
Added Turkish Lira symbol(₺) 
https://en.wikipedia.org/wiki/Turkish_lira
2019-04-11 11:32:28 +02:00
Omer Celik
034a1f458b Signed agreement () 2019-04-11 11:31:27 +02:00
Ivan Tham
71710e2454 Add myself to contributors () 2019-04-11 11:31:04 +02:00
oterrier
2854724e69 Added project gracyql to Universe () (resolves )
As discussed with Ines in https://github.com/explosion/spaCy/issues/3568 , adding a new project proposal for the community in SpaCy Universe website

GracyQL a tiny graphql wrapper aroung spacy using graphene and starlette.

## Description
Change only in universe.json file to add a new project

### Types of change
New project reference in Universe

## Checklist
- [x ] I have submitted the spaCy Contributor Agreement.
- [x ] I ran the tests, and all new and existing tests passed.
- [ x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-04-10 17:54:42 +02:00
Santiago Castro
86e4b68aa9 Fix website docs for Vectors.from_glove ()
* Fix website docs for Vectors.from_glove

* Add myself as a contributor
2019-04-10 15:23:27 +02:00
Ines Montani
4d198a7e92 Ensure match pattern error isn't raised on empty errors (closes ) 2019-04-09 12:50:43 +02:00
Ines Montani
3ddb799f27 Merge branch 'master' of https://github.com/explosion/spaCy 2019-04-09 11:40:28 +02:00
Ines Montani
145c0b7e88 Tidy up and auto-format 2019-04-09 11:40:19 +02:00
Bharat Raghunathan
72820896d4 Fix typo in web docs cli.md () 2019-04-09 11:40:03 +02:00
Ines Montani
5f005adf61 Add xfailing test for 2019-04-09 11:07:14 +02:00
Ines Montani
6ae3b5699e Make sure path is string (resolves ) 2019-04-08 12:53:41 +02:00
Ines Montani
d0f5e015cb Auto-format 2019-04-08 12:53:16 +02:00
pierremonico
0d26bfe677 Removes duplicate in table ()
* Removes duplicate in table

Just fixing typos.

* Remove newline


Co-authored-by: Ines Montani <ines@ines.io>
2019-04-08 10:30:42 +02:00
Piero Molino
5198aa4ae6 Added Ludwig among the projects () [ci skip]
* Added Ludwig among the projects

* Create w4nderlust.md

* Add Uber to logo wall
2019-04-07 13:01:26 +02:00
Dobita21
8bf6967eb7 Update Thai stop words ()
* test sPacy commit to git fri 04052019 10:54

* change Data format from my format to master format

* ทัทั้งนี้ ---> ทั้งนี้

* delete stop_word translate from Eng

* Adjust formatting and readability
2019-04-05 12:06:38 +02:00
jeannefukumaru
f67d881b30 fix typos in tag_map flagged by python -m debug-data ()
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [ ] I have submitted the spaCy Contributor Agreement.
- [ ] I ran the tests, and all new and existing tests passed.
- [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.


Co-authored-by: Ines Montani <ines@ines.io>
2019-04-05 12:06:09 +02:00
Ines Montani
cd21778bef
Merge pull request from jeannefukumaru/master
Added tags previously missing from Indonesian `tag_map.py`
2019-04-04 11:57:03 +02:00
Jeanne Choo
b6c9807431 Merge remote-tracking branch 'upstream/master' 2019-04-04 14:21:50 +08:00
Jeanne Choo
80e15af76c fixed tag_map.py merge conflict 2019-04-04 14:18:27 +08:00
jeannefukumaru
eba4f77526
Merge pull request from jeannefukumaru/update_indonesian_tag_map
updated tag map with missing tags
2019-04-04 06:49:04 +08:00
jeannefukumaru
876ce01567 updated tag map with missing tags 2019-04-03 23:09:11 +08:00
jeannefukumaru
99e04c4ce2
Merge pull request from jeannefukumaru/added-indonesian-tag-map
Added indonesian tag map
2019-04-03 23:05:05 +08:00
Ines Montani
4faf62d515
Merge pull request from svlandeg/fix/issue_3521
Allow English stopwords with any type of apostrophe
2019-04-03 14:14:03 +02:00
Yves Peirsman
951825532c Improved Dutch language resources and Dutch lemmatization ()
* Improved Dutch language resources and Dutch lemmatization

* Fix conftest

* Update punctuation.py

* Auto-format

* Format and fix tests

* Remove unused test file

* Re-add deleted test

* removed redundant infix regex pattern for ','; note: brackets + simple hyphen remains

* Cleaner lemmatization files
2019-04-03 14:13:26 +02:00
svlandeg
4ff786e113 addressed all comments by Ines 2019-04-03 13:50:33 +02:00
Ines Montani
6a4575a56c Don't make "settings" or "title" required in displaCy data (closes ) 2019-04-03 10:13:16 +02:00
Ines Montani
2f0f439c54 Remove non-existent example (closes ) 2019-04-03 09:59:17 +02:00
Kamolsit Mongkolsrisawat
dcc67f3f51 Update Thai tokenizer_exception list ()
* add tokenizer_exceptions word (ก-น) from https://goo.gl/JpJ2qq

* update tokenizer_exceptions word list

* add contributor file
2019-04-03 09:13:36 +02:00
ivigamberdiev
5e5641616d Update links and http -> https ()
* update links and http -> https

* SCA
2019-04-02 17:36:22 +02:00
svlandeg
85b4319f33 specify encoding in files 2019-04-02 15:05:31 +02:00
svlandeg
673c81bbb4 unicode string for python 2.7 2019-04-02 13:52:07 +02:00
svlandeg
eca9cc5417 fixing Issue by adding all hyphen variants for each stopword 2019-04-02 13:24:59 +02:00
svlandeg
e7062cf699 failing test for Issue 2019-04-02 13:15:35 +02:00
svlandeg
1424b12b09 failing test for Issue 2019-04-02 13:06:37 +02:00
Ines Montani
24cecdb44f Update compatibility [ci skip] 2019-04-01 16:25:16 +02:00
jeannefukumaru
6cdb7b2e04 added tag_map for indonesian ()
* added tag_map for indonesian

* changed tag map from .py to .txt to see if tests pass

* added symbols import

* added utf8 encoding flag

* added missing SCONJ symbol

* Auto-format

* Remove unused imports

* Make tag map available in Indonesian defaults
2019-04-01 12:27:48 +02:00
Ines Montani
c23e234d65 Auto-format 2019-04-01 12:11:27 +02:00
Ines Montani
5821b020d5 Merge branch 'spacy.io' 2019-04-01 11:47:59 +02:00
Ines Montani
0a0b1087b0 Make tag map available in Indonesian defaults 2019-04-01 11:46:51 +02:00
Ines Montani
5d9212c44c Remove unused imports 2019-04-01 11:46:25 +02:00
Ines Montani
8d6b544632 Auto-format 2019-04-01 11:45:43 +02:00
jeannefukumaru
6567f27849
added missing SCONJ symbol 2019-04-01 17:02:53 +08:00