Matthew Honnibal
|
9dceb97570
|
Extend morphanalysis API
|
2019-03-08 01:38:34 +01:00 |
|
Matthew Honnibal
|
322b64dca0
|
Allow lookup of morphology by attribute name
|
2019-03-08 01:38:15 +01:00 |
|
Matthew Honnibal
|
3c32590243
|
Add test for morph analysis
|
2019-03-08 00:10:07 +01:00 |
|
Matthew Honnibal
|
3300e3d7ab
|
Implement more MorphAnalysis API
|
2019-03-08 00:09:16 +01:00 |
|
Matthew Honnibal
|
9a2d1cc6e0
|
Add length attribute to MorphAnalysisC
|
2019-03-08 00:08:57 +01:00 |
|
Matthew Honnibal
|
b5f2b7b454
|
Add list_features() helper, clean up
|
2019-03-08 00:08:35 +01:00 |
|
Ines Montani
|
daaeeb7a2b
|
Merge branch 'master' into develop
|
2019-03-07 22:07:31 +01:00 |
|
Matthew Honnibal
|
a40d73cb2a
|
Build out morphological analysis API
|
2019-03-07 21:59:25 +01:00 |
|
Matthew Honnibal
|
dd9ea478c5
|
Fix intify_attrs function for obsolete data
|
2019-03-07 21:59:03 +01:00 |
|
Matthew Honnibal
|
987ee6e884
|
Fix data reading in morphology
|
2019-03-07 21:58:43 +01:00 |
|
Matthew Honnibal
|
00cfadbf63
|
Fix obsolete data in English tokenizer exceptions
|
2019-03-07 21:58:16 +01:00 |
|
Matthew Honnibal
|
7afe56a360
|
Fix morphological features in en tag_map
|
2019-03-07 21:57:56 +01:00 |
|
Matthew Honnibal
|
3a667833d1
|
Fix morphological features in de tag_map
|
2019-03-07 21:57:43 +01:00 |
|
Adrien Ball
|
88909a9adb
|
Fix egg fragments in direct download (#3369)
## Description
The egg fragment in the URL must be of the form `#egg=package_name==version` instead of `#egg=package_name-version`.
One of the consequences of specifying wrong egg fragments is that `pip` does not recognize the package and its version properly, and thus it re-downloads the package systematically.
I'm not sure how this should be tested properly.
Here is what I had before the fix when running the same direct download twice:
```
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Collecting en_core_web_sm-2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm-2.0.0
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
100% |████████████████████████████████| 37.4MB 1.6MB/s
Generating metadata for package en-core-web-sm-2.0.0 produced metadata for project name en-core-web-sm. Fix your #egg=en-core-web-sm-2.0.0 fragments.
Installing collected packages: en-core-web-sm
Running setup.py install for en-core-web-sm ... done
Successfully installed en-core-web-sm-2.0.0
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Collecting en_core_web_sm-2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm-2.0.0
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
100% |████████████████████████████████| 37.4MB 919kB/s
Generating metadata for package en-core-web-sm-2.0.0 produced metadata for project name en-core-web-sm. Fix your #egg=en-core-web-sm-2.0.0 fragments.
Requirement already satisfied (use --upgrade to upgrade): en-core-web-sm from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm-2.0.0 in ./venv3/lib/python3.6/site-packages
```
And after the fix:
```
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Collecting en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
100% |████████████████████████████████| 37.4MB 1.1MB/s
Installing collected packages: en-core-web-sm
Running setup.py install for en-core-web-sm ... done
Successfully installed en-core-web-sm-2.0.0
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Requirement already satisfied: en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0 in ./venv3/lib/python3.6/site-packages (2.0.0)
```
### Types of change
This is an enhancement as it avoids unnecessary downloads of (potentially big) spacy models, when they have already been downloaded.
## Checklist
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
|
2019-03-07 21:07:19 +01:00 |
|
Matthew Honnibal
|
1a10bf29bc
|
Remove morph_key from token api
|
2019-03-07 18:33:17 +01:00 |
|
Matthew Honnibal
|
c1888b05d2
|
Export helper functions for morphology
|
2019-03-07 18:33:06 +01:00 |
|
Matthew Honnibal
|
357066ee2f
|
Work on morphanalysis class
|
2019-03-07 18:32:51 +01:00 |
|
Matthew Honnibal
|
2669190b85
|
Normalize props for morph exceptions
|
2019-03-07 18:32:36 +01:00 |
|
Matthew Honnibal
|
e585b50458
|
Fix features in English tag map
|
2019-03-07 18:32:09 +01:00 |
|
Matthew Honnibal
|
0ad09b16ad
|
Add header for morphanalysis
|
2019-03-07 17:24:57 +01:00 |
|
Matthew Honnibal
|
fed0371db7
|
Remove enums from morphology
|
2019-03-07 17:14:57 +01:00 |
|
Matthew Honnibal
|
932d7dde1c
|
Fix compile error
|
2019-03-07 14:34:54 +01:00 |
|
Matthew Honnibal
|
b9ade7d4e0
|
Add MorphAnalysisC struct
|
2019-03-07 14:03:07 +01:00 |
|
Matthew Honnibal
|
b69013e2d7
|
Fix passing of morphological features to lemmatizer
|
2019-03-07 13:11:38 +01:00 |
|
Matthew Honnibal
|
74db1d9602
|
Revert "Space out symbols enum, to make maintaining easier"
This reverts commit be5235369c .
|
2019-03-07 12:52:30 +01:00 |
|
Matthew Honnibal
|
c773b5011c
|
Revert "Fix StringStore after symbols changes"
This reverts commit bcfe3bd312 .
|
2019-03-07 12:52:15 +01:00 |
|
Matthew Honnibal
|
bcfe3bd312
|
Fix StringStore after symbols changes
|
2019-03-07 12:51:11 +01:00 |
|
Ines Montani
|
96b91a8898
|
Fix noqa [ci skip]
|
2019-03-07 12:25:00 +01:00 |
|
Ines Montani
|
d63672f48d
|
Merge branch 'develop' into spacy.io
|
2019-03-07 12:23:39 +01:00 |
|
Ines Montani
|
fa7314b221
|
Clarify train_path and dev_path format (see #3366) [ci skip]
|
2019-03-07 12:23:27 +01:00 |
|
Matthew Honnibal
|
d0ca64bb07
|
Fix imports in morphanalysis
|
2019-03-07 12:14:53 +01:00 |
|
Matthew Honnibal
|
6734cfec88
|
Add comment
|
2019-03-07 12:14:37 +01:00 |
|
Matthew Honnibal
|
be5235369c
|
Space out symbols enum, to make maintaining easier
|
2019-03-07 12:14:23 +01:00 |
|
Matthew Honnibal
|
34651c8ddf
|
Fix lemmatizer
|
2019-03-07 12:13:47 +01:00 |
|
Matthew Honnibal
|
8805966460
|
Fix moved Morphologizer class
|
2019-03-07 10:46:27 +01:00 |
|
Matthew Honnibal
|
ef3110a444
|
Fix compile error
|
2019-03-07 10:45:55 +01:00 |
|
Matthew Honnibal
|
21008ad2d8
|
Draft API for morphological analysis class
|
2019-03-07 10:45:24 +01:00 |
|
Matthew Honnibal
|
fc1cc4c529
|
Move morphologizer under spacy/pipes
|
2019-03-07 01:36:26 +01:00 |
|
Matthew Honnibal
|
bfa52d9d8a
|
Move morphologizer within spacy/pipes
|
2019-03-07 01:34:32 +01:00 |
|
Matthew Honnibal
|
98dfe5e433
|
Fix ud_train.py
|
2019-03-07 01:31:23 +01:00 |
|
Matthew Honnibal
|
ae7c728c5f
|
Fix json dependency
|
2019-03-07 01:17:19 +01:00 |
|
Ines Montani
|
9d6ca18a10
|
Tidy up and only use self.vector once
|
2019-03-07 01:06:12 +01:00 |
|
Ines Montani
|
a8f1efd2f5
|
Merge branch 'master' into develop
|
2019-03-07 00:56:31 +01:00 |
|
Matthew Honnibal
|
010f846d5f
|
Fix dependencies in morphologizer
|
2019-03-07 00:16:51 +01:00 |
|
Matthew Honnibal
|
3993f41cc4
|
Update morphology branch from develop
|
2019-03-07 00:14:43 +01:00 |
|
Daniel King
|
5f40229397
|
Don't use numpy directly for similarity (#3362)
* Don't use numpy directly for similarity
* Contributor agreement
|
2019-03-06 22:58:38 +00:00 |
|
svlandeg
|
173d45ec5f
|
adding kb_id as field to token, el as nlp pipeline component
|
2019-03-06 19:34:18 +01:00 |
|
Ines Montani
|
0c09831227
|
Merge branch 'develop' into spacy.io
|
2019-03-06 14:41:25 +01:00 |
|
Ines Montani
|
e9babd9973
|
Update hyperparameters section (see #3352)
|
2019-03-06 14:40:30 +01:00 |
|
Ines Montani
|
6bd34e9d54
|
Expose Japanese stop words (closes #3346)
|
2019-03-06 14:21:15 +01:00 |
|