Commit Graph

11427 Commits

Author SHA1 Message Date
Ines Montani
1c212215cd
Merge pull request #5064 from adrianeboyd/feature/german-tokenization
Improve German tokenization
2020-02-26 13:41:44 +01:00
Ines Montani
f39ddda193
Merge pull request #5062 from svlandeg/bugfix/merge-conflicts
Fix sync between master and develop
2020-02-26 13:41:16 +01:00
Ines Montani
56978f5cd8
Merge pull request #5060 from svlandeg/feature/update-thinc
update thinc
2020-02-26 13:40:23 +01:00
Adriane Boyd
d1f703d78d Improve German tokenization
Improve German tokenization with respect to Tiger.
2020-02-26 13:06:52 +01:00
Ines Montani
54da6a2a07 Update pyproject.toml 2020-02-26 12:51:53 +01:00
Ines Montani
ed9358420e Merge branch 'master' into pr/5060 2020-02-26 12:51:29 +01:00
adrianeboyd
ff184b7a9c
Add tag_map argument to CLI debug-data and train (#4750) (#5038)
Add an argument for a path to a JSON-formatted tag map, which is used to
update and extend the default language tag map.
2020-02-26 12:10:38 +01:00
svlandeg
18ff97589d update spacy to 2.2.4.dev0 2020-02-26 10:50:05 +01:00
svlandeg
62406a9513 update from thinc 7.4.0.dev2 to 7.4.0 2020-02-26 10:30:35 +01:00
svlandeg
fc6e34c3a1 fix bugs from porting master to develop 2020-02-26 08:44:22 +01:00
Ines Montani
c7e3c034d2
Merge pull request #5061 from explosion/fix/pyproject-toml-master
Update pyproject.toml
2020-02-25 20:22:26 +01:00
Ines Montani
192b8d45a1
Merge pull request #5008 from svlandeg/fix/build_dependencies
Re-add pyproject.toml and add tests for dependency version consistency
2020-02-25 16:52:18 +01:00
Ines Montani
dc36ec98a4 Update pyproject.toml 2020-02-25 16:46:14 +01:00
Ines Montani
b6a6cff708 Add blis to pyproject.toml 2020-02-25 16:17:23 +01:00
Ines Montani
912572e04a Only copy if file exists (not if installed from sdist etc.) 2020-02-25 16:01:58 +01:00
Ines Montani
436b26fe0f Revert other changes 2020-02-25 15:48:29 +01:00
Ines Montani
c1a5ece65f Tidy up setup and update requirements tests 2020-02-25 15:46:39 +01:00
Ines Montani
5d21d3e8b9 Merge branch 'develop' into pr/5008 2020-02-25 15:24:47 +01:00
Ines Montani
acb4e3c7ba
Merge pull request #5039 from adrianeboyd/typo/website-token-api-shape
Fix formatting in Token API
2020-02-25 14:57:25 +01:00
Ines Montani
d50152b917
Merge pull request #5019 from questoph/master
Optimizing tokenization for Luxembourgish (dealing with apostrophe infixes)
2020-02-25 14:48:50 +01:00
Ines Montani
4440a072d2
Merge pull request #5006 from svlandeg/bugfix/multiproc-underscore
load Underscore state when multiprocessing
2020-02-25 14:46:02 +01:00
Ines Montani
38fc05986c
Merge pull request #5058 from bryant1410/patch-1
Add missing comma in a dependency specification
2020-02-25 14:44:29 +01:00
svlandeg
d848a68340 thinc 7.4.0.dev2 2020-02-25 12:07:42 +01:00
Santiago Castro
54d8665ff7
Add missing comma in a dependency specification
Conda is complaining that it can't parse that line otherwise.
2020-02-24 16:15:28 -05:00
svlandeg
d5bfebe1c5 it's moving day 2020-02-24 10:04:24 +01:00
svlandeg
217c16c7a9 running tests BEFORE deleting them ? 2020-02-24 09:38:43 +01:00
svlandeg
6f846c2cbf removing --pyargs for testing purposes 2020-02-24 09:19:08 +01:00
svlandeg
d821c95eb0 debugging prints 2020-02-23 17:38:33 +01:00
svlandeg
58568bd0cd fix 2020-02-23 16:45:37 +01:00
svlandeg
0f55e51704 assert we found the root_dir 2020-02-23 16:33:58 +01:00
svlandeg
783da088ea avoid try except 2020-02-23 16:21:21 +01:00
svlandeg
b49a3afd0c use clean_underscore fixture 2020-02-23 15:49:20 +01:00
Ines Montani
4890db6339 Auto-format and fix image [ci skip] 2020-02-23 13:56:50 +01:00
Tom Keefe
ddf63b97a8
make idx available via to_array (#5030) 2020-02-22 14:13:06 +01:00
Sofie Van Landeghem
44f4142ce4
add two abbreviations and some additional unit tests (#5040) 2020-02-22 14:12:32 +01:00
Sofie Van Landeghem
479bd8d09f
add lemma option to displacy 'dep' visualiser (#5041)
* add lemma option to displacy 'dep' visualiser

* more compact list comprehension

* add option to doc

* fix test and add lemmas to util.get_doc

* fix capital

* remove lemma from get_doc

* cleanup
2020-02-22 14:11:51 +01:00
Adriane Boyd
3853d385fa Fix formatting in Token API 2020-02-20 13:41:24 +01:00
adrianeboyd
2164e71ea8
Improved Romanian tokenization for UD RRT (#5036)
Modifications to Romanian tokenization to improve tokenization for
UD_Romanian-RRT.
2020-02-19 16:15:59 +01:00
svlandeg
9f1447bf71 where areth thou, file ? 2020-02-19 17:09:29 +02:00
svlandeg
9834527f2c hack to switch between CLI folder setup and local setup 2020-02-19 16:22:48 +02:00
svlandeg
5c2f645470 root dir one level up 2020-02-19 16:15:56 +02:00
svlandeg
303c4bcd4c include requirements in manifest 2020-02-19 15:52:55 +02:00
svlandeg
b20351792a assert prints for more clarity 2020-02-19 15:51:53 +02:00
Ines Montani
8137b24928
Merge pull request #5028 from explosion/refactor/remove-symlinks
Remove symlinks, data dir and related stuff
2020-02-19 00:20:23 +01:00
Ines Montani
a3335d36b8 Merge branch 'develop' into refactor/remove-symlinks 2020-02-18 17:22:20 +01:00
Ines Montani
a138acb220
Merge pull request #5027 from explosion/chore/sync-develop-master
Sync develop with master, tidy up, auto-format
2020-02-18 17:22:03 +01:00
Ines Montani
09cbeaef27 Remove symlinks, data dir and related stuff 2020-02-18 17:20:17 +01:00
Ines Montani
e3f40a6a0f Tidy up and auto-format 2020-02-18 15:38:18 +01:00
Ines Montani
1278161f47 Tidy up and fix issues 2020-02-18 15:17:03 +01:00
Ines Montani
de11ea753a Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00