Commit Graph

8485 Commits

Author SHA1 Message Date
ines
5750f62d0a Fix autolinking failure on fresh model install (resolves #1138)
On fresh install via subprocess, pip.get_installed_distributions()
won't show new model, so is_package check in link command fails.
Solution for now is to get model package path explicitly and pass it to
link command.
2017-08-14 12:13:26 +02:00
Matthew Honnibal
dbbfe595a5 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-14 12:09:28 +02:00
Matthew Honnibal
ac6c25f762 Check SGD is not None in update 2017-08-14 12:09:18 +02:00
Matthew Honnibal
0ae045256d Fix beam training 2017-08-13 18:02:05 -05:00
Matthew Honnibal
6a42cc16ff Fix beam parser, improve efficiency of non-beam 2017-08-13 12:37:26 +02:00
Matthew Honnibal
4363b4aa4a Fix redundant tokvecs updates during update 2017-08-13 12:36:55 +02:00
Matthew Honnibal
12de263813 Bug fixes to beam parsing. Learns small sample 2017-08-13 09:33:39 +02:00
Matthew Honnibal
4ae0d5e1e6 Set defaults for convert command 2017-08-13 09:03:38 +02:00
Matthew Honnibal
92ebab6073 Update beam-update tests 2017-08-13 08:56:02 +02:00
Matthew Honnibal
17874fe491 Disable beam parsing 2017-08-12 19:35:40 -05:00
Matthew Honnibal
69f21867b5 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-12 19:25:56 -05:00
Matthew Honnibal
3e30712b62 Improve defaults 2017-08-12 19:24:17 -05:00
Matthew Honnibal
28e930aae0 Fixes for beam parsing. Not working 2017-08-12 19:22:52 -05:00
Matthew Honnibal
c96d769836 Fix beam parse. Not sure if working 2017-08-12 18:21:54 -05:00
Matthew Honnibal
24b45b45c6 Add test for beam update 2017-08-12 17:15:28 -05:00
Matthew Honnibal
4638f4b869 Fix beam update 2017-08-12 17:15:16 -05:00
Matthew Honnibal
d4308d2363 Initialize State offset to 0 2017-08-12 17:14:39 -05:00
Matthew Honnibal
b353e4d843 Work on parser beam training 2017-08-12 14:47:45 -05:00
ines
d4f2baf7dd Add create_meta option to package command
Re-create meta.json in model directory, even if it exists. Especially
useful when updating existing spaCy models or training with Prodigy.
Ensures user won't end up with multiple "en_core_web_sm" models, and
offers easy way to change the model's name and settings without having
to edit the meta.json file.
2017-08-12 21:44:18 +02:00
Matthew Honnibal
4ab0c8c8e9 Try different drop_layer structure in Tok2Vec 2017-08-12 08:56:57 -05:00
Matthew Honnibal
cd5ecedf6a Try drop_layer in parser 2017-08-12 08:56:33 -05:00
Matthew Honnibal
8870d491f1 Remove redundant pickling during training 2017-08-12 08:55:53 -05:00
Matthew Honnibal
680043ebca Improve efficiency of tagger.set_annotations for GPU 2017-08-12 08:54:21 -05:00
Matthew Honnibal
ebe0f7f641 Pass embed size correctly in tagger, and cache embeddings for efficiency 2017-08-12 05:45:20 -05:00
Matthew Honnibal
1a59db1c86 Fix dropout and learn rate in parser 2017-08-12 05:44:39 -05:00
Ines Montani
b40bc20b12 Merge pull request #1252 from nkruglikov/patch-1
Fix small typo in documentation
2017-08-10 12:11:31 +02:00
Nikolai Kruglikov
d42a03b8de Fix small typo in documentation 2017-08-10 14:38:30 +05:00
Matthew Honnibal
d01dc3704a Adjust parser model 2017-08-09 20:06:33 -05:00
Matthew Honnibal
f37528ef58 Pass embed size for parser fine-tune. Use SELU 2017-08-09 17:52:53 -05:00
Matthew Honnibal
f93f2bed58 Revert use of layer normalization in Tok2Vec 2017-08-09 17:47:03 -05:00
Matthew Honnibal
20944dd8aa Fix conflict in parser fine-tuning 2017-08-09 16:43:05 -05:00
Matthew Honnibal
ac2de6dced Switch to ReLu layers in Tok2Vec 2017-08-09 16:41:25 -05:00
Matthew Honnibal
bbace204be Gate parser fine-tuning behind feature flag 2017-08-09 16:40:42 -05:00
Matthew Honnibal
a59a1deac4 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-09 16:23:19 -05:00
Matthew Honnibal
bcce6f7de0 Fix parser fine tuning 2017-08-09 16:23:12 -05:00
ines
495e042429 Add entry point-style auto alias for "spacy"
Simplest way to run commands as spacy xxx instead of python -m spacy
xxx, while avoiding environment conflicts
2017-08-09 12:17:30 +02:00
ines
764540a6dd Don't ignore /bin directory 2017-08-09 12:16:30 +02:00
ines
28e2fec23b Fix autolinking failure on fresh model install (resolves #1138)
On fresh install via subprocess, pip.get_installed_distributions()
won't show new model, so is_package check in link command fails.
Solution for now is to get model package path explicitly and pass it to
link command.
2017-08-09 11:52:38 +02:00
Jim Geovedi
c62b49b7cc Merge remote-tracking branch 'upstream/develop' into indonesian 2017-08-09 09:17:46 +07:00
Matthew Honnibal
dbdd8afc4b Fix parser fine-tune training 2017-08-08 15:46:07 -05:00
Matthew Honnibal
88bf1cf87c Update parser for fine tuning 2017-08-08 15:34:17 -05:00
Jim O'Regan
c069b4acb5 fix in UD submitted; map either way 2017-08-08 19:22:14 +01:00
Jim O'Regan
76c22dec4d UD Irish tag mapping 2017-08-08 19:04:52 +01:00
Jim O'Regan
95921d7d4c Merge branch 'develop' into develop-irish 2017-08-08 17:21:27 +01:00
Ines Montani
a9465271a7 Merge pull request #1245 from delirious-lettuce/fix_typos
Fix typos
2017-08-07 23:11:20 +02:00
Paul O'Leary McCann
6e9e686568 Sample implementation of Japanese Tagger (ref #1214)
This is far from complete but it should be enough to check some things.

1. Mecab transition. Janome doesn't support Unidic, only IPAdic, but UD
tag mappings are based on Unidic. This switches out Mecab for Janome to
get around that.

2. Raw tag extension. A simple tag map can't meet the specifications for
UD tag mappings, so this adds an extra field to ambiguous cases. For
this demo it just deals with the simplest case, which only needs to look
at the literal token. (In reality it may be necessary to look at the
whole sentence, but that's another issue.)

3. General code structure. Seems nobody else has implemented a custom
Tagger yet, so still not sure this is the correct way to pass the
vocabulary around, for example.

Any feedback would be greatly appreciated. -POLM
2017-08-08 01:27:15 +09:00
Matthew Honnibal
5d837c3776 Add mix weights on fine_tune 2017-08-07 06:32:59 -05:00
Delirious Lettuce
d3b03f0544 Fix typos:
* `auxillary` -> `auxiliary`
  * `consistute` -> `constitute`
  * `earlist` -> `earliest`
  * `prefered` -> `preferred`
  * `direcory` -> `directory`
  * `reuseable` -> `reusable`
  * `idiosyncracies` -> `idiosyncrasies`
  * `enviroment` -> `environment`
  * `unecessary` -> `unnecessary`
  * `yesteday` -> `yesterday`
  * `resouces` -> `resources`
2017-08-06 21:31:39 -06:00
Matthew Honnibal
42bd26f6f3 Give parser its own tok2vec weights 2017-08-06 18:33:46 +02:00
Matthew Honnibal
3ed203de25 Use LayerNorm and SELU in Tok2Vec 2017-08-06 18:33:18 +02:00