Matthew Honnibal
4363b4aa4a
Fix redundant tokvecs updates during update
2017-08-13 12:36:55 +02:00
Matthew Honnibal
12de263813
Bug fixes to beam parsing. Learns small sample
2017-08-13 09:33:39 +02:00
Matthew Honnibal
4ae0d5e1e6
Set defaults for convert command
2017-08-13 09:03:38 +02:00
Matthew Honnibal
92ebab6073
Update beam-update tests
2017-08-13 08:56:02 +02:00
Matthew Honnibal
17874fe491
Disable beam parsing
2017-08-12 19:35:40 -05:00
Matthew Honnibal
69f21867b5
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-12 19:25:56 -05:00
Matthew Honnibal
3e30712b62
Improve defaults
2017-08-12 19:24:17 -05:00
Matthew Honnibal
28e930aae0
Fixes for beam parsing. Not working
2017-08-12 19:22:52 -05:00
Matthew Honnibal
c96d769836
Fix beam parse. Not sure if working
2017-08-12 18:21:54 -05:00
Matthew Honnibal
24b45b45c6
Add test for beam update
2017-08-12 17:15:28 -05:00
Matthew Honnibal
4638f4b869
Fix beam update
2017-08-12 17:15:16 -05:00
Matthew Honnibal
d4308d2363
Initialize State offset to 0
2017-08-12 17:14:39 -05:00
Matthew Honnibal
b353e4d843
Work on parser beam training
2017-08-12 14:47:45 -05:00
ines
d4f2baf7dd
Add create_meta option to package command
...
Re-create meta.json in model directory, even if it exists. Especially
useful when updating existing spaCy models or training with Prodigy.
Ensures user won't end up with multiple "en_core_web_sm" models, and
offers easy way to change the model's name and settings without having
to edit the meta.json file.
2017-08-12 21:44:18 +02:00
Matthew Honnibal
4ab0c8c8e9
Try different drop_layer structure in Tok2Vec
2017-08-12 08:56:57 -05:00
Matthew Honnibal
cd5ecedf6a
Try drop_layer in parser
2017-08-12 08:56:33 -05:00
Matthew Honnibal
8870d491f1
Remove redundant pickling during training
2017-08-12 08:55:53 -05:00
Matthew Honnibal
680043ebca
Improve efficiency of tagger.set_annotations for GPU
2017-08-12 08:54:21 -05:00
Matthew Honnibal
ebe0f7f641
Pass embed size correctly in tagger, and cache embeddings for efficiency
2017-08-12 05:45:20 -05:00
Matthew Honnibal
1a59db1c86
Fix dropout and learn rate in parser
2017-08-12 05:44:39 -05:00
Matthew Honnibal
d01dc3704a
Adjust parser model
2017-08-09 20:06:33 -05:00
Matthew Honnibal
f37528ef58
Pass embed size for parser fine-tune. Use SELU
2017-08-09 17:52:53 -05:00
Matthew Honnibal
f93f2bed58
Revert use of layer normalization in Tok2Vec
2017-08-09 17:47:03 -05:00
Matthew Honnibal
20944dd8aa
Fix conflict in parser fine-tuning
2017-08-09 16:43:05 -05:00
Matthew Honnibal
ac2de6dced
Switch to ReLu layers in Tok2Vec
2017-08-09 16:41:25 -05:00
Matthew Honnibal
bbace204be
Gate parser fine-tuning behind feature flag
2017-08-09 16:40:42 -05:00
Matthew Honnibal
a59a1deac4
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-09 16:23:19 -05:00
Matthew Honnibal
bcce6f7de0
Fix parser fine tuning
2017-08-09 16:23:12 -05:00
ines
28e2fec23b
Fix autolinking failure on fresh model install ( resolves #1138 )
...
On fresh install via subprocess, pip.get_installed_distributions()
won't show new model, so is_package check in link command fails.
Solution for now is to get model package path explicitly and pass it to
link command.
2017-08-09 11:52:38 +02:00
Matthew Honnibal
dbdd8afc4b
Fix parser fine-tune training
2017-08-08 15:46:07 -05:00
Matthew Honnibal
88bf1cf87c
Update parser for fine tuning
2017-08-08 15:34:17 -05:00
Matthew Honnibal
5d837c3776
Add mix weights on fine_tune
2017-08-07 06:32:59 -05:00
Matthew Honnibal
42bd26f6f3
Give parser its own tok2vec weights
2017-08-06 18:33:46 +02:00
Matthew Honnibal
3ed203de25
Use LayerNorm and SELU in Tok2Vec
2017-08-06 18:33:18 +02:00
Matthew Honnibal
78498a072d
Return Transition for missing actions in lookup_action
2017-08-06 14:16:36 +02:00
Matthew Honnibal
4a5cc89138
Fix tagger 'fine_tune', to keep private CNN weights
2017-08-06 14:15:48 +02:00
Matthew Honnibal
3cb8f06881
Fix NeuralLabeller
2017-08-06 14:15:14 +02:00
Matthew Honnibal
0acce0521b
Fix Language.update for pipeline
2017-08-06 14:13:03 +02:00
Matthew Honnibal
bfffdeabb2
Fix parser batch-size bug introduced during cleanup
2017-08-06 14:10:48 +02:00
Matthew Honnibal
0eec7c9e9b
Fix Language.evaluate
2017-08-06 02:18:31 +02:00
Matthew Honnibal
0a566dc320
Add update_tensors flag to Language.update. Experimental, re #1182
2017-08-06 02:18:12 +02:00
Matthew Honnibal
cc19ea0e7c
Add update_tensors flag to Language.update. Experimental, re #1182
2017-08-06 02:17:10 +02:00
Matthew Honnibal
4cfb7a54e7
Fix tagger
2017-08-06 01:53:31 +02:00
Matthew Honnibal
e9ab800e15
Fix tagging model
2017-08-06 01:50:08 +02:00
Matthew Honnibal
468c138ab3
WIP: Add fine-tuning logic to tagger model, re #1182
2017-08-06 01:13:23 +02:00
Matthew Honnibal
7f876a7a82
Clean up some unused code in parser
2017-08-06 00:00:21 +02:00
Matthew Honnibal
ae1ad81069
Increment version
2017-08-05 18:09:32 +02:00
Matthew Honnibal
5c323daa1a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-01 22:10:37 +02:00
Matthew Honnibal
2e00361522
Fix update when 0 docs
2017-08-01 22:10:17 +02:00
Matthew Honnibal
8fce187de4
Fix ArcEager for missing values
2017-08-01 22:10:05 +02:00
ines
78e262140f
Add workaround for displaCy server on Python 2/3 ( resolves #1227 )
...
Make sure status and headers are bytes on Python 2 and strings on
Python 3
2017-08-01 01:11:35 +02:00
Matthew Honnibal
27abc56e98
Add method to get beam entities
2017-07-29 21:59:02 +02:00
Matthew Honnibal
ec63f4fe7b
Add option to control how missing entities are handled when getting NER tags
2017-07-29 21:58:37 +02:00
Matthew Honnibal
aff325b7e0
Increment version
2017-07-25 19:41:20 +02:00
Matthew Honnibal
6780132821
Fix tagger loading
2017-07-25 19:41:11 +02:00
Matthew Honnibal
fd20a4af55
Increment version
2017-07-25 18:58:34 +02:00
Matthew Honnibal
523b0df2c9
Update text classification model
2017-07-25 18:57:59 +02:00
Matthew Honnibal
7c7fac9337
Add spacy.blank() loading function
2017-07-25 18:56:37 +02:00
Matthew Honnibal
5771bd1ff8
Increment version
2017-07-23 14:18:38 +02:00
Matthew Honnibal
c4a81a47a4
Fix deserialization
2017-07-23 14:11:07 +02:00
Matthew Honnibal
2df563ad24
Remove optimization for textcat that caused loading problem
2017-07-23 14:10:51 +02:00
Matthew Honnibal
4fe77bced2
Add cfg attr to pipeline components
2017-07-23 00:52:47 +02:00
Matthew Honnibal
d8aa721664
Compute Language.meta with a property
2017-07-23 00:50:18 +02:00
Matthew Honnibal
a88a7deffe
Five save/load of textcat config
2017-07-23 00:33:43 +02:00
Matthew Honnibal
9bae0ddc50
Fix minibatching
2017-07-22 20:14:49 +02:00
Matthew Honnibal
ded0df5e2f
Expose hyper-param as keyword arg
2017-07-22 20:14:37 +02:00
Matthew Honnibal
f5de8deeec
Increment version
2017-07-22 20:04:53 +02:00
Matthew Honnibal
b55714d5d1
Make gold_tuples arg optional in begin_training
2017-07-22 20:04:43 +02:00
Matthew Honnibal
ed6c85fa3c
Fix loading of text categories in GoldParse
2017-07-22 20:04:03 +02:00
Matthew Honnibal
6ffec9dfea
Update _ml, for textcat model
2017-07-22 20:03:40 +02:00
Matthew Honnibal
d6a5c2c85a
Add test for NER
2017-07-22 01:48:58 +02:00
Matthew Honnibal
28244df4da
Add test for beam parsing
2017-07-22 01:48:35 +02:00
Matthew Honnibal
c86445bdfd
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-07-22 01:14:28 +02:00
Matthew Honnibal
b3a749610e
Fix name of TextCategorizer
2017-07-22 01:14:07 +02:00
Matthew Honnibal
2424493970
Remove unnecessary import of Mock
2017-07-22 01:13:54 +02:00
Matthew Honnibal
baa3d81c35
Add text categorizer to Language
2017-07-22 01:13:36 +02:00
Matthew Honnibal
a6a2159969
Add slot for text categories to Doc
2017-07-22 00:34:15 +02:00
Matthew Honnibal
374ab3ecfb
Increment alpha version
2017-07-22 00:32:49 +02:00
Matthew Honnibal
289f23df51
Test beam parsing
2017-07-20 15:03:10 +02:00
Matthew Honnibal
3da1063b36
Add beam decoding to parser, to allow NER uncertainties
2017-07-20 15:02:55 +02:00
Matthew Honnibal
0ca5832427
Improve negative example handling in NER oracle
2017-07-20 00:18:49 +02:00
Matthew Honnibal
a231b56d40
Add text-classification hook to pipeline
2017-07-20 00:18:15 +02:00
Matthew Honnibal
7ea50182a5
Add support for text-classification labels to GoldParse
2017-07-20 00:17:47 +02:00
Matthew Honnibal
727481377e
Add text-classifer thinc models
2017-07-20 00:17:17 +02:00
Matthew Honnibal
f014138c11
Fix parser tests
2017-07-20 00:16:52 +02:00
Ines Montani
c91642efd5
Port over changes from #1168
2017-07-01 11:43:54 +02:00
Jim Regan
d81ceb0cd5
Merge branch 'develop' into polish
2017-06-26 22:42:27 +01:00
Jim O'Regan
2f84c73585
a start
2017-06-26 22:40:04 +01:00
Jim O'Regan
28d7f0a672
reference
2017-06-26 22:38:28 +01:00
Matthew Honnibal
91e52543ef
Merge pull request #1118 from Gregory-Howard/patch-2
...
Update _tokenizer_exceptions_list (adding cities)
2017-06-20 11:16:07 +02:00
Matthew Honnibal
8ea785e01a
Merge pull request #1119 from oroszgy/patch-3
...
Fixed conllu converter
2017-06-20 11:14:41 +02:00
Tpt
7745b3ae04
Adds noun chunks to French syntax iterators
2017-06-12 15:29:58 +02:00
Tpt
57e8254f63
Adds function to extract french noun chunks
2017-06-12 15:20:49 +02:00
György Orosz
62dbf9025c
Fixed conllu converter
2017-06-09 22:53:56 +02:00
Grégory Howard
cd974b32b7
Update _tokenizer_exceptions_list (adding cities)
2017-06-09 17:58:18 +02:00
ines
34a2eecb17
Add simple "naughty strings" test (see #1107 )
2017-06-06 17:43:51 +02:00
ines
045574a936
Update package name and increment version
2017-06-05 20:41:30 +02:00
Matthew Honnibal
1f5874a927
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-06-05 20:20:00 +02:00
ines
03db56f48c
Detect spaCy version and add package title
...
Package title allows customised package names (like spacy-nightly)
2017-06-05 20:11:02 +02:00
Matthew Honnibal
c0d90f52f7
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-06-05 19:20:13 +02:00