Matthew Honnibal
|
a3b69bcb3d
|
Add low_data mode in textcat
|
2017-09-02 14:56:30 +02:00 |
|
Matthew Honnibal
|
ead78c7b9b
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-09-02 12:55:25 +02:00 |
|
Matthew Honnibal
|
5e6a9e7dcc
|
Add rule-based SBD
|
2017-09-02 12:53:38 +02:00 |
|
Matthew Honnibal
|
a824cf8f9a
|
Adjust text classification model
|
2017-09-02 11:41:00 +02:00 |
|
Matthew Honnibal
|
ac040b99bb
|
Add support for pre-trained vectors in text classifier
|
2017-09-01 16:39:55 +02:00 |
|
Matthew Honnibal
|
7742a6d559
|
Add GloVe vectors reader
|
2017-09-01 16:39:22 +02:00 |
|
Matthew Honnibal
|
789e1a3980
|
Use 13 parser features, not 8
|
2017-08-31 14:13:00 -05:00 |
|
Matthew Honnibal
|
30e35d9666
|
Fix syntax error
|
2017-08-30 17:35:39 -05:00 |
|
Matthew Honnibal
|
4ceebde523
|
Fix gradient bug in parser
|
2017-08-30 17:32:56 -05:00 |
|
ines
|
173089a45a
|
Add more validation for model meta
|
2017-08-29 11:21:46 +02:00 |
|
Matthew Honnibal
|
2e28982e28
|
Merge pull request #1288 from geovedi/indonesian
Indonesian language support
|
2017-08-26 21:31:13 +02:00 |
|
ines
|
7e04b7f89c
|
Fix info text on pipeline in package cli
|
2017-08-26 18:30:59 +02:00 |
|
ines
|
40afa13a8a
|
Increment version
|
2017-08-26 18:30:49 +02:00 |
|
Matthew Honnibal
|
876f38c548
|
Merge pull request #1279 from oroszgy/model_cli_v2
Added vector loading to model cli
|
2017-08-26 15:57:50 +02:00 |
|
Matthew Honnibal
|
cfc055734e
|
Split % in units, for compatibility with corpus
|
2017-08-25 20:03:37 -05:00 |
|
Matthew Honnibal
|
4bb6bc3f9e
|
Add support for sent_start to GoldParse
|
2017-08-25 20:03:14 -05:00 |
|
Matthew Honnibal
|
44589fb38c
|
Fix Break oracle
|
2017-08-25 19:50:55 -05:00 |
|
Matthew Honnibal
|
6d4e8e14ca
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-25 12:37:16 -05:00 |
|
Matthew Honnibal
|
4ce5531389
|
Use layer norm instead of batch norm
|
2017-08-25 12:37:10 -05:00 |
|
Matthew Honnibal
|
20dd66ddc2
|
Constrain sentence boundaries to IS_PUNCT and IS_SPACE tokens
|
2017-08-25 19:35:47 +02:00 |
|
Jim Geovedi
|
58d8078971
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-08-25 09:21:49 +08:00 |
|
Matthew Honnibal
|
6ceb0f0518
|
Allow Lexeme.rank to be set
|
2017-08-24 21:43:00 +02:00 |
|
Matthew Honnibal
|
44a1fa80d3
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-23 13:02:16 +02:00 |
|
ines
|
bb1abbeba5
|
Only link model if download was successfull
|
2017-08-23 12:36:31 +02:00 |
|
Matthew Honnibal
|
bb2541ffd3
|
Fix PROB attr for OOV words
|
2017-08-23 12:11:52 +02:00 |
|
Matthew Honnibal
|
1c5c256e58
|
Fix fine_tune when optimizer is None
|
2017-08-23 10:51:33 +02:00 |
|
Matthew Honnibal
|
9c580ad28a
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-22 17:02:04 -05:00 |
|
Matthew Honnibal
|
a4633fff6f
|
Restore use of batch norm in model
|
2017-08-22 17:01:58 -05:00 |
|
Matthew Honnibal
|
03b5b9727a
|
Fix Doc.vector for empty doc objects
|
2017-08-22 19:52:19 +02:00 |
|
Matthew Honnibal
|
0551b7b03a
|
Fix doc.vector
|
2017-08-22 19:46:52 +02:00 |
|
Matthew Honnibal
|
83f8e98450
|
Fix retrieval of OOV vectors
|
2017-08-22 19:46:35 +02:00 |
|
Matthew Honnibal
|
df2745eb08
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-22 19:00:43 +02:00 |
|
Matthew Honnibal
|
5b329acbf2
|
Fix vectors_length property in vocab
|
2017-08-22 19:00:27 +02:00 |
|
Matthew Honnibal
|
1fe605dfe5
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-21 19:18:31 -05:00 |
|
Matthew Honnibal
|
18b64e79ec
|
Fix fine tuning
|
2017-08-21 19:18:26 -05:00 |
|
Matthew Honnibal
|
682346dd66
|
Restore optimized hidden_depth=0 for parser
|
2017-08-21 19:18:04 -05:00 |
|
Matthew Honnibal
|
a21d8f3f0b
|
Add predict paths to _ml models
|
2017-08-21 23:23:45 +02:00 |
|
Matthew Honnibal
|
cec76801dc
|
Add profile command to CLI
|
2017-08-21 23:23:05 +02:00 |
|
Matthew Honnibal
|
7be5f30f17
|
Add profile function
|
2017-08-21 23:22:49 +02:00 |
|
ines
|
a68dc891ea
|
Port over changes from #1281
|
2017-08-21 23:19:18 +02:00 |
|
Matthew Honnibal
|
5e50a65252
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-21 14:15:46 -05:00 |
|
Matthew Honnibal
|
80acbc5f1f
|
Fix fine-tune weight mixture
|
2017-08-21 14:15:29 -05:00 |
|
ines
|
d15775c3ad
|
Fix typos and commands in alpha docs
|
2017-08-21 13:40:11 +02:00 |
|
Gyorgy Orosz
|
b3576bfc86
|
Added vector leading to model cli
|
2017-08-20 23:16:12 +02:00 |
|
Matthew Honnibal
|
c10f63bf10
|
Initialize fine tuning to 0.5
|
2017-08-20 15:59:48 -05:00 |
|
Matthew Honnibal
|
62878e50db
|
Fix misalignment caued by filtering inputs at wrong point in parser
|
2017-08-20 15:59:28 -05:00 |
|
Matthew Honnibal
|
78a5f842e9
|
Fix update when update_shared=False
|
2017-08-20 15:58:34 -05:00 |
|
Matthew Honnibal
|
7a6edeea68
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-20 12:55:39 -05:00 |
|
Matthew Honnibal
|
f2f9229964
|
Fix name of update_shared flag
|
2017-08-20 18:19:06 +02:00 |
|
Matthew Honnibal
|
8a59718fd6
|
Fix fine-tuning
|
2017-08-20 18:17:35 +02:00 |
|
Matthew Honnibal
|
80a5146ec2
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-20 11:07:08 -05:00 |
|
Matthew Honnibal
|
84bb543e4d
|
Add gold_preproc flag to cli/train
|
2017-08-20 11:07:00 -05:00 |
|
Matthew Honnibal
|
3fe0d76e6d
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-20 14:50:01 +02:00 |
|
Matthew Honnibal
|
c1d3ff517a
|
Track loss in tagger
|
2017-08-20 14:42:23 +02:00 |
|
Matthew Honnibal
|
8875590081
|
Add optimizer in Language.update if sgd=None
|
2017-08-20 14:42:07 +02:00 |
|
Matthew Honnibal
|
84b7ed49e4
|
Ensure updates aren't made if no gold available
|
2017-08-20 14:41:38 +02:00 |
|
Ines Montani
|
c2bbd393af
|
Merge pull request #1276 from oroszgy/model_cli_v2
Ported model cli from v1
|
2017-08-20 11:52:59 +02:00 |
|
Jim Geovedi
|
f77443ab68
|
reworked
|
2017-08-20 13:43:21 +07:00 |
|
Jim Geovedi
|
fbc62a09c7
|
added {pre,suf,in}fix tests
|
2017-08-20 13:43:00 +07:00 |
|
Jim Geovedi
|
713d7c0aa0
|
added indonesian lang test
|
2017-08-20 12:17:14 +07:00 |
|
Jim Geovedi
|
b7d83f37c8
|
indonesian abbr.
|
2017-08-20 12:16:50 +07:00 |
|
Jim Geovedi
|
7193c47f0b
|
direct lookup
|
2017-08-20 11:57:52 +07:00 |
|
Jim Geovedi
|
fdf802d505
|
added examples
|
2017-08-20 11:57:10 +07:00 |
|
Jim Geovedi
|
fa544e6c9a
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-08-20 11:49:40 +07:00 |
|
Matthew Honnibal
|
42fa84075f
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-19 22:42:50 +02:00 |
|
Matthew Honnibal
|
aefef6fd28
|
Prevent strings from being lost during from_disk and from_bytes
|
2017-08-19 22:42:17 +02:00 |
|
ines
|
281e7e58b3
|
Don't escape forward slashes on ujson.dumps
|
2017-08-19 22:32:16 +02:00 |
|
ines
|
2d126a00ae
|
Fix typo
|
2017-08-19 22:32:07 +02:00 |
|
Matthew Honnibal
|
41c2218c53
|
Fix test for vectors
|
2017-08-19 22:09:12 +02:00 |
|
Matthew Honnibal
|
b8e1603cc4
|
Fix load fail for missing vectors
|
2017-08-19 22:07:00 +02:00 |
|
Matthew Honnibal
|
a3c51a0355
|
Fix creation of pipeline
|
2017-08-19 21:58:57 +02:00 |
|
Gyorgy Orosz
|
e5344b83a3
|
Ported model cli from v1
|
2017-08-19 21:45:23 +02:00 |
|
Matthew Honnibal
|
6a94648373
|
Fix serialization
|
2017-08-19 21:27:35 +02:00 |
|
Matthew Honnibal
|
1157294434
|
Improve vector handling
|
2017-08-19 20:35:33 +02:00 |
|
Matthew Honnibal
|
ef87562741
|
Restore vectors test utils
|
2017-08-19 20:35:16 +02:00 |
|
Matthew Honnibal
|
1391f9da37
|
Restore vectors tests
|
2017-08-19 20:34:58 +02:00 |
|
Matthew Honnibal
|
8cfeeb4884
|
Increment version
|
2017-08-19 19:52:58 +02:00 |
|
Matthew Honnibal
|
93fb8b64e9
|
Fix vector loading
|
2017-08-19 19:52:25 +02:00 |
|
Matthew Honnibal
|
49a615e7d9
|
Create Vectors object in Vocab
|
2017-08-19 18:50:16 +02:00 |
|
Matthew Honnibal
|
3d049af563
|
Improve vectors to/from disk
|
2017-08-19 18:42:11 +02:00 |
|
Matthew Honnibal
|
d55d6e1cfa
|
Fix comparison of Token from different docs. Closes #1257
|
2017-08-19 16:39:32 +02:00 |
|
Matthew Honnibal
|
9b6a5df15e
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-19 16:24:57 +02:00 |
|
Matthew Honnibal
|
4fda02c7e6
|
Add test for new Span.to_array method
|
2017-08-19 16:24:38 +02:00 |
|
Matthew Honnibal
|
dea229c634
|
Fix Span.to_array method
|
2017-08-19 16:24:28 +02:00 |
|
Matthew Honnibal
|
c606b4a42c
|
Add test for Doc.char_span
|
2017-08-19 16:18:23 +02:00 |
|
Matthew Honnibal
|
8b7ac77c23
|
Allow span label to be string in Doc.char_span
|
2017-08-19 16:18:09 +02:00 |
|
Matthew Honnibal
|
7c47e38c12
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-19 09:03:15 -05:00 |
|
Matthew Honnibal
|
ab28f911b4
|
Fix parser learning rates
|
2017-08-19 09:02:57 -05:00 |
|
ines
|
1fe5e1a4d1
|
Add language example sentences (see #1107)
da, de, en, es, fr, he, it, nb, pl, pt, sv
|
2017-08-19 12:22:29 +02:00 |
|
Matthew Honnibal
|
97aabafb5f
|
Document as_tuples keyword arg of Language.pipe
|
2017-08-19 12:21:33 +02:00 |
|
Matthew Honnibal
|
80236116a6
|
Add Doc.char_span method, to get a span by character offset
|
2017-08-19 12:21:09 +02:00 |
|
Matthew Honnibal
|
482bba1722
|
Add Span.to_array method
|
2017-08-19 12:20:45 +02:00 |
|
Matthew Honnibal
|
19c495f451
|
Fix vectors deserialization
|
2017-08-19 04:33:03 +02:00 |
|
Matthew Honnibal
|
42d47c1e5c
|
Fix tagger serialization
|
2017-08-19 04:16:32 +02:00 |
|
Matthew Honnibal
|
2da96a0ec7
|
Fix beam test
|
2017-08-19 04:15:46 +02:00 |
|
Matthew Honnibal
|
a7309a217d
|
Update tagger serialization
|
2017-08-18 23:12:05 +02:00 |
|
Matthew Honnibal
|
bae59bf92f
|
Remove BiLSTM import
|
2017-08-18 22:46:59 +02:00 |
|
Matthew Honnibal
|
c307a0ffb8
|
Restore patches from nn-beam-parser to spacy/syntax
|
2017-08-18 22:38:59 +02:00 |
|
Matthew Honnibal
|
fe90dfc390
|
Restore changes from nn-beam-parser to spacy/_ml
|
2017-08-18 22:38:28 +02:00 |
|
Matthew Honnibal
|
de7e8703e3
|
Restore tests for beam parser
|
2017-08-18 22:27:42 +02:00 |
|
Matthew Honnibal
|
11c31d285c
|
Restore changes from nn-beam-parser
|
2017-08-18 22:26:12 +02:00 |
|
Matthew Honnibal
|
ce321b0322
|
Restore changes from nn-beam-parser to spacy/_ml
|
2017-08-18 22:24:46 +02:00 |
|
Matthew Honnibal
|
5f81d700ff
|
Restore patches from nn-beam-parser to spacy/syntax
|
2017-08-18 22:23:03 +02:00 |
|
Matthew Honnibal
|
ec482580b5
|
Restore changes to pipeline.pyx from nn-beam-parser branch
|
2017-08-18 22:02:35 +02:00 |
|
Matthew Honnibal
|
931509d96a
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-18 21:57:15 +02:00 |
|
Matthew Honnibal
|
ed95009b5c
|
Fix data loading on Python 2
|
2017-08-18 21:57:06 +02:00 |
|
Matthew Honnibal
|
baf36d0588
|
Add compat function for importlib.util
|
2017-08-18 21:56:47 +02:00 |
|
Matthew Honnibal
|
263366729e
|
Don't import BiLSTM
|
2017-08-18 21:56:31 +02:00 |
|
Matthew Honnibal
|
28162290b3
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-18 14:55:40 -05:00 |
|
Matthew Honnibal
|
85794c1167
|
Restore state of _ml.py
|
2017-08-18 14:55:23 -05:00 |
|
Matthew Honnibal
|
d456d2efe1
|
Fix conflicts in nn_parser
|
2017-08-18 20:55:58 +02:00 |
|
Matthew Honnibal
|
1cec1efca7
|
Fix merge conflicts in nn_parser from beam stuff
|
2017-08-18 20:50:49 +02:00 |
|
Matthew Honnibal
|
69bcacdc09
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-18 20:47:13 +02:00 |
|
Matthew Honnibal
|
2993b54fff
|
Load vectors in vocab
|
2017-08-18 20:46:56 +02:00 |
|
Matthew Honnibal
|
a1ec41298c
|
Restore CFile loader
|
2017-08-18 20:46:16 +02:00 |
|
Matthew Honnibal
|
ed4fb991dc
|
Work on vectors loading
|
2017-08-18 20:45:48 +02:00 |
|
Matthew Honnibal
|
426f84937f
|
Resolve conflicts when merging new beam parsing stuff
|
2017-08-18 13:38:32 -05:00 |
|
Matthew Honnibal
|
5181e8bedb
|
Fix merge conflict in _ml
|
2017-08-18 13:35:51 -05:00 |
|
Matthew Honnibal
|
f75420ae79
|
Unhack beam parsing, moving it under options instead of global flags
|
2017-08-18 13:31:15 -05:00 |
|
Jim Geovedi
|
7ae45bffcf
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-08-18 10:14:46 +07:00 |
|
Dan O'Huiginn
|
ebf5a3ce59
|
Allow loading with python < 3.6
Don't rely on recent python features to load models
Fixes Issue #1271
|
2017-08-17 15:15:47 +00:00 |
|
Matthew Honnibal
|
0209a06b4e
|
Update beam parser
|
2017-08-16 18:25:49 -05:00 |
|
Matthew Honnibal
|
4b1e7bd6d8
|
Improve tensorizer model
|
2017-08-16 18:25:20 -05:00 |
|
Matthew Honnibal
|
a6d8d7c82e
|
Add is_gold_parse method to transition system
|
2017-08-16 18:24:09 -05:00 |
|
Matthew Honnibal
|
3533bb61cb
|
Add option of 8 feature parse state
|
2017-08-16 18:23:27 -05:00 |
|
Matthew Honnibal
|
1cb2f15d65
|
Clean up unused predict_confidences function
|
2017-08-16 18:22:26 -05:00 |
|
Matthew Honnibal
|
210f6d5175
|
Fix efficiency error in batch parse
|
2017-08-15 03:19:03 -05:00 |
|
Matthew Honnibal
|
23537a011d
|
Tweaks to beam parser
|
2017-08-15 03:15:28 -05:00 |
|
Matthew Honnibal
|
500e92553d
|
Fix memory error when copying scores in beam
|
2017-08-15 03:15:04 -05:00 |
|
Matthew Honnibal
|
a8e4064dd8
|
Fix tensor gradient in parser
|
2017-08-15 03:14:36 -05:00 |
|
Matthew Honnibal
|
e420e0366c
|
Remove use of hash function in beam parser
|
2017-08-15 03:13:57 -05:00 |
|
Matthew Honnibal
|
6259490347
|
Fix mixture weights in fine_tune
|
2017-08-14 17:55:18 -05:00 |
|
Matthew Honnibal
|
335fa8b05c
|
Fix gradient in fine_tune
|
2017-08-14 14:55:47 -05:00 |
|
Matthew Honnibal
|
d9f82f6b50
|
Increment version
|
2017-08-14 14:55:26 +02:00 |
|
ines
|
a29f132ffd
|
Change python -m spacy to spacy
Reflects latest change to entry point or auto-alias
|
2017-08-14 13:04:48 +02:00 |
|
ines
|
65bf80302c
|
Increment version
|
2017-08-14 13:04:30 +02:00 |
|
Matthew Honnibal
|
52c180ecf5
|
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5 , reversing
changes made to 08e443e083 .
|
2017-08-14 13:00:23 +02:00 |
|
Matthew Honnibal
|
dbbfe595a5
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-14 12:09:28 +02:00 |
|
Matthew Honnibal
|
ac6c25f762
|
Check SGD is not None in update
|
2017-08-14 12:09:18 +02:00 |
|
Matthew Honnibal
|
0ae045256d
|
Fix beam training
|
2017-08-13 18:02:05 -05:00 |
|
Matthew Honnibal
|
6a42cc16ff
|
Fix beam parser, improve efficiency of non-beam
|
2017-08-13 12:37:26 +02:00 |
|
Matthew Honnibal
|
4363b4aa4a
|
Fix redundant tokvecs updates during update
|
2017-08-13 12:36:55 +02:00 |
|
Matthew Honnibal
|
12de263813
|
Bug fixes to beam parsing. Learns small sample
|
2017-08-13 09:33:39 +02:00 |
|
Matthew Honnibal
|
4ae0d5e1e6
|
Set defaults for convert command
|
2017-08-13 09:03:38 +02:00 |
|
Matthew Honnibal
|
92ebab6073
|
Update beam-update tests
|
2017-08-13 08:56:02 +02:00 |
|
Matthew Honnibal
|
17874fe491
|
Disable beam parsing
|
2017-08-12 19:35:40 -05:00 |
|
Matthew Honnibal
|
69f21867b5
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-12 19:25:56 -05:00 |
|
Matthew Honnibal
|
3e30712b62
|
Improve defaults
|
2017-08-12 19:24:17 -05:00 |
|
Matthew Honnibal
|
28e930aae0
|
Fixes for beam parsing. Not working
|
2017-08-12 19:22:52 -05:00 |
|
Matthew Honnibal
|
c96d769836
|
Fix beam parse. Not sure if working
|
2017-08-12 18:21:54 -05:00 |
|