Matthew Honnibal
ffda38356a
Add util function to enable GPU
2017-09-20 19:16:35 -05:00
Matthew Honnibal
24e85c2048
Pass values for CNN maxout pieces option
2017-09-20 19:16:12 -05:00
Matthew Honnibal
b832f89ff8
Add resume_training function
2017-09-20 19:15:20 -05:00
Matthew Honnibal
f5144f04be
Add argument for CNN maxout pieces
2017-09-20 19:14:41 -05:00
Matthew Honnibal
842e21de9f
Fix int type error for Python 2
2017-09-20 23:55:30 +02:00
Matthew Honnibal
0c93c73e49
Add __reduce__ method for PhraseMatcher
2017-09-20 22:26:40 +02:00
Matthew Honnibal
cc408fc189
Make PhraseMatcher API like Matcher API
2017-09-20 22:20:35 +02:00
Matthew Honnibal
43ad250dd5
Update matcher tests
2017-09-20 21:54:49 +02:00
Matthew Honnibal
828cc91545
Fix PhraseMatcher for spaCy 2
2017-09-20 21:54:31 +02:00
Matthew Honnibal
78301b2d29
Avoid comparison to None in Tok2Vec
2017-09-20 00:19:34 +02:00
Matthew Honnibal
b36a38f63d
Fix serialization of pretrained_dims property
2017-09-19 23:42:27 +02:00
Matthew Honnibal
2489dcaccf
Fix serialization of parser
2017-09-19 23:42:12 +02:00
Matthew Honnibal
40837b275d
Fix tensorizer with pretrained vectors
2017-09-18 18:05:38 -05:00
Matthew Honnibal
a0c4b33d03
Support resuming a model during spacy train
2017-09-18 18:04:47 -05:00
Matthew Honnibal
c858927271
Copy vectors to GPU on begin training
2017-09-18 18:04:16 -05:00
Matthew Honnibal
3fa76c17d1
Refactor Tok2Vec
2017-09-18 15:00:05 -05:00
Matthew Honnibal
217e7891cd
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-18 11:36:21 -05:00
Matthew Honnibal
7b3f391f80
Try dropping the Affine layer, conditionally
2017-09-18 11:35:59 -05:00
ines
2480f8f521
Add missing return in Doc.from_disk() ( closes #1330 )
2017-09-18 15:32:00 +02:00
Matthew Honnibal
2148ae605b
Dont use iterated convolutions
2017-09-17 17:36:04 -05:00
Matthew Honnibal
c013e5996f
Fix parser test
2017-09-17 13:13:20 -05:00
Matthew Honnibal
8f42f8d305
Remove unused 'preprocess' argument in Tok2Vec'
2017-09-17 12:30:16 -05:00
Matthew Honnibal
039d609362
Remove hard-coded default vectors width
2017-09-17 12:29:39 -05:00
Matthew Honnibal
4f38a67a89
Make width default to 0 in vectors.pyx
2017-09-17 12:29:14 -05:00
Matthew Honnibal
16122f566e
Fix cpdef enum in attrs.pyx
2017-09-17 12:28:53 -05:00
Matthew Honnibal
b159e0eb50
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-17 05:47:50 -05:00
Matthew Honnibal
2b0efc77ae
Fix wiring of pre-trained vectors in parser loading
2017-09-17 05:47:34 -05:00
Matthew Honnibal
31c2e91c35
Fix wiring of pre-trained vectors in parser loading
2017-09-17 05:46:55 -05:00
Matthew Honnibal
8f913a74ca
Fix defaults and args to build_tagger_model
2017-09-17 05:46:36 -05:00
Matthew Honnibal
c003c561c3
Revert NER action loading change, for model compatibility
2017-09-17 05:46:03 -05:00
Matthew Honnibal
43210abacc
Resolve fine-tuning conflict
2017-09-17 05:30:04 -05:00
ines
ece30c28a8
Don't split hyphenated words in German
...
This way, the tokenizer matches the tokenization in German treebanks
2017-09-16 20:40:15 +02:00
ines
68f66aebf8
Use pkg_resources instead of pip for is_package ( resolves #1293 )
2017-09-16 20:27:59 +02:00
Matthew Honnibal
5ff2491f24
Pass option for pre-trained vectors in parser
2017-09-16 12:47:21 -05:00
Matthew Honnibal
8665a77f48
Fix feature error in NER
2017-09-16 12:46:57 -05:00
Matthew Honnibal
e37a50a436
Pass documents to tensorizer, not 'features'
2017-09-16 12:46:36 -05:00
Matthew Honnibal
84e637e2e6
Pass option for pretrained vectors in pipeline
2017-09-16 12:46:02 -05:00
Matthew Honnibal
2a93404da6
Support optional pre-trained vectors in tensorizer model
2017-09-16 12:45:37 -05:00
Matthew Honnibal
e0a2aa9289
Support having word vectors data on GPU
2017-09-16 12:45:09 -05:00
Matthew Honnibal
ebf8942564
Fix test for Python3
2017-09-16 16:22:38 +02:00
Matthew Honnibal
8c945310fb
Excuse emoji failure on narrow unicode builds
2017-09-16 16:21:13 +02:00
Matthew Honnibal
11f2a05ede
Fix code explosion from long enum in Python 3, Cython 0.24+
2017-09-16 12:20:04 +02:00
Matthew Honnibal
3fa5b40b5c
Add test for hash consistency
2017-09-16 11:21:35 +02:00
Matthew Honnibal
f730d07e4e
Fix prange error for Windows
2017-09-16 00:25:33 +02:00
Matthew Honnibal
4b2065430e
Merge branch 'feature/parser-history' into develop
2017-09-15 10:42:20 +02:00
Matthew Honnibal
2f08489694
Remove AddHistory layer -- didnt work as planned
2017-09-15 10:41:40 +02:00
Matthew Honnibal
8b481e0465
Remove redundant brackets
2017-09-15 10:38:08 +02:00
Matthew Honnibal
d84607f6bb
Vectorize update in AddHistory
2017-09-14 20:34:40 +02:00
Ines Montani
bd3da3d6fb
Port over change from #1323 and tidy up
2017-09-14 19:23:13 +02:00
Matthew Honnibal
18347ab69c
Implement AddHistory layer wrapper
2017-09-14 19:07:35 +02:00
Matthew Honnibal
d4ca6cef9e
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-14 17:00:07 +02:00
Matthew Honnibal
8c503487af
Fix lookup of missing NER actions
2017-09-14 16:59:45 +02:00
Matthew Honnibal
664c5af745
Revert padding in parser
2017-09-14 16:59:25 +02:00
Matthew Honnibal
8496d76224
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-14 09:21:20 -05:00
Matthew Honnibal
d1518027a9
Increment version
2017-09-14 16:18:46 +02:00
Matthew Honnibal
70da88a3a7
Update comment on Language.begin_training
2017-09-14 16:18:30 +02:00
Matthew Honnibal
c6395b057a
Improve parser feature extraction, for missing values
2017-09-14 16:18:02 +02:00
Matthew Honnibal
daf869ab3b
Fix add_action for NER, so labelled 'O' actions aren't added
2017-09-14 16:16:41 +02:00
Matthew Honnibal
9cb2aef587
Remove print statement
2017-09-14 13:38:28 +02:00
Matthew Honnibal
ba23d63c35
Fix minibatch function, for fixed batch size
2017-09-14 13:37:41 +02:00
Matthew Honnibal
456bb8a74c
Unxfail and close #1305
2017-09-06 19:14:17 +02:00
Matthew Honnibal
99e44fbdbb
Update regression test
2017-09-06 19:13:51 +02:00
Matthew Honnibal
5c3ff06924
Fix lemmatizer rules
2017-09-06 19:13:24 +02:00
Matthew Honnibal
dd9cab0faf
Fix type-check for int/long
2017-09-06 19:03:05 +02:00
Matthew Honnibal
497a9308a8
Xfail new lemmatizer test
2017-09-06 18:41:22 +02:00
Matthew Honnibal
dcbf866970
Merge parser changes
2017-09-06 18:41:05 +02:00
Matthew Honnibal
5384fff5ce
Add test for 1305: Incorrect lemmatization of VBZ for English
2017-09-06 18:40:18 +02:00
Matthew Honnibal
24ff6b0ad9
Fix parsing and tok2vec models
2017-09-06 05:50:58 -05:00
Matthew Honnibal
1b65115bc2
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-04 20:02:53 -05:00
Matthew Honnibal
33fa91feb7
Restore correctness of parser model
2017-09-04 21:19:30 +02:00
Matthew Honnibal
e88a42e460
Increment version
2017-09-04 21:14:39 +02:00
Matthew Honnibal
9d65d67985
Preserve model compatibility in parser, for now
2017-09-04 16:46:22 +02:00
Matthew Honnibal
d5fbf27335
Fix test
2017-09-04 16:45:11 +02:00
Matthew Honnibal
7fdafcc4c4
Fix config loading in tagger
2017-09-04 16:38:49 +02:00
Matthew Honnibal
058372d120
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-04 16:27:53 +02:00
Matthew Honnibal
16e25ce3b5
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-04 09:26:53 -05:00
Matthew Honnibal
9f512e657a
Fix drop_layer calculation
2017-09-04 09:26:38 -05:00
Matthew Honnibal
cb4839033c
Fix loader for EN tests
2017-09-04 15:19:18 +02:00
Matthew Honnibal
382ce566eb
Fix deserialization bug
2017-09-04 15:19:01 +02:00
Matthew Honnibal
bfddf50081
Fix #1296 : Incorrect lemmatization of base form verbs
2017-09-04 15:18:41 +02:00
Matthew Honnibal
b29e6bff46
Improve lemmatization rule for am|VBP
2017-09-04 15:18:10 +02:00
Matthew Honnibal
644d6c9e1a
Improve lemmatization tests, re #1296
2017-09-04 15:17:44 +02:00
Matthew Honnibal
3cf3fa1704
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-02 12:46:11 -05:00
Matthew Honnibal
e920885676
Fix pickle during train
2017-09-02 12:46:01 -05:00
Matthew Honnibal
c0eaba8b28
Fix low-data textcat
2017-09-02 15:17:32 +02:00
Matthew Honnibal
9e378bdac5
Fix textcat serialization
2017-09-02 15:17:20 +02:00
Matthew Honnibal
e3ea6ee02b
Increment version
2017-09-02 15:17:01 +02:00
Matthew Honnibal
a3b69bcb3d
Add low_data mode in textcat
2017-09-02 14:56:30 +02:00
Matthew Honnibal
ead78c7b9b
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-02 12:55:25 +02:00
Matthew Honnibal
5e6a9e7dcc
Add rule-based SBD
2017-09-02 12:53:38 +02:00
Matthew Honnibal
a824cf8f9a
Adjust text classification model
2017-09-02 11:41:00 +02:00
Matthew Honnibal
ac040b99bb
Add support for pre-trained vectors in text classifier
2017-09-01 16:39:55 +02:00
Matthew Honnibal
7742a6d559
Add GloVe vectors reader
2017-09-01 16:39:22 +02:00
Matthew Honnibal
789e1a3980
Use 13 parser features, not 8
2017-08-31 14:13:00 -05:00
Matthew Honnibal
30e35d9666
Fix syntax error
2017-08-30 17:35:39 -05:00
Matthew Honnibal
4ceebde523
Fix gradient bug in parser
2017-08-30 17:32:56 -05:00
ines
173089a45a
Add more validation for model meta
2017-08-29 11:21:46 +02:00
Matthew Honnibal
2e28982e28
Merge pull request #1288 from geovedi/indonesian
...
Indonesian language support
2017-08-26 21:31:13 +02:00
ines
7e04b7f89c
Fix info text on pipeline in package cli
2017-08-26 18:30:59 +02:00
ines
40afa13a8a
Increment version
2017-08-26 18:30:49 +02:00
Matthew Honnibal
876f38c548
Merge pull request #1279 from oroszgy/model_cli_v2
...
Added vector loading to model cli
2017-08-26 15:57:50 +02:00
Matthew Honnibal
cfc055734e
Split % in units, for compatibility with corpus
2017-08-25 20:03:37 -05:00
Matthew Honnibal
4bb6bc3f9e
Add support for sent_start to GoldParse
2017-08-25 20:03:14 -05:00
Matthew Honnibal
44589fb38c
Fix Break oracle
2017-08-25 19:50:55 -05:00
Matthew Honnibal
6d4e8e14ca
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-25 12:37:16 -05:00
Matthew Honnibal
4ce5531389
Use layer norm instead of batch norm
2017-08-25 12:37:10 -05:00
Matthew Honnibal
20dd66ddc2
Constrain sentence boundaries to IS_PUNCT and IS_SPACE tokens
2017-08-25 19:35:47 +02:00
Jim Geovedi
58d8078971
Merge remote-tracking branch 'upstream/develop' into indonesian
2017-08-25 09:21:49 +08:00
Matthew Honnibal
6ceb0f0518
Allow Lexeme.rank to be set
2017-08-24 21:43:00 +02:00
Matthew Honnibal
44a1fa80d3
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-23 13:02:16 +02:00
ines
bb1abbeba5
Only link model if download was successfull
2017-08-23 12:36:31 +02:00
Matthew Honnibal
bb2541ffd3
Fix PROB attr for OOV words
2017-08-23 12:11:52 +02:00
Matthew Honnibal
1c5c256e58
Fix fine_tune when optimizer is None
2017-08-23 10:51:33 +02:00
Matthew Honnibal
9c580ad28a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-22 17:02:04 -05:00
Matthew Honnibal
a4633fff6f
Restore use of batch norm in model
2017-08-22 17:01:58 -05:00
Matthew Honnibal
03b5b9727a
Fix Doc.vector for empty doc objects
2017-08-22 19:52:19 +02:00
Matthew Honnibal
0551b7b03a
Fix doc.vector
2017-08-22 19:46:52 +02:00
Matthew Honnibal
83f8e98450
Fix retrieval of OOV vectors
2017-08-22 19:46:35 +02:00
Matthew Honnibal
df2745eb08
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-22 19:00:43 +02:00
Matthew Honnibal
5b329acbf2
Fix vectors_length property in vocab
2017-08-22 19:00:27 +02:00
Matthew Honnibal
1fe605dfe5
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-21 19:18:31 -05:00
Matthew Honnibal
18b64e79ec
Fix fine tuning
2017-08-21 19:18:26 -05:00
Matthew Honnibal
682346dd66
Restore optimized hidden_depth=0 for parser
2017-08-21 19:18:04 -05:00
Matthew Honnibal
a21d8f3f0b
Add predict paths to _ml models
2017-08-21 23:23:45 +02:00
Matthew Honnibal
cec76801dc
Add profile command to CLI
2017-08-21 23:23:05 +02:00
Matthew Honnibal
7be5f30f17
Add profile function
2017-08-21 23:22:49 +02:00
ines
a68dc891ea
Port over changes from #1281
2017-08-21 23:19:18 +02:00
Matthew Honnibal
5e50a65252
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-21 14:15:46 -05:00
Matthew Honnibal
80acbc5f1f
Fix fine-tune weight mixture
2017-08-21 14:15:29 -05:00
ines
d15775c3ad
Fix typos and commands in alpha docs
2017-08-21 13:40:11 +02:00
Gyorgy Orosz
b3576bfc86
Added vector leading to model cli
2017-08-20 23:16:12 +02:00
Matthew Honnibal
c10f63bf10
Initialize fine tuning to 0.5
2017-08-20 15:59:48 -05:00
Matthew Honnibal
62878e50db
Fix misalignment caued by filtering inputs at wrong point in parser
2017-08-20 15:59:28 -05:00
Matthew Honnibal
78a5f842e9
Fix update when update_shared=False
2017-08-20 15:58:34 -05:00
Matthew Honnibal
7a6edeea68
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-20 12:55:39 -05:00
Matthew Honnibal
f2f9229964
Fix name of update_shared flag
2017-08-20 18:19:06 +02:00
Matthew Honnibal
8a59718fd6
Fix fine-tuning
2017-08-20 18:17:35 +02:00
Matthew Honnibal
80a5146ec2
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-20 11:07:08 -05:00
Matthew Honnibal
84bb543e4d
Add gold_preproc flag to cli/train
2017-08-20 11:07:00 -05:00
Matthew Honnibal
3fe0d76e6d
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-20 14:50:01 +02:00
Matthew Honnibal
c1d3ff517a
Track loss in tagger
2017-08-20 14:42:23 +02:00
Matthew Honnibal
8875590081
Add optimizer in Language.update if sgd=None
2017-08-20 14:42:07 +02:00
Matthew Honnibal
84b7ed49e4
Ensure updates aren't made if no gold available
2017-08-20 14:41:38 +02:00
Ines Montani
c2bbd393af
Merge pull request #1276 from oroszgy/model_cli_v2
...
Ported model cli from v1
2017-08-20 11:52:59 +02:00
Jim Geovedi
f77443ab68
reworked
2017-08-20 13:43:21 +07:00
Jim Geovedi
fbc62a09c7
added {pre,suf,in}fix tests
2017-08-20 13:43:00 +07:00
Jim Geovedi
713d7c0aa0
added indonesian lang test
2017-08-20 12:17:14 +07:00
Jim Geovedi
b7d83f37c8
indonesian abbr.
2017-08-20 12:16:50 +07:00
Jim Geovedi
7193c47f0b
direct lookup
2017-08-20 11:57:52 +07:00
Jim Geovedi
fdf802d505
added examples
2017-08-20 11:57:10 +07:00
Jim Geovedi
fa544e6c9a
Merge remote-tracking branch 'upstream/develop' into indonesian
2017-08-20 11:49:40 +07:00
Matthew Honnibal
42fa84075f
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-19 22:42:50 +02:00
Matthew Honnibal
aefef6fd28
Prevent strings from being lost during from_disk and from_bytes
2017-08-19 22:42:17 +02:00
ines
281e7e58b3
Don't escape forward slashes on ujson.dumps
2017-08-19 22:32:16 +02:00
ines
2d126a00ae
Fix typo
2017-08-19 22:32:07 +02:00
Matthew Honnibal
41c2218c53
Fix test for vectors
2017-08-19 22:09:12 +02:00
Matthew Honnibal
b8e1603cc4
Fix load fail for missing vectors
2017-08-19 22:07:00 +02:00
Matthew Honnibal
a3c51a0355
Fix creation of pipeline
2017-08-19 21:58:57 +02:00
Gyorgy Orosz
e5344b83a3
Ported model cli from v1
2017-08-19 21:45:23 +02:00
Matthew Honnibal
6a94648373
Fix serialization
2017-08-19 21:27:35 +02:00
Matthew Honnibal
1157294434
Improve vector handling
2017-08-19 20:35:33 +02:00
Matthew Honnibal
ef87562741
Restore vectors test utils
2017-08-19 20:35:16 +02:00
Matthew Honnibal
1391f9da37
Restore vectors tests
2017-08-19 20:34:58 +02:00
Matthew Honnibal
8cfeeb4884
Increment version
2017-08-19 19:52:58 +02:00
Matthew Honnibal
93fb8b64e9
Fix vector loading
2017-08-19 19:52:25 +02:00
Matthew Honnibal
49a615e7d9
Create Vectors object in Vocab
2017-08-19 18:50:16 +02:00
Matthew Honnibal
3d049af563
Improve vectors to/from disk
2017-08-19 18:42:11 +02:00
Matthew Honnibal
d55d6e1cfa
Fix comparison of Token from different docs. Closes #1257
2017-08-19 16:39:32 +02:00
Matthew Honnibal
9b6a5df15e
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-19 16:24:57 +02:00
Matthew Honnibal
4fda02c7e6
Add test for new Span.to_array method
2017-08-19 16:24:38 +02:00
Matthew Honnibal
dea229c634
Fix Span.to_array method
2017-08-19 16:24:28 +02:00
Matthew Honnibal
c606b4a42c
Add test for Doc.char_span
2017-08-19 16:18:23 +02:00
Matthew Honnibal
8b7ac77c23
Allow span label to be string in Doc.char_span
2017-08-19 16:18:09 +02:00
Matthew Honnibal
7c47e38c12
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-19 09:03:15 -05:00
Matthew Honnibal
ab28f911b4
Fix parser learning rates
2017-08-19 09:02:57 -05:00
ines
1fe5e1a4d1
Add language example sentences (see #1107 )
...
da, de, en, es, fr, he, it, nb, pl, pt, sv
2017-08-19 12:22:29 +02:00
Matthew Honnibal
97aabafb5f
Document as_tuples keyword arg of Language.pipe
2017-08-19 12:21:33 +02:00
Matthew Honnibal
80236116a6
Add Doc.char_span method, to get a span by character offset
2017-08-19 12:21:09 +02:00
Matthew Honnibal
482bba1722
Add Span.to_array method
2017-08-19 12:20:45 +02:00
Matthew Honnibal
19c495f451
Fix vectors deserialization
2017-08-19 04:33:03 +02:00
Matthew Honnibal
42d47c1e5c
Fix tagger serialization
2017-08-19 04:16:32 +02:00
Matthew Honnibal
2da96a0ec7
Fix beam test
2017-08-19 04:15:46 +02:00
Matthew Honnibal
a7309a217d
Update tagger serialization
2017-08-18 23:12:05 +02:00
Matthew Honnibal
bae59bf92f
Remove BiLSTM import
2017-08-18 22:46:59 +02:00
Matthew Honnibal
c307a0ffb8
Restore patches from nn-beam-parser to spacy/syntax
2017-08-18 22:38:59 +02:00
Matthew Honnibal
fe90dfc390
Restore changes from nn-beam-parser to spacy/_ml
2017-08-18 22:38:28 +02:00
Matthew Honnibal
de7e8703e3
Restore tests for beam parser
2017-08-18 22:27:42 +02:00
Matthew Honnibal
11c31d285c
Restore changes from nn-beam-parser
2017-08-18 22:26:12 +02:00
Matthew Honnibal
ce321b0322
Restore changes from nn-beam-parser to spacy/_ml
2017-08-18 22:24:46 +02:00
Matthew Honnibal
5f81d700ff
Restore patches from nn-beam-parser to spacy/syntax
2017-08-18 22:23:03 +02:00
Matthew Honnibal
ec482580b5
Restore changes to pipeline.pyx from nn-beam-parser branch
2017-08-18 22:02:35 +02:00
Matthew Honnibal
931509d96a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-18 21:57:15 +02:00
Matthew Honnibal
ed95009b5c
Fix data loading on Python 2
2017-08-18 21:57:06 +02:00
Matthew Honnibal
baf36d0588
Add compat function for importlib.util
2017-08-18 21:56:47 +02:00
Matthew Honnibal
263366729e
Don't import BiLSTM
2017-08-18 21:56:31 +02:00
Matthew Honnibal
28162290b3
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-18 14:55:40 -05:00
Matthew Honnibal
85794c1167
Restore state of _ml.py
2017-08-18 14:55:23 -05:00
Matthew Honnibal
d456d2efe1
Fix conflicts in nn_parser
2017-08-18 20:55:58 +02:00
Matthew Honnibal
1cec1efca7
Fix merge conflicts in nn_parser from beam stuff
2017-08-18 20:50:49 +02:00
Matthew Honnibal
69bcacdc09
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-18 20:47:13 +02:00
Matthew Honnibal
2993b54fff
Load vectors in vocab
2017-08-18 20:46:56 +02:00
Matthew Honnibal
a1ec41298c
Restore CFile loader
2017-08-18 20:46:16 +02:00
Matthew Honnibal
ed4fb991dc
Work on vectors loading
2017-08-18 20:45:48 +02:00
Matthew Honnibal
426f84937f
Resolve conflicts when merging new beam parsing stuff
2017-08-18 13:38:32 -05:00
Matthew Honnibal
5181e8bedb
Fix merge conflict in _ml
2017-08-18 13:35:51 -05:00
Matthew Honnibal
f75420ae79
Unhack beam parsing, moving it under options instead of global flags
2017-08-18 13:31:15 -05:00
Jim Geovedi
7ae45bffcf
Merge remote-tracking branch 'upstream/develop' into indonesian
2017-08-18 10:14:46 +07:00
Dan O'Huiginn
ebf5a3ce59
Allow loading with python < 3.6
...
Don't rely on recent python features to load models
Fixes Issue #1271
2017-08-17 15:15:47 +00:00
Matthew Honnibal
0209a06b4e
Update beam parser
2017-08-16 18:25:49 -05:00
Matthew Honnibal
4b1e7bd6d8
Improve tensorizer model
2017-08-16 18:25:20 -05:00
Matthew Honnibal
a6d8d7c82e
Add is_gold_parse method to transition system
2017-08-16 18:24:09 -05:00
Matthew Honnibal
3533bb61cb
Add option of 8 feature parse state
2017-08-16 18:23:27 -05:00
Matthew Honnibal
1cb2f15d65
Clean up unused predict_confidences function
2017-08-16 18:22:26 -05:00
Matthew Honnibal
210f6d5175
Fix efficiency error in batch parse
2017-08-15 03:19:03 -05:00
Matthew Honnibal
23537a011d
Tweaks to beam parser
2017-08-15 03:15:28 -05:00
Matthew Honnibal
500e92553d
Fix memory error when copying scores in beam
2017-08-15 03:15:04 -05:00
Matthew Honnibal
a8e4064dd8
Fix tensor gradient in parser
2017-08-15 03:14:36 -05:00
Matthew Honnibal
e420e0366c
Remove use of hash function in beam parser
2017-08-15 03:13:57 -05:00
Matthew Honnibal
6259490347
Fix mixture weights in fine_tune
2017-08-14 17:55:18 -05:00
Matthew Honnibal
335fa8b05c
Fix gradient in fine_tune
2017-08-14 14:55:47 -05:00
Matthew Honnibal
d9f82f6b50
Increment version
2017-08-14 14:55:26 +02:00
ines
a29f132ffd
Change python -m spacy to spacy
...
Reflects latest change to entry point or auto-alias
2017-08-14 13:04:48 +02:00
ines
65bf80302c
Increment version
2017-08-14 13:04:30 +02:00
Matthew Honnibal
52c180ecf5
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit ea8de11ad5
, reversing
changes made to 08e443e083
.
2017-08-14 13:00:23 +02:00
Matthew Honnibal
dbbfe595a5
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-14 12:09:28 +02:00
Matthew Honnibal
ac6c25f762
Check SGD is not None in update
2017-08-14 12:09:18 +02:00
Matthew Honnibal
0ae045256d
Fix beam training
2017-08-13 18:02:05 -05:00
Matthew Honnibal
6a42cc16ff
Fix beam parser, improve efficiency of non-beam
2017-08-13 12:37:26 +02:00
Matthew Honnibal
4363b4aa4a
Fix redundant tokvecs updates during update
2017-08-13 12:36:55 +02:00
Matthew Honnibal
12de263813
Bug fixes to beam parsing. Learns small sample
2017-08-13 09:33:39 +02:00
Matthew Honnibal
4ae0d5e1e6
Set defaults for convert command
2017-08-13 09:03:38 +02:00
Matthew Honnibal
92ebab6073
Update beam-update tests
2017-08-13 08:56:02 +02:00
Matthew Honnibal
17874fe491
Disable beam parsing
2017-08-12 19:35:40 -05:00
Matthew Honnibal
69f21867b5
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-12 19:25:56 -05:00
Matthew Honnibal
3e30712b62
Improve defaults
2017-08-12 19:24:17 -05:00
Matthew Honnibal
28e930aae0
Fixes for beam parsing. Not working
2017-08-12 19:22:52 -05:00
Matthew Honnibal
c96d769836
Fix beam parse. Not sure if working
2017-08-12 18:21:54 -05:00
Matthew Honnibal
24b45b45c6
Add test for beam update
2017-08-12 17:15:28 -05:00
Matthew Honnibal
4638f4b869
Fix beam update
2017-08-12 17:15:16 -05:00
Matthew Honnibal
d4308d2363
Initialize State offset to 0
2017-08-12 17:14:39 -05:00
Matthew Honnibal
b353e4d843
Work on parser beam training
2017-08-12 14:47:45 -05:00
ines
d4f2baf7dd
Add create_meta option to package command
...
Re-create meta.json in model directory, even if it exists. Especially
useful when updating existing spaCy models or training with Prodigy.
Ensures user won't end up with multiple "en_core_web_sm" models, and
offers easy way to change the model's name and settings without having
to edit the meta.json file.
2017-08-12 21:44:18 +02:00
Matthew Honnibal
4ab0c8c8e9
Try different drop_layer structure in Tok2Vec
2017-08-12 08:56:57 -05:00
Matthew Honnibal
cd5ecedf6a
Try drop_layer in parser
2017-08-12 08:56:33 -05:00
Matthew Honnibal
8870d491f1
Remove redundant pickling during training
2017-08-12 08:55:53 -05:00
Matthew Honnibal
680043ebca
Improve efficiency of tagger.set_annotations for GPU
2017-08-12 08:54:21 -05:00
Matthew Honnibal
ebe0f7f641
Pass embed size correctly in tagger, and cache embeddings for efficiency
2017-08-12 05:45:20 -05:00
Matthew Honnibal
1a59db1c86
Fix dropout and learn rate in parser
2017-08-12 05:44:39 -05:00
Matthew Honnibal
d01dc3704a
Adjust parser model
2017-08-09 20:06:33 -05:00
Matthew Honnibal
f37528ef58
Pass embed size for parser fine-tune. Use SELU
2017-08-09 17:52:53 -05:00