ines
0c2343d73a
Tidy up language data
2017-10-11 02:22:49 +02:00
Matthew Honnibal
73bca3d382
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-10 12:51:37 -05:00
Matthew Honnibal
5156074df1
Make loading code more consistent in train command
2017-10-10 12:51:20 -05:00
Matthew Honnibal
d70fba6807
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-10 19:33:10 +02:00
Matthew Honnibal
8143618497
Set prefix length back to 1
2017-10-10 19:32:54 +02:00
Matthew Honnibal
97c9b5db8b
Patch spacy.train for new pipeline management
2017-10-09 23:41:16 -05:00
Matthew Honnibal
a635240398
Add conll_ner2json converter
2017-10-09 22:03:26 -05:00
Matthew Honnibal
dce8afb9cf
Set prefix length to 3
2017-10-09 21:55:55 -05:00
Matthew Honnibal
8265b90c83
Update parser defaults
2017-10-09 21:55:20 -05:00
Matthew Honnibal
dd2b0601d1
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-09 21:30:46 -05:00
Matthew Honnibal
09d61ada5e
Merge pull request #1396 from explosion/feature/pipeline-management
...
💫 Improve pipeline and factory management
2017-10-10 04:29:54 +02:00
Matthew Honnibal
19136fd155
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-10 03:58:30 +02:00
Matthew Honnibal
8978212ee5
Patch serialization bug raised in #1105
2017-10-10 03:58:12 +02:00
Matthew Honnibal
f0f2739ae3
Add test for serialization issue raised in #1105
2017-10-10 03:57:58 +02:00
Matthew Honnibal
735d18654d
Add NER converter for CoNLL 2003 data
2017-10-09 20:06:28 -05:00
Matthew Honnibal
808d8740d6
Remove print statement
2017-10-09 08:45:20 -05:00
Matthew Honnibal
0f41b25f60
Add speed benchmarks to metadata
2017-10-09 08:05:37 -05:00
Matthew Honnibal
d8a2506023
Merge pull request #1401 from explosion/feature/add-parser-action
...
💫 Allow labels to be added to pre-trained parser and NER modes
2017-10-09 04:57:51 +02:00
Matthew Honnibal
689349e32f
Merge pull request #1400 from explosion/feature/sentence-parsing
...
💫 Force parser to respect preset sentence boundaries
2017-10-09 04:31:43 +02:00
Matthew Honnibal
e79fc41ff8
Merge pull request #1391 from explosion/feature/multilabel-textcat
...
💫 Fix multi-label support for text classification
2017-10-09 04:22:31 +02:00
Matthew Honnibal
fad2b8315f
Merge branch 'develop' into feature/add-parser-action
2017-10-09 04:13:04 +02:00
Matthew Honnibal
6c79841c0d
Fix tests for history features
2017-10-09 04:12:24 +02:00
Matthew Honnibal
dde87e6b0d
Add tests for adding parser actions
2017-10-09 03:42:35 +02:00
Matthew Honnibal
b2b8506f2c
Remove whitespace
2017-10-09 03:35:57 +02:00
Matthew Honnibal
d43a83e37a
Allow parser.add_label for pretrained models
2017-10-09 03:35:40 +02:00
Matthew Honnibal
81a64119db
Fix string-to-unicode problem
2017-10-09 00:59:49 +02:00
Matthew Honnibal
02c2af7119
Fix test
2017-10-09 00:29:37 +02:00
Matthew Honnibal
4cc84b0234
Prohibit Break when sent_start < 0
2017-10-09 00:02:45 +02:00
Matthew Honnibal
5a67efeccc
Add tests for sentence segmentation presetting
2017-10-09 00:02:23 +02:00
Matthew Honnibal
e938bce320
Adjust parsing transition system to allow preset sentence segments.
2017-10-08 23:53:34 +02:00
Matthew Honnibal
080afd4924
Add ternary value setting to Token.sent_start
2017-10-08 23:51:58 +02:00
Matthew Honnibal
7ae67ec6a1
Add Span.as_doc method
2017-10-08 23:50:20 +02:00
Matthew Honnibal
20309fb9db
Make history features default to zero
2017-10-08 20:32:14 +02:00
Matthew Honnibal
e74c8d2fad
Merge remote-tracking branch 'origin/develop' into feature/sentence-parsing
2017-10-08 20:20:41 +02:00
Matthew Honnibal
18063803de
Make TokenC.sent_tart an int, to allow ternary value
2017-10-08 19:58:54 +02:00
Matthew Honnibal
be4f0b6460
Update defaults
2017-10-08 02:08:12 -05:00
Matthew Honnibal
42b401d08b
Change default hidden depth to 1
2017-10-07 21:05:21 -05:00
Matthew Honnibal
9d66a915da
Update training defaults
2017-10-07 21:02:38 -05:00
Matthew Honnibal
d163115e91
Add non-linearity after history features
2017-10-07 21:00:43 -05:00
Matthew Honnibal
92c5d78b42
Unhack NER.add_action
2017-10-07 19:02:40 +02:00
Matthew Honnibal
f2b590f672
Increment version
2017-10-07 19:01:01 +02:00
Matthew Honnibal
eb0595bea9
Merge pull request #1392 from explosion/feature/parser-history-model
...
💫 Parser history features
2017-10-07 15:07:02 +02:00
Matthew Honnibal
3d22ccf495
Update default hyper-parameters
2017-10-07 07:16:41 -05:00
Matthew Honnibal
09442d25ec
Merge remote-tracking branch 'origin/develop' into feature/parser-history-model
2017-10-07 07:05:04 -05:00
Matthew Honnibal
3b67eabfea
Allow empty dictionaries to match any token in Matcher
...
Often patterns need to match "any token". A clean way to denote this
is with the empty dict {}: this sets no constraints on the token,
so should always match.
The problem was that having attributes length==0 was used as an
end-of-array signal, so the matcher didn't handle this case correctly.
This patch compiles empty token spec dicts into a constraint
NULL_ATTR==0. The NULL_ATTR attribute, 0, is always set to 0 on the
lexeme -- so this always matches.
2017-10-07 03:36:15 +02:00
ines
0adadcb3f0
Fix beam parse model test
2017-10-07 02:15:15 +02:00
ines
b38a8f4a94
Fix and update pipe methods tests
2017-10-07 02:06:23 +02:00
Matthew Honnibal
0384f08218
Trigger nonproj.deprojectivize as a postprocess
2017-10-07 02:00:47 +02:00
Matthew Honnibal
3a65a0c970
Start adding tests for new pipeline management
2017-10-07 01:48:23 +02:00
ines
e43530269c
Update docstrings
2017-10-07 01:04:50 +02:00