Matthew Honnibal
774f5732bd
Fix dimensionality of textcat when no vectors available
2017-10-04 14:55:15 +02:00
Ines Montani
28ba0b9b51
Merge pull request #1385 from explosion/feature/new-website
...
💫 New spaCy website
2017-10-04 14:35:52 +02:00
Matthew Honnibal
af75b74208
Unset LayerNorm backwards compat hack
2017-10-03 20:47:10 -05:00
ines
73ac0aa0b5
Update spacy evaluate and add displaCy option
2017-10-04 00:03:15 +02:00
Matthew Honnibal
246612cb53
Merge remote-tracking branch 'origin/develop' into feature/parser-history-model
2017-10-03 16:56:42 -05:00
Matthew Honnibal
f24c2e3a8a
Fix evaluate for non-GPU
2017-10-03 22:47:31 +02:00
Matthew Honnibal
5cbefcba17
Set backwards compatibility flag
2017-10-03 20:29:58 +02:00
Matthew Honnibal
5454b20cd7
Update thinc imports for 6.9
2017-10-03 20:07:17 +02:00
Matthew Honnibal
4a59f6358c
Fix thinc imports
2017-10-03 19:21:26 +02:00
Matthew Honnibal
e514d6aa0a
Import thinc modules more explicitly, to avoid cycles
2017-10-03 18:49:25 +02:00
Matthew Honnibal
338e1fda0e
Unbreak merge artefact
2017-10-03 09:41:05 -05:00
Matthew Honnibal
1289187279
Fix circular import
2017-10-03 09:33:21 -05:00
Matthew Honnibal
a44c4c3a5b
Add timer to evaluate
2017-10-03 09:15:35 -05:00
Matthew Honnibal
96da86b3e5
Add support for verbose flag to Language
2017-10-03 09:14:57 -05:00
Matthew Honnibal
02586a5243
Add timing to spacy evaluate command
2017-10-03 09:14:34 -05:00
ines
e49cd7aeaf
Move import into load to avoid circular imports
2017-10-03 15:22:19 +02:00
ines
b0dfa059db
Update docs link in about.py
2017-10-03 15:19:55 +02:00
Matthew Honnibal
dc3c791947
Fix history size option
2017-10-03 13:41:23 +02:00
Matthew Honnibal
278a4c17c6
Fix history features
2017-10-03 13:27:10 +02:00
Matthew Honnibal
b770f4e108
Fix embed class in history features
2017-10-03 13:26:55 +02:00
Matthew Honnibal
b50a359e11
Add support for history features in parsing models
2017-10-03 12:44:01 +02:00
Matthew Honnibal
ee41e4fea7
Support history features in stateclass
2017-10-03 12:43:48 +02:00
Matthew Honnibal
6aa6a5bc25
Add a layer type for history features
2017-10-03 12:43:09 +02:00
Matthew Honnibal
8902df44de
Fix component disabling during training
2017-10-02 21:07:23 +02:00
Matthew Honnibal
c617d288d8
Update pipeline component names in spaCy train
2017-10-02 17:20:19 +02:00
Matthew Honnibal
f942903429
Improve sentence merging in iob2json
2017-10-02 17:02:10 +02:00
Matthew Honnibal
31681d20e0
Fix concatenation in iob2json converter
2017-10-02 16:50:26 +02:00
Matthew Honnibal
4896ce3320
Remove misleading comment
2017-10-02 00:09:14 +02:00
Matthew Honnibal
d90cc917fa
Merge vectors.pyx doc strings
2017-10-01 17:05:54 -05:00
Matthew Honnibal
b2a8b9be77
Fix inconsistency of Vectors class API
2017-10-01 17:00:34 -05:00
Matthew Honnibal
e38089d598
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-01 22:10:54 +02:00
Matthew Honnibal
97c409b602
Add docstrings for spacy.vectors
2017-10-01 22:10:33 +02:00
ines
b776f48e58
Fix typo
2017-10-01 21:58:45 +02:00
Matthew Honnibal
94df115a81
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-01 14:06:23 -05:00
Matthew Honnibal
2cf0f4622f
Fix loading of models with pre-trained vectors
2017-10-01 14:05:32 -05:00
Matthew Honnibal
69c7c642c2
Add spacy evaluate
2017-10-01 14:05:04 -05:00
ines
8dbe49ecb8
Always compare lowercase package names
...
Otherwise, is_package will return False if model name contains
uppercase characters. See this issue:
https://support.prodi.gy/t/saving-a-trained-ner-model-as-a-loadable-modu
le/46/6
2017-09-29 20:55:17 +02:00
ines
153c2589d4
Revert "Always compare lowercase package names"
...
This reverts commit 7d77dc490f
.
2017-09-29 20:53:36 +02:00
ines
fd1a9225d8
Handle conversion of pipeline components correctly
...
Allow both comma and comma + whitespace as separators
2017-09-29 20:52:56 +02:00
ines
7d77dc490f
Always compare lowercase package names
...
Otherwise, is_package will return False if model name contains
uppercase characters. See this issue:
https://support.prodi.gy/t/saving-a-trained-ner-model-as-a-loadable-modu
le/46/6
2017-09-29 20:52:28 +02:00
Matthew Honnibal
cdb2d83e16
Pass dropout in parser
2017-09-28 18:47:13 -05:00
Matthew Honnibal
158e177cae
Fix default embed size
2017-09-28 08:25:23 -05:00
Matthew Honnibal
f6330d69e6
Default embed size to 7000
2017-09-28 08:07:41 -05:00
Matthew Honnibal
ac8481a7b0
Print NER loss
2017-09-28 08:05:31 -05:00
Matthew Honnibal
542ebfa498
Improve defaults
2017-09-27 18:54:37 -05:00
Matthew Honnibal
dcb86bdc43
Default batch size to 32
2017-09-27 11:48:19 -05:00
Matthew Honnibal
1a37a2c0a0
Update training defaults
2017-09-27 11:48:07 -05:00
Matthew Honnibal
13d7a97f3a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-27 11:44:37 -05:00
Matthew Honnibal
66c388ee01
Remove unhelpful multitask objectives
2017-09-27 11:44:16 -05:00
Matthew Honnibal
983201a83a
Fix hard-coded vector width
2017-09-27 11:43:58 -05:00
Ines Montani
959c46eabe
Merge pull request #1365 from wannaphongcom/develop
...
Add Thai language for spaCy v2
2017-09-26 23:43:05 +02:00
Matthew Honnibal
1ef4236f8e
Merge pull request #1343 from explosion/feature/phrasematcher
...
Update PhraseMatcher for spaCy 2
2017-09-26 20:44:23 +02:00
Wannaphong Phatthiyaphaibun
7b5263ffa4
fix thai test
2017-09-26 23:54:15 +07:00
ines
1ff62eaee7
Fix option shortcut to avoid conflict
2017-09-26 17:59:34 +02:00
Wannaphong Phatthiyaphaibun
3d5046c499
fix import in th
2017-09-26 22:41:20 +07:00
ines
7fdfb78141
Add version option to cli.train
2017-09-26 17:34:52 +02:00
Wannaphong Phatthiyaphaibun
a63f790b8c
fix thai tag_map
2017-09-26 22:28:57 +07:00
Wannaphong Phatthiyaphaibun
2ea27d07f4
fix tokenizer_exceptions in thai
2017-09-26 22:14:47 +07:00
Matthew Honnibal
41cc5c4c17
Merge branch 'develop' into feature/phrasematcher
2017-09-26 09:59:17 -05:00
Matthew Honnibal
c2e2f81773
Merge pull request #1355 from explosion/feature/noshare
...
Make pipeline components independent
2017-09-26 16:58:09 +02:00
Wannaphong Phatthiyaphaibun
a2bf4cc7bf
fix newline in file
2017-09-26 21:49:43 +07:00
ines
bb5c631402
Implement like_num getter for French (via #1161 )
2017-09-26 16:47:45 +02:00
ines
15479b3bae
Add comment to like_num re: future work
2017-09-26 16:43:28 +02:00
ines
adda08fe14
Implement like_num getter for Dutch (via #1177 )
2017-09-26 16:39:15 +02:00
ines
5ee10379db
Port over changes from #1340
2017-09-26 16:38:08 +02:00
Wannaphong Phatthiyaphaibun
5cba67146c
add thai in spacy2
2017-09-26 21:36:27 +07:00
ines
10d291f129
Port over change from #1351
2017-09-26 16:11:41 +02:00
Matthew Honnibal
3274b46a0d
Try to fix compile error on Windows
2017-09-26 09:05:53 -05:00
Matthew Honnibal
19c7c09bf7
Fix PhraseMatcher.__contains__
2017-09-26 08:35:53 -05:00
Matthew Honnibal
d02a41a8c9
Merge remote-tracking branch 'origin/develop' into feature/phrasematcher
2017-09-26 08:32:55 -05:00
Matthew Honnibal
698fc0d016
Remove merge artefact
2017-09-26 08:31:37 -05:00
Matthew Honnibal
defb68e94f
Update feature/noshare with recent develop changes
2017-09-26 08:15:14 -05:00
Matthew Honnibal
ca28590ddd
Use dep and ent multi-task objectives for parser'
2017-09-26 08:13:52 -05:00
Matthew Honnibal
9bfd585a11
Fix parameter name in .pxd file
2017-09-26 07:28:50 -05:00
Matthew Honnibal
74f08e1ad5
Update test
2017-09-26 06:45:56 -05:00
Matthew Honnibal
5aaef3e7b8
Dont link vectors in vocab deserialize
2017-09-26 06:45:47 -05:00
Matthew Honnibal
18a27c7579
Fix typo in tensorizer serialization
2017-09-26 06:45:14 -05:00
Matthew Honnibal
5056743ad5
Fix parser serialization
2017-09-26 06:44:56 -05:00
Ines Montani
7123139b2b
Add __contains__ to PhraseMatcher
2017-09-26 13:13:27 +02:00
Ines Montani
50ad50f96a
Update matcher.pyx
2017-09-26 13:11:17 +02:00
Matthew Honnibal
e34e70673f
Allow tagger models to be built with pre-defined tok2vec layer
2017-09-26 05:51:52 -05:00
Matthew Honnibal
bf917225ab
Allow multi-task objectives during training
2017-09-26 05:42:52 -05:00
Matthew Honnibal
4ae9ea7684
Remove unused argument in Language
2017-09-26 05:41:35 -05:00
ines
edf7e4881d
Add meta.json option to cli.train and add relevant properties
...
Add accuracy scores to meta.json instead of accuracy.json and replace
all relevant properties like lang, pipeline, spacy_version in existing
meta.json. If not present, also add name and version placeholders to
make it packagable.
2017-09-25 19:00:47 +02:00
ines
d2d35b63b7
Fix formatting
2017-09-25 18:37:13 +02:00
Matthew Honnibal
8eb0b7b779
Add docstrings for Pipe API
2017-09-25 16:22:07 +02:00
Matthew Honnibal
39f390dba7
Add docstrings for Pipe API
2017-09-25 16:20:49 +02:00
Matthew Honnibal
8716ffe57d
Serialize vocab last
2017-09-24 05:01:45 -05:00
Matthew Honnibal
72bbcc0871
Handle lemmatization for unknown string IDs
2017-09-24 05:01:31 -05:00
Matthew Honnibal
204b58c864
Fix evaluation during training
2017-09-24 05:01:03 -05:00
Matthew Honnibal
dc3a623d00
Remove unused update_shared argument
2017-09-24 05:00:37 -05:00
Matthew Honnibal
63bd87508d
Don't use iterated convolutions
2017-09-23 04:39:17 -05:00
Matthew Honnibal
5a7fd0fd36
Fix vector linkage
2017-09-22 20:11:52 -05:00
Matthew Honnibal
4348c479fc
Merge pre-trained vectors and noshare patches
2017-09-22 20:07:28 -05:00
Matthew Honnibal
7dc61b3f43
Whitespace
2017-09-22 20:00:50 -05:00
Matthew Honnibal
e93d43a43a
Fix training with preset vectors
2017-09-22 20:00:40 -05:00
Matthew Honnibal
0795857dcb
Fix beam parsing
2017-09-23 02:59:53 +02:00
Matthew Honnibal
4bd6a12b1f
Fix Tok2Vec
2017-09-23 02:58:54 +02:00
Matthew Honnibal
386c1a5bd8
Fix tagger training
2017-09-23 02:58:06 +02:00
Matthew Honnibal
a2357cce3f
Set random seed in train script
2017-09-23 02:57:31 +02:00