Commit Graph

8216 Commits

Author SHA1 Message Date
ines
edf7e4881d Add meta.json option to cli.train and add relevant properties
Add accuracy scores to meta.json instead of accuracy.json and replace
all relevant properties like lang, pipeline, spacy_version in existing
meta.json. If not present, also add name and version placeholders to
make it packagable.
2017-09-25 19:00:47 +02:00
ines
d2d35b63b7 Fix formatting 2017-09-25 18:37:13 +02:00
Matthew Honnibal
8eb0b7b779 Add docstrings for Pipe API 2017-09-25 16:22:07 +02:00
Matthew Honnibal
39f390dba7 Add docstrings for Pipe API 2017-09-25 16:20:49 +02:00
Matthew Honnibal
2f8d535f65 Merge pull request #1351 from hscspring/patch-4
Update punctuation.py
2017-09-24 12:16:39 +02:00
Matthew Honnibal
8716ffe57d Serialize vocab last 2017-09-24 05:01:45 -05:00
Matthew Honnibal
72bbcc0871 Handle lemmatization for unknown string IDs 2017-09-24 05:01:31 -05:00
Matthew Honnibal
204b58c864 Fix evaluation during training 2017-09-24 05:01:03 -05:00
Matthew Honnibal
dc3a623d00 Remove unused update_shared argument 2017-09-24 05:00:37 -05:00
Matthew Honnibal
63bd87508d Don't use iterated convolutions 2017-09-23 04:39:17 -05:00
Matthew Honnibal
5a7fd0fd36 Fix vector linkage 2017-09-22 20:11:52 -05:00
Matthew Honnibal
4348c479fc Merge pre-trained vectors and noshare patches 2017-09-22 20:07:28 -05:00
Matthew Honnibal
7dc61b3f43 Whitespace 2017-09-22 20:00:50 -05:00
Matthew Honnibal
e93d43a43a Fix training with preset vectors 2017-09-22 20:00:40 -05:00
Matthew Honnibal
0795857dcb Fix beam parsing 2017-09-23 02:59:53 +02:00
Matthew Honnibal
4bd6a12b1f Fix Tok2Vec 2017-09-23 02:58:54 +02:00
Matthew Honnibal
386c1a5bd8 Fix tagger training 2017-09-23 02:58:06 +02:00
Matthew Honnibal
a2357cce3f Set random seed in train script 2017-09-23 02:57:31 +02:00
Matthew Honnibal
05596159bf Fix serialization when pre-trained vectors 2017-09-22 15:33:27 -05:00
Matthew Honnibal
980fb6e854 Refactor Tok2Vec 2017-09-22 09:38:36 -05:00
Matthew Honnibal
d9124f1aa3 Add link_vectors_to_models function 2017-09-22 09:38:22 -05:00
Matthew Honnibal
a186596307 Add 'reapply' combinator, for iterated CNN 2017-09-22 09:37:03 -05:00
Matthew Honnibal
9177313063 Merge pull request #1352 from hscspring/patch-5
Update customizing-tokenizer.jade
2017-09-22 16:11:49 +02:00
Matthew Honnibal
1dbc2285b8 Merge pull request #1350 from hscspring/patch-3
Update word-vectors-similarities.jade
2017-09-22 16:11:05 +02:00
Yam
54855f0eee Update customizing-tokenizer.jade 2017-09-22 12:15:48 +08:00
Yam
6f450306c3 Update customizing-tokenizer.jade
update some codes:    
- `me` -> `-PRON`
- `TAG` -> `POS`
- `create_tokenizer` function
2017-09-22 10:53:22 +08:00
Yam
923c4c2fb2 Update punctuation.py
add `……`
2017-09-22 09:50:46 +08:00
Yam
425c09488d Update word-vectors-similarities.jade
add
```    
import spacy
nlp = spacy.load('en') ```
2017-09-22 08:56:34 +08:00
Matthew Honnibal
40a4873b70 Fix serialization of model options 2017-09-21 13:07:26 -05:00
Matthew Honnibal
0a9016cade Fix serialization during training 2017-09-21 13:06:45 -05:00
Matthew Honnibal
20193371f5 Don't share CNN, to reduce complexities 2017-09-21 14:59:48 +02:00
Wannaphong Phatthiyaphaibun
1abf472068 add th test 2017-09-21 12:56:58 +07:00
Matthew Honnibal
1d73dec8b1 Refactor train script 2017-09-20 19:17:10 -05:00
Matthew Honnibal
ffda38356a Add util function to enable GPU 2017-09-20 19:16:35 -05:00
Matthew Honnibal
24e85c2048 Pass values for CNN maxout pieces option 2017-09-20 19:16:12 -05:00
Matthew Honnibal
b832f89ff8 Add resume_training function 2017-09-20 19:15:20 -05:00
Matthew Honnibal
f5144f04be Add argument for CNN maxout pieces 2017-09-20 19:14:41 -05:00
Matthew Honnibal
ea2732469b Merge pull request #1340 from hscspring/patch-1
Update punctuation.py
2017-09-20 23:57:00 +02:00
Matthew Honnibal
842e21de9f Fix int type error for Python 2 2017-09-20 23:55:30 +02:00
Matthew Honnibal
f92ab03dc8 Rename phrase matcher example 2017-09-20 22:51:58 +02:00
Matthew Honnibal
01858e9b59 Fix PhraseMatcher example 2017-09-20 22:51:41 +02:00
Matthew Honnibal
0c93c73e49 Add __reduce__ method for PhraseMatcher 2017-09-20 22:26:40 +02:00
Matthew Honnibal
cc408fc189 Make PhraseMatcher API like Matcher API 2017-09-20 22:20:35 +02:00
Matthew Honnibal
43ad250dd5 Update matcher tests 2017-09-20 21:54:49 +02:00
Matthew Honnibal
828cc91545 Fix PhraseMatcher for spaCy 2 2017-09-20 21:54:31 +02:00
Wannaphong Phatthiyaphaibun
39bb5690f0 update th 2017-09-21 00:36:02 +07:00
Wannaphong Phatthiyaphaibun
44291f6697 add thai 2017-09-20 23:26:34 +07:00
Yam
978b24ccd4 Update punctuation.py
In Chinese, `~` and `——` is hyphens,   
`·` is intermittent symbol
2017-09-20 23:02:22 +08:00
Matthew Honnibal
78301b2d29 Avoid comparison to None in Tok2Vec 2017-09-20 00:19:34 +02:00
Matthew Honnibal
b36a38f63d Fix serialization of pretrained_dims property 2017-09-19 23:42:27 +02:00