Matthew Honnibal
3376d4d6e8
Update the train script, fixing GPU memory leak
2017-05-19 18:15:50 -05:00
Matthew Honnibal
08766240c3
Add incomplete iob converter
2017-05-19 13:27:51 -05:00
Matthew Honnibal
09a877886b
WIP on iob converter
2017-05-19 13:24:39 -05:00
Matthew Honnibal
ca70b08661
Fix GPU training and evaluation
2017-05-18 08:30:33 -05:00
Matthew Honnibal
fc8d3a112c
Add util.env_opt support: Can set hyper params through environment variables.
2017-05-18 04:36:53 -05:00
Matthew Honnibal
55dab77de8
Add conversion rule for .conll
2017-05-17 13:13:48 +02:00
Matthew Honnibal
793430aa7a
Get spaCy train command working with neural network
...
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
3bf4a28d8d
Use tag in CoNLL converter, not POS
2017-05-17 12:04:33 +02:00
Matthew Honnibal
8cf097ca88
Redesign training to integrate NN components
...
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
.begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
5211645af3
Get data flowing through pipeline. Needs redesign
2017-05-16 11:21:59 +02:00
Matthew Honnibal
a9edb3aa1d
Improve integration of NN parser, to support unified training API
2017-05-15 21:53:27 +02:00
ines
9d85cda8e4
Fix models error message and use about.__docs_models__ (see #1051 )
2017-05-13 13:05:47 +02:00
ines
4eefb288e3
Port over PR #1055
2017-05-13 03:25:32 +02:00
ines
95edd9e896
Let parse_package_meta take full path
2017-05-08 15:30:48 +02:00
ines
59c3b9d4dd
Tidy up CLI and fix print functions
2017-05-07 23:25:29 +02:00
ines
527d51ac9a
Fetch shortcuts from GitHub and improve error handling
2017-04-26 18:00:28 +02:00
Matthew Honnibal
4f9657b42b
Fix reporting if no dev data with train
2017-04-23 22:27:10 +02:00
ines
3a9710f356
Pass dev_scores to print_progress correctly ( resolves #1008 )
...
Only read scores attribute if command is used with dev_data, otherwise
default dev_scores to empty dict.
2017-04-23 15:58:40 +02:00
ines
25c70b4cc5
Move fix_text to spacy.compat (see #1002 )
2017-04-20 15:47:17 +02:00
Gyorgy Orosz
4a06a2572c
Using ftfy for handling broken encoded strings.
2017-04-20 13:34:51 +02:00
ines
48da244058
Use spacy.compat.json_dumps for Python 2/3 compatibility ( resolves #991 )
2017-04-19 11:50:36 +02:00
ines
82f5f1f98f
Replace str with compat.unicode_
2017-04-17 01:29:54 +02:00
Matthew Honnibal
17c9fffb9e
Fix naked except
2017-04-16 15:28:16 -05:00
ines
6145b7c153
Remove redundant Path
2017-04-16 20:53:25 +02:00
Matthew Honnibal
89a4f262fc
Fix training methods
2017-04-16 13:00:37 -05:00
ines
8191e33cf1
Update link error message with info on permissions
2017-04-16 13:32:31 +02:00
ines
a3ddbc0444
Add note about --force flag to error message
2017-04-16 13:14:36 +02:00
ines
e3de035814
Add meta validation to check for required settings
...
Complain if no "lang", "name" or "version" is found (those settings are
used in directory / package names). Package will still build without,
but it'll inevitably fail somewhere down the line.
2017-04-16 13:13:17 +02:00
ines
a7574b7572
Add more options to read in meta data in package command
...
Add meta option to supply path to meta.json. If no meta path is set,
check if meta.json exists in input directory and use it. Otherwise,
prompt for details on the command line.
2017-04-16 13:06:02 +02:00
ines
13c8a42d2b
Fix typos
2017-04-16 13:03:58 +02:00
ines
35fb4febe2
Fix whitespace
2017-04-15 12:13:45 +02:00
ines
c05ec4b89a
Add compat functions and remove old workarounds
...
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00
ines
d24589aa72
Clean up imports, unused code, whitespace, docstrings
2017-04-15 12:05:47 +02:00
ines
561f2a3eb4
Use consistent formatting for docstrings
2017-04-15 11:59:21 +02:00
ines
84341c2975
Only compile list of models if data_path exists
2017-04-14 16:48:02 +02:00
Gyorgy Orosz
dd3244c08a
Made json dump to produce unicode strings in py2
2017-04-13 23:30:47 +02:00
Gyorgy Orosz
a9469c8173
Fixed typo
2017-04-13 15:24:14 +02:00
ines
41037f0f07
Remove unused imports
2017-04-13 13:52:11 +02:00
ines
1b92c8d5d5
Use unicode paths on Windows/Python 2 and catch other errors ( resolves #970 )
...
try/except here is quite dirty, but it'll at least make sure users see
an error message that explains what's going on
2017-04-10 17:49:51 +02:00
ines
7ea1673072
Fix whitespace
2017-04-07 13:28:48 +02:00
ines
255650dbc2
Add connlu2json converter from explosion/spacy-dev-resources/#11
2017-04-07 13:05:12 +02:00
ines
789ce8a45e
Add convert command
2017-04-07 13:04:17 +02:00
ines
9952d3b08a
Fix whitespace
2017-04-07 13:02:05 +02:00
ines
dcf8ab0c47
Merge branch 'develop'
2017-04-07 12:00:09 +02:00
Joshua Reeter
564daf6dec
Issue #934 symlink should not convert paths as_posix under windows.
2017-03-30 23:47:45 -05:00
ines
4759fd437d
Merge branch 'master' into develop
2017-03-29 10:37:13 +02:00
Grégory Howard
9c2996b27f
correction of package.py (encoding on open instead of write)
2017-03-29 09:11:02 +02:00
ines
7198cf1c8a
Remove unused import
2017-03-26 20:56:05 +02:00
ines
7ceaa1614b
Add experimental model init command
2017-03-26 20:51:40 +02:00
Matthew Honnibal
2efdbc08ff
Make training work with directories
2017-03-26 08:46:44 -05:00