Commit Graph

200 Commits

Author SHA1 Message Date
Kirill Bulygin
7b064542f7 Making lang/th/test_tokenizer.py pass by creating ThaiTokenizer (#3078) 2019-01-10 15:40:37 +01:00
Matthew Honnibal
6430b1fe64 Restore encoding arg on msgpack-numpy 2018-09-27 15:58:21 +02:00
Matthew Honnibal
8809dc4514 Remove deprecated encoding argument to msgpack 2018-09-27 12:56:23 +02:00
Matthew Honnibal
e0caf3ae8c Fix msgpack for new version 2018-07-20 17:32:00 +02:00
ansgar-t
9732988951 escape html in displacy.render (#2378) (closes #2361)
## Description
Fix for issue #2361 :
replace &, <, >, " with &amp;amp; , &amp;lt; , &amp;gt; , &amp;quot; in before rendering svg

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [ ] I ran the tests, and all new and existing tests passed.
(As discussed in the comments to #2361)
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2018-05-28 18:36:41 +02:00
ines
5768df4f09 Add SimpleFrozenDict util to use as default function argument 2018-05-20 15:13:37 +02:00
Ines Montani
3141e04822
💫 New system for error messages and warnings (#2163)
* Add spacy.errors module

* Update deprecation and user warnings

* Replace errors and asserts with new error message system

* Remove redundant asserts

* Fix whitespace

* Add messages for print/util.prints statements

* Fix typo

* Fix typos

* Move CLI messages to spacy.cli._messages

* Add decorator to display error code with message

An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc.

* Remove unused link in spacy.about

* Update errors for invalid pipeline components

* Improve error for unknown factories

* Add displaCy warnings

* Update formatting consistency

* Move error message to spacy.errors

* Update errors and check if doc returned by component is None
2018-04-03 15:50:31 +02:00
Matthew Honnibal
8308bbc617 Get msgpack and msgpack_numpy via Thinc, to avoid potential version conflicts 2018-03-29 00:14:55 +02:00
Johannes Dollinger
012e874d09 Add contributor agreement for emulbreh 2018-02-13 13:40:33 +01:00
Johannes Dollinger
bf94c13382 Don't fix random seeds on import 2018-02-13 12:42:23 +01:00
ines
35653bef3a Add missing import (fixes #1546) 2017-11-10 19:05:18 +01:00
Matthew Honnibal
726f689da4 Fix missing import 2017-11-07 13:20:12 +01:00
ines
8fb48b9b91 Update and document new util functions 2017-11-07 00:22:43 +01:00
Matthew Honnibal
1cab703bba Move minibatch function to util 2017-11-06 23:45:36 +01:00
ines
39e0586192 Add deprecated helper
Uses warning to show DeprecationWarning and custom stack trace
2017-11-01 16:32:36 +01:00
Matthew Honnibal
a7bf38bf31 Remove misleading comment on util.get_cuda_stream() 2017-11-01 13:57:25 +01:00
ines
ea4a41c8fb Tidy up util and helpers 2017-10-27 14:39:09 +02:00
Matthew Honnibal
9baa8fe7ec Convert closure to functools.partial, to promote pickling 2017-10-17 18:20:52 +02:00
Matthew Honnibal
df488274b1 Fix deserialization of vectors 2017-10-16 20:55:00 +02:00
ines
d5418553eb Fix whitespace 2017-10-16 18:30:04 +02:00
ines
6ceadcdb5c Make sure from_disk passes string to numpy (see #1421)
If path is a WindowsPath, numpy does not recognise it as a path and as
a result, doesn't open the file.
https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py#L369
2017-10-16 18:29:56 +02:00
ines
b39409173e Add disable option and True/False/None values for pipeline 2017-10-07 00:29:08 +02:00
ines
212c8f0711 Implement new Language methods and pipeline API 2017-10-07 00:25:54 +02:00
Matthew Honnibal
f24c2e3a8a Fix evaluate for non-GPU 2017-10-03 22:47:31 +02:00
ines
8dbe49ecb8 Always compare lowercase package names
Otherwise, is_package will return False if model name contains
uppercase characters. See this issue:
https://support.prodi.gy/t/saving-a-trained-ner-model-as-a-loadable-modu
le/46/6
2017-09-29 20:55:17 +02:00
ines
153c2589d4 Revert "Always compare lowercase package names"
This reverts commit 7d77dc490f.
2017-09-29 20:53:36 +02:00
ines
7d77dc490f Always compare lowercase package names
Otherwise, is_package will return False if model name contains
uppercase characters. See this issue:
https://support.prodi.gy/t/saving-a-trained-ner-model-as-a-loadable-modu
le/46/6
2017-09-29 20:52:28 +02:00
Matthew Honnibal
ffda38356a Add util function to enable GPU 2017-09-20 19:16:35 -05:00
ines
68f66aebf8 Use pkg_resources instead of pip for is_package (resolves #1293) 2017-09-16 20:27:59 +02:00
Matthew Honnibal
30e35d9666 Fix syntax error 2017-08-30 17:35:39 -05:00
ines
173089a45a Add more validation for model meta 2017-08-29 11:21:46 +02:00
Matthew Honnibal
ed95009b5c Fix data loading on Python 2 2017-08-18 21:57:06 +02:00
Dan O'Huiginn
ebf5a3ce59 Allow loading with python < 3.6
Don't rely on recent python features to load models

Fixes Issue #1271
2017-08-17 15:15:47 +00:00
ines
ea167e14db Fix model package loading from link 2017-06-05 13:10:49 +02:00
ines
dd6dc4c120 Update spacy.load() helper functions 2017-06-05 13:02:31 +02:00
ines
7db1a0e83e Make sure printed values are always strings 2017-06-04 21:27:20 +02:00
ines
070e026ed9 Ensure path on read_json 2017-06-04 20:44:37 +02:00
ines
e1e73936b1 Raise correct error 2017-06-04 20:44:27 +02:00
ines
4c2bbc3ccc Add add_lookups util function 2017-06-03 19:44:47 +02:00
ines
924c58bde3 Fix serialization of optional elements 2017-06-02 18:18:17 +02:00
Matthew Honnibal
1d18cedae8 Fiddle with msgpack bytes vs unicode 2017-06-01 10:48:43 -05:00
Matthew Honnibal
3ff7d7fcef Merge for updated requirements 2017-06-01 04:57:47 -05:00
Matthew Honnibal
ae8010b526 Move weight serialization to Thinc 2017-06-01 02:56:12 -05:00
Matthew Honnibal
c8a58cfcf8 Fix Python2/3 load bug 2017-05-31 15:21:44 -05:00
Matthew Honnibal
8dfb9546f0 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-31 07:21:14 -05:00
Matthew Honnibal
92f9e5cc9a Silence env_opt, and fix serialization for GPU 2017-05-31 07:14:11 -05:00
Matthew Honnibal
33e5ec737f Fix to/from disk methods 2017-05-31 13:43:10 +02:00
Matthew Honnibal
2a061e2777 Fix serialisation, for reals this time 2017-05-29 17:52:08 -05:00
Matthew Honnibal
35d981241f Fix model deserialization 2017-05-29 14:46:31 -05:00
Matthew Honnibal
5b29f227ae Fix serialization 2017-05-29 14:35:53 -05:00