Commit Graph

692 Commits

Author SHA1 Message Date
ines
34a2eecb17 Add simple "naughty strings" test (see #1107) 2017-06-06 17:43:51 +02:00
ines
cc9c5dc7a3 Fix noun chunks test 2017-06-05 16:39:04 +02:00
Matthew Honnibal
b4cdd05466 Add vectors.pyx in setup 2017-06-05 12:45:29 +02:00
Matthew Honnibal
30369d580f Start testing Vectors class 2017-06-05 12:32:49 +02:00
ines
51d7414e94 Make sure sents are a list 2017-06-05 12:30:13 +02:00
ines
a0f4592f0a Update tests 2017-06-05 02:26:13 +02:00
ines
3e105bcd36 Update tests 2017-06-05 02:09:27 +02:00
ines
078232932c Fix tokenizer fixture scope 2017-06-05 01:06:34 +02:00
Matthew Honnibal
58be0e1f6f Update tests 2017-06-04 16:35:06 -05:00
Matthew Honnibal
bb98d45a63 Fix tests 2017-06-04 16:00:44 -05:00
Matthew Honnibal
55d0621532 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-06-04 15:53:25 -05:00
Matthew Honnibal
5b9f116aca Update tests 2017-06-04 15:53:17 -05:00
ines
8a29308d0b Remove unused imports 2017-06-04 22:39:29 +02:00
Ines Montani
112c5787eb Merge pull request #1101 from oroszgy/hu_tokenizer_fix
More robust Hungarian tokenizer.
2017-06-04 22:37:51 +02:00
ines
96867a24ae Fix typo 2017-06-04 22:36:40 +02:00
ines
f432bb4b48 Fix fixture scopes 2017-06-04 22:34:31 +02:00
ines
a66cf24ee8 xfail tokenizer serialization tests for now
Tests pass locally, but not on Travis – needs more investigation
2017-06-04 13:58:20 +02:00
ines
e47eef5e03 Update German tokenizer exceptions and tests 2017-06-03 21:07:44 +02:00
ines
d77c2cc8bb Add tests for English norm exceptions 2017-06-03 20:59:50 +02:00
ines
3152ee5ca2 Update serialization tests for tokenizer 2017-06-03 17:05:28 +02:00
ines
1ebd0d3f27 Add assert_packed_msg_equal util function 2017-06-03 17:04:30 +02:00
ines
de974f7bef Add serializer tests for tokenizer 2017-06-03 13:26:34 +02:00
ines
d21459f87d Update serializer tests 2017-06-02 21:42:26 +02:00
ines
d86e7cde93 Add entity recognizer to parser serialization tests 2017-06-02 18:40:06 +02:00
ines
0051c05964 Add tests for serializing parser 2017-06-02 18:37:19 +02:00
ines
cef547a9f0 Add serialization tests for tensorizer 2017-06-02 18:18:30 +02:00
ines
f74a45c1fe Remove unnecessary argument 2017-06-02 18:17:46 +02:00
ines
43b4d63f85 Add serialization tests for tagger 2017-06-02 17:29:34 +02:00
ines
acd65c00f6 Add serialization tests for StringStore and Vocab 2017-06-02 10:57:42 +02:00
ines
9692c98f57 Add test utils for temp file and temp dir 2017-06-02 10:56:09 +02:00
Matthew Honnibal
4c97371051 Fixes for thinc 6.7 2017-06-01 04:22:16 -05:00
Gyorgy Orosz
f0c3b09242 More robust Hungarian tokenizer. 2017-05-31 22:28:40 +02:00
ines
5e1c361270 Update tests README with info on model tests 2017-05-31 12:22:58 +02:00
Ines Montani
e6cf3c7e1c Merge pull request #1093 from oroszgy/hu_emoji_fix
Fixed emoji handling for Hungarian
2017-05-31 11:33:24 +02:00
Matthew Honnibal
6937e311a4 Update doc tests 2017-05-30 23:34:23 +02:00
Gyorgy Orosz
8c0b4b850e Fixed emoji handling for Hungarian 2017-05-30 21:34:46 +02:00
Matthew Honnibal
b127645afc Fix test_misc merge conflict 2017-05-29 18:31:44 -05:00
Matthew Honnibal
e0e8eae7c7 Tweak package test 2017-05-29 18:30:42 -05:00
ines
20a7003c0d Update model fixtures and reorganise tests 2017-05-29 22:14:31 +02:00
ines
795fe43a4d Add load_test_model function with importorskip()
Loads model only if it can be imported, i.e. if it's installed as a
package.
2017-05-29 22:11:31 +02:00
ines
6e3937efc5 Check for arguments of model markers to specify models to test
Lets user set --models --en for only English models
2017-05-29 22:10:16 +02:00
Matthew Honnibal
f4aafca222 Merge changes to test_misc 2017-05-29 12:26:02 +02:00
Matthew Honnibal
ff26aa6c37 Work on to/from bytes/disk serialization methods 2017-05-29 11:45:45 +02:00
ines
df920ba0e7 Add tests for displaCy and util functions and fix util typo 2017-05-29 10:51:19 +02:00
ines
c5714d4fb2 xfail matcher test for now until setting norm via Span.merge works 2017-05-29 10:51:02 +02:00
Matthew Honnibal
c91b121aeb Move serialization functions to util 2017-05-29 10:13:42 +02:00
Matthew Honnibal
1fa2bfb600 Add model_to_bytes and model_from_bytes helpers. Probably belong in thinc. 2017-05-29 09:27:04 +02:00
Matthew Honnibal
6dad4117ad Work on serialization for models 2017-05-29 01:37:57 +02:00
ines
7b1ddcc04d Add test for vocab serialization 2017-05-29 01:09:52 +02:00
ines
00b2094dc3 Fix typos, long integers and tests 2017-05-29 01:09:52 +02:00
ines
804dbb8d25 Add StringStore test for API docs 2017-05-29 01:09:52 +02:00
Matthew Honnibal
92dbf28c1e Hack a fixture in the vectors tests, for xfail 2017-05-28 20:28:32 +02:00
Matthew Honnibal
fe11564b8e Finish stringstore change. Also xfail vectors tests 2017-05-28 15:10:22 +02:00
Matthew Honnibal
b007a2b0d3 Update stringstore tests 2017-05-28 14:08:09 +02:00
Matthew Honnibal
84e66ca6d4 WIP on stringstore change. 27 failures 2017-05-28 14:06:40 +02:00
Matthew Honnibal
fe4a746300 Accomodate symbols in new string scheme 2017-05-28 13:03:16 +02:00
Matthew Honnibal
a5606c3eda Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
ines
a8e58e04ef Add symbols class to punctuation rules to handle emoji (see #1088)
Currently doesn't work for Hungarian, because of conflicts with the
custom punctuation rules. Also doesn't take multi-character emoji like
👩🏽‍💻 into account.
2017-05-27 17:57:10 +02:00
Matthew Honnibal
4917cbb484 Include sent_start test 2017-05-23 18:40:37 +02:00
ines
fb0ff0272f xfail neural parser tests for now and remove test for deprecated method 2017-05-23 12:40:37 +02:00
Matthew Honnibal
5418bcf5d7 Resolve conflict on test 2017-05-23 04:37:16 -05:00
ines
e6acd3bbf2 Fix matcher tests and matcher docs 2017-05-23 11:36:02 +02:00
ines
d0c6d4f76d Fix formatting 2017-05-23 11:32:00 +02:00
Matthew Honnibal
3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal
532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal
bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
ines
b3c7ee0148 Fix tests and use the new Matcher API 2017-05-22 13:54:20 +02:00
Matthew Honnibal
187f370734 Update tests for matcher changes 2017-05-22 12:59:50 +02:00
Matthew Honnibal
7e2cdc0c81 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-22 12:39:34 +02:00
Matthew Honnibal
2f78413a02 PseudoProjectivity->nonproj 2017-05-22 05:39:03 -05:00
Matthew Honnibal
d8bb5bb959 Implement StringStore serialization, and update tests 2017-05-22 12:38:00 +02:00
Matthew Honnibal
5db89053aa Merge docstrings 2017-05-21 13:46:23 -05:00
Matthew Honnibal
836fe1d880 Update neural net tests 2017-05-19 18:11:29 -05:00
ines
a804045597 Use is_ancestor instead of deprecated is_ancestor_of 2017-05-19 20:23:40 +02:00
Matthew Honnibal
793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
c9a5d5d24b Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-16 16:22:05 +02:00
Matthew Honnibal
8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
221b4c1ee8 Fix test for Python 3 2017-05-16 13:06:30 +02:00
Matthew Honnibal
1d7c18e58a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-15 21:53:47 +02:00
Matthew Honnibal
a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
ines
b462076d80 Merge load_lang_class and get_lang_class 2017-05-14 01:31:10 +02:00
ines
5858857a78 Update languages list in conftest 2017-05-13 15:37:54 +02:00
ines
8c2a0c026d Fix parse_tree test 2017-05-13 12:32:45 +02:00
Matthew Honnibal
ee1d35bdb0 Fix merge conflict 2017-05-13 03:20:19 +02:00
Matthew Honnibal
b2540d2379 Merge Kengz's tree_print patch 2017-05-13 03:18:49 +02:00
Matthew Honnibal
7253b4e649 Remove old serialization tests 2017-05-09 18:12:58 +02:00
Matthew Honnibal
f9327343ce Start updating serializer test 2017-05-09 18:12:03 +02:00
ines
2c3bdd09b1 Add English test for like_num 2017-05-09 11:06:34 +02:00
ines
22375eafb0 Fix and merge attrs and lex_attrs tests 2017-05-09 11:06:25 +02:00
ines
c714841cc8 Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
ines
bd57b611cc Update conftest to lazy load languages 2017-05-09 00:02:21 +02:00
ines
3c0f85de8e Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00
ines
be5541bd16 Fix import and tokenizer exceptions 2017-05-08 16:20:14 +02:00
ines
2324788970 Remove bad tests 2017-05-08 16:15:27 +02:00
Gregory Howard
c0afcd22bb Merge remote-tracking branch 'remotes/upstream/master' 2017-04-27 14:42:54 +02:00
Gregory Howard
8ff4682255 correcting tokenizer exception.
Adding tests for lemmatization
2017-04-27 11:52:14 +02:00
Ines Montani
7da9cefd25 Merge pull request #1022 from luvogels/master
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Gregory Howard
44cb486849 Adding unitest for tokenization in french (with title) 2017-04-27 10:59:38 +02:00
luvogels
d12a0b6431 Hooked up tokenizer tests 2017-04-26 23:21:41 +02:00
luvogels
8de59ce3b9 Added tokenizer tests 2017-04-26 19:10:18 +02:00