ines
34a2eecb17
Add simple "naughty strings" test (see #1107 )
2017-06-06 17:43:51 +02:00
ines
cc9c5dc7a3
Fix noun chunks test
2017-06-05 16:39:04 +02:00
Matthew Honnibal
b4cdd05466
Add vectors.pyx in setup
2017-06-05 12:45:29 +02:00
Matthew Honnibal
30369d580f
Start testing Vectors class
2017-06-05 12:32:49 +02:00
ines
51d7414e94
Make sure sents are a list
2017-06-05 12:30:13 +02:00
ines
a0f4592f0a
Update tests
2017-06-05 02:26:13 +02:00
ines
3e105bcd36
Update tests
2017-06-05 02:09:27 +02:00
ines
078232932c
Fix tokenizer fixture scope
2017-06-05 01:06:34 +02:00
Matthew Honnibal
58be0e1f6f
Update tests
2017-06-04 16:35:06 -05:00
Matthew Honnibal
bb98d45a63
Fix tests
2017-06-04 16:00:44 -05:00
Matthew Honnibal
55d0621532
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-06-04 15:53:25 -05:00
Matthew Honnibal
5b9f116aca
Update tests
2017-06-04 15:53:17 -05:00
ines
8a29308d0b
Remove unused imports
2017-06-04 22:39:29 +02:00
Ines Montani
112c5787eb
Merge pull request #1101 from oroszgy/hu_tokenizer_fix
...
More robust Hungarian tokenizer.
2017-06-04 22:37:51 +02:00
ines
96867a24ae
Fix typo
2017-06-04 22:36:40 +02:00
ines
f432bb4b48
Fix fixture scopes
2017-06-04 22:34:31 +02:00
ines
a66cf24ee8
xfail tokenizer serialization tests for now
...
Tests pass locally, but not on Travis – needs more investigation
2017-06-04 13:58:20 +02:00
ines
e47eef5e03
Update German tokenizer exceptions and tests
2017-06-03 21:07:44 +02:00
ines
d77c2cc8bb
Add tests for English norm exceptions
2017-06-03 20:59:50 +02:00
ines
3152ee5ca2
Update serialization tests for tokenizer
2017-06-03 17:05:28 +02:00
ines
1ebd0d3f27
Add assert_packed_msg_equal util function
2017-06-03 17:04:30 +02:00
ines
de974f7bef
Add serializer tests for tokenizer
2017-06-03 13:26:34 +02:00
ines
d21459f87d
Update serializer tests
2017-06-02 21:42:26 +02:00
ines
d86e7cde93
Add entity recognizer to parser serialization tests
2017-06-02 18:40:06 +02:00
ines
0051c05964
Add tests for serializing parser
2017-06-02 18:37:19 +02:00
ines
cef547a9f0
Add serialization tests for tensorizer
2017-06-02 18:18:30 +02:00
ines
f74a45c1fe
Remove unnecessary argument
2017-06-02 18:17:46 +02:00
ines
43b4d63f85
Add serialization tests for tagger
2017-06-02 17:29:34 +02:00
ines
acd65c00f6
Add serialization tests for StringStore and Vocab
2017-06-02 10:57:42 +02:00
ines
9692c98f57
Add test utils for temp file and temp dir
2017-06-02 10:56:09 +02:00
Matthew Honnibal
4c97371051
Fixes for thinc 6.7
2017-06-01 04:22:16 -05:00
Gyorgy Orosz
f0c3b09242
More robust Hungarian tokenizer.
2017-05-31 22:28:40 +02:00
ines
5e1c361270
Update tests README with info on model tests
2017-05-31 12:22:58 +02:00
Ines Montani
e6cf3c7e1c
Merge pull request #1093 from oroszgy/hu_emoji_fix
...
Fixed emoji handling for Hungarian
2017-05-31 11:33:24 +02:00
Matthew Honnibal
6937e311a4
Update doc tests
2017-05-30 23:34:23 +02:00
Gyorgy Orosz
8c0b4b850e
Fixed emoji handling for Hungarian
2017-05-30 21:34:46 +02:00
Matthew Honnibal
b127645afc
Fix test_misc merge conflict
2017-05-29 18:31:44 -05:00
Matthew Honnibal
e0e8eae7c7
Tweak package test
2017-05-29 18:30:42 -05:00
ines
20a7003c0d
Update model fixtures and reorganise tests
2017-05-29 22:14:31 +02:00
ines
795fe43a4d
Add load_test_model function with importorskip()
...
Loads model only if it can be imported, i.e. if it's installed as a
package.
2017-05-29 22:11:31 +02:00
ines
6e3937efc5
Check for arguments of model markers to specify models to test
...
Lets user set --models --en for only English models
2017-05-29 22:10:16 +02:00
Matthew Honnibal
f4aafca222
Merge changes to test_misc
2017-05-29 12:26:02 +02:00
Matthew Honnibal
ff26aa6c37
Work on to/from bytes/disk serialization methods
2017-05-29 11:45:45 +02:00
ines
df920ba0e7
Add tests for displaCy and util functions and fix util typo
2017-05-29 10:51:19 +02:00
ines
c5714d4fb2
xfail matcher test for now until setting norm via Span.merge works
2017-05-29 10:51:02 +02:00
Matthew Honnibal
c91b121aeb
Move serialization functions to util
2017-05-29 10:13:42 +02:00
Matthew Honnibal
1fa2bfb600
Add model_to_bytes and model_from_bytes helpers. Probably belong in thinc.
2017-05-29 09:27:04 +02:00
Matthew Honnibal
6dad4117ad
Work on serialization for models
2017-05-29 01:37:57 +02:00
ines
7b1ddcc04d
Add test for vocab serialization
2017-05-29 01:09:52 +02:00
ines
00b2094dc3
Fix typos, long integers and tests
2017-05-29 01:09:52 +02:00
ines
804dbb8d25
Add StringStore test for API docs
2017-05-29 01:09:52 +02:00
Matthew Honnibal
92dbf28c1e
Hack a fixture in the vectors tests, for xfail
2017-05-28 20:28:32 +02:00
Matthew Honnibal
fe11564b8e
Finish stringstore change. Also xfail vectors tests
2017-05-28 15:10:22 +02:00
Matthew Honnibal
b007a2b0d3
Update stringstore tests
2017-05-28 14:08:09 +02:00
Matthew Honnibal
84e66ca6d4
WIP on stringstore change. 27 failures
2017-05-28 14:06:40 +02:00
Matthew Honnibal
fe4a746300
Accomodate symbols in new string scheme
2017-05-28 13:03:16 +02:00
Matthew Honnibal
a5606c3eda
Work on changing StringStore to return hashes.
2017-05-28 12:36:27 +02:00
ines
a8e58e04ef
Add symbols class to punctuation rules to handle emoji (see #1088 )
...
Currently doesn't work for Hungarian, because of conflicts with the
custom punctuation rules. Also doesn't take multi-character emoji like
👩🏽💻 into account.
2017-05-27 17:57:10 +02:00
Matthew Honnibal
4917cbb484
Include sent_start test
2017-05-23 18:40:37 +02:00
ines
fb0ff0272f
xfail neural parser tests for now and remove test for deprecated method
2017-05-23 12:40:37 +02:00
Matthew Honnibal
5418bcf5d7
Resolve conflict on test
2017-05-23 04:37:16 -05:00
ines
e6acd3bbf2
Fix matcher tests and matcher docs
2017-05-23 11:36:02 +02:00
ines
d0c6d4f76d
Fix formatting
2017-05-23 11:32:00 +02:00
Matthew Honnibal
3959d778ac
Revert "Revert "WIP on improving parser efficiency""
...
This reverts commit 532afef4a8
.
2017-05-23 03:06:53 -05:00
Matthew Honnibal
532afef4a8
Revert "WIP on improving parser efficiency"
...
This reverts commit bdaac7ab44
.
2017-05-23 03:05:25 -05:00
Matthew Honnibal
bdaac7ab44
WIP on improving parser efficiency
2017-05-23 02:59:31 -05:00
ines
b3c7ee0148
Fix tests and use the new Matcher API
2017-05-22 13:54:20 +02:00
Matthew Honnibal
187f370734
Update tests for matcher changes
2017-05-22 12:59:50 +02:00
Matthew Honnibal
7e2cdc0c81
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-22 12:39:34 +02:00
Matthew Honnibal
2f78413a02
PseudoProjectivity->nonproj
2017-05-22 05:39:03 -05:00
Matthew Honnibal
d8bb5bb959
Implement StringStore serialization, and update tests
2017-05-22 12:38:00 +02:00
Matthew Honnibal
5db89053aa
Merge docstrings
2017-05-21 13:46:23 -05:00
Matthew Honnibal
836fe1d880
Update neural net tests
2017-05-19 18:11:29 -05:00
ines
a804045597
Use is_ancestor instead of deprecated is_ancestor_of
2017-05-19 20:23:40 +02:00
Matthew Honnibal
793430aa7a
Get spaCy train command working with neural network
...
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
c9a5d5d24b
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-16 16:22:05 +02:00
Matthew Honnibal
8cf097ca88
Redesign training to integrate NN components
...
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
.begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
221b4c1ee8
Fix test for Python 3
2017-05-16 13:06:30 +02:00
Matthew Honnibal
1d7c18e58a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-15 21:53:47 +02:00
Matthew Honnibal
a9edb3aa1d
Improve integration of NN parser, to support unified training API
2017-05-15 21:53:27 +02:00
ines
b462076d80
Merge load_lang_class and get_lang_class
2017-05-14 01:31:10 +02:00
ines
5858857a78
Update languages list in conftest
2017-05-13 15:37:54 +02:00
ines
8c2a0c026d
Fix parse_tree test
2017-05-13 12:32:45 +02:00
Matthew Honnibal
ee1d35bdb0
Fix merge conflict
2017-05-13 03:20:19 +02:00
Matthew Honnibal
b2540d2379
Merge Kengz's tree_print patch
2017-05-13 03:18:49 +02:00
Matthew Honnibal
7253b4e649
Remove old serialization tests
2017-05-09 18:12:58 +02:00
Matthew Honnibal
f9327343ce
Start updating serializer test
2017-05-09 18:12:03 +02:00
ines
2c3bdd09b1
Add English test for like_num
2017-05-09 11:06:34 +02:00
ines
22375eafb0
Fix and merge attrs and lex_attrs tests
2017-05-09 11:06:25 +02:00
ines
c714841cc8
Move language-specific tests to tests/lang
2017-05-09 00:02:37 +02:00
ines
bd57b611cc
Update conftest to lazy load languages
2017-05-09 00:02:21 +02:00
ines
3c0f85de8e
Remove imports in /lang/__init__.py
2017-05-08 23:58:07 +02:00
ines
be5541bd16
Fix import and tokenizer exceptions
2017-05-08 16:20:14 +02:00
ines
2324788970
Remove bad tests
2017-05-08 16:15:27 +02:00
Gregory Howard
c0afcd22bb
Merge remote-tracking branch 'remotes/upstream/master'
2017-04-27 14:42:54 +02:00
Gregory Howard
8ff4682255
correcting tokenizer exception.
...
Adding tests for lemmatization
2017-04-27 11:52:14 +02:00
Ines Montani
7da9cefd25
Merge pull request #1022 from luvogels/master
...
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Gregory Howard
44cb486849
Adding unitest for tokenization in french (with title)
2017-04-27 10:59:38 +02:00
luvogels
d12a0b6431
Hooked up tokenizer tests
2017-04-26 23:21:41 +02:00
luvogels
8de59ce3b9
Added tokenizer tests
2017-04-26 19:10:18 +02:00