ines
c5714d4fb2
xfail matcher test for now until setting norm via Span.merge works
2017-05-29 10:51:02 +02:00
Matthew Honnibal
c91b121aeb
Move serialization functions to util
2017-05-29 10:13:42 +02:00
Matthew Honnibal
1fa2bfb600
Add model_to_bytes and model_from_bytes helpers. Probably belong in thinc.
2017-05-29 09:27:04 +02:00
Matthew Honnibal
6dad4117ad
Work on serialization for models
2017-05-29 01:37:57 +02:00
ines
7b1ddcc04d
Add test for vocab serialization
2017-05-29 01:09:52 +02:00
ines
00b2094dc3
Fix typos, long integers and tests
2017-05-29 01:09:52 +02:00
ines
804dbb8d25
Add StringStore test for API docs
2017-05-29 01:09:52 +02:00
Matthew Honnibal
92dbf28c1e
Hack a fixture in the vectors tests, for xfail
2017-05-28 20:28:32 +02:00
Matthew Honnibal
fe11564b8e
Finish stringstore change. Also xfail vectors tests
2017-05-28 15:10:22 +02:00
Matthew Honnibal
b007a2b0d3
Update stringstore tests
2017-05-28 14:08:09 +02:00
Matthew Honnibal
84e66ca6d4
WIP on stringstore change. 27 failures
2017-05-28 14:06:40 +02:00
Matthew Honnibal
fe4a746300
Accomodate symbols in new string scheme
2017-05-28 13:03:16 +02:00
Matthew Honnibal
a5606c3eda
Work on changing StringStore to return hashes.
2017-05-28 12:36:27 +02:00
ines
a8e58e04ef
Add symbols class to punctuation rules to handle emoji (see #1088 )
...
Currently doesn't work for Hungarian, because of conflicts with the
custom punctuation rules. Also doesn't take multi-character emoji like
👩🏽💻 into account.
2017-05-27 17:57:10 +02:00
Matthew Honnibal
4917cbb484
Include sent_start test
2017-05-23 18:40:37 +02:00
ines
fb0ff0272f
xfail neural parser tests for now and remove test for deprecated method
2017-05-23 12:40:37 +02:00
Matthew Honnibal
5418bcf5d7
Resolve conflict on test
2017-05-23 04:37:16 -05:00
ines
e6acd3bbf2
Fix matcher tests and matcher docs
2017-05-23 11:36:02 +02:00
ines
d0c6d4f76d
Fix formatting
2017-05-23 11:32:00 +02:00
Matthew Honnibal
3959d778ac
Revert "Revert "WIP on improving parser efficiency""
...
This reverts commit 532afef4a8
.
2017-05-23 03:06:53 -05:00
Matthew Honnibal
532afef4a8
Revert "WIP on improving parser efficiency"
...
This reverts commit bdaac7ab44
.
2017-05-23 03:05:25 -05:00
Matthew Honnibal
bdaac7ab44
WIP on improving parser efficiency
2017-05-23 02:59:31 -05:00
ines
b3c7ee0148
Fix tests and use the new Matcher API
2017-05-22 13:54:20 +02:00
Matthew Honnibal
187f370734
Update tests for matcher changes
2017-05-22 12:59:50 +02:00
Matthew Honnibal
7e2cdc0c81
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-22 12:39:34 +02:00
Matthew Honnibal
2f78413a02
PseudoProjectivity->nonproj
2017-05-22 05:39:03 -05:00
Matthew Honnibal
d8bb5bb959
Implement StringStore serialization, and update tests
2017-05-22 12:38:00 +02:00
Matthew Honnibal
5db89053aa
Merge docstrings
2017-05-21 13:46:23 -05:00
Matthew Honnibal
836fe1d880
Update neural net tests
2017-05-19 18:11:29 -05:00
ines
a804045597
Use is_ancestor instead of deprecated is_ancestor_of
2017-05-19 20:23:40 +02:00
Matthew Honnibal
793430aa7a
Get spaCy train command working with neural network
...
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
c9a5d5d24b
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-16 16:22:05 +02:00
Matthew Honnibal
8cf097ca88
Redesign training to integrate NN components
...
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
.begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
221b4c1ee8
Fix test for Python 3
2017-05-16 13:06:30 +02:00
Matthew Honnibal
1d7c18e58a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-15 21:53:47 +02:00
Matthew Honnibal
a9edb3aa1d
Improve integration of NN parser, to support unified training API
2017-05-15 21:53:27 +02:00
ines
b462076d80
Merge load_lang_class and get_lang_class
2017-05-14 01:31:10 +02:00
ines
5858857a78
Update languages list in conftest
2017-05-13 15:37:54 +02:00
ines
8c2a0c026d
Fix parse_tree test
2017-05-13 12:32:45 +02:00
Matthew Honnibal
ee1d35bdb0
Fix merge conflict
2017-05-13 03:20:19 +02:00
Matthew Honnibal
b2540d2379
Merge Kengz's tree_print patch
2017-05-13 03:18:49 +02:00
Matthew Honnibal
7253b4e649
Remove old serialization tests
2017-05-09 18:12:58 +02:00
Matthew Honnibal
f9327343ce
Start updating serializer test
2017-05-09 18:12:03 +02:00
ines
2c3bdd09b1
Add English test for like_num
2017-05-09 11:06:34 +02:00
ines
22375eafb0
Fix and merge attrs and lex_attrs tests
2017-05-09 11:06:25 +02:00
ines
c714841cc8
Move language-specific tests to tests/lang
2017-05-09 00:02:37 +02:00
ines
bd57b611cc
Update conftest to lazy load languages
2017-05-09 00:02:21 +02:00
ines
3c0f85de8e
Remove imports in /lang/__init__.py
2017-05-08 23:58:07 +02:00
ines
be5541bd16
Fix import and tokenizer exceptions
2017-05-08 16:20:14 +02:00
ines
2324788970
Remove bad tests
2017-05-08 16:15:27 +02:00
Gregory Howard
c0afcd22bb
Merge remote-tracking branch 'remotes/upstream/master'
2017-04-27 14:42:54 +02:00
Gregory Howard
8ff4682255
correcting tokenizer exception.
...
Adding tests for lemmatization
2017-04-27 11:52:14 +02:00
Ines Montani
7da9cefd25
Merge pull request #1022 from luvogels/master
...
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Gregory Howard
44cb486849
Adding unitest for tokenization in french (with title)
2017-04-27 10:59:38 +02:00
luvogels
d12a0b6431
Hooked up tokenizer tests
2017-04-26 23:21:41 +02:00
luvogels
8de59ce3b9
Added tokenizer tests
2017-04-26 19:10:18 +02:00
Matthew Honnibal
4d98511db7
Make Span hashable. Closes #1019
2017-04-26 19:01:05 +02:00
Matthew Honnibal
24c4c51f13
Try to make test999 less flakey
2017-04-26 18:42:06 +02:00
Gregory Howard
ed5f094451
Adding insensitive lemmatisation test
2017-04-25 18:07:02 +02:00
ghoward
26e31afc18
renamming tests
2017-04-25 17:46:01 +02:00
ghoward
c085c2d391
Adding some unitests
2017-04-25 17:44:16 +02:00
Matthew Honnibal
c4be9c36fe
Fix unicode header in tests
2017-04-24 10:09:01 +02:00
Matthew Honnibal
65f10b53e5
Fix test
2017-04-24 00:25:55 +02:00
Matthew Honnibal
70a43858e1
Fix flakey test
2017-04-24 00:06:30 +02:00
Matthew Honnibal
3973af2d15
Make training test less flakey
2017-04-23 22:59:34 +02:00
ines
42305bc519
Remove unnecessary test
2017-04-23 21:21:41 +02:00
ines
012ea594d1
Add file for misc tests
2017-04-23 21:06:51 +02:00
ines
83f66947dc
Rename test_download to test_cli
2017-04-23 21:06:50 +02:00
Matthew Honnibal
874a3cbb07
Add test for Issue #955
2017-04-23 17:57:01 +02:00
Matthew Honnibal
5d8af40445
Add test for Issue #999
2017-04-23 17:06:30 +02:00
Matthew Honnibal
040751ad17
Remove xfail on Test #910
2017-04-23 16:28:55 +02:00
Ben Eyal
e90e8a3f10
Enable test
2017-04-20 02:25:24 +03:00
ines
2bd89e7ade
Tidy up Hebrew tests and test for punctuation (see #995 )
2017-04-19 19:28:03 +02:00
ines
13d30b6c01
xfail lemmatizer test that's causing problems (see #546 )
2017-04-16 21:18:39 +02:00
ines
0084466a66
Remove unused utf8open util and replace os.path with ensure_path
2017-04-16 20:37:45 +02:00
Matthew Honnibal
1dca7eeb03
Add unicode declaration on new regression test
2017-04-07 18:09:23 +02:00
ines
887827fc6a
Merge branch 'develop'
2017-04-07 17:36:23 +02:00
ines
444dd511c5
Fix xpassing URL test case
2017-04-07 17:36:05 +02:00
ines
bf0f15e762
Add / to tokenizer infixes ( resolves #891 )
2017-04-07 17:30:44 +02:00
ines
00b9011a49
Fix whitespace
2017-04-07 17:29:59 +02:00
Matthew Honnibal
0513c43bf0
Merge branch 'master' of https://github.com/explosion/spaCy
2017-04-07 17:07:10 +02:00
Matthew Honnibal
cc36c308f4
Fix noun_chunk rules around coordination
...
Closes #693 .
2017-04-07 17:06:40 +02:00
Matthew Honnibal
ab846256cf
Merge pull request #966 from recognai/master
...
Prepare Spanish language for training models, including configuration, rich-UD tag map and tests
2017-04-07 16:12:29 +02:00
Matthew Honnibal
83dca920d4
Rename test #913 -> #957 , comment
...
Make test for #957 reference correct bug. Add comment.
Previous commit closes #957 .
2017-04-07 15:54:25 +02:00
Matthew Honnibal
5887383fc0
Add test for Issue #913 : Hang from bad regex
2017-04-07 15:47:27 +02:00
oeg
c693d40791
feature(model): Add support for creating the Spanish model, including rich tagset, configuration, and basich tests
2017-04-06 18:48:45 +02:00
Matthew Honnibal
cfff4e0f61
Improve test
2017-03-31 13:59:32 +02:00
Matthew Honnibal
e854f28304
Add test for Issue #758
...
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Matthew Honnibal
0fefdfcbda
Merge pull request #935 from ericzhao28/master
...
Add option to use label=ent_type in doc.merge arguments (Bug fix for issue #862 )
2017-03-30 02:51:24 +02:00
Eric Zhao
aafdf6ffb8
Add option to use label karg to determine ent_type in doc.merge
2017-03-28 23:35:03 -07:00
Matthew Honnibal
b94286de30
Fix regression test
2017-03-25 22:35:07 +01:00
Matthew Honnibal
4f400fa486
Prevent lemmatization of base nouns
...
Update lemmatizer's base-form check, for change in morphology class.
Closes #903 .
2017-03-25 21:51:12 +01:00
Matthew Honnibal
4454c1b23f
Block lemmatization of base-form adjectives
...
Fixes check that an adjective is a base form (as opposed to a
comparative or superlative), so that it's not lemmatized.
e.g. inner -!> inn. Closes #912 .
2017-03-25 21:29:57 +01:00
Ines Montani
97cb4d5e3c
Merge branch 'master' into master
2017-03-25 10:03:47 +01:00
Iddo Berger
da135bd823
add hebrew tokenizer
2017-03-24 18:27:44 +03:00
Matthew Honnibal
f40fbc3710
Add test for Issue #910 : Resuming entity training
2017-03-23 23:38:57 +01:00
ines
f830213c4c
Remove compatibility check test
...
Will only cause problems when incrementing version and not updating
table. Also depends on external URL, which is bad.
2017-03-20 13:20:26 +01:00
Ines Montani
b6ee241e26
Fix print statements
2017-03-20 11:46:37 +01:00
ines
fe0ff00fe1
Fix spacing
2017-03-19 11:55:37 +01:00
ines
5712da6095
Add regression test for #891
2017-03-19 11:48:01 +01:00