Ines Montani
|
d5d774413a
|
Update comments on EN and DE fixtures
|
2017-01-12 22:03:07 +01:00 |
|
Ines Montani
|
9b4bea1df9
|
Tidy up and rename regression tests and remove unnecessary imports
|
2017-01-12 22:00:37 +01:00 |
|
Ines Montani
|
5e1b6178e3
|
Fix formatting and consistency
|
2017-01-12 22:00:06 +01:00 |
|
Ines Montani
|
a3fd32455e
|
Remove redundant language loading integration tests
|
2017-01-12 21:59:48 +01:00 |
|
Ines Montani
|
61f1ca09c2
|
Modernise serializer codecs tests
|
2017-01-12 21:58:55 +01:00 |
|
Ines Montani
|
5dbc6e59f6
|
Modernise Huffman tests
|
2017-01-12 21:58:40 +01:00 |
|
Ines Montani
|
edeeeccea5
|
Modernise packer tests and don't depend on models where possible
|
2017-01-12 21:58:07 +01:00 |
|
Ines Montani
|
d084676cd0
|
Modernise and merge serialization tests
|
2017-01-12 21:57:19 +01:00 |
|
Ines Montani
|
442237787c
|
Add assert_docs_equal util to compare two docs
|
2017-01-12 21:56:52 +01:00 |
|
Ines Montani
|
eac3f700fb
|
Add fixture for entity recognizer
|
2017-01-12 21:56:32 +01:00 |
|
Ines Montani
|
b438cfddbc
|
Modernise matcher tests and split into two files
|
2017-01-12 17:51:46 +01:00 |
|
Ines Montani
|
27482ebed8
|
Move matcher tests for #188 and #242 to regression tests
Modernise tests and remove unnecessary imports
|
2017-01-12 17:33:57 +01:00 |
|
Ines Montani
|
0a4dc632bd
|
Update test to not create redundant Doc object
|
2017-01-12 17:33:18 +01:00 |
|
Ines Montani
|
a2526e66d8
|
Fix formatting, naming and unicode declaration
|
2017-01-12 16:51:13 +01:00 |
|
Ines Montani
|
052cdff07d
|
Modernise vector similarity tests
|
2017-01-12 16:51:13 +01:00 |
|
Ines Montani
|
bd20ec0a6a
|
Add get_cosine util function
|
2017-01-12 16:51:13 +01:00 |
|
Ines Montani
|
51ef75f629
|
Fix regression test for #615 and remove unnecessary imports
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
aeb747e10c
|
Adjust formatting
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
8e3e58a7e6
|
Modernise and merge lexeme vocab tests
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
c3d4516fc2
|
Move test for #361 to regression tests
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
7cb3d74426
|
Modernise span tests and don't depend on models
|
2017-01-12 15:30:49 +01:00 |
|
Ines Montani
|
92e3d8b3ee
|
Modernise vocab API tests and remove old xfailing tests
|
2017-01-12 15:27:46 +01:00 |
|
Ines Montani
|
7ea87684cd
|
Rename test_vocab.py to test_vocab_api.py
|
2017-01-12 15:12:21 +01:00 |
|
Ines Montani
|
0da2ee5c68
|
Merge flag features tests into orth tests in tests root
|
2017-01-12 15:12:00 +01:00 |
|
Ines Montani
|
03c136cfd3
|
Remove StringStore tests from vocab tests
|
2017-01-12 15:11:15 +01:00 |
|
Ines Montani
|
d7bd57abdf
|
Modernise add vectors vocab test
|
2017-01-12 15:09:49 +01:00 |
|
Ines Montani
|
89525ef345
|
Use consistent test names
|
2017-01-12 15:09:21 +01:00 |
|
Ines Montani
|
f8803808ce
|
Remove old unused tests and conftest files
|
2017-01-12 15:09:05 +01:00 |
|
Ines Montani
|
4d0bfebcd9
|
Move Pragmatic Segmenter test cases (currently unused) to parser tests
|
2017-01-12 15:08:02 +01:00 |
|
Ines Montani
|
26d018d874
|
Add tests for StringStore
|
2017-01-12 15:07:31 +01:00 |
|
Ines Montani
|
9b6784bab5
|
Add fixture for StringStore
|
2017-01-12 15:05:40 +01:00 |
|
Ines Montani
|
99d66d613a
|
Modernise tests for merging spans and don't depend on models
|
2017-01-12 12:26:26 +01:00 |
|
Ines Montani
|
fa8f67596d
|
Remove unused old test
|
2017-01-12 12:26:08 +01:00 |
|
Ines Montani
|
359f73a96b
|
Move test for #54 to regression tests
|
2017-01-12 12:25:51 +01:00 |
|
Ines Montani
|
3f3a46722c
|
Remove unused conftest
|
2017-01-12 12:25:24 +01:00 |
|
Ines Montani
|
c2406e92bc
|
Allow setting ents in get_doc
|
2017-01-12 12:25:10 +01:00 |
|
Ines Montani
|
c5914c6fe5
|
Fix and pass regression test for #736
|
2017-01-12 11:48:56 +01:00 |
|
Ines Montani
|
a6790b6694
|
Rename tags to pos in get_doc and allow adding tags to tokens
|
2017-01-12 11:18:36 +01:00 |
|
Ines Montani
|
1add8ace67
|
Merge lemmatizer tests
|
2017-01-12 11:16:53 +01:00 |
|
Ines Montani
|
3bc082abdf
|
Modernise morph exceptions test and don't depend on models
|
2017-01-12 11:14:29 +01:00 |
|
Ines Montani
|
ec7739b76e
|
Add regression test for #736
|
2017-01-12 11:12:44 +01:00 |
|
Ines Montani
|
6c1c564891
|
Move language-specific tests out of redundant tokenizer directories
|
2017-01-12 02:17:18 +01:00 |
|
Ines Montani
|
8fecedac3a
|
Tidy up
|
2017-01-12 02:16:37 +01:00 |
|
Ines Montani
|
ae7edd30e7
|
Move text file back to tokenizer tests directory
|
2017-01-12 02:10:23 +01:00 |
|
Ines Montani
|
ffcaba9017
|
Remove old and/or redundant tests
|
2017-01-12 02:10:18 +01:00 |
|
Ines Montani
|
19c4132097
|
Modernise space attachment parser tests and don't depend on models
|
2017-01-12 01:54:44 +01:00 |
|
Ines Montani
|
69778924c8
|
Modernise and merge parser tests and don't depend on models
|
2017-01-12 01:07:29 +01:00 |
|
Ines Montani
|
178c147612
|
Modernise nonprojectivity tests and don't depend on models
|
2017-01-12 01:06:36 +01:00 |
|
Ines Montani
|
1a3984742c
|
Modernise sentence boundary detection tests and don't depend on models (where possible)
|
2017-01-11 23:53:08 +01:00 |
|
Ines Montani
|
0cdb6ea61d
|
Remove old unused pickle test
|
2017-01-11 23:52:28 +01:00 |
|
Ines Montani
|
c9671329dc
|
Move test for #309 to regression tests
|
2017-01-11 23:52:13 +01:00 |
|
Ines Montani
|
d0e37b5670
|
Modernise parser tests and don't depend on models
|
2017-01-11 21:30:27 +01:00 |
|
Ines Montani
|
342cb41782
|
Add apply_transition_sequence util function to utils
|
2017-01-11 21:30:14 +01:00 |
|
Ines Montani
|
09807addff
|
Add en_parser fixture
|
2017-01-11 21:29:59 +01:00 |
|
Ines Montani
|
55d151aa61
|
Modernise Doc parse tree navigation tests and don't depend on models
|
2017-01-11 21:14:15 +01:00 |
|
Ines Montani
|
7262421bb2
|
Use consistent test names
|
2017-01-11 19:00:52 +01:00 |
|
Ines Montani
|
33800c9367
|
Rename "tokens" tests to "doc"
|
2017-01-11 18:59:01 +01:00 |
|
Ines Montani
|
3a9c6a9563
|
Remove old unused files
|
2017-01-11 18:58:38 +01:00 |
|
Ines Montani
|
8e962de39f
|
Remove old word vector tests
|
2017-01-11 18:55:08 +01:00 |
|
Ines Montani
|
e027936920
|
Modernise Doc noun chunks tests
|
2017-01-11 18:54:56 +01:00 |
|
Ines Montani
|
439f396acd
|
Modernise Doc array tests and don't depend on models
|
2017-01-11 18:54:46 +01:00 |
|
Ines Montani
|
05447be884
|
Modernise test for adding entities
|
2017-01-11 18:54:24 +01:00 |
|
Ines Montani
|
6e883f4c00
|
Modernise Doc API tests and don't depend on models
|
2017-01-11 18:05:36 +01:00 |
|
Ines Montani
|
8bf3bb5c44
|
Make words optional for get_doc
|
2017-01-11 18:05:10 +01:00 |
|
Ines Montani
|
928db7e419
|
Fix StringIO import for Python 3
|
2017-01-11 14:07:48 +01:00 |
|
Ines Montani
|
69998f216b
|
Rename test_tokens_api.py to test_doc_api.py
|
2017-01-11 13:58:56 +01:00 |
|
Ines Montani
|
d94dea1b18
|
Merge token tests into token API tests
|
2017-01-11 13:57:02 +01:00 |
|
Ines Montani
|
eb23424ab0
|
Modernise token API tests and don't depend on loading models
|
2017-01-11 13:56:54 +01:00 |
|
Ines Montani
|
c682b8ca90
|
Merge conftests into one cohesive file
|
2017-01-11 13:56:32 +01:00 |
|
Ines Montani
|
909f24d7df
|
Add test utils and get_doc helper function
Create Doc object from given vocab, words and annotations to allow
tests not to depend on loading the models.
|
2017-01-11 13:55:33 +01:00 |
|
Ines Montani
|
3e6e1f0251
|
Tidy up regression tests
|
2017-01-10 19:24:10 +01:00 |
|
Ines Montani
|
869963c3c4
|
Mark extensive prefix/suffix tests as slow
|
2017-01-10 15:57:35 +01:00 |
|
Ines Montani
|
487e020ebe
|
Add simple test for surrounding brackets
|
2017-01-10 15:57:26 +01:00 |
|
Ines Montani
|
0ba5cf51d2
|
Assert length first
|
2017-01-10 15:57:00 +01:00 |
|
Ines Montani
|
2185d31907
|
Adjust names and formatting
|
2017-01-10 15:56:35 +01:00 |
|
Ines Montani
|
e10d4ca964
|
Remove semi-redundant URLs and punctuation for faster testing
|
2017-01-10 15:54:25 +01:00 |
|
Ines Montani
|
3a3cb2c90c
|
Add unicode declaration
|
2017-01-10 15:53:15 +01:00 |
|
Matthew Honnibal
|
64f747cb65
|
Token comparison test
|
2017-01-09 19:12:00 +01:00 |
|
Matthew Honnibal
|
18c3c2d05c
|
Add tests for token comparison, re Issue #631
|
2017-01-09 19:09:59 +01:00 |
|
Matthew Honnibal
|
42cd598f57
|
Use correct fixtures in URL tokenizer
|
2017-01-09 14:10:40 +01:00 |
|
Ines Montani
|
aa876884f0
|
Revert "Revert "Merge remote-tracking branch 'origin/master'""
This reverts commit fb9d3bb022 .
|
2017-01-09 13:28:13 +01:00 |
|
Ines Montani
|
d5c72c40eb
|
Remove old tests for old website example code
|
2017-01-08 22:28:53 +01:00 |
|
Ines Montani
|
5d28664fc5
|
Don't test Hungarian for numbers and hyphens for now
Reinvestigate behaviour of case affixes given reorganised tokenizer
patterns.
|
2017-01-08 20:45:40 +01:00 |
|
Ines Montani
|
abb09782f9
|
Move sun.txt to original location and fix path to not break parser tests
|
2017-01-08 20:32:54 +01:00 |
|
Ines Montani
|
8328925e1f
|
Add newlines to long German text
|
2017-01-05 18:13:30 +01:00 |
|
Ines Montani
|
55b46d7cf6
|
Add tokenizer tests for German
|
2017-01-05 18:11:25 +01:00 |
|
Ines Montani
|
5bb4081f52
|
Remove redundant test_tokenizer.py for English
|
2017-01-05 18:11:11 +01:00 |
|
Ines Montani
|
8216ba599b
|
Add tests for longer and mixed English texts
|
2017-01-05 18:11:04 +01:00 |
|
Ines Montani
|
65f937d5c6
|
Move basic contraction tests to test_contractions.py
|
2017-01-05 18:09:53 +01:00 |
|
Ines Montani
|
bbe7cab3a1
|
Move non-English-specific tests back to general tokenizer tests
|
2017-01-05 18:09:29 +01:00 |
|
Ines Montani
|
038002d616
|
Reformat HU tokenizer tests and adapt to general style
Improve readability of test cases and add conftest.py with fixture
|
2017-01-05 18:06:44 +01:00 |
|
Ines Montani
|
637f785036
|
Add general sanity tests for all tokenizers
|
2017-01-05 16:25:38 +01:00 |
|
Ines Montani
|
c5f2dc15de
|
Move English tokenizer tests to directory /en
|
2017-01-05 16:25:04 +01:00 |
|
Ines Montani
|
8b45363b4d
|
Modernize and merge general tokenizer tests
|
2017-01-05 13:17:05 +01:00 |
|
Ines Montani
|
02cfda48c9
|
Modernize and merge tokenizer tests for string loading
|
2017-01-05 13:16:55 +01:00 |
|
Ines Montani
|
a11f684822
|
Modernize and merge tokenizer tests for whitespace
|
2017-01-05 13:16:33 +01:00 |
|
Ines Montani
|
8b284fc6f1
|
Modernize and merge tokenizer tests for text from file
|
2017-01-05 13:15:52 +01:00 |
|
Ines Montani
|
2c2e878653
|
Modernize and merge tokenizer tests for punctuation
|
2017-01-05 13:14:16 +01:00 |
|
Ines Montani
|
8a74129cdf
|
Modernize and merge tokenizer tests for prefixes/suffixes/infixes
|
2017-01-05 13:13:12 +01:00 |
|
Ines Montani
|
0e65dca9a5
|
Modernize and merge tokenizer tests for exception and emoticons
|
2017-01-05 13:11:31 +01:00 |
|
Ines Montani
|
34c47bb20d
|
Fix formatting
|
2017-01-05 13:10:51 +01:00 |
|
Ines Montani
|
2e72683baa
|
Add missing docstrings
|
2017-01-05 13:10:21 +01:00 |
|
Ines Montani
|
da10a049a6
|
Add unicode declarations
|
2017-01-05 13:09:48 +01:00 |
|
Ines Montani
|
58adae8774
|
Remove unused file
|
2017-01-05 13:09:22 +01:00 |
|
Ines Montani
|
c6e5a5349d
|
Move regression test for #360 into own file
|
2017-01-04 00:49:31 +01:00 |
|
Ines Montani
|
8279993a6f
|
Modernize and merge tokenizer tests for punctuation
|
2017-01-04 00:49:20 +01:00 |
|
Ines Montani
|
550630df73
|
Update tokenizer tests for contractions
|
2017-01-04 00:48:42 +01:00 |
|
Ines Montani
|
109f202e8f
|
Update conftest fixture
|
2017-01-04 00:48:21 +01:00 |
|
Ines Montani
|
ee6b49b293
|
Modernize tokenizer tests for emoticons
|
2017-01-04 00:47:59 +01:00 |
|
Ines Montani
|
f09b5a5dfd
|
Modernize tokenizer tests for infixes
|
2017-01-04 00:47:42 +01:00 |
|
Ines Montani
|
59059fed27
|
Move regression test for #351 to own file
|
2017-01-04 00:47:11 +01:00 |
|
Ines Montani
|
667051375d
|
Modernize tokenizer tests for whitespace
|
2017-01-04 00:46:35 +01:00 |
|
Ines Montani
|
aafc894285
|
Modernize tokenizer tests for contractions
Use @pytest.mark.parametrize.
|
2017-01-03 23:02:21 +01:00 |
|
Ines Montani
|
fb9d3bb022
|
Revert "Merge remote-tracking branch 'origin/master'"
This reverts commit d3b181cdf1 , reversing
changes made to b19cfcc144 .
|
2017-01-03 18:21:36 +01:00 |
|
Matthew Honnibal
|
3ba7c167a8
|
Fix URL tests
|
2016-12-30 17:10:08 -06:00 |
|
Matthew Honnibal
|
9936a1b9b5
|
Merge branch 'tokenization_w_exception_patterns' of https://github.com/oroszgy/spaCy.hu into oroszgy-tokenization_w_exception_patterns
|
2016-12-30 14:53:40 -06:00 |
|
kengz
|
73a38bd4d1
|
Merge remote-tracking branch 'upstream/master'
|
2016-12-30 12:19:59 -05:00 |
|
kengz
|
da44183ae1
|
move parse_tree logic to a new tokens/printers.py file
|
2016-12-30 12:19:18 -05:00 |
|
Matthew Honnibal
|
3e8d9c772e
|
Test interaction of token_match and punctuation
Check that the new token_match function applies after punctuation is split off.
|
2016-12-31 00:52:17 +11:00 |
|
Gyorgy Orosz
|
45e045a87b
|
Unicode/UTF8 compatibility for Python2
|
2016-12-24 00:21:00 +01:00 |
|
Gyorgy Orosz
|
72b61b6d03
|
Typo fix.
|
2016-12-24 00:10:29 +01:00 |
|
Gyorgy Orosz
|
1748549aeb
|
Added exception pattern mechanism to the tokenizer.
|
2016-12-21 23:16:19 +01:00 |
|
Gyorgy Orosz
|
ab2f6ea46c
|
Removed data files from tests..
|
2016-12-21 20:22:09 +01:00 |
|
Gyorgy Orosz
|
3d5306acb9
|
Added further testcases.
|
2016-12-20 23:49:35 +01:00 |
|
Gyorgy Orosz
|
23956e72ff
|
Improved partial support for tokenzing Hungarian numbers
|
2016-12-20 23:36:59 +01:00 |
|
Gyorgy Orosz
|
6add156075
|
Refactored language data structure
|
2016-12-20 22:28:20 +01:00 |
|
Gyorgy Orosz
|
366b3f8685
|
Merge branch 'master' into hu_tokenizer
|
2016-12-20 20:53:31 +01:00 |
|
Gyorgy Orosz
|
c035928156
|
Partial Hungarian number tokenization is added.
|
2016-12-20 20:46:20 +01:00 |
|
Matthew Honnibal
|
f38eb25fe1
|
Fix test for word vector
|
2016-12-18 23:31:55 +01:00 |
|
Matthew Honnibal
|
e4c951c153
|
Merge branch 'organize-language-data' of ssh://github.com/explosion/spaCy into organize-language-data
|
2016-12-18 17:01:08 +01:00 |
|
Ines Montani
|
d1c1d3f9cd
|
Fix tokenizer test
|
2016-12-18 16:55:32 +01:00 |
|
Matthew Honnibal
|
bdcecb3c96
|
Add import in regression test
|
2016-12-18 16:51:31 +01:00 |
|
Ines Montani
|
77cf2fb0f6
|
Remove unnecessary argument in test
|
2016-12-18 14:06:27 +01:00 |
|
Ines Montani
|
121c310566
|
Remove trailing whitespace
|
2016-12-18 14:06:27 +01:00 |
|
Matthew Honnibal
|
0595cc0635
|
Change test595 to mock data, instead of requiring model.
|
2016-12-18 13:28:51 +01:00 |
|
Ines Montani
|
f2c48ef504
|
Resolve stopwords conflict to merge Dutch
|
2016-12-17 13:08:16 +01:00 |
|
Janneke van der Zwaan
|
4a3fdcce8a
|
Merge github.com:explosion/spaCy into dutch
|
2016-12-13 09:25:23 +01:00 |
|
Gyorgy Orosz
|
0cf2144d24
|
Adding partial hyphen and quote handling support.
|
2016-12-11 00:14:36 +01:00 |
|
Gyorgy Orosz
|
2051726fd3
|
Passing Hungatian abbrev tests.
|
2016-12-10 23:37:58 +01:00 |
|
Gyorgy Orosz
|
0289b8ceaa
|
Additional abbreviation tests.
|
2016-12-08 12:17:44 +01:00 |
|
Gyorgy Orosz
|
5b00039955
|
First steps towards the Hungarian tokenizer code.
|
2016-12-07 23:07:43 +01:00 |
|
Ines Montani
|
8350d65695
|
Change morphology and lemmatizer API
Take morphology features as object instead of keyword arguments
|
2016-12-07 21:12:49 +01:00 |
|
Ines Montani
|
52e7d634df
|
Remove trailing whitespace
|
2016-12-07 21:12:19 +01:00 |
|
Ines Montani
|
07f0efb102
|
Add test for tokenizer regular expressions
|
2016-12-07 20:33:28 +01:00 |
|
Matthew Honnibal
|
f6e356aada
|
Add (and test) Span.sentiment attribute. By default we average token.span, but can override with custom hook. Re Issue #667
|
2016-12-02 11:05:50 +01:00 |
|
Janneke van der Zwaan
|
88869e0e07
|
Merge github.com:explosion/spaCy into dutch
|
2016-11-30 17:13:39 +01:00 |
|
Matthew Honnibal
|
6652f2a135
|
Test #656, #624: special case rules for tokenizer with attributes.
|
2016-11-25 12:44:13 +01:00 |
|
Matthew Honnibal
|
53d8ca8f51
|
Add spacy.attrs.intify_attrs function, to normalize strings in token attribute dictionaries.
|
2016-11-25 11:34:30 +01:00 |
|
dafnevk
|
3db8b0d322
|
Added language class and some language data (with some TODOs) for Dutch
|
2016-11-24 15:56:38 +01:00 |
|
Matthew Honnibal
|
e01c1875ee
|
Work on test for #615
|
2016-11-23 23:48:41 +01:00 |
|
Matthew Honnibal
|
e86f440ca6
|
Fix test for issue 617
|
2016-11-10 22:48:10 +01:00 |
|
Matthew Honnibal
|
faa7610c56
|
Merge branch 'master' of ssh://github.com/explosion/spaCy
|
2016-11-10 22:46:38 +01:00 |
|
Matthew Honnibal
|
a2c7de8329
|
spacy/tests/regression/test_issue617.py
Test Issue #617
|
2016-11-10 22:46:23 +01:00 |
|
tiago
|
2a3e342c1f
|
Added a test case to cover the span.merge returning values
|
2016-11-09 18:57:50 +00:00 |
|
Dmitry Sadovnychyi
|
86c056ba64
|
Add basic test for PhraseMatcher
#613
|
2016-11-09 00:10:32 +08:00 |
|
Matthew Honnibal
|
3ea15b257f
|
Fix test for 605
|
2016-11-06 11:59:26 +01:00 |
|
Matthew Honnibal
|
efe7790439
|
Test #590: Order dependence in Matcher rules.
|
2016-11-06 11:21:36 +01:00 |
|
Matthew Honnibal
|
75805397dd
|
Test Issue #605
|
2016-11-06 10:42:32 +01:00 |
|
Matthew Honnibal
|
4a8a2b6001
|
Test #595 -- Bug in lemmatization of base forms.
|
2016-11-04 00:27:32 +01:00 |
|
Matthew Honnibal
|
72b9bd57ec
|
Test Issue #588: Matcher accepts invalid, empty patterns.
|
2016-11-03 00:09:35 +01:00 |
|
Matthew Honnibal
|
b6b01d4680
|
Remove deprecated tokens_from_list test.
|
2016-11-02 23:47:21 +01:00 |
|
Matthew Honnibal
|
3d6c79e595
|
Test Issue #599: .is_tagged and .is_parsed attributes not reflected after deserialization for empty documents.
|
2016-11-02 23:40:11 +01:00 |
|
Matthew Honnibal
|
125c910a8d
|
Test Issue #600
|
2016-11-02 23:24:13 +01:00 |
|
Matthew Honnibal
|
80824f6d29
|
Fix test
|
2016-11-02 20:48:40 +01:00 |
|
Matthew Honnibal
|
c09a8ce5bb
|
Add test for french tokenizer
|
2016-11-02 20:40:31 +01:00 |
|
Matthew Honnibal
|
b012ae3044
|
Add test for loading languages
|
2016-11-02 20:38:48 +01:00 |
|
Matthew Honnibal
|
d8db648ebf
|
Add __init__.py file for regression tests
|
2016-11-01 13:45:06 +01:00 |
|
Matthew Honnibal
|
6977a2b8cd
|
Add test for Issue #589
|
2016-11-01 12:33:36 +01:00 |
|
Matthew Honnibal
|
7e5f63a595
|
Improve test slightly
|
2016-10-28 17:41:16 +02:00 |
|
Matthew Honnibal
|
782e4814f4
|
Test Issue #587: Matcher segfaults on particular input
|
2016-10-28 16:38:32 +02:00 |
|
Matthew Honnibal
|
afea6505f3
|
Test Issue 429: No valid actions for NER after matcher adds a new entity label.
|
2016-10-27 18:01:34 +02:00 |
|
Matthew Honnibal
|
6c47048912
|
Fix test, after IOB tweak.
|
2016-10-26 17:22:03 +02:00 |
|
Matthew Honnibal
|
d3a617aa99
|
Test workaround for Issue #285: Streaming data memory growth
|
2016-10-24 13:48:06 +02:00 |
|
Matthew Honnibal
|
64e5f02cf7
|
Update test
|
2016-10-23 21:08:07 +02:00 |
|
Matthew Honnibal
|
66d7a6eca2
|
Update test
|
2016-10-23 21:02:05 +02:00 |
|
Matthew Honnibal
|
90bf797125
|
Update test
|
2016-10-23 20:54:17 +02:00 |
|
Matthew Honnibal
|
5e76320ffe
|
Update test
|
2016-10-23 20:44:54 +02:00 |
|
Matthew Honnibal
|
aa105927f3
|
Update test
|
2016-10-23 20:31:25 +02:00 |
|
Matthew Honnibal
|
e120561294
|
Fix vector_norm test.
|
2016-10-23 19:56:16 +02:00 |
|
Matthew Honnibal
|
c05cd2356e
|
Fix similarity test for Python 3
|
2016-10-23 18:16:56 +02:00 |
|
Matthew Honnibal
|
79aa03fe98
|
Test Issue #514: Serializer fails when new entity type has been added.
|
2016-10-23 17:41:44 +02:00 |
|
Matthew Honnibal
|
f97548c6f1
|
Fix broken test, re Issue #461
|
2016-10-23 17:02:23 +02:00 |
|
Matthew Honnibal
|
4de30a8e38
|
Test Issue #514: Serialization fails after adding a new entity label.
|
2016-10-23 16:40:27 +02:00 |
|
Matthew Honnibal
|
e99b3f5322
|
Test Issue #459: Fail to deserialize empty doc
|
2016-10-23 16:30:22 +02:00 |
|
Matthew Honnibal
|
99ff8b902f
|
Test that huffman codec works with empty freqs dict
|
2016-10-23 16:27:45 +02:00 |
|
Matthew Honnibal
|
e5627134d9
|
Test Issue #461: ent_iob tag incorrect after setting entities.
|
2016-10-23 15:50:04 +02:00 |
|
Matthew Honnibal
|
2989072aac
|
Add tests to verify that Issue #442 is fixed in 1.1
|
2016-10-23 14:33:13 +02:00 |
|
Matthew Honnibal
|
e838b6d53f
|
Add tests for using the new Entity ID tracking in the rule matcher
|
2016-10-23 14:04:01 +02:00 |
|
Matthew Honnibal
|
e7af75e0a9
|
Add test for vector resizing, re Issue #544
|
2016-10-21 17:07:21 +02:00 |
|
Matthew Honnibal
|
c3a8a1cf51
|
Update serializer test.
|
2016-10-18 16:18:46 +02:00 |
|
Matthew Honnibal
|
7d446e5094
|
Revert "Update matcher test, to reflect character offset return instead of token offset."
This reverts commit f8d3e3bcfe .
|
2016-10-17 16:49:49 +02:00 |
|
Matthew Honnibal
|
4bf2c53c13
|
Revert "Hack on matcher tests, for new implementation."
This reverts commit dbe60644ab .
|
2016-10-17 16:49:48 +02:00 |
|
Matthew Honnibal
|
dbe60644ab
|
Hack on matcher tests, for new implementation.
|
2016-10-17 16:12:22 +02:00 |
|
Matthew Honnibal
|
f8d3e3bcfe
|
Update matcher test, to reflect character offset return instead of token offset.
|
2016-10-17 16:00:10 +02:00 |
|
Matthew Honnibal
|
be48a7b4f3
|
Fix conftest for website tests.
|
2016-10-17 01:54:26 +02:00 |
|
Matthew Honnibal
|
8951bf6989
|
Update matcher tests
|
2016-10-17 01:53:24 +02:00 |
|
Matthew Honnibal
|
0cf4aff470
|
Set default path in EN/DE tests.
|
2016-10-17 01:52:49 +02:00 |
|
Matthew Honnibal
|
cd71b6b0a9
|
Remove test of parser pickle
|
2016-10-17 01:52:10 +02:00 |
|
kengz
|
fb92e2d061
|
activate parse_tree test, use from_array, test for root correctness
|
2016-10-16 15:12:08 -04:00 |
|
kengz
|
17b7832419
|
mark test as needing models
|
2016-10-16 14:39:07 -04:00 |
|
kengz
|
f046e0d7c8
|
add parse_tree method to language, separate from __call__ for efficiency, but will use __call__ to get the doc
|
2016-10-16 14:20:23 -04:00 |
|
Matthew Honnibal
|
5444d38cc6
|
Update test for biluo tags
|
2016-10-16 11:42:45 +02:00 |
|
Matthew Honnibal
|
47afef7d6b
|
Add init.py for gold tests
|
2016-10-15 21:51:28 +02:00 |
|
Matthew Honnibal
|
2163fd238f
|
Add tests for entity->biluo transformation
|
2016-10-15 21:50:43 +02:00 |
|
Matthew Honnibal
|
2516382106
|
Fix loading of English in span test
|
2016-10-15 14:44:37 +02:00 |
|
Matthew Honnibal
|
049197e0ae
|
Update tests, somewhat messily.
|
2016-10-15 14:14:04 +02:00 |
|
Matthew Honnibal
|
1e1a1d9517
|
Update matcher test
|
2016-10-15 14:13:41 +02:00 |
|
Matthew Honnibal
|
9cc9ce0f14
|
Load with default path=False in tests.
|
2016-10-15 14:13:23 +02:00 |
|
Matthew Honnibal
|
788657f062
|
Ensure words are added to vocab before test, so that the lexicon is updated correctly.
|
2016-10-15 14:12:18 +02:00 |
|
Matthew Honnibal
|
2cc515b2ed
|
Add add_flag method to Vocab, re Issue #504.
|
2016-10-14 12:15:38 +02:00 |
|
Matthew Honnibal
|
a42fbcf946
|
Require model for test_is_properties
|
2016-10-12 19:35:18 +02:00 |
|
Matthew Honnibal
|
20c948361b
|
Use local path in test_lemmatizer
|
2016-10-12 19:35:00 +02:00 |
|
Matthew Honnibal
|
1318d0bc65
|
Test with the non-loaded versions of the English and German pipelines.
|
2016-10-12 19:13:31 +02:00 |
|
Matthew Honnibal
|
bd7fe6420c
|
Revert "Changes to test for new string-store"
This reverts commit 21e90d7d0b .
|
2016-09-30 20:11:01 +02:00 |
|
Matthew Honnibal
|
21e90d7d0b
|
Changes to test for new string-store
|
2016-09-30 20:00:58 +02:00 |
|
Matthew Honnibal
|
81a47c01d8
|
Fix test for empty sentence string.
|
2016-09-27 19:21:22 +02:00 |
|
Matthew Honnibal
|
fc4a7ad794
|
Test and fix Issue #411: IndexError when .sents property is used on empty string.
|
2016-09-27 18:49:14 +02:00 |
|
Matthew Honnibal
|
3d370b7d45
|
Add test for Issue #445, fixed in 3cb4d455d , with improved lemmatizer logic
|
2016-09-27 18:39:46 +02:00 |
|
Matthew Honnibal
|
9c8ac91d72
|
Add test for Issue #435
|
2016-09-27 13:52:38 +02:00 |
|
Matthew Honnibal
|
e233328d38
|
Fix Issue #371: Lexeme objects were unhashable.
|
2016-09-27 13:22:30 +02:00 |
|
Matthew Honnibal
|
2debc4e0a2
|
Add .blank() method to Parser. Start housing default dep labels and entity types within the Defaults class.
|
2016-09-26 11:57:54 +02:00 |
|
Matthew Honnibal
|
95aaea0d3f
|
Refactor so that the tokenizer data is read from Python data, rather than from disk
|
2016-09-25 14:49:53 +02:00 |
|
Matthew Honnibal
|
fd65cf6cbb
|
Finish refactoring data loading
|
2016-09-24 20:26:17 +02:00 |
|
Matthew Honnibal
|
83e364188c
|
Mostly finished loading refactoring. Design is in place, but doesn't work yet.
|
2016-09-24 15:42:01 +02:00 |
|
Matthew Honnibal
|
b00f683a0c
|
Fix matcher test
|
2016-09-24 11:20:58 +02:00 |
|
Matthew Honnibal
|
939a791a52
|
Update tests
|
2016-09-24 01:17:03 +02:00 |
|
Matthew Honnibal
|
f6e587b1c7
|
Fix matcher tests
|
2016-09-21 20:45:20 +02:00 |
|
Matthew Honnibal
|
58e83fe34b
|
Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match.
|
2016-09-21 14:54:55 +02:00 |
|
Matthew Honnibal
|
cc8bf62208
|
* Fix Issue #360: Tokenizer failed when the infix regex matched the start of the string while trying to tokenize multi-infix tokens.
|
2016-05-09 13:23:47 +02:00 |
|
Matthew Honnibal
|
5d86c30f0b
|
* Fix Issue #367: Missing has_vector property on Doc and Span objects
|
2016-05-09 12:36:14 +02:00 |
|
Matthew Honnibal
|
26095f9722
|
* Add span.sent property, re Issue #366
|
2016-05-06 00:17:38 +02:00 |
|
Matthew Honnibal
|
a6a25166ba
|
* Remove print from test
|
2016-05-05 11:10:59 +02:00 |
|
Matthew Honnibal
|
7441ca30ee
|
* Add tests for Issue #361: Lexeme rich comparison
|
2016-05-05 01:31:58 +02:00 |
|
Matthew Honnibal
|
72564213e3
|
* Add test for Issue #309
|
2016-05-04 16:00:28 +02:00 |
|
Matthew Honnibal
|
76f1d871da
|
Merge branch 'master' of ssh://github.com/spacy-io/spaCy
|
2016-05-04 15:54:00 +02:00 |
|
Matthew Honnibal
|
b4bfc6ae55
|
* Add test for Issue #351: Indices off when leading whitespace
|
2016-05-04 15:53:17 +02:00 |
|
Wolfgang Seeker
|
a06fca9fdf
|
German noun chunk iterator now doesn't return tokens more than once
|
2016-05-03 16:58:59 +02:00 |
|
Wolfgang Seeker
|
7825b75548
|
add tests for German noun chunker
|
2016-05-03 15:01:28 +02:00 |
|
Wolfgang Seeker
|
7b246c13cb
|
reformulate noun chunk tests for English
|
2016-05-03 14:24:35 +02:00 |
|
Wolfgang Seeker
|
1786331cd8
|
add model sanity test
|
2016-05-03 12:51:47 +02:00 |
|
Matthew Honnibal
|
308a28c26c
|
* Whitespace
|
2016-05-02 16:08:11 +02:00 |
|
Matthew Honnibal
|
c1c11a8ae0
|
* Fix formatting on serializer tests
|
2016-05-02 16:07:21 +02:00 |
|
Matthew Honnibal
|
902a389d85
|
* Fix merge conflict in test_parse
|
2016-05-02 15:28:07 +02:00 |
|
Matthew Honnibal
|
02c23cc1d0
|
* Fix sentence boundary test
|
2016-05-02 15:26:07 +02:00 |
|
Matthew Honnibal
|
d2f469b809
|
* Fix parsing tests, so that labels are added if they're missing, and so that the branching test values are correct
|
2016-05-02 15:25:27 +02:00 |
|
Wolfgang Seeker
|
b11cbb06c6
|
remove old tests for sentence boundary detection
|
2016-05-02 14:36:35 +02:00 |
|
Matthew Honnibal
|
508fd1f6dc
|
* Refactor noun chunk iterators, so that they're simple functions. Install the iterator when the Doc is created, but allow users to write to the noun_chunk_iterator attribute. The iterator functions accept an object and yield (int start, int end, int label) triples.
|
2016-05-02 14:25:10 +02:00 |
|
Wolfgang Seeker
|
fa961ea694
|
add tests for serialization bug
|
2016-05-02 11:01:56 +02:00 |
|
Wolfgang Seeker
|
1003e7ccec
|
remove debug output from tests
|
2016-04-25 12:12:40 +02:00 |
|
Wolfgang Seeker
|
f57f843e85
|
fix bug in updating tree structure when introducing additional roots
|
2016-04-25 12:01:19 +02:00 |
|