Matthew Honnibal
|
fea9fe08af
|
Merge pull request #866 from juanmirocks/master
Fix lemmatization of OOV words
|
2017-03-16 23:37:36 +01:00 |
|
Matthew Honnibal
|
28bb546939
|
Merge pull request #883 from ericzhao28/master
Add `lower_` and `upper_` properties to `Span` class
|
2017-03-16 23:35:47 +01:00 |
|
Matthew Honnibal
|
8843b84bd1
|
Merge remote-tracking branch 'origin/develop-downloads'
|
2017-03-16 12:00:42 -05:00 |
|
ines
|
4cfc8ffbd2
|
Reformat pickle tests
|
2017-03-15 17:39:54 +01:00 |
|
ines
|
2a0fcf1354
|
Add tests for new download module
|
2017-03-15 17:39:43 +01:00 |
|
Matthew Honnibal
|
4cab8ac136
|
Update morph exceptions test
|
2017-03-15 09:31:34 -05:00 |
|
ines
|
42ba740dde
|
Revert "Merge branch 'debug'"
This reverts commit 89b79d1178 , reversing
changes made to 02bdf490a1 .
|
2017-03-13 20:11:52 +01:00 |
|
ines
|
4c5f51e49e
|
Update regression test
|
2017-03-13 15:16:11 +01:00 |
|
ines
|
02bdf490a1
|
Remove regression test to see if it caused pytest Travis error
|
2017-03-13 13:00:22 +01:00 |
|
ines
|
17018750ac
|
Add regression test for #717
|
2017-03-13 12:58:22 +01:00 |
|
ines
|
2883ebfca2
|
Remove print statement
|
2017-03-13 12:30:42 +01:00 |
|
ines
|
98c13d8aa9
|
Add regression test for #401
|
2017-03-13 12:28:41 +01:00 |
|
ines
|
444d665f9d
|
Add regression test for #686
|
2017-03-13 12:23:35 +01:00 |
|
ines
|
46b17e5b51
|
Add regression test for #719
|
2017-03-13 12:17:35 +01:00 |
|
ines
|
c8ae682ff9
|
Add regression test for #636
|
2017-03-13 12:08:31 +01:00 |
|
ines
|
337f9601f2
|
Add missing unicode declaration
|
2017-03-13 12:08:19 +01:00 |
|
ines
|
d70386ec6e
|
Update docstring in #886 regression test
|
2017-03-13 12:00:38 +01:00 |
|
ines
|
51ba3ef0a8
|
Add regression test for #886
|
2017-03-13 11:44:58 +01:00 |
|
ines
|
1da29a7146
|
Use new Lemmatizer data and remove file import
Since there's currently only an English lemmatizer, the global
Lemmatizer imports from spacy.en. This is unideal and still needs to be
fixed.
|
2017-03-12 13:58:22 +01:00 |
|
ines
|
c89e30d1a3
|
Add test for English time exceptions ("1a.m." etc.)
|
2017-03-12 13:58:22 +01:00 |
|
ines
|
66c1f194f9
|
Use consistent unicode declarations
|
2017-03-12 13:07:28 +01:00 |
|
Em
|
9c809efc25
|
Removed mapStr
|
2017-03-11 16:23:26 -08:00 |
|
Matthew Honnibal
|
ea2592879f
|
Merge branch 'master' of https://github.com/explosion/spaCy
|
2017-03-11 11:13:37 -06:00 |
|
Em
|
426d17167f
|
Added string manipulation for spans
|
2017-03-10 16:50:02 -08:00 |
|
ines
|
10e29189ac
|
Adjust URL testcases and xfail problems (instead of comment)
|
2017-03-10 14:22:50 +01:00 |
|
Matthew Honnibal
|
ea53647362
|
Merge branch 'develop'
|
2017-03-10 02:49:39 -06:00 |
|
Dan Rapp
|
123d3f2d38
|
Fix error in test case parameterization
|
2017-03-09 12:18:21 -07:00 |
|
Dan Rapp
|
b9307dfcd7
|
Merge branch 'master' into rappdw/tokenizer_exceptions_url_fix
|
2017-03-09 11:42:14 -07:00 |
|
Dan Rapp
|
3b1df3808d
|
Issue #840 - URL pattenr too broad
|
2017-03-09 11:39:39 -07:00 |
|
Matthew Honnibal
|
5b0b968d13
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-03-08 15:03:10 +01:00 |
|
Matthew Honnibal
|
0ac3d27689
|
Fix handling of trailing whitespace
Fix off-by-one error that meant trailing spaces were being dropped.
Closes #792
|
2017-03-08 15:01:40 +01:00 |
|
ines
|
c2e3e651b8
|
Re-add regression test for #859
|
2017-03-08 14:36:09 +01:00 |
|
Matthew Honnibal
|
16670d3251
|
Xfail the vocab pickling for now
|
2017-03-07 21:43:28 +01:00 |
|
Matthew Honnibal
|
a89c3500f6
|
Fixes to hacky vocab pickling
|
2017-03-07 20:58:55 +01:00 |
|
Matthew Honnibal
|
3edb8ae207
|
Whitespace
|
2017-03-07 17:16:26 +01:00 |
|
Matthew Honnibal
|
5de7e712b7
|
Add support for pickling StringStore.
|
2017-03-07 17:15:18 +01:00 |
|
Matthew Honnibal
|
4e75e74247
|
Update regression test for variable-length pattern problem in the matcher.
|
2017-03-07 16:08:32 +01:00 |
|
Matthew Honnibal
|
6d67213b80
|
Add test for 850: Matcher fails on zero-or-more.
|
2017-03-07 15:55:28 +01:00 |
|
Aniruddha Adhikary
|
696215a3fb
|
add tests for Bengali
|
2017-03-05 11:25:12 +06:00 |
|
ines
|
8dff040032
|
Revert "Add regression test for #859"
This reverts commit c4f16c66d1 .
|
2017-03-01 21:56:20 +01:00 |
|
Juan Miguel Cejuela
|
a8cfde46d3
|
#781 Fix test — colocalizes is lemmatized to colocaliz and colicalize
|
2017-03-01 21:43:08 +01:00 |
|
Juan Miguel Cejuela
|
a471114eb2
|
#781 add regression test, failing previous bug fix
|
2017-03-01 21:30:51 +01:00 |
|
ines
|
c4f16c66d1
|
Add regression test for #859
|
2017-03-01 16:07:27 +01:00 |
|
Matthew Honnibal
|
34bcc8706d
|
Merge branch 'french-tokenizer-exceptions'
|
2017-02-27 11:21:21 +01:00 |
|
Matthew Honnibal
|
0aaa546435
|
Fix test after updating the French tokenizer stuff
|
2017-02-27 11:20:47 +01:00 |
|
ines
|
376c5813a7
|
Remove print statements from test
|
2017-02-24 18:26:32 +01:00 |
|
ines
|
7c1260e98c
|
Add regression test
|
2017-02-24 18:22:49 +01:00 |
|
ines
|
51eb190ef4
|
Remove print statements from test
|
2017-02-24 17:41:12 +01:00 |
|
Matthew Honnibal
|
db5ada3995
|
Merge branch 'master' of https://github.com/explosion/spaCy
|
2017-02-24 14:28:12 +01:00 |
|
Matthew Honnibal
|
8f94897d07
|
Add 1 operator to matcher, and make sure open patterns are closed at end of document. Closes Issue #766
|
2017-02-24 14:27:02 +01:00 |
|
ines
|
67991b6e5f
|
Add more test cases to #775 regression test to cover #847
|
2017-02-18 14:10:44 +01:00 |
|
ines
|
44de3c7642
|
Reformat test and use text_file fixture
|
2017-02-16 23:49:19 +01:00 |
|
ines
|
3dd22e9c88
|
Mark vectors test as xfail (temporary)
|
2017-02-16 23:28:51 +01:00 |
|
ines
|
85d249d451
|
Revert "Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)""
This reverts commit ea05f78660 .
|
2017-02-16 23:26:25 +01:00 |
|
ines
|
ea05f78660
|
Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)"
This reverts commit 7d8c9eee7f , reversing
changes made to f6b69babcc .
|
2017-02-16 15:27:12 +01:00 |
|
Raphaël Bournhonesque
|
06a71d22df
|
Fix test failure by using unicode literals
|
2017-02-16 14:48:00 +01:00 |
|
Raphaël Bournhonesque
|
3ba109622c
|
Add regression test with non ' ' space character as token
|
2017-02-16 12:23:27 +01:00 |
|
ines
|
21f09d10d7
|
Revert "Revert "Merge pull request #818 from raphael0202/tokenizer_exceptions""
This reverts commit f02a2f9322 .
|
2017-02-10 13:17:05 +01:00 |
|
ines
|
f02a2f9322
|
Revert "Merge pull request #818 from raphael0202/tokenizer_exceptions"
This reverts commit b95afdf39c , reversing
changes made to b0ccf32378 .
|
2017-02-09 17:07:21 +01:00 |
|
Raphaël Bournhonesque
|
309da78bf0
|
Merge branch 'master' into tokenizer_exceptions
|
2017-02-09 16:32:12 +01:00 |
|
Raphaël Bournhonesque
|
4ce0bbc6b6
|
Update unit tests
|
2017-02-09 16:30:43 +01:00 |
|
ines
|
654fe447b1
|
Add Swedish tokenizer tests (see #807)
|
2017-02-05 11:47:07 +01:00 |
|
Michael Wallin
|
35100c8bdd
|
[issue 805] Add regression test and the required fixture
|
2017-02-04 16:21:34 +02:00 |
|
Michael Wallin
|
1a1952afa5
|
[finnish] Add initial tests for tokenizer
|
2017-02-04 13:54:10 +02:00 |
|
Ines Montani
|
afc6365388
|
Update regression test for #801 to match current expected behaviour
|
2017-02-02 16:23:05 +01:00 |
|
Ines Montani
|
13a4ab37e0
|
Add regression test for #801
|
2017-02-02 15:33:52 +01:00 |
|
Raphaël Bournhonesque
|
85f951ca99
|
Add tokenizer exceptions for French
|
2017-02-02 08:36:16 +01:00 |
|
Ines Montani
|
e4875834fe
|
Fix formatting
|
2017-01-31 15:19:33 +01:00 |
|
Ines Montani
|
c304834e45
|
Add missing import
|
2017-01-31 15:18:30 +01:00 |
|
Ines Montani
|
e6465b9ca3
|
Parametrize test cases and mark as xfail
|
2017-01-31 15:14:42 +01:00 |
|
latkins
|
e4c84321a5
|
Added regression test for Issue #792.
|
2017-01-31 13:47:42 +00:00 |
|
Ines Montani
|
19501f3340
|
Add regression test for #775
|
2017-01-25 13:16:52 +01:00 |
|
Raphaël Bournhonesque
|
1be9c0e724
|
Add fr tokenization unit tests
|
2017-01-24 10:57:37 +01:00 |
|
Ines Montani
|
0967eb07be
|
Add regression test for #768
|
2017-01-23 21:25:46 +01:00 |
|
Ines Montani
|
5f6f48e734
|
Add regression test for #759
|
2017-01-20 15:11:48 +01:00 |
|
Ines Montani
|
d704cfa60d
|
Fix typo
|
2017-01-16 21:30:33 +01:00 |
|
Matthew Honnibal
|
2c60d0cb1e
|
Test #743: Tokens unhashable.
|
2017-01-16 13:27:26 +01:00 |
|
Ines Montani
|
50878ef598
|
Exclude "were" and "Were" from tokenizer exceptions and add regression test (resolves #744)
|
2017-01-16 13:10:38 +01:00 |
|
Ines Montani
|
e053c7693b
|
Fix formatting
|
2017-01-16 13:09:52 +01:00 |
|
Ines Montani
|
116c675c3c
|
Merge pull request #742 from oroszgy/hu_tokenizer_fix
Improved Hungarian tokenizer
|
2017-01-14 23:52:44 +01:00 |
|
Gyorgy Orosz
|
92345b6a41
|
Further numeric test.
|
2017-01-14 22:44:19 +01:00 |
|
Gyorgy Orosz
|
b4df202bfa
|
Better error handling
|
2017-01-14 22:24:58 +01:00 |
|
Gyorgy Orosz
|
b03a46792c
|
Better error handling
|
2017-01-14 22:09:29 +01:00 |
|
Ines Montani
|
332ce2d758
|
Update README.md
|
2017-01-14 21:12:11 +01:00 |
|
Gyorgy Orosz
|
9505c6a72b
|
Passing all old tests.
|
2017-01-14 20:39:21 +01:00 |
|
Gyorgy Orosz
|
63037e79af
|
Fixed hyphen handling in the Hungarian tokenizer.
|
2017-01-14 16:30:11 +01:00 |
|
Gyorgy Orosz
|
f77c0284d6
|
Maintaining compatibility with other spacy tokenizers.
|
2017-01-14 16:19:15 +01:00 |
|
Gyorgy Orosz
|
1be5da1ac6
|
Fixed Hungarian tokenizer for numbers
|
2017-01-14 15:51:59 +01:00 |
|
Ines Montani
|
a89e269a5a
|
Fix test formatting and consistency
|
2017-01-14 13:41:19 +01:00 |
|
Ines Montani
|
3424e3a7e5
|
Update README.md
|
2017-01-13 15:54:54 +01:00 |
|
Ines Montani
|
49186b34a1
|
Mark lemmatizer tests as models since they use installed data
|
2017-01-13 15:12:07 +01:00 |
|
Ines Montani
|
138deb80a1
|
Modernise vector tests, use add_vecs_to_vocab and don't depend on models
|
2017-01-13 15:12:07 +01:00 |
|
Ines Montani
|
96f0caa28a
|
Fix test name for consistency
|
2017-01-13 15:12:07 +01:00 |
|
Ines Montani
|
dc2bb1259f
|
Add util function to add vectors to vocab
|
2017-01-13 15:12:07 +01:00 |
|
Ines Montani
|
db9b25663d
|
Reformat add_docs_equal and add docstring
|
2017-01-13 15:12:07 +01:00 |
|
Ines Montani
|
62ce0a0073
|
Add README.md to tests to explain organisation and conventions
|
2017-01-13 15:11:18 +01:00 |
|
Ines Montani
|
38d60f6b90
|
Modernise serializer I/O tests and don't depend on models where possible
|
2017-01-13 02:24:56 +01:00 |
|
Ines Montani
|
4bb5b89ee4
|
Add text_file_b fixture using BytesIO
|
2017-01-13 02:23:50 +01:00 |
|
Ines Montani
|
49febd8c62
|
Modernise noun chunks tests and don't depend on models
|
2017-01-13 02:01:00 +01:00 |
|
Ines Montani
|
3ee97b5686
|
Rename test_parser to test_noun_chunks
|
2017-01-13 01:36:33 +01:00 |
|
Ines Montani
|
a308703f47
|
Remove old tests
|
2017-01-13 01:34:48 +01:00 |
|
Ines Montani
|
12eb8edf26
|
Move parser tests from unit to parser
|
2017-01-13 01:34:38 +01:00 |
|
Ines Montani
|
138c53ff2e
|
Merge tokenizer tests
|
2017-01-13 01:34:14 +01:00 |
|
Ines Montani
|
01f36ca3ff
|
Move attrs tests from unit to root and modernise
|
2017-01-13 01:33:50 +01:00 |
|
Ines Montani
|
3610d27967
|
Move alignment tests from munge to gold and modernise
|
2017-01-13 01:33:31 +01:00 |
|
Ines Montani
|
094ff7396a
|
Reformat and rename Pragmatic Segmenter tests and mark xfails
|
2017-01-13 01:30:20 +01:00 |
|
Ines Montani
|
affcf1b19d
|
Modernise lemmatizer tests
|
2017-01-12 23:41:17 +01:00 |
|
Ines Montani
|
33d9cf87f9
|
Modernise tagger tests and fix xpassing test
|
2017-01-12 23:40:52 +01:00 |
|
Ines Montani
|
33e5f8dc2e
|
Create basic and extended test set for URLs
|
2017-01-12 23:40:02 +01:00 |
|
Ines Montani
|
5e4f5ebfc8
|
Modernise BILUO tests
|
2017-01-12 23:39:18 +01:00 |
|
Ines Montani
|
09acfbca01
|
Add Lemmatizer fixture
|
2017-01-12 23:38:55 +01:00 |
|
Ines Montani
|
514bfa2597
|
Add path fixture for spaCy data path
|
2017-01-12 23:38:47 +01:00 |
|
Ines Montani
|
e9e99a5670
|
Add regression test for #740
|
2017-01-12 22:57:38 +01:00 |
|
Ines Montani
|
6935d55409
|
Fix formatting
|
2017-01-12 22:56:20 +01:00 |
|
Ines Montani
|
5f0d196a31
|
Modernise and merge matcher tests
|
2017-01-12 22:23:11 +01:00 |
|
Ines Montani
|
d5d774413a
|
Update comments on EN and DE fixtures
|
2017-01-12 22:03:07 +01:00 |
|
Ines Montani
|
9b4bea1df9
|
Tidy up and rename regression tests and remove unnecessary imports
|
2017-01-12 22:00:37 +01:00 |
|
Ines Montani
|
5e1b6178e3
|
Fix formatting and consistency
|
2017-01-12 22:00:06 +01:00 |
|
Ines Montani
|
a3fd32455e
|
Remove redundant language loading integration tests
|
2017-01-12 21:59:48 +01:00 |
|
Ines Montani
|
61f1ca09c2
|
Modernise serializer codecs tests
|
2017-01-12 21:58:55 +01:00 |
|
Ines Montani
|
5dbc6e59f6
|
Modernise Huffman tests
|
2017-01-12 21:58:40 +01:00 |
|
Ines Montani
|
edeeeccea5
|
Modernise packer tests and don't depend on models where possible
|
2017-01-12 21:58:07 +01:00 |
|
Ines Montani
|
d084676cd0
|
Modernise and merge serialization tests
|
2017-01-12 21:57:19 +01:00 |
|
Ines Montani
|
442237787c
|
Add assert_docs_equal util to compare two docs
|
2017-01-12 21:56:52 +01:00 |
|
Ines Montani
|
eac3f700fb
|
Add fixture for entity recognizer
|
2017-01-12 21:56:32 +01:00 |
|
Ines Montani
|
b438cfddbc
|
Modernise matcher tests and split into two files
|
2017-01-12 17:51:46 +01:00 |
|
Ines Montani
|
27482ebed8
|
Move matcher tests for #188 and #242 to regression tests
Modernise tests and remove unnecessary imports
|
2017-01-12 17:33:57 +01:00 |
|
Ines Montani
|
0a4dc632bd
|
Update test to not create redundant Doc object
|
2017-01-12 17:33:18 +01:00 |
|
Ines Montani
|
a2526e66d8
|
Fix formatting, naming and unicode declaration
|
2017-01-12 16:51:13 +01:00 |
|
Ines Montani
|
052cdff07d
|
Modernise vector similarity tests
|
2017-01-12 16:51:13 +01:00 |
|
Ines Montani
|
bd20ec0a6a
|
Add get_cosine util function
|
2017-01-12 16:51:13 +01:00 |
|
Ines Montani
|
51ef75f629
|
Fix regression test for #615 and remove unnecessary imports
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
aeb747e10c
|
Adjust formatting
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
8e3e58a7e6
|
Modernise and merge lexeme vocab tests
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
c3d4516fc2
|
Move test for #361 to regression tests
|
2017-01-12 16:51:12 +01:00 |
|
Ines Montani
|
7cb3d74426
|
Modernise span tests and don't depend on models
|
2017-01-12 15:30:49 +01:00 |
|
Ines Montani
|
92e3d8b3ee
|
Modernise vocab API tests and remove old xfailing tests
|
2017-01-12 15:27:46 +01:00 |
|
Ines Montani
|
7ea87684cd
|
Rename test_vocab.py to test_vocab_api.py
|
2017-01-12 15:12:21 +01:00 |
|
Ines Montani
|
0da2ee5c68
|
Merge flag features tests into orth tests in tests root
|
2017-01-12 15:12:00 +01:00 |
|
Ines Montani
|
03c136cfd3
|
Remove StringStore tests from vocab tests
|
2017-01-12 15:11:15 +01:00 |
|
Ines Montani
|
d7bd57abdf
|
Modernise add vectors vocab test
|
2017-01-12 15:09:49 +01:00 |
|
Ines Montani
|
89525ef345
|
Use consistent test names
|
2017-01-12 15:09:21 +01:00 |
|
Ines Montani
|
f8803808ce
|
Remove old unused tests and conftest files
|
2017-01-12 15:09:05 +01:00 |
|
Ines Montani
|
4d0bfebcd9
|
Move Pragmatic Segmenter test cases (currently unused) to parser tests
|
2017-01-12 15:08:02 +01:00 |
|
Ines Montani
|
26d018d874
|
Add tests for StringStore
|
2017-01-12 15:07:31 +01:00 |
|
Ines Montani
|
9b6784bab5
|
Add fixture for StringStore
|
2017-01-12 15:05:40 +01:00 |
|
Ines Montani
|
99d66d613a
|
Modernise tests for merging spans and don't depend on models
|
2017-01-12 12:26:26 +01:00 |
|
Ines Montani
|
fa8f67596d
|
Remove unused old test
|
2017-01-12 12:26:08 +01:00 |
|
Ines Montani
|
359f73a96b
|
Move test for #54 to regression tests
|
2017-01-12 12:25:51 +01:00 |
|
Ines Montani
|
3f3a46722c
|
Remove unused conftest
|
2017-01-12 12:25:24 +01:00 |
|