Wolfgang Seeker
690c5acabf
adjust train.py to train both english and german models
2016-03-03 15:21:00 +01:00
Wolfgang Seeker
3448cb40a4
integrated pseudo-projective parsing into parser
...
- nonproj.pyx holds a class PseudoProjectivity which currently holds
all functionality to implement Nivre & Nilsson 2005's pseudo-projective
parsing using the HEAD decoration scheme
- changed lefts/rights in Token to account for possible non-projective
structures
2016-03-01 10:09:08 +01:00
Henning Peters
f3df736e0a
remove unidecode-related test
2016-02-24 18:22:22 +01:00
Wolfgang Seeker
4b2297d5d4
add class PseudoProjective for pseudo-projective parsing
...
PseudoProjective() implements the algorithm from Nivre & Nilsson 2005
using their HEAD decoration scheme.
2016-02-24 11:26:25 +01:00
Wolfgang Seeker
8d531c958b
replace tests for non-projectivity
...
- add functions to find non-projective edges
- add test file for non-projectivity functions
2016-02-22 14:40:40 +01:00
Henning Peters
9d8966a2c0
Update test_tokenizer.py
2016-02-10 19:24:37 +01:00
Henning Peters
3b5f1e753b
py26 compatibility
2016-02-10 14:32:54 +01:00
Henning Peters
ee1f1ac300
mark test_sentence_space() as model test
2016-02-10 07:49:11 +01:00
Matthew Honnibal
c6623889c1
* Add test for Issue #251 : Incorrect right edges, caused by bad update to r_edge in del_arc, triggered from non-monotonic left-arc
2016-02-06 23:47:51 +01:00
Matthew Honnibal
161b01d4c0
* Tweak usage example for multi-processing
2016-02-06 14:44:11 +01:00
Matthew Honnibal
7f24229f10
* Don't try to pickle the tokenizer
2016-02-06 14:09:05 +01:00
Matthew Honnibal
e66d45bf66
* Restore previous patch to Span.root, as it seems it wasn't the cause of the problem.
2016-02-06 13:37:41 +01:00
Matthew Honnibal
031b00cb91
* Fix Span.root calculation
2016-02-05 20:12:09 +01:00
Matthew Honnibal
1cf0100bf6
* Add test for multithreading
2016-02-05 19:38:22 +01:00
Matthew Honnibal
1ef84a0557
* Merge master into rethinc2
2016-02-05 12:55:59 +01:00
Matthew Honnibal
c0e63feccc
* xfail pickle tests
2016-02-05 12:46:58 +01:00
Matthew Honnibal
48ce09687d
* Skip pickling the vocab in the tests
2016-02-04 15:51:19 +01:00
Matthew Honnibal
ee975d36d0
* Add stubs to test is_bracket/is_quote/is_left_punct/is_right_punct functions
2016-02-04 13:02:25 +01:00
Matthew Honnibal
907e8cf07d
* Add u prefix to string in web example
2016-01-25 15:51:38 +01:00
Matthew Honnibal
eba03695ef
* Comment out pickle tests
2016-01-25 15:51:13 +01:00
Matthew Honnibal
de94e6c525
* Mark pickle tests as xfail, due to temp files problem
2016-01-25 15:24:17 +01:00
Matthew Honnibal
87172a15c6
* Fix runtime error bug that arose from updated Span.root function.
2016-01-25 15:22:42 +01:00
Matthew Honnibal
2c8dd91785
* Fix first code example on the website
2016-01-23 18:09:19 +01:00
Matthew Honnibal
82d011ac43
* Fix test for whitespace
2016-01-19 20:38:26 +01:00
Matthew Honnibal
e89069dcae
* Fix matcher test
2016-01-19 20:24:01 +01:00
Matthew Honnibal
e1282b7f2f
* Require user-custom NER classes to work without adding the label.
2016-01-19 20:11:03 +01:00
Matthew Honnibal
f0f92793f6
* Add test for user NER classes in matcher blocking the NER model. Re Issue #178 and Issue #217
2016-01-19 19:23:16 +01:00
Matthew Honnibal
515493c675
* Add xfail test for Issue #225 : tokenization with non-whitespace delimiters
2016-01-19 13:20:14 +01:00
Matthew Honnibal
04177debd0
* Unwind limit to sentence boundary detection that prevents it from inserting boundaries on whitespace. Replace it with a check for whitespace in StateClass.fast_forward, so that whitespace is LeftArced when it's on the stack. This should prevent the previous problem of whitespace-only sentences. Should fix Issue #184 , but may cause further problems. Needs testing.
2016-01-19 02:54:15 +01:00
Matthew Honnibal
7893de3203
* Add test for Issue #184 : Whitespace at sentence boundary causes sentence boundary error.
2016-01-18 23:04:38 +01:00
Matthew Honnibal
e825fd9554
* Make some of the website tests work without models
2016-01-18 18:14:44 +01:00
Matthew Honnibal
bed36ab0ff
* Fix import of HEAD attribute
2016-01-18 17:34:43 +01:00
Matthew Honnibal
28c659c1fe
* Fix import for numpy
2016-01-18 17:25:04 +01:00
Matthew Honnibal
fc36bcf458
* Fix import for English
2016-01-18 17:14:40 +01:00
Matthew Honnibal
cc4c335e14
* Set heads for test_merge_tokens, to make the test run without models
2016-01-18 17:00:11 +01:00
Matthew Honnibal
714cbc03d5
* Add test for Issue #203 : nested noun chunks.
2016-01-16 18:02:30 +01:00
Matthew Honnibal
4e2253170c
* Move test for doc.merge to tokens_api file, to avoid name conflicts which upset pytest
2016-01-16 18:01:36 +01:00
Matthew Honnibal
34a157511f
* Move test_merge_hang to test_tokens_api
2016-01-16 18:00:26 +01:00
Matthew Honnibal
4a16dbfeca
* Add test for Issue #203 : noun chunks should be flat, but sometimes are nested
2016-01-16 17:41:25 +01:00
Matthew Honnibal
223d2b3484
* Add test for Issue #154 : Additional whitespace introduced when string ends with a whitespace token.
2016-01-16 17:08:07 +01:00
Matthew Honnibal
3dc398b727
* Fix merge conflict in requirements.txt
2016-01-16 16:20:49 +01:00
Matthew Honnibal
fc5962a77d
* Improve test for root token in Span
2016-01-16 16:19:09 +01:00
Matthew Honnibal
aa0dd79f52
* Delete test_token_references, which checked a flakey strategy for preventing orphan tokens from a while ago. Now orphan tokens simply hold a reference to Pool, preventing the memory from being freed underneath them. This means that we don't need to run this slow test.
2016-01-16 16:03:35 +01:00
Matthew Honnibal
c1039fa4b4
* Add test for Issue #214 . Resolved in change to Span.root
2016-01-16 15:37:47 +01:00
Henning Peters
235f094534
untangle data_path/via
2016-01-16 12:23:45 +01:00
Matthew Honnibal
478a79a3d5
* Add test for Issue #220 : Whitespace being tagged as noun
2016-01-15 16:17:07 +01:00
Henning Peters
bc229790ac
integrate with sputnik
2016-01-13 19:46:17 +01:00
Matthew Honnibal
3fbfba575a
* xfail the contractions test
2015-12-31 13:16:28 +01:00
Matthew Honnibal
3bd910ccad
* Merge therell test
2015-12-31 11:55:18 +01:00
Matthew Honnibal
eaf2ad59f1
* Fix use of mock Package object
2015-12-31 04:13:15 +01:00
Matthew Honnibal
a6ba43ecaf
* Fix errors in packaging revision
2015-12-29 18:37:26 +01:00
Matthew Honnibal
4b4eec8b47
* Fix Issue #201 : Tokenization of there'll
2015-12-29 18:09:09 +01:00
Matthew Honnibal
86ee9d046d
* Remove test that belongs to a change for master
2015-12-29 18:07:23 +01:00
Matthew Honnibal
aec130af56
Use util.Package class for io
...
Previous Sputnik integration caused API change: Vocab, Tagger, etc
were loaded via a from_package classmethod, that required a
sputnik.Package instance. This forced users to first create a
sputnik.Sputnik() instance, in order to acquire a Package via
sp.pool().
Instead I've created a small file-system shim, util.Package, which
allows classes to have a .load() classmethod, that accepts either
util.Package objects, or strings. We can later gut the internals
of this and make it a proxy for Sputnik if we need more functionality
that should live in the Sputnik library.
Sputnik is now only used to download and install the data, in
spacy.en.download
2015-12-29 18:00:48 +01:00
Matthew Honnibal
8b61d45ed0
* Fix merge conflicts for headers branch
2015-12-27 17:46:25 +01:00
Matthew Honnibal
6bb9c7f311
Merge pull request #202 from henningpeters/sputnik
...
access model via sputnik
2015-12-28 03:29:53 +11:00
Henning Peters
7f7299cafb
Merge branch 'tmpdir' into headers
2015-12-18 12:25:25 +01:00
Henning Peters
cfa187aaf0
fix tests
2015-12-18 10:58:02 +01:00
Henning Peters
8359bd4d93
strip data/ from package, friendlier Language invocation, make data_dir backward/forward-compatible
2015-12-18 09:52:55 +01:00
Henning Peters
4f3efb8eaf
avoid writing to /tmp (not cross-platform compatible)
2015-12-16 19:56:40 +01:00
Henning Peters
4ada39f472
avoid writing to /tmp (not cross-platform compatible)
2015-12-16 19:53:06 +01:00
Henning Peters
ac318b568c
new approach to dependency headers
2015-12-13 11:49:17 +01:00
Henning Peters
9027cef3bc
access model via sputnik
2015-12-07 06:01:28 +01:00
Matthew Honnibal
ec7d36c3a4
* Add test for matcher end-point problem
2015-11-12 05:00:40 +11:00
Matthew Honnibal
d309622a27
* Add test for matcher end-point problem
2015-11-12 04:59:11 +11:00
Matthew Honnibal
56ea20a886
* Add test for matcher end-point problem
2015-11-12 04:58:53 +11:00
Matthew Honnibal
cfa4062147
* Add test for matcher end-point problem
2015-11-12 04:56:07 +11:00
Matthew Honnibal
d67d7d5a86
* Add test for NER inconsistency bug
2015-11-08 16:19:33 +01:00
Matthew Honnibal
fde9a22ec2
* Add new test for ner
2015-11-08 13:57:15 +01:00
Matthew Honnibal
31da42eb27
* Mark tests that require models
2015-11-07 19:27:38 +11:00
Matthew Honnibal
8e26a28616
* Mark tests that require models
2015-11-07 19:10:56 +11:00
Matthew Honnibal
15eab7354f
* Remove extraneous test files
2015-11-07 18:45:13 +11:00
Matthew Honnibal
06f26d258e
* Fix test_basic_create
2015-11-07 10:04:37 +11:00
Matthew Honnibal
1d3884c46d
* Fix test_basic_create
2015-11-07 10:03:56 +11:00
Andreas Grivas
83ca4e0b93
* use old merge tests - add more
2015-11-07 07:57:04 +11:00
Matthew Honnibal
3c162dcac3
* Refactor away from the _ml module, to use thinc 4.0. Still some work needs to be done, e.g. to add __reduce__ to the models, more testing, etc.
2015-11-07 03:24:30 +11:00
Matthew Honnibal
ee3f9ba581
* Fix test of serializer
2015-11-03 19:45:16 +11:00
Matthew Honnibal
d06ba26371
* Fix test of serializer
2015-11-03 19:43:27 +11:00
Matthew Honnibal
85372468e3
* Fix serialize test
2015-11-03 08:51:33 +01:00
Matthew Honnibal
389a373807
Merge branch 'master' of ssh://github.com/honnibal/spaCy
2015-11-03 18:07:25 +11:00
Matthew Honnibal
3f44b3e43f
* Mark serializer test as requiring models
2015-11-03 18:07:08 +11:00
Matthew Honnibal
25ed7be8f8
Merge branch 'master' of https://github.com/honnibal/spaCy
2015-11-03 07:58:17 +01:00
Matthew Honnibal
5e040855a5
* Ensure morphological features and lemmas are loaded in from_array, re Issue #152
2015-11-03 17:56:50 +11:00
Matthew Honnibal
5668feb235
* Fix pickle test for python3
2015-11-03 04:57:02 +01:00
Andreas Grivas
d418f00eb1
fixed error when printing unicode
2015-11-02 20:23:18 +02:00
Matthew Honnibal
1c0356e4c2
* Set test file mode to w+t
2015-10-26 22:40:48 +11:00
Matthew Honnibal
0fe98f358b
* Fix mode on text file for Python3 in strings test
2015-10-26 22:25:16 +11:00
Matthew Honnibal
8ba9cf905e
* Fix mode on text file for Python3 in strings test
2015-10-26 21:44:34 +11:00
Matthew Honnibal
a0730699b1
* Fix mode on text file for Python3 in strings test
2015-10-26 21:25:56 +11:00
Matthew Honnibal
725344d349
* Fix tempfile in test
2015-10-26 21:08:18 +11:00
Matthew Honnibal
a824a98312
* Add tests for pickling vectors, re: Issue #125
2015-10-26 12:31:05 +11:00
Matthew Honnibal
4e16f9e435
* Move tests underneath spacy/
2015-10-26 00:07:31 +11:00