spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-12-27 18:36:36 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	04d0209be9	* Recognise multiple infixes in a token.	2016-04-13 18:38:26 +10:00
Matthew Honnibal	6df3858dbc	* Fix Issue #323 : Incorrect semantics of Token.__str__ built-in. Add flag to allow users to switch the old semantics back on, to ease transition.	2016-04-12 13:17:59 +10:00
Wolfgang Seeker	80bea62842	bugfix in unit test	2016-04-08 16:46:44 +02:00
Matthew Honnibal	26622f0ffc	Merge branch 'master' of ssh://github.com/honnibal/spaCy	2016-03-29 14:31:52 +11:00
Matthew Honnibal	b1fe41b45d	* Extend infix test, commenting on limitation of tokenizer w.r.t. infixes at the moment.	2016-03-29 14:31:05 +11:00
Matthew Honnibal	9c73983bdd	* Add test for hyphenation problem in Issue #302	2016-03-29 14:27:13 +11:00
Matthew Honnibal	4a37fdcee1	Merge pull request #287 from wbwseeker/deproj_sentbnd_bug add function to Token for setting head and dep (and dep_)	2016-03-25 09:47:45 +11:00
Henning Peters	c12d3dd200	add __init__.py to empty package dirs	2016-03-14 11:28:03 +01:00
Wolfgang Seeker	46e3f979f1	add function for setting head and label to token change PseudoProjectivity.deprojectivize to use these functions	2016-03-11 17:31:06 +01:00
Matthew Honnibal	963fe5258e	* Add missing __contains__ method to vocab	2016-03-08 15:49:10 +00:00
Wolfgang Seeker	9d1e6de4a0	make a proper list from zip iterator	2016-03-03 19:51:01 +01:00
Wolfgang Seeker	49f9d1c085	change test_nonproj.py to not use zip inside numpy.asarray	2016-03-03 19:42:09 +01:00
Matthew Honnibal	fcaa0ad7ce	Merge pull request #280 from wbwseeker/german_parser German parser	2016-03-04 03:27:42 +11:00
Wolfgang Seeker	690c5acabf	adjust train.py to train both english and german models	2016-03-03 15:21:00 +01:00
Wolfgang Seeker	3448cb40a4	integrated pseudo-projective parsing into parser - nonproj.pyx holds a class PseudoProjectivity which currently holds all functionality to implement Nivre & Nilsson 2005's pseudo-projective parsing using the HEAD decoration scheme - changed lefts/rights in Token to account for possible non-projective structures	2016-03-01 10:09:08 +01:00
Henning Peters	f3df736e0a	remove unidecode-related test	2016-02-24 18:22:22 +01:00
Wolfgang Seeker	4b2297d5d4	add class PseudoProjective for pseudo-projective parsing PseudoProjective() implements the algorithm from Nivre & Nilsson 2005 using their HEAD decoration scheme.	2016-02-24 11:26:25 +01:00
Wolfgang Seeker	8d531c958b	replace tests for non-projectivity - add functions to find non-projective edges - add test file for non-projectivity functions	2016-02-22 14:40:40 +01:00
Henning Peters	9d8966a2c0	Update test_tokenizer.py	2016-02-10 19:24:37 +01:00
Henning Peters	3b5f1e753b	py26 compatibility	2016-02-10 14:32:54 +01:00
Henning Peters	ee1f1ac300	mark test_sentence_space() as model test	2016-02-10 07:49:11 +01:00
Matthew Honnibal	c6623889c1	* Add test for Issue #251 : Incorrect right edges, caused by bad update to r_edge in del_arc, triggered from non-monotonic left-arc	2016-02-06 23:47:51 +01:00
Matthew Honnibal	161b01d4c0	* Tweak usage example for multi-processing	2016-02-06 14:44:11 +01:00
Matthew Honnibal	7f24229f10	* Don't try to pickle the tokenizer	2016-02-06 14:09:05 +01:00
Matthew Honnibal	e66d45bf66	* Restore previous patch to Span.root, as it seems it wasn't the cause of the problem.	2016-02-06 13:37:41 +01:00
Matthew Honnibal	031b00cb91	* Fix Span.root calculation	2016-02-05 20:12:09 +01:00
Matthew Honnibal	1cf0100bf6	* Add test for multithreading	2016-02-05 19:38:22 +01:00
Matthew Honnibal	1ef84a0557	* Merge master into rethinc2	2016-02-05 12:55:59 +01:00
Matthew Honnibal	c0e63feccc	* xfail pickle tests	2016-02-05 12:46:58 +01:00
Matthew Honnibal	48ce09687d	* Skip pickling the vocab in the tests	2016-02-04 15:51:19 +01:00
Matthew Honnibal	ee975d36d0	* Add stubs to test is_bracket/is_quote/is_left_punct/is_right_punct functions	2016-02-04 13:02:25 +01:00
Matthew Honnibal	907e8cf07d	* Add u prefix to string in web example	2016-01-25 15:51:38 +01:00
Matthew Honnibal	eba03695ef	* Comment out pickle tests	2016-01-25 15:51:13 +01:00
Matthew Honnibal	de94e6c525	* Mark pickle tests as xfail, due to temp files problem	2016-01-25 15:24:17 +01:00
Matthew Honnibal	87172a15c6	* Fix runtime error bug that arose from updated Span.root function.	2016-01-25 15:22:42 +01:00
Matthew Honnibal	2c8dd91785	* Fix first code example on the website	2016-01-23 18:09:19 +01:00
Matthew Honnibal	82d011ac43	* Fix test for whitespace	2016-01-19 20:38:26 +01:00
Matthew Honnibal	e89069dcae	* Fix matcher test	2016-01-19 20:24:01 +01:00
Matthew Honnibal	e1282b7f2f	* Require user-custom NER classes to work without adding the label.	2016-01-19 20:11:03 +01:00
Matthew Honnibal	f0f92793f6	* Add test for user NER classes in matcher blocking the NER model. Re Issue #178 and Issue #217	2016-01-19 19:23:16 +01:00
Matthew Honnibal	515493c675	* Add xfail test for Issue #225 : tokenization with non-whitespace delimiters	2016-01-19 13:20:14 +01:00
Matthew Honnibal	04177debd0	* Unwind limit to sentence boundary detection that prevents it from inserting boundaries on whitespace. Replace it with a check for whitespace in StateClass.fast_forward, so that whitespace is LeftArced when it's on the stack. This should prevent the previous problem of whitespace-only sentences. Should fix Issue #184 , but may cause further problems. Needs testing.	2016-01-19 02:54:15 +01:00
Matthew Honnibal	7893de3203	* Add test for Issue #184 : Whitespace at sentence boundary causes sentence boundary error.	2016-01-18 23:04:38 +01:00
Matthew Honnibal	e825fd9554	* Make some of the website tests work without models	2016-01-18 18:14:44 +01:00
Matthew Honnibal	bed36ab0ff	* Fix import of HEAD attribute	2016-01-18 17:34:43 +01:00
Matthew Honnibal	28c659c1fe	* Fix import for numpy	2016-01-18 17:25:04 +01:00
Matthew Honnibal	fc36bcf458	* Fix import for English	2016-01-18 17:14:40 +01:00
Matthew Honnibal	cc4c335e14	* Set heads for test_merge_tokens, to make the test run without models	2016-01-18 17:00:11 +01:00
Matthew Honnibal	714cbc03d5	* Add test for Issue #203 : nested noun chunks.	2016-01-16 18:02:30 +01:00
Matthew Honnibal	4e2253170c	* Move test for doc.merge to tokens_api file, to avoid name conflicts which upset pytest	2016-01-16 18:01:36 +01:00

1 2 3

105 Commits