spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-03-06 04:41:32 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	ed1907b4df	* Add pragmatic sentence boundary detection tests, from that Ruby gem. Not automatically run, as they can arbitrarily fail based on model changes. Currently 8/15 fail.	2015-04-12 04:46:40 +02:00
Matthew Honnibal	0c25001325	* Fix specials.json	2015-04-12 04:45:41 +02:00
Matthew Honnibal	1629b33082	* Fix copying of tokenizer data in init_model	2015-04-12 04:45:31 +02:00
Matthew Honnibal	248a2b4b0f	* Remove Spans class	2015-04-12 04:07:29 +02:00
Matthew Honnibal	1d05e6da00	* Add ne_iob and ne_type features to NER	2015-04-10 19:07:08 +02:00
Matthew Honnibal	4df8a3d90f	* Add ne_iob and ne_type attributes to context vector	2015-04-10 05:02:15 +02:00
Matthew Honnibal	8c354c432b	* Add ValueError condition to ner_tag reading	2015-04-10 04:59:59 +02:00
Matthew Honnibal	435cccf098	* Add read_conll03_file function to conll.pyx	2015-04-10 04:59:11 +02:00
Matthew Honnibal	99c9ecfc18	* Fix bug in prefix, suffix and word shape features in parser and NER	2015-04-10 03:53:33 +02:00
Matthew Honnibal	a6ac92f077	* Respect the model_dir input parameter to train.py	2015-04-08 22:48:26 +02:00
Matthew Honnibal	ed8942a096	* Add train function to fabfile	2015-04-08 22:47:59 +02:00
Matthew Honnibal	baff0f8ad8	* Add docstring explaining script a bit, and add handling of word vectors	2015-04-08 08:20:15 +02:00
Matthew Honnibal	c0a3e25b43	* Upd gitignore	2015-04-08 07:48:04 +02:00
Matthew Honnibal	156b70ed82	* Add new script to replace make_lexicon, that does full setup of data	2015-04-08 07:46:53 +02:00
Matthew Honnibal	e775e05313	* Use merge_mwe=False in evaluation in train.py	2015-04-08 00:35:19 +02:00
Matthew Honnibal	cff2b13fef	* Fix Issue #44 : Broken Token.string attribute when single word sentence	2015-04-07 06:08:25 +02:00
Matthew Honnibal	085574ccc1	* Add test for Issue #44	2015-04-07 06:05:18 +02:00
Matthew Honnibal	6640386b25	* Fix Issue #43 : TAG attr not supported. Also add DEP attr, while I'm at it. Need better way of ensuring future changes don't break in similar way.	2015-04-07 06:00:57 +02:00
Matthew Honnibal	b64b2bd910	* Fix Issue #43 : TAG attr not supported. Also add DEP attr, while I'm at it. Need better way of ensuring future changes don't break in similar way.	2015-04-07 06:00:30 +02:00
Matthew Honnibal	6674d719a5	* Test for Issue #43 : TAG attribute not working in array export	2015-04-07 05:53:50 +02:00
Matthew Honnibal	f9e510a893	* Whitespace	2015-04-07 04:53:59 +02:00
Matthew Honnibal	66c7ccf6cc	* Fix Spans.orth_	2015-04-07 04:53:40 +02:00
Matthew Honnibal	3b5ea3731a	* Add tests for Span stuff	2015-04-07 04:52:25 +02:00
Matthew Honnibal	c2b9a61ee2	* Upd merge test	2015-04-07 04:51:31 +02:00
Matthew Honnibal	b8d34531c4	* Add support for units to English.__init__, by loading and applying regular expressions	2015-04-07 04:02:32 +02:00
Matthew Honnibal	0ea5af88b6	* Add multi-word expression RegexMatcher	2015-04-07 03:45:40 +02:00
Matthew Honnibal	2fee67cfa3	* Add regular expressions for English multi-word expressions	2015-04-07 03:45:18 +02:00
Matthew Honnibal	5a075ea3fc	* Ensure NER moves are available for single-word tokens	2015-04-05 22:30:58 +02:00
Matthew Honnibal	a60a366b2c	* Support 'punct' dep label in conll.pyx	2015-04-05 22:30:19 +02:00
Matthew Honnibal	021c972137	* Print parse if verbose in scorer	2015-04-05 22:29:30 +02:00
Matthew Honnibal	f26f381b0e	* Add simple ner_tag script	2015-04-03 17:26:58 +02:00
Matthew Honnibal	bb27979352	* Add prepare_vecs script	2015-04-02 06:19:39 +02:00
Matthew Honnibal	fbf19049cf	* Add ent_type_ property	2015-03-31 02:01:29 +02:00
Matthew Honnibal	3f1e17bd3c	* Add tests for new merge() method	2015-03-30 01:37:57 +02:00
Matthew Honnibal	e70b87efeb	* Add merge() method to Tokens, with fairly brittle/hacky implementation, but quite easy to test. Passing minimal tests. Still need to fix left/right deps in C data	2015-03-30 01:37:41 +02:00
Matthew Honnibal	557856e84c	* Allow regular expressions to specify labels for merged spans	2015-03-27 17:40:52 +01:00
Matthew Honnibal	a3af6b7c3d	* Left-Arc from Root, to allow non-monotonic reduce to compete with left-arc when the stack is not empty.	2015-03-27 17:39:16 +01:00
Matthew Honnibal	db5a43318c	* Improve print_state debug printer	2015-03-27 17:29:58 +01:00
Matthew Honnibal	1705eccbbe	* Remove whitespace	2015-03-27 15:22:39 +01:00
Matthew Honnibal	3feb52374c	* Break apart a condition, for ease of debug printing	2015-03-27 15:21:38 +01:00
Matthew Honnibal	b32f581acb	* Fix bug in ArcEager.get_labels	2015-03-27 15:21:06 +01:00
Matthew Honnibal	cd054c6c9f	* Remove stray print statement	2015-03-27 15:20:42 +01:00
Matthew Honnibal	5f2a4ff36d	* Fix spans.lemma_	2015-03-26 16:45:38 +01:00
Matthew Honnibal	f4cc222ec3	* Fix NER scoring	2015-03-26 16:45:38 +01:00
Matthew Honnibal	1320bd19db	* Move Span class to own file	2015-03-26 16:45:38 +01:00
Matthew Honnibal	6f47a667cf	* Move Span class to own file	2015-03-26 16:45:38 +01:00
Matthew Honnibal	f02c39dfaf	* Compare to is not None, for more robustness	2015-03-26 16:44:48 +01:00
Matthew Honnibal	8f68b864c4	* Move Span/Spans to separate files. Currently duplicates lots of Tokens functionality. Should probably be integrated into Tokens	2015-03-26 16:44:48 +01:00
Matthew Honnibal	056c672caf	* Bug fixes to tokenization, and support for times	2015-03-26 16:44:48 +01:00
Matthew Honnibal	ee385b439a	* Ensure StringStore is dumped during training	2015-03-26 16:44:47 +01:00

1 2 3 4 5 ...

974 Commits