spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-03-03 19:08:06 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	6eef0bf9ab	* Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx	2015-07-13 20:20:58 +02:00
Matthew Honnibal	ff9ff6f3fa	* Ensure unseen words are given low log probability	2015-07-12 01:31:09 +02:00
Matthew Honnibal	89a91ad726	* Add SPACE part-of-speech tag, and train tagger to assign it. Also train tagger not to make whitespace an entity	2015-07-09 13:30:41 +02:00
Matthew Honnibal	6ddb2f5e45	* Restore merge_mwe in English class	2015-07-08 19:35:30 +02:00
Matthew Honnibal	6859f6adac	* Restore merge_mwe in English class	2015-07-08 19:34:55 +02:00
Matthew Honnibal	e3c53f5ecd	* Fix mention of Tokens in docstring	2015-07-08 18:56:27 +02:00
Matthew Honnibal	bb522496dd	* Rename Tokens to Doc	2015-07-08 18:53:00 +02:00
Matthew Honnibal	4e4fac452b	* Refactor __init__ for simplicity. Allow parse=True, tag=True etc flags to be passed at top-level. Do not lazy-load parser.	2015-07-08 12:35:29 +02:00
Matthew Honnibal	1d2deb4616	* Work on refactoring default arguments to English.__init__	2015-07-07 15:53:25 +02:00
Matthew Honnibal	6788c86b2f	* Begin refactor	2015-07-07 14:00:07 +02:00
Matthew Honnibal	9af86b0b0b	* Fix attrs.pxd	2015-06-30 18:16:30 +02:00
Matthew Honnibal	5d595b5a8c	* Inc versions	2015-06-30 18:11:06 +02:00
Matthew Honnibal	d2eeba6667	* Start wiring up color and emotion lexicons. Hopefully we get to use them.	2015-06-30 16:22:23 +02:00
Matthew Honnibal	b266a63f2c	* Inc version of downloadble data	2015-06-24 04:53:08 +02:00
Matthew Honnibal	7d265a9c62	* Revert to wget in spacy.en.download	2015-06-08 00:48:56 +02:00
Matthew Honnibal	1515862861	* Fix download.py	2015-06-08 00:08:05 +02:00
Matthew Honnibal	7e9e8f654a	* Use urllib in spacy.en.download	2015-06-07 23:51:38 +02:00
Matthew Honnibal	80cff41a9c	* Upd download.py	2015-06-07 19:13:28 +02:00
Matthew Honnibal	58d5ac0944	* Add beam search capabilities to Parser. Rename GreedyParser to Parser.	2015-06-02 00:28:02 +02:00
Matthew Honnibal	62424e6c76	* Remove unused regularize argument from _ml.Model	2015-06-02 00:27:07 +02:00
Matthew Honnibal	04bda8648d	* Pass parameter for regularization to model	2015-05-27 03:16:58 +02:00
Matthew Honnibal	eba7b34f66	* Add flag to disable loading of word vectors	2015-05-25 01:02:42 +02:00
Matthew Honnibal	03ebf70a66	* Inc version to 0.84	2015-05-12 02:38:51 +02:00
Matthew Honnibal	fb8d50b3d5	Merge branch 'master' of ssh://github.com/honnibal/spaCy	2015-04-30 12:45:15 +02:00
Matthew Honnibal	378c2a6435	* Fix POS model: make it use tag instead of pos in history features	2015-04-29 00:02:53 +02:00
Jordan Suchow	3a8d9b37a6	Remove trailing whitespace	2015-04-19 13:01:38 -07:00
Matthew Honnibal	cc4e395927	* Add some ad hoc regexes, for multi-word location prepositions	2015-04-17 04:44:24 +02:00
Matthew Honnibal	684d0e5e85	* Download updated data	2015-04-16 04:29:15 +02:00
Matthew Honnibal	42617548af	* Disable merge_mwes by default	2015-04-16 04:20:31 +02:00
Matthew Honnibal	77d0700caf	* Add on X way regexes	2015-04-16 01:35:46 +02:00
Matthew Honnibal	c6707778dd	* Fix Issue #51 : Handle non-ascii lemmas correctly	2015-04-13 22:28:59 +02:00
Matthew Honnibal	761a19113a	* Fix /tmp moving thing in download.py	2015-04-12 07:04:10 +02:00
Matthew Honnibal	b64b2bd910	* Fix Issue #43 : TAG attr not supported. Also add DEP attr, while I'm at it. Need better way of ensuring future changes don't break in similar way.	2015-04-07 06:00:30 +02:00
Matthew Honnibal	b8d34531c4	* Add support for units to English.__init__, by loading and applying regular expressions	2015-04-07 04:02:32 +02:00
Matthew Honnibal	2fee67cfa3	* Add regular expressions for English multi-word expressions	2015-04-07 03:45:18 +02:00
Matthew Honnibal	567388e38d	* Use values encoded by StringStore in POS tagging, rather than indices into a list of tags	2015-03-26 16:44:45 +01:00
Matthew Honnibal	801bf14f4f	* Clean up handling of dep_strings and ent_strings, using StringStore to encode the label names.	2015-03-26 16:44:45 +01:00
Matthew Honnibal	f21ab2d7fb	* Fix bug in ugly ent_strings hack on English class	2015-03-26 16:44:45 +01:00
Matthew Honnibal	8057a95f20	* NER seems to be working, scoring 69 F. Need to add decision-history features --- currently only use current word, 2 words context. Need refactoring.	2015-03-26 16:44:44 +01:00
Matthew Honnibal	220ce8bfed	* Prepare English class for NER	2015-03-26 16:44:44 +01:00
Matthew Honnibal	179b7eb0a7	* Specify parser transition system in language	2015-03-26 16:44:43 +01:00
Matthew Honnibal	8cc3524dc9	* Ws	2015-03-26 16:44:41 +01:00
Matthew Honnibal	2e8d0e5d45	* Upd download script	2015-03-03 05:47:16 -05:00
Matthew Honnibal	caf046b220	* Hastily add method to apply tags from a list of strings, instead of predicting the tags.	2015-02-23 15:40:17 -05:00
Matthew Honnibal	64645a1c2f	* Improve docstring on English	2015-02-11 15:13:20 -05:00
Matthew Honnibal	594e50bd45	* Add option to download speech-parsing data set.	2015-02-11 14:20:29 -05:00
Matthew Honnibal	0b7e769211	* Add POS tags to support SWBD tag set	2015-02-11 14:08:28 -05:00
Matthew Honnibal	312b3a45f3	* Fix issue #19 : Allow parsing/pos tagging of empty strings	2015-02-10 10:15:58 -05:00
Matthew Honnibal	2a0615104b	* Upd download script	2015-02-09 10:22:59 -05:00
Matthew Honnibal	5c3513583d	* Clear buffered python tokens when modifying the Tokens object. Need to clean this up, and modify via a method on Tokens.	2015-02-09 03:57:10 -05:00

1 2 3

113 Commits