spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-02-05 14:59:59 +03:00

History

Matthew Honnibal afdc9b7ac2 * More performance fiddling, particularly moving the specials into the cache, so that we can just lookup the cache in _tokenize		2014-09-13 00:59:34 +02:00
..
__init__.py	* Basic punct tests updated and passing	2014-08-27 19:38:57 +02:00
en.pxd	* Fix bug with trailing punct on contractions. Reduced efficiency, and slightly hacky implementation.	2014-09-12 18:26:26 +02:00
en.pyx	* Fix performance issues by implementing a better cache. Add own String struct to help	2014-09-12 23:50:37 +02:00
lang.pxd	* Efficiency tweaks	2014-09-13 00:14:05 +02:00
lang.pyx	* More performance fiddling, particularly moving the specials into the cache, so that we can just lookup the cache in _tokenize	2014-09-13 00:59:34 +02:00
lexeme.pxd	* Restoring Lexeme-as-struct	2014-09-10 20:41:37 +02:00
lexeme.pyx	* Restoring Lexeme-as-struct	2014-09-10 20:41:37 +02:00
orth.py	* Refactor to use tokens class.	2014-09-10 18:27:44 +02:00
ptb3.pxd	* Adding PTB3 tokenizer back in, so can understand how much boilerplate is in the docs for multiple tokenizers	2014-08-29 02:30:27 +02:00
ptb3.pyx	* Adding PTB3 tokenizer back in, so can understand how much boilerplate is in the docs for multiple tokenizers	2014-08-29 02:30:27 +02:00
tokens.pxd	* Fiddle with token features	2014-09-12 15:49:36 +02:00
tokens.pyx	* Fiddle with token features	2014-09-12 15:49:36 +02:00
word.pxd	* Moving back to lexeme structs	2014-09-10 20:41:47 +02:00
word.pyx	* Only store LexemeC structs in the vocabulary, transforming them to Lexeme objects for output. Moving away from Lexeme objects for Tokens soon.	2014-09-11 12:28:38 +02:00