spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-04 02:46:40 +03:00

History

Matthew Honnibal 0152831c89 * Refactor tokenization, enable cache, and ensure we look up specials correctly even when there's confusing punctuation surrounding the token.		2014-09-16 18:01:46 +02:00
..
__init__.py	* Basic punct tests updated and passing	2014-08-27 19:38:57 +02:00
_hashing.pxd	* Few nips and tucks to hash table	2014-09-15 05:03:44 +02:00
_hashing.pyx	* Few nips and tucks to hash table	2014-09-15 05:03:44 +02:00
en.pxd	* Refactor tokenization, splitting it into a clearer life-cycle.	2014-09-16 13:16:02 +02:00
en.pyx	* Refactor tokenization, enable cache, and ensure we look up specials correctly even when there's confusing punctuation surrounding the token.	2014-09-16 18:01:46 +02:00
lang.pxd	* Refactor tokenization, enable cache, and ensure we look up specials correctly even when there's confusing punctuation surrounding the token.	2014-09-16 18:01:46 +02:00
lang.pyx	* Refactor tokenization, enable cache, and ensure we look up specials correctly even when there's confusing punctuation surrounding the token.	2014-09-16 18:01:46 +02:00
lexeme.pxd	* Upd Tokens to use vector, with bounds checking.	2014-09-15 03:22:40 +02:00
lexeme.pyx	* Fiddle with the way strings are interned in lexeme	2014-09-15 06:34:45 +02:00
orth.py	* Refactor to use tokens class.	2014-09-10 18:27:44 +02:00
ptb3.pxd	* Adding PTB3 tokenizer back in, so can understand how much boilerplate is in the docs for multiple tokenizers	2014-08-29 02:30:27 +02:00
ptb3.pyx	* Adding PTB3 tokenizer back in, so can understand how much boilerplate is in the docs for multiple tokenizers	2014-08-29 02:30:27 +02:00
tokens.pxd	* Switch to using a heap-allocated vector in tokens	2014-09-15 03:46:14 +02:00
tokens.pyx	* Switch to using a heap-allocated vector in tokens	2014-09-15 03:46:14 +02:00
word.pxd	* Moving back to lexeme structs	2014-09-10 20:41:47 +02:00
word.pyx	* Only store LexemeC structs in the vocabulary, transforming them to Lexeme objects for output. Moving away from Lexeme objects for Tokens soon.	2014-09-11 12:28:38 +02:00