.. |
__init__.py
|
* Basic punct tests updated and passing
|
2014-08-27 19:38:57 +02:00 |
de.pxd
|
* Add German tokenizer files
|
2014-09-25 18:29:13 +02:00 |
de.pyx
|
* Add German tokenizer files
|
2014-09-25 18:29:13 +02:00 |
en.pxd
|
* Refactor tokenization, splitting it into a clearer life-cycle.
|
2014-09-16 13:16:02 +02:00 |
en.pyx
|
* Use PointerHash instead of locally provided _hashing module
|
2014-09-25 18:23:35 +02:00 |
lang.pxd
|
* Add i attribute to lexeme, giving lexemes sequential IDs.
|
2014-10-09 13:50:05 +11:00 |
lang.pyx
|
* Update Lexicon class to expect a list of lexeme dict descriptions
|
2014-10-09 14:51:35 +11:00 |
lexeme.pxd
|
* Update Lexicon class to expect a list of lexeme dict descriptions
|
2014-10-09 14:51:35 +11:00 |
lexeme.pyx
|
* Add serialize/deserialize functions for lexeme, transport to/from python dict.
|
2014-10-09 14:10:46 +11:00 |
orth.py
|
* Refactor to use tokens class.
|
2014-09-10 18:27:44 +02:00 |
ptb3.pxd
|
* Adding PTB3 tokenizer back in, so can understand how much boilerplate is in the docs for multiple tokenizers
|
2014-08-29 02:30:27 +02:00 |
ptb3.pyx
|
* Switch to using a Python ref counted gateway to malloc/free, to prevent memory leaks
|
2014-09-17 20:02:26 +02:00 |
tokens.pxd
|
* Switch to using a heap-allocated vector in tokens
|
2014-09-15 03:46:14 +02:00 |
tokens.pyx
|
* Switch to using a Python ref counted gateway to malloc/free, to prevent memory leaks
|
2014-09-17 20:02:26 +02:00 |
typedefs.pxd
|
* Add typedefs file
|
2014-09-17 23:10:32 +02:00 |
util.py
|
* Update Lexicon class to expect a list of lexeme dict descriptions
|
2014-10-09 14:51:35 +11:00 |
word.pxd
|
* Moving back to lexeme structs
|
2014-09-10 20:41:47 +02:00 |
word.pyx
|
* Switch to using a Python ref counted gateway to malloc/free, to prevent memory leaks
|
2014-09-17 20:02:26 +02:00 |