Matthew Honnibal
|
0074ae2fc0
|
* Switch to dynamically allocating array, based on the document length
|
2014-07-07 08:05:29 +02:00 |
|
Matthew Honnibal
|
ff1869ff07
|
* Fixed major efficiency problem, from not quite grokking pass by reference in cython c++
|
2014-07-07 07:36:43 +02:00 |
|
Matthew Honnibal
|
0c76143b72
|
* Give value for assert
|
2014-07-07 05:10:46 +02:00 |
|
Matthew Honnibal
|
d5bef02c72
|
* Reorganized, moving language-independent stuff to spacy. The functions in spacy ask for the dictionaries and split function on input, but the language-specific modules are curried versions that use the globals
|
2014-07-07 04:21:06 +02:00 |
|
Matthew Honnibal
|
a62c38e1ef
|
* Working tokenization. en doesn't match PTB perfectly. Need to reorganize before adding more schemes.
|
2014-07-07 01:15:59 +02:00 |
|
Matthew Honnibal
|
556f6a18ca
|
* Initial commit. Tests passing for punctuation handling. Need contractions, file transport, tokenize function, etc.
|
2014-07-05 20:51:42 +02:00 |
|