Matthew Honnibal
|
985bc68327
|
* Fix bug with trailing punct on contractions. Reduced efficiency, and slightly hacky implementation.
|
2014-09-12 18:26:26 +02:00 |
|
Matthew Honnibal
|
4817277d66
|
* Replace main lexicon dict with dense_hash_map. May be unsuitable, if strings need recovery.
|
2014-09-12 04:29:09 +02:00 |
|
Matthew Honnibal
|
8b20e9ad97
|
* Delete ununused _split method
|
2014-09-12 04:03:52 +02:00 |
|
Matthew Honnibal
|
a4863686ec
|
* Changed cache to use a linked-list data structure, to take out Python list code. Taking 6-7 mins for gigaword.
|
2014-09-12 03:30:50 +02:00 |
|
Matthew Honnibal
|
e096f30161
|
* Tweak signatures and refactor slightly. Processing gigaword taking 8-9 mins. Tests passing, but some sort of memory bug on exit.
|
2014-09-12 02:43:36 +02:00 |
|
Matthew Honnibal
|
073ee0de63
|
* Restore dense_hash_map for cache dictionary. Seems to double efficiency
|
2014-09-12 02:23:51 +02:00 |
|
Matthew Honnibal
|
c8f7c8bfde
|
* Moving to storing LexemeC structs internally
|
2014-09-11 21:54:34 +02:00 |
|
Matthew Honnibal
|
563047e90f
|
* Switch to returning a Tokens object
|
2014-09-11 21:37:32 +02:00 |
|
Matthew Honnibal
|
cf412adba8
|
* Refactoring to use Tokens object
|
2014-09-10 18:11:13 +02:00 |
|
Matthew Honnibal
|
45a22d6b2c
|
* Docs coming together
|
2014-08-29 01:59:23 +02:00 |
|
Matthew Honnibal
|
c282e6d5fb
|
* Redesign proceeding
|
2014-08-28 19:45:09 +02:00 |
|
Matthew Honnibal
|
fdaf24604a
|
* Basic punct tests updated and passing
|
2014-08-27 19:38:57 +02:00 |
|
Matthew Honnibal
|
e9a62b6eba
|
* Refactoring with Lexeme as a class now compiles. Basic design seems to work
|
2014-08-27 17:15:39 +02:00 |
|
Matthew Honnibal
|
68bae2fec6
|
* More refactoring
|
2014-08-25 16:42:22 +02:00 |
|