Matthew Honnibal
|
90c143bd85
|
* Fix orth import
|
2015-01-05 18:49:19 +11:00 |
|
Matthew Honnibal
|
7689dccd0f
|
* Remove unused import
|
2015-01-05 18:48:48 +11:00 |
|
Matthew Honnibal
|
3f1944d688
|
* Make PyPy work
|
2015-01-05 17:54:38 +11:00 |
|
Matthew Honnibal
|
a510d9f677
|
* Another assertion removed
|
2015-01-05 13:01:40 +11:00 |
|
Matthew Honnibal
|
2856946a66
|
* Remove assertion that doesn't work on Python 3
|
2015-01-05 12:51:16 +11:00 |
|
Matthew Honnibal
|
94034f1112
|
* Fix encoding in lemmatization
|
2015-01-05 11:54:29 +11:00 |
|
Matthew Honnibal
|
b132b3caa6
|
* Fix unicode error in lemmatizer
|
2015-01-05 11:53:54 +11:00 |
|
Matthew Honnibal
|
477e7fbffe
|
* Fix data reading for lemmatizer
|
2015-01-05 06:01:32 +11:00 |
|
Matthew Honnibal
|
58f75abaca
|
* Fix unicode error in orth
|
2015-01-05 05:53:08 +11:00 |
|
Matthew Honnibal
|
4e085d5166
|
* Fix lemmatizer for Python3
|
2015-01-05 05:51:26 +11:00 |
|
Matthew Honnibal
|
ae7c811fd1
|
* Use Exception instead of StandardError
|
2015-01-04 01:22:12 +11:00 |
|
Matthew Honnibal
|
0e4c2ba036
|
* Fix loading of special morph words
|
2015-01-03 23:13:00 +11:00 |
|
Matthew Honnibal
|
f5d41028b5
|
* Move around data files for test release
|
2015-01-03 01:59:22 +11:00 |
|
Matthew Honnibal
|
a24321b63a
|
* Add downloader
|
2015-01-02 21:44:41 +11:00 |
|
Matthew Honnibal
|
5d9a096e2f
|
* Some minor clean-up after HastyModel
|
2014-12-31 19:46:04 +11:00 |
|
Matthew Honnibal
|
aafaf58cbe
|
* Refactor _ml.Model, and finish implementing HastyModel so far not worthwhile.
|
2014-12-31 19:40:59 +11:00 |
|
Matthew Honnibal
|
bcd038e7b6
|
* Implement HastyModel
|
2014-12-31 01:16:47 +11:00 |
|
Matthew Honnibal
|
1a075f77ff
|
* Don't over-ride pre-loaded POS tags, if set by special-cases
|
2014-12-30 23:26:32 +11:00 |
|
Matthew Honnibal
|
785c7ba76a
|
* Embed signature on attrs
|
2014-12-30 23:25:31 +11:00 |
|
Matthew Honnibal
|
30e5805656
|
* Lazy-load tagger and parser
|
2014-12-30 23:25:09 +11:00 |
|
Matthew Honnibal
|
9976aa976e
|
* Messily fix morphology and POS tags on special tokens.
|
2014-12-30 23:24:37 +11:00 |
|
Matthew Honnibal
|
c1ef3febee
|
* Embedsignature in tokens.pyx
|
2014-12-30 21:22:00 +11:00 |
|
Matthew Honnibal
|
aac5028b6e
|
* Move tagger to _ml
|
2014-12-30 21:21:38 +11:00 |
|
Matthew Honnibal
|
1ffb0229ed
|
* Import tokens in parser.pxd
|
2014-12-30 21:21:17 +11:00 |
|
Matthew Honnibal
|
bb0b00f819
|
* Repurporse the Tagger class as a generic Model, wrapping thinc's interface
|
2014-12-30 21:20:15 +11:00 |
|
Matthew Honnibal
|
fe2a5e0370
|
* Work on docstrings
|
2014-12-27 21:46:04 +11:00 |
|
Matthew Honnibal
|
bb80937544
|
* Upd docstrings
|
2014-12-27 18:45:16 +11:00 |
|
Matthew Honnibal
|
b8b65903fc
|
* Tmp
|
2014-12-24 17:42:00 +11:00 |
|
Matthew Honnibal
|
ab61673edd
|
* Fix api of array method
|
2014-12-23 15:18:48 +11:00 |
|
Matthew Honnibal
|
7708d0e24a
|
* Move lemmatizer to en dir
|
2014-12-23 15:16:57 +11:00 |
|
Matthew Honnibal
|
98eb4c0426
|
* Fix path to parser model
|
2014-12-23 15:09:09 +11:00 |
|
Matthew Honnibal
|
b00bc01d8c
|
* All tests now passing for reorg
|
2014-12-23 13:18:59 +11:00 |
|
Matthew Honnibal
|
73f200436f
|
* Tests passing except for morphology/lemmatization stuff
|
2014-12-23 11:40:32 +11:00 |
|
Matthew Honnibal
|
cf8d26c3d2
|
* POS tagger training working after reorg
|
2014-12-22 08:54:47 +11:00 |
|
Matthew Honnibal
|
4c4aa2c5c9
|
* Work on train
|
2014-12-22 07:25:43 +11:00 |
|
Matthew Honnibal
|
61df50b598
|
* Add English-subclass POS tagger
|
2014-12-21 20:59:07 +11:00 |
|
Matthew Honnibal
|
9f3f07cab6
|
* Add attrs file for English
|
2014-12-21 11:29:11 +11:00 |
|
Matthew Honnibal
|
2a89d70429
|
* Add vocab.pyx to setup, and ensure we can import spacy.en.lang
|
2014-12-21 06:03:53 +11:00 |
|
Matthew Honnibal
|
b34a1325d3
|
* Everything compiling after reorg. About to start testing.
|
2014-12-21 05:42:23 +11:00 |
|
Matthew Honnibal
|
e1c1a4b868
|
* Tmp
|
2014-12-21 05:36:29 +11:00 |
|
Matthew Honnibal
|
d11c1edf8c
|
* Import slice_unicode from strings.pyx
|
2014-12-20 07:56:26 +11:00 |
|
Matthew Honnibal
|
be1bdcbd85
|
* Move lang.pyx to tokenizer.pyx
|
2014-12-20 07:55:40 +11:00 |
|
Matthew Honnibal
|
89a1cc1a48
|
* Move murmurhash to .pxd in strings file
|
2014-12-20 07:41:08 +11:00 |
|
Matthew Honnibal
|
d5a942c4a4
|
* Rename lang.pyx to tokenizer.pyx
|
2014-12-20 07:30:39 +11:00 |
|
Matthew Honnibal
|
a60ae261ae
|
* Move tokenizer to its own file, and refactor
|
2014-12-20 07:29:16 +11:00 |
|
Matthew Honnibal
|
867a4a000c
|
* Export set_morph_from_dict function
|
2014-12-20 07:28:27 +11:00 |
|
Matthew Honnibal
|
4e30195c6d
|
* Refactor morphology.pyx
|
2014-12-20 07:27:28 +11:00 |
|
Matthew Honnibal
|
4c6ce7ee84
|
* Update tokens.pyx as part of reorg
|
2014-12-20 07:03:26 +11:00 |
|
Matthew Honnibal
|
116f7f3bc1
|
* Rename Lexicon to Vocab, and move it to its own file
|
2014-12-20 06:54:03 +11:00 |
|
Matthew Honnibal
|
780cbd68b1
|
* Move all struct definitions to structs.pxd, to avoid circular dependencies
|
2014-12-20 06:51:33 +11:00 |
|