Matthew Honnibal
|
62a01dd41d
|
* Fix issue #92: lexemes.bin read error on 32-bit platforms.
|
2015-09-08 14:23:58 +02:00 |
|
Matthew Honnibal
|
55ed3b3a63
|
Merge pull request #85 from NSchrading/master
Add a script to generate the specials.json file
|
2015-09-07 09:05:19 +10:00 |
|
jxs8172
|
85f01c5e16
|
Add contributor agreement. Add exception to 'it' so that 'its' and 'Its' isn't generated (its =/= it's)
|
2015-08-24 18:20:06 -04:00 |
|
Matthew Honnibal
|
25f29232ca
|
Merge pull request #86 from vsolovyov/fix-c-ext-in-setuppy
Correctly pass link_args in c_ext() in setup.py
|
2015-08-24 20:18:49 +10:00 |
|
Vsevolod Solovyov
|
bbdb973398
|
Add contributor agreement for vsolovyov
|
2015-08-24 13:09:23 +03:00 |
|
Vsevolod Solovyov
|
39cfe28f33
|
Correctly pass link_args in c_ext() in setup.py
|
2015-08-24 12:52:05 +03:00 |
|
jxs8172
|
5876248109
|
Add missing we've and hardcoded 's and 'S
|
2015-08-21 22:57:47 -04:00 |
|
jxs8172
|
a5e0a0073b
|
Add a script to generate the specials.json file, to take care of handling uppercase and missing apostrophe contractions
|
2015-08-21 22:39:33 -04:00 |
|
Matthew Honnibal
|
bb910cff92
|
* Fix Python3 problem in align_raw
|
2015-07-28 16:06:53 +02:00 |
|
Matthew Honnibal
|
dcafb181b9
|
* Fix Python3 problem in align_raw
|
2015-07-28 15:52:10 +02:00 |
|
Matthew Honnibal
|
c609ea18f0
|
* Increment version in download script
|
2015-07-28 15:22:17 +02:00 |
|
Matthew Honnibal
|
9c4d0aae62
|
* Switch to better Python2/3 compatible unicode handling
|
2015-07-28 14:45:37 +02:00 |
|
Matthew Honnibal
|
7606d9936f
|
* Python3 correction for GoldParse
|
2015-07-28 14:44:53 +02:00 |
|
Matthew Honnibal
|
ddc1a5cfe5
|
* Fix training under python3
|
2015-07-28 14:09:30 +02:00 |
|
Matthew Honnibal
|
a8bbd7312c
|
* Hackishly patch long dependencies problem
|
2015-07-28 00:14:29 +02:00 |
|
Matthew Honnibal
|
bb583f7f09
|
* Hackishly patch long dependencies problem
|
2015-07-27 23:14:33 +02:00 |
|
Matthew Honnibal
|
b96bf9b8cc
|
Merge branch 'master' of ssh://github.com/honnibal/spaCy
|
2015-07-27 22:57:48 +02:00 |
|
Matthew Honnibal
|
aa7a964a4f
|
* Add a type declaration for doc.from_array
|
2015-07-27 22:57:22 +02:00 |
|
Matthew Honnibal
|
9034f8a1cf
|
* Update test_docs
|
2015-07-27 22:15:19 +02:00 |
|
Matthew Honnibal
|
25a8774f42
|
* Fix regression in packer
|
2015-07-27 21:53:38 +02:00 |
|
Matthew Honnibal
|
174ed1ad20
|
* Tighten the frequency filter in init_model
|
2015-07-27 21:44:51 +02:00 |
|
Matthew Honnibal
|
1601e488ee
|
* Fix bug in decoding non-ascii characters
|
2015-07-27 21:43:58 +02:00 |
|
Matthew Honnibal
|
6deb1e84b6
|
* Upd serialization tests
|
2015-07-27 21:25:48 +02:00 |
|
Matthew Honnibal
|
6a95409cd2
|
* Fix type on bits
|
2015-07-27 21:16:49 +02:00 |
|
Matthew Honnibal
|
a296d72b54
|
* Fix en/attrs
|
2015-07-27 21:16:33 +02:00 |
|
Matthew Honnibal
|
45460f505c
|
* Fix data type on read32 in BitArray
|
2015-07-27 21:12:13 +02:00 |
|
Matthew Honnibal
|
3d43f49f69
|
* Revert prev change
|
2015-07-27 10:58:15 +02:00 |
|
Matthew Honnibal
|
6b586cdad4
|
* Change lexemes.bin format. Add a header specifying size of LexemeC and number of lexemes, and don't have the redundant orth information.
|
2015-07-27 08:31:51 +02:00 |
|
Matthew Honnibal
|
6047f2aa35
|
* Fix path to freqs.txt
|
2015-07-27 02:22:35 +02:00 |
|
Matthew Honnibal
|
4a0f40ec2d
|
* Ensure data is packaged in vocab
|
2015-07-27 02:14:36 +02:00 |
|
Matthew Honnibal
|
af6ed18f2a
|
* Ensure we don't use orth_encode on OOV words.
|
2015-07-27 02:12:01 +02:00 |
|
Matthew Honnibal
|
912511f0aa
|
* Update prebuild command, for shell bug
|
2015-07-27 01:52:04 +02:00 |
|
Matthew Honnibal
|
b532f4eaa2
|
* Ensure serialize is packaged.
|
2015-07-27 01:51:37 +02:00 |
|
Matthew Honnibal
|
8535d872e8
|
* Set is_oov property in get_flags
|
2015-07-27 01:51:24 +02:00 |
|
Matthew Honnibal
|
0f4d0d51ab
|
* Test is_oov property
|
2015-07-27 01:50:34 +02:00 |
|
Matthew Honnibal
|
8e4c69ee8c
|
* Add is_oov property, and fix up handling of attributes
|
2015-07-27 01:50:06 +02:00 |
|
Matthew Honnibal
|
fc268f03eb
|
* Assert against null pointer exceptions in vocab
|
2015-07-27 01:00:10 +02:00 |
|
Matthew Honnibal
|
2b5cde87fd
|
* Add prebuild command, to test clean builds
|
2015-07-26 22:40:04 +02:00 |
|
Matthew Honnibal
|
0368889d6c
|
* Support gzipped frequencies in init_model
|
2015-07-26 22:39:22 +02:00 |
|
Matthew Honnibal
|
62da5eb338
|
* Inc version
|
2015-07-26 22:22:54 +02:00 |
|
Matthew Honnibal
|
b997b1122b
|
* Mark test_io as requiring the model
|
2015-07-26 21:36:22 +02:00 |
|
Matthew Honnibal
|
0f093fdb30
|
* Fix get_by_orth for py3
|
2015-07-26 19:26:41 +02:00 |
|
Matthew Honnibal
|
ceeda5a739
|
* Fix get_by_orth for py3
|
2015-07-26 18:39:27 +02:00 |
|
Matthew Honnibal
|
5c9b8d05e4
|
* Upd test_docs
|
2015-07-26 17:41:13 +02:00 |
|
Matthew Honnibal
|
609f729cc5
|
* Fix infix test
|
2015-07-26 17:32:55 +02:00 |
|
Matthew Honnibal
|
3cfe3d8c1c
|
* Revert bad infix change
|
2015-07-26 17:32:37 +02:00 |
|
Matthew Honnibal
|
460b4c3207
|
* Add more infix tests
|
2015-07-26 17:30:34 +02:00 |
|
Matthew Honnibal
|
bd608559bc
|
* Fix infix-period tokenization
|
2015-07-26 17:14:52 +02:00 |
|
Matthew Honnibal
|
94f314c271
|
* Fix tokenization of email addresses.
|
2015-07-26 16:38:08 +02:00 |
|
Matthew Honnibal
|
48a4d15264
|
* Test token properties
|
2015-07-26 16:37:39 +02:00 |
|