Jim Geovedi
|
2572a9ddf0
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-07-30 21:24:16 +07:00 |
|
Jim Geovedi
|
bb08d696f9
|
added hashtag rule and fixed currency rules
|
2017-07-30 21:23:28 +07:00 |
|
Jim Geovedi
|
e9af79a803
|
added u-\d+ rules (sports team)
|
2017-07-30 21:23:01 +07:00 |
|
Matthew Honnibal
|
c16ef0a85c
|
Clarify train textcat example
|
2017-07-29 21:59:27 +02:00 |
|
Matthew Honnibal
|
27abc56e98
|
Add method to get beam entities
|
2017-07-29 21:59:02 +02:00 |
|
Matthew Honnibal
|
ec63f4fe7b
|
Add option to control how missing entities are handled when getting NER tags
|
2017-07-29 21:58:37 +02:00 |
|
Jim Geovedi
|
e5adc26c72
|
simplified rules
|
2017-07-29 18:21:32 +07:00 |
|
Jim Geovedi
|
783f7d8b86
|
added test set for Indonesian language
|
2017-07-29 18:21:07 +07:00 |
|
Jim Geovedi
|
4d04898dea
|
updated regexp
|
2017-07-29 17:44:57 +07:00 |
|
Jim Geovedi
|
7d96d477ea
|
updated like_num
|
2017-07-29 17:44:46 +07:00 |
|
Jim Geovedi
|
3cca4ed798
|
added lex attrs rules
|
2017-07-29 17:22:21 +07:00 |
|
Jim Geovedi
|
8b814c63f1
|
more exceptions
|
2017-07-27 19:46:30 +07:00 |
|
Jim Geovedi
|
6c725e8dcf
|
updated lemma
|
2017-07-27 19:46:21 +07:00 |
|
Jim Geovedi
|
c194f7ae26
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-07-27 10:55:34 +07:00 |
|
Jim Geovedi
|
547973b92a
|
wip syntax iterators
|
2017-07-27 10:51:34 +07:00 |
|
Jim Geovedi
|
bbc75da38d
|
enable syntax iterator and lemma lookup
|
2017-07-27 10:51:15 +07:00 |
|
Jim Geovedi
|
24a8c8bf28
|
added wip lemma dict
|
2017-07-26 21:39:54 +07:00 |
|
Jim Geovedi
|
63f14ba46b
|
added hyphen-suffix rules
|
2017-07-26 19:28:57 +07:00 |
|
Jim Geovedi
|
f288964441
|
removed -el from suffix rules
|
2017-07-26 19:28:38 +07:00 |
|
Jim Geovedi
|
6eee7a7411
|
updated tokenizer exceptions
|
2017-07-26 19:13:47 +07:00 |
|
Jim Geovedi
|
edec51b1b1
|
update punctuation rules
|
2017-07-26 19:13:36 +07:00 |
|
Jim Geovedi
|
62443d495a
|
enable token match
|
2017-07-26 19:13:14 +07:00 |
|
Jim Geovedi
|
c97f5ae0bb
|
updated tokenizer exceptions
|
2017-07-26 19:12:52 +07:00 |
|
Matthew Honnibal
|
aff325b7e0
|
Increment version
|
2017-07-25 19:41:20 +02:00 |
|
Matthew Honnibal
|
6780132821
|
Fix tagger loading
|
2017-07-25 19:41:11 +02:00 |
|
Matthew Honnibal
|
fd20a4af55
|
Increment version
|
2017-07-25 18:58:34 +02:00 |
|
Matthew Honnibal
|
ff7418b0d9
|
Update requirements
|
2017-07-25 18:58:15 +02:00 |
|
Matthew Honnibal
|
523b0df2c9
|
Update text classification model
|
2017-07-25 18:57:59 +02:00 |
|
Matthew Honnibal
|
7c7fac9337
|
Add spacy.blank() loading function
|
2017-07-25 18:56:37 +02:00 |
|
Jim Geovedi
|
73f6ac9d9b
|
added hyhen
|
2017-07-24 15:56:31 +07:00 |
|
Jim Geovedi
|
68454c40bf
|
added missing import
|
2017-07-24 14:12:34 +07:00 |
|
Jim Geovedi
|
eaf9cbd708
|
cursed of copy & paste
|
2017-07-24 14:11:51 +07:00 |
|
Jim Geovedi
|
7aad6718bc
|
enable tokenizer exceptions
|
2017-07-24 14:11:10 +07:00 |
|
Jim Geovedi
|
ad56c9179a
|
added tokenizer exceptions list
|
2017-07-24 14:10:16 +07:00 |
|
Jim Geovedi
|
c1f3fe99fe
|
updated punctuation rules
|
2017-07-24 13:57:21 +07:00 |
|
Jim Geovedi
|
37fa2c8c80
|
punctution rules
|
2017-07-24 06:17:18 +07:00 |
|
Jim Geovedi
|
082e94ac1c
|
added inflix rules
|
2017-07-24 06:17:07 +07:00 |
|
Jim Geovedi
|
d0ec484725
|
reverted
|
2017-07-24 06:16:29 +07:00 |
|
Jim Geovedi
|
0e590c711f
|
added prefix & suffix rules
|
2017-07-23 23:46:40 +07:00 |
|
Jim Geovedi
|
ba922e30e8
|
added ampere hour unit
|
2017-07-23 23:46:18 +07:00 |
|
Jim Geovedi
|
3b17eba27b
|
added frequency units
|
2017-07-23 23:10:52 +07:00 |
|
Jim Geovedi
|
d5fd32a572
|
added known currencies
|
2017-07-23 22:56:48 +07:00 |
|
Jim Geovedi
|
f6f15678fb
|
added lex_attrs
|
2017-07-23 22:55:22 +07:00 |
|
Jim Geovedi
|
bed8162d00
|
added tokenizer_exceptions
|
2017-07-23 22:55:05 +07:00 |
|
Jim Geovedi
|
b80c35bc9a
|
added norm_exceptions
|
2017-07-23 22:54:49 +07:00 |
|
Jim Geovedi
|
b5de329ea3
|
added norm_exceptions
|
2017-07-23 22:54:19 +07:00 |
|
Jim Geovedi
|
082e9ade46
|
fixed typo
|
2017-07-23 21:30:34 +07:00 |
|
Jim Geovedi
|
e2efeb186e
|
added stopwords
|
2017-07-23 20:52:37 +07:00 |
|
Jim Geovedi
|
da98676839
|
use template
|
2017-07-23 20:51:31 +07:00 |
|
Jim Geovedi
|
c2b4dd7809
|
start working on Indonesian language
|
2017-07-23 20:50:56 +07:00 |
|