Jim Geovedi
|
c62b49b7cc
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-08-09 09:17:46 +07:00 |
|
Matthew Honnibal
|
dbdd8afc4b
|
Fix parser fine-tune training
|
2017-08-08 15:46:07 -05:00 |
|
Matthew Honnibal
|
88bf1cf87c
|
Update parser for fine tuning
|
2017-08-08 15:34:17 -05:00 |
|
Matthew Honnibal
|
5d837c3776
|
Add mix weights on fine_tune
|
2017-08-07 06:32:59 -05:00 |
|
Matthew Honnibal
|
42bd26f6f3
|
Give parser its own tok2vec weights
|
2017-08-06 18:33:46 +02:00 |
|
Matthew Honnibal
|
3ed203de25
|
Use LayerNorm and SELU in Tok2Vec
|
2017-08-06 18:33:18 +02:00 |
|
Matthew Honnibal
|
78498a072d
|
Return Transition for missing actions in lookup_action
|
2017-08-06 14:16:36 +02:00 |
|
Matthew Honnibal
|
4a5cc89138
|
Fix tagger 'fine_tune', to keep private CNN weights
|
2017-08-06 14:15:48 +02:00 |
|
Matthew Honnibal
|
3cb8f06881
|
Fix NeuralLabeller
|
2017-08-06 14:15:14 +02:00 |
|
Matthew Honnibal
|
0acce0521b
|
Fix Language.update for pipeline
|
2017-08-06 14:13:03 +02:00 |
|
Matthew Honnibal
|
bfffdeabb2
|
Fix parser batch-size bug introduced during cleanup
|
2017-08-06 14:10:48 +02:00 |
|
Matthew Honnibal
|
0eec7c9e9b
|
Fix Language.evaluate
|
2017-08-06 02:18:31 +02:00 |
|
Matthew Honnibal
|
0a566dc320
|
Add update_tensors flag to Language.update. Experimental, re #1182
|
2017-08-06 02:18:12 +02:00 |
|
Matthew Honnibal
|
cc19ea0e7c
|
Add update_tensors flag to Language.update. Experimental, re #1182
|
2017-08-06 02:17:10 +02:00 |
|
Matthew Honnibal
|
4cfb7a54e7
|
Fix tagger
|
2017-08-06 01:53:31 +02:00 |
|
Matthew Honnibal
|
e9ab800e15
|
Fix tagging model
|
2017-08-06 01:50:08 +02:00 |
|
Matthew Honnibal
|
468c138ab3
|
WIP: Add fine-tuning logic to tagger model, re #1182
|
2017-08-06 01:13:23 +02:00 |
|
Matthew Honnibal
|
7f876a7a82
|
Clean up some unused code in parser
|
2017-08-06 00:00:21 +02:00 |
|
Matthew Honnibal
|
ae1ad81069
|
Increment version
|
2017-08-05 18:09:32 +02:00 |
|
Jim Geovedi
|
cc4772cac2
|
reworks
|
2017-08-03 13:08:38 +07:00 |
|
Jim Geovedi
|
37f19f5ed2
|
added more currencies based on corpus data
|
2017-08-03 13:03:25 +07:00 |
|
Jim Geovedi
|
30fd068d42
|
hashtag prefix should be handled somewhere else
|
2017-08-03 13:03:02 +07:00 |
|
Jim Geovedi
|
4705ae19ba
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-08-03 12:40:19 +07:00 |
|
Jim Geovedi
|
ba07e23c87
|
added USD in currency rules
|
2017-08-02 22:42:47 +07:00 |
|
Matthew Honnibal
|
5c323daa1a
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-08-01 22:10:37 +02:00 |
|
Matthew Honnibal
|
2e00361522
|
Fix update when 0 docs
|
2017-08-01 22:10:17 +02:00 |
|
Matthew Honnibal
|
8fce187de4
|
Fix ArcEager for missing values
|
2017-08-01 22:10:05 +02:00 |
|
ines
|
78e262140f
|
Add workaround for displaCy server on Python 2/3 (resolves #1227)
Make sure status and headers are bytes on Python 2 and strings on
Python 3
|
2017-08-01 01:11:35 +02:00 |
|
Jim Geovedi
|
2572a9ddf0
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-07-30 21:24:16 +07:00 |
|
Jim Geovedi
|
bb08d696f9
|
added hashtag rule and fixed currency rules
|
2017-07-30 21:23:28 +07:00 |
|
Jim Geovedi
|
e9af79a803
|
added u-\d+ rules (sports team)
|
2017-07-30 21:23:01 +07:00 |
|
Matthew Honnibal
|
c16ef0a85c
|
Clarify train textcat example
|
2017-07-29 21:59:27 +02:00 |
|
Matthew Honnibal
|
27abc56e98
|
Add method to get beam entities
|
2017-07-29 21:59:02 +02:00 |
|
Matthew Honnibal
|
ec63f4fe7b
|
Add option to control how missing entities are handled when getting NER tags
|
2017-07-29 21:58:37 +02:00 |
|
Jim Geovedi
|
e5adc26c72
|
simplified rules
|
2017-07-29 18:21:32 +07:00 |
|
Jim Geovedi
|
783f7d8b86
|
added test set for Indonesian language
|
2017-07-29 18:21:07 +07:00 |
|
Jim Geovedi
|
4d04898dea
|
updated regexp
|
2017-07-29 17:44:57 +07:00 |
|
Jim Geovedi
|
7d96d477ea
|
updated like_num
|
2017-07-29 17:44:46 +07:00 |
|
Jim Geovedi
|
3cca4ed798
|
added lex attrs rules
|
2017-07-29 17:22:21 +07:00 |
|
Jim Geovedi
|
8b814c63f1
|
more exceptions
|
2017-07-27 19:46:30 +07:00 |
|
Jim Geovedi
|
6c725e8dcf
|
updated lemma
|
2017-07-27 19:46:21 +07:00 |
|
Jim Geovedi
|
c194f7ae26
|
Merge remote-tracking branch 'upstream/develop' into indonesian
|
2017-07-27 10:55:34 +07:00 |
|
Jim Geovedi
|
547973b92a
|
wip syntax iterators
|
2017-07-27 10:51:34 +07:00 |
|
Jim Geovedi
|
bbc75da38d
|
enable syntax iterator and lemma lookup
|
2017-07-27 10:51:15 +07:00 |
|
Jim Geovedi
|
24a8c8bf28
|
added wip lemma dict
|
2017-07-26 21:39:54 +07:00 |
|
Jim Geovedi
|
63f14ba46b
|
added hyphen-suffix rules
|
2017-07-26 19:28:57 +07:00 |
|
Jim Geovedi
|
f288964441
|
removed -el from suffix rules
|
2017-07-26 19:28:38 +07:00 |
|
Jim Geovedi
|
6eee7a7411
|
updated tokenizer exceptions
|
2017-07-26 19:13:47 +07:00 |
|
Jim Geovedi
|
edec51b1b1
|
update punctuation rules
|
2017-07-26 19:13:36 +07:00 |
|
Jim Geovedi
|
62443d495a
|
enable token match
|
2017-07-26 19:13:14 +07:00 |
|