Matthew Honnibal
|
19c1e83d3d
|
Work on draft Italian tokenizer
|
2016-11-02 19:56:32 +01:00 |
|
Matthew Honnibal
|
9efe568177
|
Add missing unicode_literals to spacy.util. I think this was messing up the tokenizer regex for non-ascii characters in Python 2. Re Issue #596
|
2016-11-02 12:31:34 +01:00 |
|
Matthew Honnibal
|
d8db648ebf
|
Add __init__.py file for regression tests
|
2016-11-01 13:45:06 +01:00 |
|
Matthew Honnibal
|
11664b9f20
|
Fix variable error in token
|
2016-11-01 13:28:00 +01:00 |
|
Matthew Honnibal
|
8c4d1b46ce
|
Fix variable error in Span
|
2016-11-01 13:27:44 +01:00 |
|
Matthew Honnibal
|
e7af6b937f
|
Fix syntax error while fixing doc strings
|
2016-11-01 13:27:32 +01:00 |
|
Matthew Honnibal
|
62fc6b1afa
|
Use 32 bit hashes for OOV, re Issue #589, Issue #285
|
2016-11-01 13:27:13 +01:00 |
|
Matthew Honnibal
|
6977a2b8cd
|
Add test for Issue #589
|
2016-11-01 12:33:36 +01:00 |
|
Matthew Honnibal
|
b86f8af0c1
|
Fix doc strings
|
2016-11-01 12:25:36 +01:00 |
|
Matthew Honnibal
|
d563f1eadb
|
Fix Issue #587: Segfault in Matcher, due to simple error in the state machine.
|
2016-10-28 17:42:00 +02:00 |
|
Matthew Honnibal
|
7e5f63a595
|
Improve test slightly
|
2016-10-28 17:41:16 +02:00 |
|
Matthew Honnibal
|
782e4814f4
|
Test Issue #587: Matcher segfaults on particular input
|
2016-10-28 16:38:32 +02:00 |
|
Matthew Honnibal
|
708ea22208
|
Infer types in transition_system.pyx
|
2016-10-27 18:08:13 +02:00 |
|
Matthew Honnibal
|
18590eba94
|
Fix training evaluate method
|
2016-10-27 18:02:19 +02:00 |
|
Matthew Honnibal
|
301f3cc898
|
Fix Issue #429. Add an initialize_state method to the named entity recogniser that adds missing entity types. This is a messy place to add this, because it's strange to have the method mutate state. A better home for this logic could be found.
|
2016-10-27 18:01:55 +02:00 |
|
Matthew Honnibal
|
afea6505f3
|
Test Issue 429: No valid actions for NER after matcher adds a new entity label.
|
2016-10-27 18:01:34 +02:00 |
|
Matthew Honnibal
|
03a520ec4f
|
Change signature of Parser.parseC, so that nr_class is read from the transition system. This allows the transition system to modify the number of actions in initialize_state.
|
2016-10-27 17:58:56 +02:00 |
|
Matthew Honnibal
|
6c47048912
|
Fix test, after IOB tweak.
|
2016-10-26 17:22:03 +02:00 |
|
Matthew Honnibal
|
4ca31b4d87
|
Fix clobbering of 'missing' named ent values after assigning ents.
|
2016-10-26 13:13:56 +02:00 |
|
Matthew Honnibal
|
cb49189477
|
Remove dead code
|
2016-10-26 13:11:07 +02:00 |
|
Matthew Honnibal
|
a209b10579
|
Improve error message when oracle fails for non-projective trees, re Issue #571.
|
2016-10-24 20:31:30 +02:00 |
|
Matthew Honnibal
|
b2d43b93d2
|
Fix Python 3 basestring error
|
2016-10-24 14:22:51 +02:00 |
|
Matthew Honnibal
|
276478fe0f
|
Update strings.pxd
|
2016-10-24 14:00:35 +02:00 |
|
Matthew Honnibal
|
d8134817ff
|
Workaround Issue #285: Allow the StringStore to be 'frozen', in which case strings will be pushed into an OOV map. We can then flush this OOV map, freeing all of the OOV strings.
|
2016-10-24 13:49:03 +02:00 |
|
Matthew Honnibal
|
d3a617aa99
|
Test workaround for Issue #285: Streaming data memory growth
|
2016-10-24 13:48:06 +02:00 |
|
Matthew Honnibal
|
64e5f02cf7
|
Update test
|
2016-10-23 21:08:07 +02:00 |
|
Matthew Honnibal
|
66d7a6eca2
|
Update test
|
2016-10-23 21:02:05 +02:00 |
|
Matthew Honnibal
|
90bf797125
|
Update test
|
2016-10-23 20:54:17 +02:00 |
|
Matthew Honnibal
|
5e76320ffe
|
Update test
|
2016-10-23 20:44:54 +02:00 |
|
Matthew Honnibal
|
aa105927f3
|
Update test
|
2016-10-23 20:31:25 +02:00 |
|
Matthew Honnibal
|
6b9237aa83
|
Increment version
|
2016-10-23 20:22:53 +02:00 |
|
Matthew Honnibal
|
150e02d72e
|
Fix Issue #566
|
2016-10-23 20:19:01 +02:00 |
|
Matthew Honnibal
|
e120561294
|
Fix vector_norm test.
|
2016-10-23 19:56:16 +02:00 |
|
Matthew Honnibal
|
fefde8aef8
|
Make installation print data path.
|
2016-10-23 19:46:44 +02:00 |
|
Matthew Honnibal
|
e7414cd064
|
Try to fix weird install glitch.
|
2016-10-23 19:46:28 +02:00 |
|
Matthew Honnibal
|
90f7544edd
|
Increment version
|
2016-10-23 19:43:06 +02:00 |
|
Matthew Honnibal
|
6036ec7c77
|
Fix vector norm when loading lexemes.
|
2016-10-23 19:40:18 +02:00 |
|
Matthew Honnibal
|
c05cd2356e
|
Fix similarity test for Python 3
|
2016-10-23 18:16:56 +02:00 |
|
Matthew Honnibal
|
3e688e6d4b
|
Fix issue #514 -- serializer fails when new entity type has been added. The fix here is quite ugly. It's best to add the entities ASAP after loading the NLP pipeline, to mitigate the brittleness.
|
2016-10-23 17:45:44 +02:00 |
|
Matthew Honnibal
|
79aa03fe98
|
Test Issue #514: Serializer fails when new entity type has been added.
|
2016-10-23 17:41:44 +02:00 |
|
Matthew Honnibal
|
f97548c6f1
|
Fix broken test, re Issue #461
|
2016-10-23 17:02:23 +02:00 |
|
Matthew Honnibal
|
4de30a8e38
|
Test Issue #514: Serialization fails after adding a new entity label.
|
2016-10-23 16:40:27 +02:00 |
|
Matthew Honnibal
|
936e6246aa
|
Fix Issue #459 -- failed to deserialize empty doc.
|
2016-10-23 16:31:05 +02:00 |
|
Matthew Honnibal
|
e99b3f5322
|
Test Issue #459: Fail to deserialize empty doc
|
2016-10-23 16:30:22 +02:00 |
|
Matthew Honnibal
|
49c117960c
|
Fix bug where huffman codec died if given empty freqs dict.
|
2016-10-23 16:28:05 +02:00 |
|
Matthew Honnibal
|
99ff8b902f
|
Test that huffman codec works with empty freqs dict
|
2016-10-23 16:27:45 +02:00 |
|
Matthew Honnibal
|
15c9b59f0e
|
Fix Issue #461: O tag was being clobbered by doc.ents.__set__
|
2016-10-23 15:50:26 +02:00 |
|
Matthew Honnibal
|
e5627134d9
|
Test Issue #461: ent_iob tag incorrect after setting entities.
|
2016-10-23 15:50:04 +02:00 |
|
Matthew Honnibal
|
f62088d646
|
Fix compile error
|
2016-10-23 14:50:50 +02:00 |
|
Matthew Honnibal
|
2c3a67b693
|
Fix calculation of vector norm, re Issue #522. Need to consolidate the calculations into a helper function.
|
2016-10-23 14:49:31 +02:00 |
|