Commit Graph

51 Commits

Author SHA1 Message Date
svlandeg
b1911f7105 Errors.E146 for IO error when FP is null 2019-07-22 14:56:13 +02:00
svlandeg
5d544f89ba Errors.E145 for IO errors when reading KB 2019-07-22 14:36:07 +02:00
svlandeg
76184374e2 test corner cases 2019-07-22 13:39:32 +02:00
svlandeg
9f8c1e71a2 fix for Issue #4000 2019-07-22 13:34:12 +02:00
svlandeg
dae8a21282 rename entity frequency 2019-07-19 17:40:28 +02:00
svlandeg
ec55d2fccd filter training data beforehand (+black formatting) 2019-07-18 10:22:24 +02:00
svlandeg
d833d4c358 fixes in kb and gold 2019-07-17 17:18:26 +02:00
svlandeg
4086c6ff60 get vector functionality + unit test 2019-07-17 12:17:02 +02:00
svlandeg
dbc53b9870 rename to KBEntryC 2019-06-26 15:55:26 +02:00
svlandeg
cc9ae28a52 custom error and warning messages 2019-06-19 12:35:26 +02:00
svlandeg
a31648d28b further code cleanup 2019-06-19 09:15:43 +02:00
svlandeg
78dd3e11da write entity linking pipe to file and keep vocab consistent between kb and nlp 2019-06-13 16:25:39 +02:00
svlandeg
a5c061f506 storing NEL training data in GoldParse objects 2019-06-07 12:58:42 +02:00
svlandeg
d8b435ceff pretraining description vectors and storing them in the KB 2019-06-06 19:51:27 +02:00
svlandeg
5c723c32c3 entity vectors in the KB + serialization of them 2019-06-05 18:29:18 +02:00
svlandeg
1ae41daaa9 allow small rounding errors 2019-05-01 23:05:40 +02:00
svlandeg
60b54ae8ce bulk entity writing and experiment with regex wikidata reader to speed up processing 2019-05-01 00:00:38 +02:00
svlandeg
54d0cea062 unit test for KB serialization 2019-04-24 23:52:34 +02:00
svlandeg
3e0cb69065 KB aliases to and from file 2019-04-24 20:24:24 +02:00
svlandeg
ad6c5e581c writing and reading number of entries to/from header 2019-04-24 15:31:44 +02:00
svlandeg
6e3223f234 bulk loading in proper order of entity indices 2019-04-24 11:26:38 +02:00
svlandeg
694fea597a dumping all entryC entries + (inefficient) reading back in 2019-04-23 18:36:50 +02:00
svlandeg
8e70a564f1 custom reader and writer for _EntryC fields (first stab at it - not complete) 2019-04-23 16:33:40 +02:00
svlandeg
10ee8dfea2 poc with few entities and collecting aliases from the WP links 2019-04-18 14:12:17 +02:00
svlandeg
9a7d534b1b enable nogil for cython functions in kb.pxd 2019-04-10 17:25:10 +02:00
svlandeg
61a33f55d2 little fixes 2019-04-10 16:06:09 +02:00
svlandeg
8814b9010d entity as one field instead of both ID and name 2019-03-25 18:10:41 +01:00
svlandeg
46f4eb5db3 error and warning messages 2019-03-22 16:55:05 +01:00
svlandeg
b4cd5d5ee9 property annotations for fields with only a getter 2019-03-22 16:10:49 +01:00
svlandeg
a48241e9a2 use nlp's vocab for stringstore 2019-03-22 11:36:45 +01:00
svlandeg
1ee0e78fd7 select candidate with highest prior probabiity 2019-03-22 11:36:45 +01:00
svlandeg
7b708ab8a4 name per entity 2019-03-22 11:36:45 +01:00
svlandeg
c593607ce2 minimal EL pipe 2019-03-22 11:36:45 +01:00
svlandeg
c71123dd0c ensure no candidates are returned for unknown aliases 2019-03-22 11:36:45 +01:00
svlandeg
b6c3255a9f Entity class 2019-03-22 11:36:45 +01:00
svlandeg
1289cd6e8f property getters and keep track of KB internally 2019-03-22 11:36:45 +01:00
svlandeg
9a46c431c3 store entity hash instead of pointer 2019-03-22 11:36:45 +01:00
svlandeg
9819dca80e create candidate object from entry pointer (not fully functional yet) 2019-03-22 11:36:45 +01:00
svlandeg
a9074e0886 check the length of entities and probabilities vector + unit test 2019-03-22 11:36:45 +01:00
svlandeg
d133ffaff9 correct size, not counting dummy elements in the vector 2019-03-22 11:36:45 +01:00
svlandeg
33f8a0fe2e check and unit test in case prior probs exceed 1 2019-03-22 11:36:45 +01:00
svlandeg
b55baaa1dc avoid value 0 in preshmap and helpful user warnings 2019-03-22 11:36:45 +01:00
svlandeg
20a7b7b1c0 raising error when adding alias for unknown entity + unit test 2019-03-22 11:36:45 +01:00
svlandeg
8843f9279c use StringStore 2019-03-22 11:36:45 +01:00
svlandeg
51560bf0ed bugfix adding aliases 2019-03-22 11:36:45 +01:00
svlandeg
c4ba942765 get candidates by alias 2019-03-22 11:36:45 +01:00
svlandeg
151b855cc8 adding and retrieving aliases 2019-03-22 11:36:45 +01:00
svlandeg
cf34113250 very minimal KB functionality working 2019-03-22 11:36:44 +01:00
svlandeg
af281c5466 adding aliases per entity in the KB 2019-03-22 11:36:44 +01:00
svlandeg
f77b99c103 fix compile errors 2019-03-22 11:36:44 +01:00