| 
							
							
								 svlandeg | cd6c263fe4 | format offsets | 2019-07-23 11:31:29 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 9f8c1e71a2 | fix for Issue #4000 | 2019-07-22 13:34:12 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | dae8a21282 | rename entity frequency | 2019-07-19 17:40:28 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 21176517a7 | have gold.links correspond exactly to doc.ents | 2019-07-19 12:36:15 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | e1213eaf6a | use original gold object in get_loss function | 2019-07-18 13:35:10 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | ec55d2fccd | filter training data beforehand (+black formatting) | 2019-07-18 10:22:24 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | b7a0c9bf60 | fixing the context/prior weight settings | 2019-07-03 17:48:09 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 8840d4b1b3 | fix for context encoder optimizer | 2019-07-03 13:35:36 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 3420cbe496 | small fixes | 2019-07-03 10:25:51 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 2d2dea9924 | experiment with adding NER types to the feature vector | 2019-06-29 14:52:36 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | c664f58246 | adding prior probability as feature in the model | 2019-06-28 16:22:58 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 1c80b85241 | fix tests | 2019-06-28 08:59:23 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 68a0662019 | context encoder with Tok2Vec + linking model instead of cosine | 2019-06-28 08:29:31 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | dbc53b9870 | rename to KBEntryC | 2019-06-26 15:55:26 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 1de61f68d6 | improve speed of prediction loop | 2019-06-26 13:53:10 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | bee23cd8af | try Tok2Vec instead of SpacyVectors | 2019-06-25 16:09:22 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | b58bace84b | small fixes | 2019-06-24 10:55:04 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | a31648d28b | further code cleanup | 2019-06-19 09:15:43 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 478305cd3f | small tweaks and documentation | 2019-06-18 18:38:09 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 0d177c1146 | clean up code, remove old code, move to bin | 2019-06-18 13:20:40 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | ffae7d3555 | sentence encoder only (removing article/mention encoder) | 2019-06-18 00:05:47 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 6332af40de | baseline performances: oracle KB, random and prior prob | 2019-06-17 14:39:40 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 24db1392b9 | reprocessing all of wikipedia for training data | 2019-06-16 21:14:45 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 81731907ba | performance per entity type | 2019-06-14 19:55:46 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | b312f2d0e7 | redo training data to be independent of KB and entity-level instead of doc-level | 2019-06-14 15:55:26 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 0b04d142de | regenerating KB | 2019-06-13 22:32:56 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 78dd3e11da | write entity linking pipe to file and keep vocab consistent between kb and nlp | 2019-06-13 16:25:39 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | b12001f368 | small fixes | 2019-06-12 22:05:53 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 6521cfa132 | speeding up training | 2019-06-12 13:37:05 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 66813a1fdc | speed up predictions | 2019-06-11 14:18:20 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | fe1ed432ef | eval on dev set, varying combo's of prior and context scores | 2019-06-11 11:40:58 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 83dc7b46fd | first tests with EL pipe | 2019-06-10 21:25:26 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 7de1ee69b8 | training loop in proper pipe format | 2019-06-07 15:55:10 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 0486ccabfd | introduce goldparse.links | 2019-06-07 13:54:45 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | a5c061f506 | storing NEL training data in GoldParse objects | 2019-06-07 12:58:42 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 61f0e2af65 | code cleanup | 2019-06-06 20:22:14 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | d8b435ceff | pretraining description vectors and storing them in the KB | 2019-06-06 19:51:27 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 5c723c32c3 | entity vectors in the KB + serialization of them | 2019-06-05 18:29:18 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 9abbd0899f | separate entity encoder to get 64D descriptions | 2019-06-05 00:09:46 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | fb37cdb2d3 | implementing el pipe in pipes.pyx (not tested yet) | 2019-06-03 21:32:54 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 9e88763dab | 60% acc run | 2019-06-03 08:04:49 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 268a52ead7 | experimenting with cosine sim for negative examples (not OK yet) | 2019-05-29 16:07:53 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | a761929fa5 | context encoder combining sentence and article | 2019-05-28 18:14:49 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 992fa92b66 | refactor again to clusters of entities and cosine similarity | 2019-05-28 00:05:22 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 8c4aa076bc | small fixes | 2019-05-27 14:29:38 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | cfc27d7ff9 | using Tok2Vec instead | 2019-05-26 23:39:46 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | abf9af81c9 | learn rate en epochs | 2019-05-24 22:04:25 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 86ed771e0b | adding local sentence encoder | 2019-05-23 16:59:11 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 4392c01b7b | obtain sentence for each mention | 2019-05-23 15:37:05 +02:00 |  | 
			
				
					| 
							
							
								 svlandeg | 97241a3ed7 | upsampling and batch processing | 2019-05-22 23:40:10 +02:00 |  |