mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-31 07:57:35 +03:00 
			
		
		
		
	Fix out-of-bounds access in NER training
The helper method state.B(1) gets the index of the first token of the buffer, or -1 if no such token exists. Normally this is safe because we pass this to functions like state.safe_get(), which returns an empty token. Here we used it directly as an array index, which is not okay! This error may have been the cause of out-of-bounds access errors during training. Similar errors may still be around, so much be hunted down. Hunting this one down took a long time...I printed out values across training runs and diffed, looking for points of divergence between runs, when no randomness should be allowed.
This commit is contained in:
		
							parent
							
								
									c6a320cad4
								
							
						
					
					
						commit
						ad068f51be
					
				|  | @ -338,7 +338,7 @@ cdef class In: | |||
|     @staticmethod | ||||
|     cdef weight_t cost(StateClass s, const GoldParseC* gold, attr_t label) nogil: | ||||
|         move = IN | ||||
|         cdef int next_act = gold.ner[s.B(1)].move if s.B(0) < s.c.length else OUT | ||||
|         cdef int next_act = gold.ner[s.B(1)].move if s.B(1) >= 0 else OUT | ||||
|         cdef int g_act = gold.ner[s.B(0)].move | ||||
|         cdef attr_t g_tag = gold.ner[s.B(0)].label | ||||
|         cdef bint is_sunk = _entity_is_sunk(s, gold.ner) | ||||
|  |  | |||
		Loading…
	
		Reference in New Issue
	
	Block a user