Jeffrey Gerard 
							
						 
					 
					
						
						
						
						
							
						
						
							b6ebedd09c 
							
						 
					 
					
						
						
							
							Document Tokenizer(token_match) and clarify tokenizer_pseudo_code  
						
						... 
						
						
						
						Closes  #835 
In the `tokenizer_pseudo_code` I put the `special_cases` kwarg
before `find_prefix` because this now matches the order the args
are used in the pseudocode, and it also matches spacy's actual code. 
					
						2017-09-25 13:13:25 -07:00 
						 
				 
			
				
					
						
							
							
								Yam 
							
						 
					 
					
						
						
						
						
							
						
						
							54855f0eee 
							
						 
					 
					
						
						
							
							Update customizing-tokenizer.jade  
						
						
						
					 
					
						2017-09-22 12:15:48 +08:00 
						 
				 
			
				
					
						
							
							
								Yam 
							
						 
					 
					
						
						
						
						
							
						
						
							6f450306c3 
							
						 
					 
					
						
						
							
							Update customizing-tokenizer.jade  
						
						... 
						
						
						
						update some codes:    
- `me` -> `-PRON`
- `TAG` -> `POS`
- `create_tokenizer` function 
						
					 
					
						2017-09-22 10:53:22 +08:00 
						 
				 
			
				
					
						
							
							
								Delirious Lettuce 
							
						 
					 
					
						
						
						
						
							
						
						
							d3b03f0544 
							
						 
					 
					
						
						
							
							Fix typos:  
						
						... 
						
						
						
						* `auxillary` -> `auxiliary`
  * `consistute` -> `constitute`
  * `earlist` -> `earliest`
  * `prefered` -> `preferred`
  * `direcory` -> `directory`
  * `reuseable` -> `reusable`
  * `idiosyncracies` -> `idiosyncrasies`
  * `enviroment` -> `environment`
  * `unecessary` -> `unnecessary`
  * `yesteday` -> `yesterday`
  * `resouces` -> `resources` 
						
					 
					
						2017-08-06 21:31:39 -06:00 
						 
				 
			
				
					
						
							
							
								Bart Broere 
							
						 
					 
					
						
						
						
						
							
						
						
							e4a45ae55f 
							
						 
					 
					
						
						
							
							Very minor documentation fix  
						
						
						
					 
					
						2017-06-12 12:28:51 +02:00 
						 
				 
			
				
					
						
							
							
								Yuval Pinter 
							
						 
					 
					
						
						
						
						
							
						
						
							af3d121ec9 
							
						 
					 
					
						
						
							
							extend suffixes from first to last  
						
						... 
						
						
						
						reverse suffix list in `tokenizer_pseudo_code()` so the order of returned tokens matches input order 
						
					 
					
						2017-05-22 10:56:03 -04:00 
						 
				 
			
				
					
						
							
							
								Kevin Gao 
							
						 
					 
					
						
						
						
						
							
						
						
							7ec710af0e 
							
						 
					 
					
						
						
							
							Fix Custom Tokenizer docs  
						
						... 
						
						
						
						- Fix mismatched quotations
- Make it more clear where ORTH, LEMMA, and POS symbols come from
- Make strings consistent
- Fix lemma_ assertion s/-PRON-/me/ 
						
					 
					
						2017-01-17 10:38:14 -08:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ce8bf08223 
							
						 
					 
					
						
						
							
							Fix formatting  
						
						
						
					 
					
						2016-12-18 17:40:20 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c20abc8a6d 
							
						 
					 
					
						
						
							
							Add customizing tokenizer and training workflow  
						
						
						
					 
					
						2016-11-05 20:40:11 +01:00