Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							eb72eae258 
							
						 
					 
					
						
						
							
							Merge pull request  #1364  from Destygo/master  
						
						... 
						
						
						
						Fixed NER model loading bug 
						
					 
					
						2017-09-29 12:29:43 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							58bfe30a12 
							
						 
					 
					
						
						
							
							Merge pull request  #1362  from IamJeffG/docs/custom-tokenizer  
						
						... 
						
						
						
						Document Tokenizer(token_match) and clarify tokenizer_pseudo_code 
						
					 
					
						2017-09-26 15:51:15 +02:00 
						 
				 
			
				
					
						
							
							
								Vincent Genty 
							
						 
					 
					
						
						
						
						
							
						
						
							259ed027af 
							
						 
					 
					
						
						
							
							Fixed NER model loading bug  
						
						
						
					 
					
						2017-09-26 15:46:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							361211fe26 
							
						 
					 
					
						
						
							
							Merge pull request  #1342  from wannaphongcom/master  
						
						... 
						
						
						
						Add Thai language 
						
					 
					
						2017-09-26 15:40:55 +02:00 
						 
				 
			
				
					
						
							
							
								Jeffrey Gerard 
							
						 
					 
					
						
						
						
						
							
						
						
							b6ebedd09c 
							
						 
					 
					
						
						
							
							Document Tokenizer(token_match) and clarify tokenizer_pseudo_code  
						
						... 
						
						
						
						Closes  #835 
In the `tokenizer_pseudo_code` I put the `special_cases` kwarg
before `find_prefix` because this now matches the order the args
are used in the pseudocode, and it also matches spacy's actual code. 
					
						2017-09-25 13:13:25 -07:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2f8d535f65 
							
						 
					 
					
						
						
							
							Merge pull request  #1351  from hscspring/patch-4  
						
						... 
						
						
						
						Update punctuation.py 
						
					 
					
						2017-09-24 12:16:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9177313063 
							
						 
					 
					
						
						
							
							Merge pull request  #1352  from hscspring/patch-5  
						
						... 
						
						
						
						Update customizing-tokenizer.jade 
						
					 
					
						2017-09-22 16:11:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1dbc2285b8 
							
						 
					 
					
						
						
							
							Merge pull request  #1350  from hscspring/patch-3  
						
						... 
						
						
						
						Update word-vectors-similarities.jade 
						
					 
					
						2017-09-22 16:11:05 +02:00 
						 
				 
			
				
					
						
							
							
								Yam 
							
						 
					 
					
						
						
						
						
							
						
						
							54855f0eee 
							
						 
					 
					
						
						
							
							Update customizing-tokenizer.jade  
						
						
						
					 
					
						2017-09-22 12:15:48 +08:00 
						 
				 
			
				
					
						
							
							
								Yam 
							
						 
					 
					
						
						
						
						
							
						
						
							6f450306c3 
							
						 
					 
					
						
						
							
							Update customizing-tokenizer.jade  
						
						... 
						
						
						
						update some codes:    
- `me` -> `-PRON`
- `TAG` -> `POS`
- `create_tokenizer` function 
						
					 
					
						2017-09-22 10:53:22 +08:00 
						 
				 
			
				
					
						
							
							
								Yam 
							
						 
					 
					
						
						
						
						
							
						
						
							923c4c2fb2 
							
						 
					 
					
						
						
							
							Update punctuation.py  
						
						... 
						
						
						
						add `……` 
						
					 
					
						2017-09-22 09:50:46 +08:00 
						 
				 
			
				
					
						
							
							
								Yam 
							
						 
					 
					
						
						
						
						
							
						
						
							425c09488d 
							
						 
					 
					
						
						
							
							Update word-vectors-similarities.jade  
						
						... 
						
						
						
						add
```    
import spacy
nlp = spacy.load('en') ``` 
						
					 
					
						2017-09-22 08:56:34 +08:00 
						 
				 
			
				
					
						
							
							
								Wannaphong Phatthiyaphaibun 
							
						 
					 
					
						
						
						
						
							
						
						
							1abf472068 
							
						 
					 
					
						
						
							
							add th test  
						
						
						
					 
					
						2017-09-21 12:56:58 +07:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ea2732469b 
							
						 
					 
					
						
						
							
							Merge pull request  #1340  from hscspring/patch-1  
						
						... 
						
						
						
						Update punctuation.py 
						
					 
					
						2017-09-20 23:57:00 +02:00 
						 
				 
			
				
					
						
							
							
								Wannaphong Phatthiyaphaibun 
							
						 
					 
					
						
						
						
						
							
						
						
							39bb5690f0 
							
						 
					 
					
						
						
							
							update th  
						
						
						
					 
					
						2017-09-21 00:36:02 +07:00 
						 
				 
			
				
					
						
							
							
								Wannaphong Phatthiyaphaibun 
							
						 
					 
					
						
						
						
						
							
						
						
							44291f6697 
							
						 
					 
					
						
						
							
							add thai  
						
						
						
					 
					
						2017-09-20 23:26:34 +07:00 
						 
				 
			
				
					
						
							
							
								Yam 
							
						 
					 
					
						
						
						
						
							
						
						
							978b24ccd4 
							
						 
					 
					
						
						
							
							Update punctuation.py  
						
						... 
						
						
						
						In Chinese, `~` and `——` is hyphens,   
`·` is intermittent symbol 
						
					 
					
						2017-09-20 23:02:22 +08:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							aa728b33ca 
							
						 
					 
					
						
						
							
							Merge pull request  #1333  from galaxyh/master  
						
						... 
						
						
						
						Add Chinese punctuation 
						
					 
					
						2017-09-19 15:09:30 +02:00 
						 
				 
			
				
					
						
							
							
								Yu-chun Huang 
							
						 
					 
					
						
						
						
						
							
						
						
							188b439b25 
							
						 
					 
					
						
						
							
							Add Chinese punctuation  
						
						... 
						
						
						
						Add Chinese punctuation. 
						
					 
					
						2017-09-19 16:58:42 +08:00 
						 
				 
			
				
					
						
							
							
								Yu-chun Huang 
							
						 
					 
					
						
						
						
						
							
						
						
							1f1f35dcd0 
							
						 
					 
					
						
						
							
							Add Chinese punctuation  
						
						... 
						
						
						
						Add Chinese punctuation. 
						
					 
					
						2017-09-19 16:57:24 +08:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4bee26188d 
							
						 
					 
					
						
						
							
							Merge pull request  #1323  from galaxyh/master  
						
						... 
						
						
						
						Set the "cut_all" parameter in jieba.cut() to False, or jieba will return ALL POSSIBLE word segmentations. 
						
					 
					
						2017-09-14 15:23:41 +02:00 
						 
				 
			
				
					
						
							
							
								Yu-chun Huang 
							
						 
					 
					
						
						
						
						
							
						
						
							7692b8c071 
							
						 
					 
					
						
						
							
							Update __init__.py  
						
						... 
						
						
						
						Set the "cut_all" parameter to False, or jieba will return ALL POSSIBLE word segmentations. 
						
					 
					
						2017-09-12 16:23:47 +08:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ddaff6ca56 
							
						 
					 
					
						
						
							
							Merge pull request  #1287  from IamJeffG/feature/1226-more-complete-noun-chunks  
						
						... 
						
						
						
						Capture more noun chunks 
						
					 
					
						2017-09-08 07:59:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							45029a550e 
							
						 
					 
					
						
						
							
							Fix customized-tokenizer tests  
						
						
						
					 
					
						2017-09-04 20:13:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							34c585396a 
							
						 
					 
					
						
						
							
							Merge pull request  #1294  from Vimos/master  
						
						... 
						
						
						
						Fix issue #1292  and add test case for the Assertion Error 
						
					 
					
						2017-09-04 19:20:40 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c68f188eb0 
							
						 
					 
					
						
						
							
							Fix error on test  
						
						
						
					 
					
						2017-09-04 18:59:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							33313c01ad 
							
						 
					 
					
						
						
							
							Merge pull request  #1298  from ericzhao28/master  
						
						... 
						
						
						
						Lowest common ancestor matrix for spans and docs 
						
					 
					
						2017-09-04 18:57:54 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e8a26ebfab 
							
						 
					 
					
						
						
							
							Add efficiency note to new get_lca_matrix() method  
						
						
						
					 
					
						2017-09-04 15:43:52 +02:00 
						 
				 
			
				
					
						
							
							
								Eric Zhao 
							
						 
					 
					
						
						
						
						
							
						
						
							d61c117081 
							
						 
					 
					
						
						
							
							Lowest common ancestor matrix for spans and docs  
						
						... 
						
						
						
						Added functionality for spans and docs to get lowest common ancestor
matrix by simply calling: doc.get_lca_matrix() or
doc[:3].get_lca_matrix().
Corresponding unit tests were also added under spacy/tests/doc and
spacy/tests/spans.
Designed to address: https://github.com/explosion/spaCy/issues/969 . 
						
					 
					
						2017-09-03 12:22:19 -07:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9bffcaa73d 
							
						 
					 
					
						
						
							
							Update test to make it slightly more direct  
						
						... 
						
						
						
						The `nlp` container should be unnecessary here. If so, we can test the tokenizer class just a little more directly. 
						
					 
					
						2017-09-01 21:16:56 +02:00 
						 
				 
			
				
					
						
							
							
								Vimos Tan 
							
						 
					 
					
						
						
						
						
							
						
						
							a6d9fb5bb6 
							
						 
					 
					
						
						
							
							fix issue  #1292  
						
						
						
					 
					
						2017-08-30 14:49:14 +08:00 
						 
				 
			
				
					
						
							
							
								Jeffrey Gerard 
							
						 
					 
					
						
						
						
						
							
						
						
							884ba168a8 
							
						 
					 
					
						
						
							
							Capture more noun chunks  
						
						
						
					 
					
						2017-08-23 21:18:53 -07:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							dcff10abe9 
							
						 
					 
					
						
						
							
							Add regression test for  #1281  
						
						
						
					 
					
						2017-08-21 16:11:47 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							edc596d9a7 
							
						 
					 
					
						
						
							
							Add missing tokenizer exceptions ( resolves   #1281 )  
						
						
						
					 
					
						2017-08-21 16:11:36 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							c5c3f4c7d9 
							
						 
					 
					
						
						
							
							Use more generous .env ignore rule  
						
						
						
					 
					
						2017-08-21 16:08:40 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							dca026124f 
							
						 
					 
					
						
						
							
							Merge pull request  #1262  from kevinmarsh/patch-1  
						
						... 
						
						
						
						Fix broken tutorial link on website 
						
					 
					
						2017-08-16 09:58:07 +02:00 
						 
				 
			
				
					
						
							
							
								Kevin Marsh 
							
						 
					 
					
						
						
						
						
							
						
						
							e3738aba0d 
							
						 
					 
					
						
						
							
							Fix broken tutorial link on website  
						
						
						
					 
					
						2017-08-15 21:50:09 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a9465271a7 
							
						 
					 
					
						
						
							
							Merge pull request  #1245  from delirious-lettuce/fix_typos  
						
						... 
						
						
						
						Fix typos 
						
					 
					
						2017-08-07 23:11:20 +02:00 
						 
				 
			
				
					
						
							
							
								Delirious Lettuce 
							
						 
					 
					
						
						
						
						
							
						
						
							d3b03f0544 
							
						 
					 
					
						
						
							
							Fix typos:  
						
						... 
						
						
						
						* `auxillary` -> `auxiliary`
  * `consistute` -> `constitute`
  * `earlist` -> `earliest`
  * `prefered` -> `preferred`
  * `direcory` -> `directory`
  * `reuseable` -> `reusable`
  * `idiosyncracies` -> `idiosyncrasies`
  * `enviroment` -> `environment`
  * `unecessary` -> `unnecessary`
  * `yesteday` -> `yesterday`
  * `resouces` -> `resources` 
						
					 
					
						2017-08-06 21:31:39 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b7b121103f 
							
						 
					 
					
						
						
							
							Merge pull request  #1244  from gideonite/patch-1  
						
						... 
						
						
						
						improve pipe, tee, izip explanation 
						
					 
					
						2017-08-06 14:34:07 +02:00 
						 
				 
			
				
					
						
							
							
								Gideon Dresdner 
							
						 
					 
					
						
						
						
						
							
						
						
							7e98a3613c 
							
						 
					 
					
						
						
							
							improve pipe, tee, izip explanation  
						
						... 
						
						
						
						Use an example from an old issue https://github.com/explosion/spaCy/issues/172#issuecomment-183963403 . 
						
					 
					
						2017-08-06 13:21:45 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							864cefd3b2 
							
						 
					 
					
						
						
							
							Update README.rst  
						
						
						
					 
					
						2017-07-22 18:29:55 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							e349271506 
							
						 
					 
					
						
						
							
							Increment version  
						
						
						
					 
					
						2017-07-22 18:29:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							570964e67f 
							
						 
					 
					
						
						
							
							Update README.rst  
						
						
						
					 
					
						2017-07-22 16:20:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5494605689 
							
						 
					 
					
						
						
							
							Fiddle with regex pin  
						
						
						
					 
					
						2017-07-22 16:09:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							78fcf56dd5 
							
						 
					 
					
						
						
							
							Update version pin for regex library  
						
						
						
					 
					
						2017-07-22 15:57:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d51d55bba6 
							
						 
					 
					
						
						
							
							Increment version  
						
						
						
					 
					
						2017-07-22 15:43:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8ccf154413 
							
						 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/explosion/spaCy  
						
						
						
					 
					
						2017-07-22 15:42:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							796b2f4c1b 
							
						 
					 
					
						
						
							
							Remove print statements in tests  
						
						
						
					 
					
						2017-07-22 15:42:38 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							7c4bf9994d 
							
						 
					 
					
						
						
							
							Add note on requirements and preventing model re-downloads ( closes   #1143 )  
						
						
						
					 
					
						2017-07-22 15:40:12 +02:00