Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							308a28c26c 
							
						 
					 
					
						
						
							
							* Whitespace  
						
						
						
					 
					
						2016-05-02 16:08:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							29a114e645 
							
						 
					 
					
						
						
							
							* Don't assign 0-valued tags in Doc.from_array  
						
						
						
					 
					
						2016-05-02 16:07:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c1c11a8ae0 
							
						 
					 
					
						
						
							
							* Fix formatting on serializer tests  
						
						
						
					 
					
						2016-05-02 16:07:21 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							dae6bc05eb 
							
						 
					 
					
						
						
							
							define German dummy lemmatizer until morphology is done  
						
						
						
					 
					
						2016-05-02 16:04:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6e1f1c4b9e 
							
						 
					 
					
						
						
							
							Merge pull request  #357  from wbwseeker/german_ner  
						
						... 
						
						
						
						German ner 
						
					 
					
						2016-05-02 23:39:34 +10:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							b6b96b233c 
							
						 
					 
					
						
						
							
							don't require read_json_file to expect particular annotations  
						
						
						
					 
					
						2016-05-02 15:29:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							902a389d85 
							
						 
					 
					
						
						
							
							* Fix merge conflict in test_parse  
						
						
						
					 
					
						2016-05-02 15:28:07 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							276fbe9996 
							
						 
					 
					
						
						
							
							* Fix assignment of iterator on Doc object  
						
						
						
					 
					
						2016-05-02 15:26:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							02c23cc1d0 
							
						 
					 
					
						
						
							
							* Fix sentence boundary test  
						
						
						
					 
					
						2016-05-02 15:26:07 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d2f469b809 
							
						 
					 
					
						
						
							
							* Fix parsing tests, so that labels are added if they're missing, and so that the branching test values are correct  
						
						
						
					 
					
						2016-05-02 15:25:27 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							b11cbb06c6 
							
						 
					 
					
						
						
							
							remove old tests for sentence boundary detection  
						
						
						
					 
					
						2016-05-02 14:36:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							508fd1f6dc 
							
						 
					 
					
						
						
							
							* Refactor noun chunk iterators, so that they're simple functions. Install the iterator when the Doc is created, but allow users to write to the noun_chunk_iterator attribute. The iterator functions accept an object and yield (int start, int end, int label) triples.  
						
						
						
					 
					
						2016-05-02 14:25:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e526be5602 
							
						 
					 
					
						
						
							
							Merge branch 'master' of ssh://github.com/spacy-io/spaCy  
						
						
						
					 
					
						2016-05-02 13:08:08 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							fa961ea694 
							
						 
					 
					
						
						
							
							add tests for serialization bug  
						
						
						
					 
					
						2016-05-02 11:01:56 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							9b142d4438 
							
						 
					 
					
						
						
							
							can't work around build issue on windows  
						
						
						
					 
					
						2016-05-01 12:30:59 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							749cbd359e 
							
						 
					 
					
						
						
							
							Update LICENSE  
						
						
						
					 
					
						2016-04-29 09:49:28 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							fac209cb7e 
							
						 
					 
					
						
						
							
							add stdint.h fallback (vs 2008)  
						
						
						
					 
					
						2016-04-29 00:08:14 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							2bf34687ea 
							
						 
					 
					
						
						
							
							add stdint.h fallback (vs 2008)  
						
						
						
					 
					
						2016-04-28 22:10:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							97b2bba249 
							
						 
					 
					
						
						
							
							* Merge updated/simplified Break approach  
						
						
						
					 
					
						2016-04-25 19:44:42 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							77609588b6 
							
						 
					 
					
						
						
							
							* Fix assignment of root label to words left as root implicitly, after parsing ends.  
						
						
						
					 
					
						2016-04-25 19:41:59 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7c2d2deaa7 
							
						 
					 
					
						
						
							
							* Revise transition system so that the Break transition retains sole responsibility for setting sentence boundaries. Re Issue  #322  
						
						
						
					 
					
						2016-04-25 19:41:59 +00:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							c2f76a4024 
							
						 
					 
					
						
						
							
							Merge branch 'master' into german_ner  
						
						
						
					 
					
						2016-04-25 13:21:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							feb65fcaa1 
							
						 
					 
					
						
						
							
							Merge pull request  #346  from wbwseeker/sentbnd_bug  
						
						... 
						
						
						
						introduce sentence boundaries for additional root tokens 
						
					 
					
						2016-04-25 20:31:27 +10:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							1003e7ccec 
							
						 
					 
					
						
						
							
							remove debug output from tests  
						
						
						
					 
					
						2016-04-25 12:12:40 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							f57f843e85 
							
						 
					 
					
						
						
							
							fix bug in updating tree structure when introducing additional roots  
						
						
						
					 
					
						2016-04-25 12:01:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							478a8d1829 
							
						 
					 
					
						
						
							
							* Register Chinese language in spacy/__init__.py  
						
						
						
					 
					
						2016-04-24 18:45:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8569dbc2d0 
							
						 
					 
					
						
						
							
							* Add initial stuff for Chinese parsing  
						
						
						
					 
					
						2016-04-24 18:44:24 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							4d7f393fae 
							
						 
					 
					
						
						
							
							don't require json-files to have syntactic annotation  
						
						
						
					 
					
						2016-04-22 16:32:27 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							b6477fc4f4 
							
						 
					 
					
						
						
							
							adjusted tests to Travis Setup  
						
						
						
					 
					
						2016-04-21 17:15:10 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							736ffcb9a2 
							
						 
					 
					
						
						
							
							remove whitespace  
						
						
						
					 
					
						2016-04-21 16:55:55 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							6c7301cc6d 
							
						 
					 
					
						
						
							
							the parser now introduces sentence boundaries properly when predicting dependents with root labels  
						
						
						
					 
					
						2016-04-21 16:50:53 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							12024b0b0a 
							
						 
					 
					
						
						
							
							bugfix: introducing multiple roots now updates original head's properties  
						
						... 
						
						
						
						adjust tests to rely less on statistical model 
						
					 
					
						2016-04-20 16:42:41 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							c356251f45 
							
						 
					 
					
						
						
							
							Merge branch 'master' of github.com:spacy-io/spaCy  
						
						
						
					 
					
						2016-04-19 19:50:55 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							bb3238bcdd 
							
						 
					 
					
						
						
							
							pin numpy to >=1.7, ship headers  
						
						
						
					 
					
						2016-04-19 19:50:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							67ce96c9c9 
							
						 
					 
					
						
						
							
							* Make patterns argument to Matcher class optional  
						
						
						
					 
					
						2016-04-17 21:32:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8b4677d34d 
							
						 
					 
					
						
						
							
							* Add missing keyword arguments to spacy.load() function  
						
						
						
					 
					
						2016-04-17 21:31:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2add5206aa 
							
						 
					 
					
						
						
							
							* Fix description of matcher test  
						
						
						
					 
					
						2016-04-17 15:40:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2b419d5b8c 
							
						 
					 
					
						
						
							
							* Update test for Issue  #242  
						
						
						
					 
					
						2016-04-17 15:34:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f12b043308 
							
						 
					 
					
						
						
							
							* Add test for Issue  #242 : Overlapping matches not well recognised.  
						
						
						
					 
					
						2016-04-17 15:19:17 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							b98cc3266d 
							
						 
					 
					
						
						
							
							bugfix: iterators now reset properly when called a second time  
						
						
						
					 
					
						2016-04-15 17:49:16 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							e6945c4d0e 
							
						 
					 
					
						
						
							
							bugfix: uppercase attr values before looking them up  
						
						
						
					 
					
						2016-04-15 15:46:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c0909afe22 
							
						 
					 
					
						
						
							
							Merge pull request  #312  from wbwseeker/space_head_bug  
						
						... 
						
						
						
						add restrictions to L-arc and R-arc to prevent space heads 
						
					 
					
						2016-04-15 20:36:03 +10:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							289b10f441 
							
						 
					 
					
						
						
							
							remove some comments  
						
						
						
					 
					
						2016-04-14 15:37:51 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fe9299a118 
							
						 
					 
					
						
						
							
							* Fix long-standing issue with coarse-grained tags: proper nouns weren't receiving the PROPN tag, and personal pronouns weren't receiving the PRON tag. This should fix Issue  #191 , and also Issue  #325 , which reported that proper nouns were being lemmatized using the common noun policies. This lemmatization will be prevented if the universal tag is PROPN, not NOUN, as no lemmatization rules are loaded for the PROPN tag.  
						
						
						
					 
					
						2016-04-14 12:46:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6f82065761 
							
						 
					 
					
						
						
							
							* Fix infixed commas in tokenizer, re Issue  #326 . Need to benchmark on empirical data, to make sure this doesn't break other cases.  
						
						
						
					 
					
						2016-04-14 11:36:03 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0f957dd586 
							
						 
					 
					
						
						
							
							Merge branch 'master' of ssh://github.com/honnibal/spaCy  
						
						
						
					 
					
						2016-04-14 10:37:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							108aca0e50 
							
						 
					 
					
						
						
							
							* Make Matcher use attrs from the attrs.pyx file, rather than having an incomplete function doing the mapping.  
						
						
						
					 
					
						2016-04-14 10:37:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							61d20de35d 
							
						 
					 
					
						
						
							
							* Fix language.py docstring  
						
						
						
					 
					
						2016-04-14 10:36:57 +02:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							d99a9cbce9 
							
						 
					 
					
						
						
							
							different handling of space tokens  
						
						... 
						
						
						
						space tokens are now always attached to the previous non-space token
there are two exceptions:
leading space tokens are attached to the first following non-space token
in input that consists exclusively of space tokens, the last space token
is the head of all others. 
						
					 
					
						2016-04-13 15:28:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							04d0209be9 
							
						 
					 
					
						
						
							
							* Recognise multiple infixes in a token.  
						
						
						
					 
					
						2016-04-13 18:38:26 +10:00