spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-08-21 20:44:56 +03:00

Author	SHA1	Message	Date
Ines Montani	aab299c8ae	Merge pull request #1429 from vishnunekkanti/develop fix syntax error in zh	2017-10-17 14:45:02 +02:00
Anto Binish Kaspar	534240648e	Fix trailing whitespace on morphology features	2017-10-17 17:15:58 +05:30
Anto Binish Kaspar	8f5b60c168	Fix Language.from_disk overwrites the meta.json file.	2017-10-17 17:15:32 +05:30
ines	8ca344712d	Add Language.has_pipe method	2017-10-17 11:20:07 +02:00
ines	485c4f6df5	Add Hungarian examples (see #1107 )	2017-10-17 02:37:45 +02:00
Matthew Honnibal	fc797a58de	Merge pull request #1424 from explosion/feature/streaming-data-memory-growth 💫 Fix streaming data memory growth (!!)	2017-10-16 23:08:18 +02:00
Matthew Honnibal	19531bad4c	Merge branch 'develop' into feature/streaming-data-memory-growth	2017-10-16 21:44:11 +02:00
Matthew Honnibal	df488274b1	Fix deserialization of vectors	2017-10-16 20:55:00 +02:00
Matthew Honnibal	4018486d31	Merge remote-tracking branch 'origin/develop' into feature/streaming-data-memory-growth	2017-10-16 20:49:48 +02:00
ines	4cfe259266	Fix formatting	2017-10-16 20:36:41 +02:00
ines	18793efef1	Remove Russian from v2.0 docs for now	2017-10-16 20:36:36 +02:00
ines	d383612225	Add note about word vectors in example (see #1117 )	2017-10-16 20:31:58 +02:00
Matthew Honnibal	4174477161	Fix equality check in test	2017-10-16 19:50:35 +02:00
Matthew Honnibal	2bc06e4b22	Bump rolling buffer size to 10k	2017-10-16 19:38:29 +02:00
Matthew Honnibal	66e2eb8f39	Clean up remnant of frozen in StringStore	2017-10-16 19:34:41 +02:00
Matthew Honnibal	a002264fec	Remove caching of Token in Doc, as caused cycle.	2017-10-16 19:34:21 +02:00
Matthew Honnibal	3e037054c8	Remove obsolete is_frozen functionality from StringStore	2017-10-16 19:23:10 +02:00
Matthew Honnibal	5c14f3f033	Create a rolling buffer for the StringStore in Language.pipe()	2017-10-16 19:22:40 +02:00
Matthew Honnibal	59c216196c	Allow weakrefs on Doc objects	2017-10-16 19:22:11 +02:00
ines	d5418553eb	Fix whitespace	2017-10-16 18:30:04 +02:00
ines	6ceadcdb5c	Make sure from_disk passes string to numpy (see #1421 ) If path is a WindowsPath, numpy does not recognise it as a path and as a result, doesn't open the file. https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py#L369	2017-10-16 18:29:56 +02:00
Matthew Honnibal	010a7309ff	Merge pull request #1402 from explosion/feature/fix-matcher-operators 💫 Fix Matcher variable-length operators	2017-10-16 17:53:19 +02:00
Matthew Honnibal	c29927d2e7	Fix matcher test	2017-10-16 17:22:18 +02:00
Vishnu Kumar Nekkanti	d3c54cf39a	fixed SyntaxError while checking for jieba	2017-10-16 18:51:33 +05:30
Vishnu Kumar Nekkanti	18ec6610dd	Merge pull request #1 from explosion/develop Develop	2017-10-16 18:34:13 +05:30
ines	63393b4e0d	Update matcher docs to reflect operator changes	2017-10-16 13:44:12 +02:00
Matthew Honnibal	a928ae2f35	Merge branch 'develop' into feature/fix-matcher-operators	2017-10-16 13:38:36 +02:00
Matthew Honnibal	56aa42cc5d	Fix and document matcher operator 'shadowing' behaviour	2017-10-16 13:38:20 +02:00
Matthew Honnibal	748d525801	Add more matcher operator tests	2017-10-16 13:38:01 +02:00
Matthew Honnibal	0433181658	Document operator semantics in Matcher docstring	2017-10-16 12:06:33 +02:00
Matthew Honnibal	cd9378c8f1	Merge pull request #1423 from yuukos/master Fixed Russian tokenizer	2017-10-16 11:45:53 +02:00
Matthew Honnibal	6b0121091c	Merge pull request #1420 from polm/master [ja] Stash tokenizer output for speed	2017-10-16 10:28:22 +02:00
yuukos	34e9c6ddc0	Merge remote-tracking branch 'origin/master'	2017-10-16 13:48:10 +07:00
yuukos	92931a2efd	Merge branch 'russian_language'	2017-10-16 13:46:28 +07:00
yuukos	241d19a3e6	fixed Russian Tokenizer - added trailing space flags for tokens	2017-10-16 13:37:05 +07:00
Paul O'Leary McCann	71ae8013ec	[ja] Use user_details instead of a wrapper class Instead of using a JapaneseDoc wrapper class to store Mecab output, stash it in `user_data`. -POLM	2017-10-16 00:24:34 +09:00
Paul O'Leary McCann	43eedf73f2	[ja] Stash tokenizer output for speed Before this commit, the Mecab tokenizer had to be called twice when creating a Doc- once during tokenization and once during tagging. This creates a JapaneseDoc wrapper class for Doc that stashes the parsed tokenizer output to remove redundant processing. -POLM	2017-10-15 23:33:25 +09:00
ines	15514dc333	Add section on upgrading	2017-10-14 22:14:47 +02:00
ines	c0aceb9fbe	Add Hindi to supported languages	2017-10-14 15:16:41 +02:00
Ines Montani	e00a6c08cf	Merge pull request #1418 from polm/master Contributor agreement	2017-10-14 15:10:58 +02:00
ines	266e7180a7	Add Language class, stop words and basic stemmer that sets NORM	2017-10-14 14:59:52 +02:00
ines	e85e1d571b	Update base punctuation	2017-10-14 14:59:23 +02:00
ines	9d6c8eaa49	Update base norm exceptions with more unicode characters e.g. unicode variations of punctuation used in Chinese	2017-10-14 14:58:52 +02:00
ines	3516aa0cea	Port over changes from #1389	2017-10-14 13:32:55 +02:00
ines	cd6a29dce7	Port over changes from #1294	2017-10-14 13:28:46 +02:00
ines	38c756fd85	Port over changes from #1287	2017-10-14 13:16:21 +02:00
ines	612224c10d	Port over changes from #1157	2017-10-14 13:11:39 +02:00
ines	9b3f8f9ec3	Fix formatting and add comment on languages	2017-10-14 13:11:18 +02:00
ines	a4d974d97b	Port over URL pattern changes from #1411	2017-10-14 12:58:07 +02:00
ines	09aed58140	Port over changes from #1333 and add comments	2017-10-14 12:52:59 +02:00

... 35 36 37 38 39 ...

8809 Commits