spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-08-26 23:14:55 +03:00

Author	SHA1	Message	Date
Jim Geovedi	37fa2c8c80	punctution rules	2017-07-24 06:17:18 +07:00
Jim Geovedi	082e94ac1c	added inflix rules	2017-07-24 06:17:07 +07:00
Jim Geovedi	d0ec484725	reverted	2017-07-24 06:16:29 +07:00
Jim Geovedi	0e590c711f	added prefix & suffix rules	2017-07-23 23:46:40 +07:00
Jim Geovedi	ba922e30e8	added ampere hour unit	2017-07-23 23:46:18 +07:00
Jim Geovedi	3b17eba27b	added frequency units	2017-07-23 23:10:52 +07:00
Jim Geovedi	d5fd32a572	added known currencies	2017-07-23 22:56:48 +07:00
Jim Geovedi	f6f15678fb	added lex_attrs	2017-07-23 22:55:22 +07:00
Jim Geovedi	bed8162d00	added tokenizer_exceptions	2017-07-23 22:55:05 +07:00
Jim Geovedi	b80c35bc9a	added norm_exceptions	2017-07-23 22:54:49 +07:00
Jim Geovedi	b5de329ea3	added norm_exceptions	2017-07-23 22:54:19 +07:00
Jim Geovedi	082e9ade46	fixed typo	2017-07-23 21:30:34 +07:00
Jim Geovedi	e2efeb186e	added stopwords	2017-07-23 20:52:37 +07:00
Jim Geovedi	da98676839	use template	2017-07-23 20:51:31 +07:00
Jim Geovedi	c2b4dd7809	start working on Indonesian language	2017-07-23 20:50:56 +07:00
Matthew Honnibal	5771bd1ff8	Increment version	2017-07-23 14:18:38 +02:00
Matthew Honnibal	c4a81a47a4	Fix deserialization	2017-07-23 14:11:07 +02:00
Matthew Honnibal	2df563ad24	Remove optimization for textcat that caused loading problem	2017-07-23 14:10:51 +02:00
Matthew Honnibal	4fe77bced2	Add cfg attr to pipeline components	2017-07-23 00:52:47 +02:00
Matthew Honnibal	d8aa721664	Compute Language.meta with a property	2017-07-23 00:50:18 +02:00
Matthew Honnibal	54a539a113	Finish text classifier example	2017-07-23 00:34:12 +02:00
Matthew Honnibal	a88a7deffe	Five save/load of textcat config	2017-07-23 00:33:43 +02:00
Matthew Honnibal	c27fdaef6f	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-07-22 20:15:55 +02:00
Matthew Honnibal	2bc7d87c70	Add example for training text classifier	2017-07-22 20:15:32 +02:00
Matthew Honnibal	9bae0ddc50	Fix minibatching	2017-07-22 20:14:49 +02:00
Matthew Honnibal	ded0df5e2f	Expose hyper-param as keyword arg	2017-07-22 20:14:37 +02:00
Matthew Honnibal	f5de8deeec	Increment version	2017-07-22 20:04:53 +02:00
Matthew Honnibal	b55714d5d1	Make gold_tuples arg optional in begin_training	2017-07-22 20:04:43 +02:00
Matthew Honnibal	ed6c85fa3c	Fix loading of text categories in GoldParse	2017-07-22 20:04:03 +02:00
Matthew Honnibal	6ffec9dfea	Update _ml, for textcat model	2017-07-22 20:03:40 +02:00
ines	864cefd3b2	Update README.rst	2017-07-22 18:29:55 +02:00
ines	e349271506	Increment version	2017-07-22 18:29:30 +02:00
ines	ab8ffbaab7	Add text classification to v2 overview	2017-07-22 17:56:51 +02:00
ines	f085b88f9d	Add TextCategorizer API docs stub	2017-07-22 17:56:33 +02:00
ines	ab1a4e8b3c	Add Tensorizer API docs stub	2017-07-22 17:56:25 +02:00
ines	0fb89dd204	Add text classification usage guide template	2017-07-22 17:56:07 +02:00
ines	d05ab1b3a0	Add text classification to 101 overview and change order	2017-07-22 17:55:53 +02:00
ines	d2a7e5b8e5	Add GoldParse.cats attribute	2017-07-22 17:55:35 +02:00
ines	23d976ed00	Add Doc.cats attribute and missing v2 tag	2017-07-22 17:55:14 +02:00
Ines Montani	570964e67f	Update README.rst	2017-07-22 16:20:19 +02:00
Matthew Honnibal	5494605689	Fiddle with regex pin	2017-07-22 16:09:50 +02:00
Matthew Honnibal	78fcf56dd5	Update version pin for regex library	2017-07-22 15:57:58 +02:00
Matthew Honnibal	d51d55bba6	Increment version	2017-07-22 15:43:16 +02:00
Matthew Honnibal	8ccf154413	Merge branch 'master' of https://github.com/explosion/spaCy	2017-07-22 15:42:44 +02:00
Matthew Honnibal	796b2f4c1b	Remove print statements in tests	2017-07-22 15:42:38 +02:00
ines	7c4bf9994d	Add note on requirements and preventing model re-downloads (closes #1143 )	2017-07-22 15:40:12 +02:00
ines	de25bad036	Use lower min version for requests dependency (fixes #1137 ) Ensure compatibility with docker-compose and other packages	2017-07-22 15:29:10 +02:00
ines	d7560047c5	Fix version	2017-07-22 15:24:33 +02:00
Matthew Honnibal	af945ea8e2	Merge branch 'master' of https://github.com/explosion/spaCy	2017-07-22 15:09:59 +02:00
Matthew Honnibal	4b2e5e59ed	Add flush_cache method to tokenizer, to fix #1061 The tokenizer caches output for common chunks, for efficiency. This cache is be invalidated when the tokenizer rules change, e.g. when a new special-case rule is introduced. That's what was causing #1061. When the cache is flushed, we free the intermediate token chunks. I think this is safe --- but if we start getting segfaults, this patch is to blame. The resolution would be to simply not free those bits of memory. They'll be freed when the tokenizer exits anyway.	2017-07-22 15:06:50 +02:00

... 29 30 31 32 33 ...

7669 Commits