spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-30 23:47:31 +03:00

Author	SHA1	Message	Date
Jeffrey Gerard	b6ebedd09c	Document Tokenizer(token_match) and clarify tokenizer_pseudo_code Closes #835 In the `tokenizer_pseudo_code` I put the `special_cases` kwarg before `find_prefix` because this now matches the order the args are used in the pseudocode, and it also matches spacy's actual code.	2017-09-25 13:13:25 -07:00
Yam	54855f0eee	Update customizing-tokenizer.jade	2017-09-22 12:15:48 +08:00
Yam	6f450306c3	Update customizing-tokenizer.jade update some codes: - `me` -> `-PRON` - `TAG` -> `POS` - `create_tokenizer` function	2017-09-22 10:53:22 +08:00
Delirious Lettuce	d3b03f0544	Fix typos: * `auxillary` -> `auxiliary` * `consistute` -> `constitute` * `earlist` -> `earliest` * `prefered` -> `preferred` * `direcory` -> `directory` * `reuseable` -> `reusable` * `idiosyncracies` -> `idiosyncrasies` * `enviroment` -> `environment` * `unecessary` -> `unnecessary` * `yesteday` -> `yesterday` * `resouces` -> `resources`	2017-08-06 21:31:39 -06:00
Bart Broere	e4a45ae55f	Very minor documentation fix	2017-06-12 12:28:51 +02:00
Yuval Pinter	af3d121ec9	extend suffixes from first to last reverse suffix list in `tokenizer_pseudo_code()` so the order of returned tokens matches input order	2017-05-22 10:56:03 -04:00
Kevin Gao	7ec710af0e	Fix Custom Tokenizer docs - Fix mismatched quotations - Make it more clear where ORTH, LEMMA, and POS symbols come from - Make strings consistent - Fix lemma_ assertion s/-PRON-/me/	2017-01-17 10:38:14 -08:00
Ines Montani	ce8bf08223	Fix formatting	2016-12-18 17:40:20 +01:00
Ines Montani	c20abc8a6d	Add customizing tokenizer and training workflow	2016-11-05 20:40:11 +01:00

9 Commits