spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-08-23 05:24:56 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	9bd8191739	Add tests for Underscore	2017-10-07 18:56:19 +02:00
Matthew Honnibal	668a0ea640	Pass extensions into Underscore class	2017-10-07 18:56:01 +02:00
Matthew Honnibal	1289129fd9	Add Underscore class	2017-10-07 18:00:14 +02:00
ines	ca6769fd48	Update spacy functions and remove removed set_factory	2017-10-07 15:28:01 +02:00
ines	743d1df1fe	Update pipelines docs and add user hooks to custom components	2017-10-07 15:27:28 +02:00
Matthew Honnibal	eb0595bea9	Merge pull request #1392 from explosion/feature/parser-history-model 💫 Parser history features	2017-10-07 15:07:02 +02:00
ines	d70cf19158	Fix formatting	2017-10-07 15:06:38 +02:00
Ines Montani	36c68015f3	Merge pull request #1397 from explosion/feature/matcher-wildcard-token 💫 Allow empty dictionaries to match any token in Matcher	2017-10-07 15:05:24 +02:00
ines	c970b4f226	Add missing token attribute	2017-10-07 15:04:16 +02:00
ines	37f755897f	Update rule-based matching docs	2017-10-07 15:04:09 +02:00
Matthew Honnibal	3d22ccf495	Update default hyper-parameters	2017-10-07 07:16:41 -05:00
Matthew Honnibal	e22067e3b5	Document new hyper-parameters	2017-10-07 07:10:10 -05:00
ines	feaf353051	Update processing pipelines usage docs	2017-10-07 14:05:59 +02:00
Matthew Honnibal	09442d25ec	Merge remote-tracking branch 'origin/develop' into feature/parser-history-model	2017-10-07 07:05:04 -05:00
ines	58dfde7c02	Remove redundante deprecation note	2017-10-07 04:54:57 +02:00
Matthew Honnibal	3b67eabfea	Allow empty dictionaries to match any token in Matcher Often patterns need to match "any token". A clean way to denote this is with the empty dict {}: this sets no constraints on the token, so should always match. The problem was that having attributes length==0 was used as an end-of-array signal, so the matcher didn't handle this case correctly. This patch compiles empty token spec dicts into a constraint NULL_ATTR==0. The NULL_ATTR attribute, 0, is always set to 0 on the lexeme -- so this always matches.	2017-10-07 03:36:15 +02:00
ines	ed8e0085b0	Update docs for spacy.load()	2017-10-07 03:06:55 +02:00
ines	e370332fb1	Update Language API docs	2017-10-07 03:00:20 +02:00
ines	0adadcb3f0	Fix beam parse model test	2017-10-07 02:15:15 +02:00
ines	b38a8f4a94	Fix and update pipe methods tests	2017-10-07 02:06:23 +02:00
Matthew Honnibal	0384f08218	Trigger nonproj.deprojectivize as a postprocess	2017-10-07 02:00:47 +02:00
Matthew Honnibal	3a65a0c970	Start adding tests for new pipeline management	2017-10-07 01:48:23 +02:00
ines	e43530269c	Update docstrings	2017-10-07 01:04:50 +02:00
ines	61a503a611	Fix parser test	2017-10-07 00:38:51 +02:00
ines	b39409173e	Add disable option and True/False/None values for pipeline	2017-10-07 00:29:08 +02:00
ines	2586b61b15	Fix formatting, tidy up and remove unused imports	2017-10-07 00:26:05 +02:00
ines	212c8f0711	Implement new Language methods and pipeline API	2017-10-07 00:25:54 +02:00
Matthew Honnibal	8be46d766e	Remove print statement	2017-10-06 16:19:02 -05:00
ines	3468d535ad	Update model benchmarks	2017-10-06 21:39:06 +02:00
Matthew Honnibal	8e731009fe	Fix parser config serialization	2017-10-06 13:50:52 -05:00
Matthew Honnibal	f4c9a98166	Fix spacy evaluate command on non-GPU	2017-10-06 13:17:47 -05:00
Matthew Honnibal	16ba6aa8a6	Fix parser config serialization	2017-10-06 13:17:31 -05:00
ines	96a4e79d13	Fix PhraseMatcher example	2017-10-06 18:22:10 +02:00
Ines Montani	d33899b60b	Merge pull request #1393 from yuukos/patch-1 Update adding-languages.jade	2017-10-06 18:03:31 +02:00
Ines Montani	e89689a31d	Update CONTRIBUTORS.md	2017-10-06 18:02:40 +02:00
Matthew Honnibal	c66399d8ae	Fix depth definition with history features	2017-10-06 06:20:05 -05:00
Matthew Honnibal	5c750a9c2f	Reserve 0 for 'missing' in history features	2017-10-06 06:10:13 -05:00
Matthew Honnibal	fbba7c517e	Pass dropout through to embed tables	2017-10-06 06:09:18 -05:00
Matthew Honnibal	21d11936fe	Fix significant train/test skew error in history feats	2017-10-06 06:08:50 -05:00
Alex	763b54cbc3	Update adding-languages.jade Fixed misspellings	2017-10-06 16:30:44 +07:00
Matthew Honnibal	555d8c8bff	Fix beam history features	2017-10-05 22:21:50 -05:00
Matthew Honnibal	3db0a32fd6	Fix dropout for history features	2017-10-05 22:21:30 -05:00
Matthew Honnibal	b0618def8d	Add support for 2-token state option	2017-10-05 21:54:12 -05:00
Matthew Honnibal	363aa47b40	Clean up dead parsing code	2017-10-05 21:53:49 -05:00
Matthew Honnibal	ca12764772	Enable history features for beam parser	2017-10-05 21:53:29 -05:00
Matthew Honnibal	fc06b0a333	Fix training when hist_size==0	2017-10-05 21:52:28 -05:00
Matthew Honnibal	0e1adacaff	Merge pull request #1390 from mdcclv/contributor-mdcclv Contributor agreement for Orion Montoya @mdcclv	2017-10-06 02:39:08 +02:00
Matthew Honnibal	e25ffcb11f	Move history size under feature flags	2017-10-05 19:38:13 -05:00
Matthew Honnibal	563f46f026	Fix multi-label support for text classification The TextCategorizer class is supposed to support multi-label text classification, and allow training data to contain missing values. For this to work, the gradient of the loss should be 0 when labels are missing. Instead, there was no way to actually denote "missing" in the GoldParse class, and so the TextCategorizer class treated the label set within gold.cats as complete. To fix this, we change GoldParse.cats to be a dict instead of a list. The GoldParse.cats dict should map to floats, with 1. denoting 'present' and 0. denoting 'absent'. Gradients are zeroed for categories absent from the gold.cats dict. A nice bonus is that you can also set values between 0 and 1 for partial membership. You can also set numeric values, if you're using a text classification model that uses an appropriate loss function. Unfortunately this is a breaking change; although the functionality was only recently introduced and hasn't been properly documented yet. I've updated the example script accordingly.	2017-10-05 18:43:02 -05:00
Orion Montoya	e04e11070f	Contributor agreement for Orion Montoya @mdcclv	2017-10-05 17:45:45 -04:00

... 83 84 85 86 87 ...

11031 Commits