spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-12-27 10:26:35 +03:00

Author	SHA1	Message	Date
Anto Binish Kaspar	8f5b60c168	Fix Language.from_disk overwrites the meta.json file.	2017-10-17 17:15:32 +05:30
ines	8ca344712d	Add Language.has_pipe method	2017-10-17 11:20:07 +02:00
ines	485c4f6df5	Add Hungarian examples (see #1107 )	2017-10-17 02:37:45 +02:00
Matthew Honnibal	19531bad4c	Merge branch 'develop' into feature/streaming-data-memory-growth	2017-10-16 21:44:11 +02:00
Matthew Honnibal	df488274b1	Fix deserialization of vectors	2017-10-16 20:55:00 +02:00
Matthew Honnibal	4018486d31	Merge remote-tracking branch 'origin/develop' into feature/streaming-data-memory-growth	2017-10-16 20:49:48 +02:00
Matthew Honnibal	4174477161	Fix equality check in test	2017-10-16 19:50:35 +02:00
Matthew Honnibal	2bc06e4b22	Bump rolling buffer size to 10k	2017-10-16 19:38:29 +02:00
Matthew Honnibal	66e2eb8f39	Clean up remnant of frozen in StringStore	2017-10-16 19:34:41 +02:00
Matthew Honnibal	a002264fec	Remove caching of Token in Doc, as caused cycle.	2017-10-16 19:34:21 +02:00
Matthew Honnibal	3e037054c8	Remove obsolete is_frozen functionality from StringStore	2017-10-16 19:23:10 +02:00
Matthew Honnibal	5c14f3f033	Create a rolling buffer for the StringStore in Language.pipe()	2017-10-16 19:22:40 +02:00
Matthew Honnibal	59c216196c	Allow weakrefs on Doc objects	2017-10-16 19:22:11 +02:00
ines	d5418553eb	Fix whitespace	2017-10-16 18:30:04 +02:00
ines	6ceadcdb5c	Make sure from_disk passes string to numpy (see #1421 ) If path is a WindowsPath, numpy does not recognise it as a path and as a result, doesn't open the file. https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py#L369	2017-10-16 18:29:56 +02:00
Matthew Honnibal	010a7309ff	Merge pull request #1402 from explosion/feature/fix-matcher-operators 💫 Fix Matcher variable-length operators	2017-10-16 17:53:19 +02:00
Matthew Honnibal	c29927d2e7	Fix matcher test	2017-10-16 17:22:18 +02:00
Vishnu Kumar Nekkanti	d3c54cf39a	fixed SyntaxError while checking for jieba	2017-10-16 18:51:33 +05:30
Matthew Honnibal	a928ae2f35	Merge branch 'develop' into feature/fix-matcher-operators	2017-10-16 13:38:36 +02:00
Matthew Honnibal	56aa42cc5d	Fix and document matcher operator 'shadowing' behaviour	2017-10-16 13:38:20 +02:00
Matthew Honnibal	748d525801	Add more matcher operator tests	2017-10-16 13:38:01 +02:00
Matthew Honnibal	0433181658	Document operator semantics in Matcher docstring	2017-10-16 12:06:33 +02:00
ines	266e7180a7	Add Language class, stop words and basic stemmer that sets NORM	2017-10-14 14:59:52 +02:00
ines	e85e1d571b	Update base punctuation	2017-10-14 14:59:23 +02:00
ines	9d6c8eaa49	Update base norm exceptions with more unicode characters e.g. unicode variations of punctuation used in Chinese	2017-10-14 14:58:52 +02:00
ines	3516aa0cea	Port over changes from #1389	2017-10-14 13:32:55 +02:00
ines	cd6a29dce7	Port over changes from #1294	2017-10-14 13:28:46 +02:00
ines	38c756fd85	Port over changes from #1287	2017-10-14 13:16:21 +02:00
ines	612224c10d	Port over changes from #1157	2017-10-14 13:11:39 +02:00
ines	9b3f8f9ec3	Fix formatting and add comment on languages	2017-10-14 13:11:18 +02:00
ines	a4d974d97b	Port over URL pattern changes from #1411	2017-10-14 12:58:07 +02:00
ines	09aed58140	Port over changes from #1333 and add comments	2017-10-14 12:52:59 +02:00
Matthew Honnibal	cf6da9301a	Update lemmatizer test	2017-10-12 22:50:52 +02:00
Matthew Honnibal	9b90d235d1	Fix tag check in lemmatizer	2017-10-12 22:50:43 +02:00
Matthew Honnibal	dc01acd821	Escape encoding in validate function	2017-10-12 22:23:21 +02:00
Matthew Honnibal	27b927259a	Add locale_escape compat function	2017-10-12 22:22:04 +02:00
ines	9c6de3dcfa	Merge branch 'develop' into feature/cli-validate	2017-10-12 21:44:28 +02:00
Matthew Honnibal	462caf835a	Fix SBD test	2017-10-12 21:18:22 +02:00
ines	fff1028391	Add validate CLI command	2017-10-12 20:05:06 +02:00
Matthew Honnibal	908f44c3fe	Disable history features by default	2017-10-12 14:56:11 +02:00
Matthew Honnibal	a955843684	Increase default number of epochs	2017-10-12 13:13:01 +02:00
Matthew Honnibal	cecfcc7711	Set default hyper params back to 'slow' settings	2017-10-12 13:12:26 +02:00
Ines Montani	37aa523a8e	Merge pull request #1408 from explosion/feature/dot-underscore 💫 Custom attributes via Doc._, Token._ and Span._	2017-10-11 18:35:56 +02:00
ines	8ce6f96180	Don't make copies of language data components	2017-10-11 15:34:55 +02:00
ines	51519251c2	Fix underscore method test	2017-10-11 13:34:19 +02:00
ines	c6ae49e8bf	Fix formatting	2017-10-11 13:34:11 +02:00
ines	453c47ca24	Add German lemmatizer tests	2017-10-11 13:27:26 +02:00
ines	15fe0fd82d	Fix tests	2017-10-11 13:27:18 +02:00
ines	6dd14dc342	Add lookup lemmas to tokens without POS tags	2017-10-11 13:27:10 +02:00
ines	9620c1a640	Add lemma_lookup to Language defaults	2017-10-11 13:26:05 +02:00
ines	9fd471372a	Add lookup lemmatizer to lemmatizer as lookup() method	2017-10-11 13:25:51 +02:00
ines	e0ff145a8b	Merge branch 'develop' into feature/dot-underscore	2017-10-11 11:57:05 +02:00
ines	c1d6d43c83	Merge branch 'develop' into feature/lemmatizer	2017-10-11 11:56:35 +02:00
Matthew Honnibal	17c467e0ab	Avoid clobbering existing lemmas	2017-10-11 03:33:06 -05:00
Matthew Honnibal	807e109f2b	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-11 02:47:59 -05:00
Matthew Honnibal	6e552c9d83	Prune number of non-projective labels more aggressiely	2017-10-11 02:46:44 -05:00
Matthew Honnibal	76fe24f44d	Improve embedding defaults	2017-10-11 09:44:17 +02:00
Matthew Honnibal	188f620046	Improve parser defaults	2017-10-11 09:43:48 +02:00
Matthew Honnibal	acba2e1051	Fix metadata in training	2017-10-11 08:55:52 +02:00
Matthew Honnibal	74c2c6a58c	Add default name and lang to meta	2017-10-11 08:49:12 +02:00
Matthew Honnibal	3814a161e6	Avoid clobbering preset lemmas	2017-10-11 08:41:03 +02:00
Matthew Honnibal	fd47f8e89f	Fix failing test	2017-10-11 08:38:34 +02:00
Matthew Honnibal	462b2e26b4	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-11 08:23:04 +02:00
Matthew Honnibal	a6ac4699eb	Allow Morphology class to setup tokens Add Morphology.assign_untagged() C-method, and call it from Doc.push_back() when a token is created. This gives a place to allow the Morphology class to initialize token data.	2017-10-11 03:24:14 +02:00
Matthew Honnibal	3b527fa52b	Call morphology.assign_untagged when pushing token to Doc	2017-10-11 03:23:57 +02:00
Matthew Honnibal	c15d8278cb	Avoid lemmatizing inappropriate tags in English lemmatizer	2017-10-11 03:23:23 +02:00
Matthew Honnibal	d528b6e36d	Add assign_untagged method in Morphology	2017-10-11 03:22:49 +02:00
Matthew Honnibal	2c118ab3a6	Add tests for Doc creation	2017-10-11 03:21:23 +02:00
ines	820bf85075	Move LookupLemmatizer to spacy.lemmatizer	2017-10-11 02:25:13 +02:00
ines	417d45f5d0	Add lemmatizer data as variable on language data Don't create lookup lemmatizer within Language class and just pass in the data so it can be set on Token creation	2017-10-11 02:24:58 +02:00
ines	0c2343d73a	Tidy up language data	2017-10-11 02:22:49 +02:00
Matthew Honnibal	d84136b4a9	Update add label test	2017-10-10 22:57:41 +02:00
Matthew Honnibal	3065f12ef2	Make add parser label work for hidden_depth=0	2017-10-10 22:57:31 +02:00
ines	bfd58dd0fc	Merge branch 'develop' into feature/dot-underscore	2017-10-10 22:03:51 +02:00
Matthew Honnibal	73bca3d382	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-10 12:51:37 -05:00
Matthew Honnibal	5156074df1	Make loading code more consistent in train command	2017-10-10 12:51:20 -05:00
Matthew Honnibal	d70fba6807	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-10 19:33:10 +02:00
Matthew Honnibal	8143618497	Set prefix length back to 1	2017-10-10 19:32:54 +02:00
Matthew Honnibal	97c9b5db8b	Patch spacy.train for new pipeline management	2017-10-09 23:41:16 -05:00
Matthew Honnibal	a635240398	Add conll_ner2json converter	2017-10-09 22:03:26 -05:00
Matthew Honnibal	e0a9b02b67	Merge Span._ and Span.as_doc methods	2017-10-09 22:00:15 -05:00
Matthew Honnibal	dce8afb9cf	Set prefix length to 3	2017-10-09 21:55:55 -05:00
Matthew Honnibal	8265b90c83	Update parser defaults	2017-10-09 21:55:20 -05:00
Matthew Honnibal	dd2b0601d1	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-09 21:30:46 -05:00
Matthew Honnibal	09d61ada5e	Merge pull request #1396 from explosion/feature/pipeline-management 💫 Improve pipeline and factory management	2017-10-10 04:29:54 +02:00
ines	67350fa496	Use better logic for auto-generating component name Instances don't have __name__, so we try __class__.__name__ as well, before giving up and defaulting to repr(component).	2017-10-10 04:23:05 +02:00
ines	3fc4fe61d2	Fix typo	2017-10-10 04:15:14 +02:00
ines	59c4f27499	Add get, set and has methods to Underscore	2017-10-10 04:14:35 +02:00
Matthew Honnibal	19136fd155	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-10 03:58:30 +02:00
Matthew Honnibal	8978212ee5	Patch serialization bug raised in #1105	2017-10-10 03:58:12 +02:00
Matthew Honnibal	f0f2739ae3	Add test for serialization issue raised in #1105	2017-10-10 03:57:58 +02:00
Matthew Honnibal	735d18654d	Add NER converter for CoNLL 2003 data	2017-10-09 20:06:28 -05:00
Matthew Honnibal	51d18937af	Partially apply doc/span/token into method We want methods to act like they're "bound" to the object, so that you can make your method conditional on the `doc`, `span` or `token` instance --- like, well, a method. We therefore partially apply the function, which works like this: ``` def partial(unbound_method, constant_arg): def bound_method(args, kwargs): return unbound_method(constant_arg, args, **kwargs) return bound_method	2017-10-10 02:21:28 +02:00
Matthew Honnibal	808d8740d6	Remove print statement	2017-10-09 08:45:20 -05:00
Matthew Honnibal	0f41b25f60	Add speed benchmarks to metadata	2017-10-09 08:05:37 -05:00
ines	de374dc72a	Merge branch 'feature/pipeline-management' into feature/dot-underscore	2017-10-09 14:37:51 +02:00
Matthew Honnibal	2534cd57d7	Add bandaid solution to the 'shadowing' problem in #864	2017-10-09 08:59:35 +02:00
Matthew Honnibal	d8a2506023	Merge pull request #1401 from explosion/feature/add-parser-action 💫 Allow labels to be added to pre-trained parser and NER modes	2017-10-09 04:57:51 +02:00
Matthew Honnibal	689349e32f	Merge pull request #1400 from explosion/feature/sentence-parsing 💫 Force parser to respect preset sentence boundaries	2017-10-09 04:31:43 +02:00
Matthew Honnibal	e79fc41ff8	Merge pull request #1391 from explosion/feature/multilabel-textcat 💫 Fix multi-label support for text classification	2017-10-09 04:22:31 +02:00
Matthew Honnibal	fad2b8315f	Merge branch 'develop' into feature/add-parser-action	2017-10-09 04:13:04 +02:00
Matthew Honnibal	6c79841c0d	Fix tests for history features	2017-10-09 04:12:24 +02:00
Matthew Honnibal	dde87e6b0d	Add tests for adding parser actions	2017-10-09 03:42:35 +02:00
Matthew Honnibal	b2b8506f2c	Remove whitespace	2017-10-09 03:35:57 +02:00
Matthew Honnibal	d43a83e37a	Allow parser.add_label for pretrained models	2017-10-09 03:35:40 +02:00
Matthew Honnibal	81a64119db	Fix string-to-unicode problem	2017-10-09 00:59:49 +02:00
Matthew Honnibal	02c2af7119	Fix test	2017-10-09 00:29:37 +02:00
Matthew Honnibal	4cc84b0234	Prohibit Break when sent_start < 0	2017-10-09 00:02:45 +02:00
Matthew Honnibal	5a67efeccc	Add tests for sentence segmentation presetting	2017-10-09 00:02:23 +02:00
Matthew Honnibal	e938bce320	Adjust parsing transition system to allow preset sentence segments.	2017-10-08 23:53:34 +02:00
Matthew Honnibal	080afd4924	Add ternary value setting to Token.sent_start	2017-10-08 23:51:58 +02:00
Matthew Honnibal	7ae67ec6a1	Add Span.as_doc method	2017-10-08 23:50:20 +02:00
Matthew Honnibal	20309fb9db	Make history features default to zero	2017-10-08 20:32:14 +02:00
Matthew Honnibal	e74c8d2fad	Merge remote-tracking branch 'origin/develop' into feature/sentence-parsing	2017-10-08 20:20:41 +02:00
Matthew Honnibal	18063803de	Make TokenC.sent_tart an int, to allow ternary value	2017-10-08 19:58:54 +02:00
Matthew Honnibal	be4f0b6460	Update defaults	2017-10-08 02:08:12 -05:00
Matthew Honnibal	42b401d08b	Change default hidden depth to 1	2017-10-07 21:05:21 -05:00
Matthew Honnibal	9d66a915da	Update training defaults	2017-10-07 21:02:38 -05:00
Matthew Honnibal	d163115e91	Add non-linearity after history features	2017-10-07 21:00:43 -05:00
Matthew Honnibal	92c5d78b42	Unhack NER.add_action	2017-10-07 19:02:40 +02:00
Matthew Honnibal	f2b590f672	Increment version	2017-10-07 19:01:01 +02:00
Matthew Honnibal	9bd8191739	Add tests for Underscore	2017-10-07 18:56:19 +02:00
Matthew Honnibal	668a0ea640	Pass extensions into Underscore class	2017-10-07 18:56:01 +02:00
Matthew Honnibal	1289129fd9	Add Underscore class	2017-10-07 18:00:14 +02:00
Matthew Honnibal	eb0595bea9	Merge pull request #1392 from explosion/feature/parser-history-model 💫 Parser history features	2017-10-07 15:07:02 +02:00
Matthew Honnibal	3d22ccf495	Update default hyper-parameters	2017-10-07 07:16:41 -05:00
Matthew Honnibal	09442d25ec	Merge remote-tracking branch 'origin/develop' into feature/parser-history-model	2017-10-07 07:05:04 -05:00
Matthew Honnibal	3b67eabfea	Allow empty dictionaries to match any token in Matcher Often patterns need to match "any token". A clean way to denote this is with the empty dict {}: this sets no constraints on the token, so should always match. The problem was that having attributes length==0 was used as an end-of-array signal, so the matcher didn't handle this case correctly. This patch compiles empty token spec dicts into a constraint NULL_ATTR==0. The NULL_ATTR attribute, 0, is always set to 0 on the lexeme -- so this always matches.	2017-10-07 03:36:15 +02:00
ines	0adadcb3f0	Fix beam parse model test	2017-10-07 02:15:15 +02:00
ines	b38a8f4a94	Fix and update pipe methods tests	2017-10-07 02:06:23 +02:00
Matthew Honnibal	0384f08218	Trigger nonproj.deprojectivize as a postprocess	2017-10-07 02:00:47 +02:00
Matthew Honnibal	3a65a0c970	Start adding tests for new pipeline management	2017-10-07 01:48:23 +02:00
ines	e43530269c	Update docstrings	2017-10-07 01:04:50 +02:00
ines	61a503a611	Fix parser test	2017-10-07 00:38:51 +02:00
ines	b39409173e	Add disable option and True/False/None values for pipeline	2017-10-07 00:29:08 +02:00
ines	2586b61b15	Fix formatting, tidy up and remove unused imports	2017-10-07 00:26:05 +02:00
ines	212c8f0711	Implement new Language methods and pipeline API	2017-10-07 00:25:54 +02:00
Matthew Honnibal	8be46d766e	Remove print statement	2017-10-06 16:19:02 -05:00
Matthew Honnibal	8e731009fe	Fix parser config serialization	2017-10-06 13:50:52 -05:00
Matthew Honnibal	f4c9a98166	Fix spacy evaluate command on non-GPU	2017-10-06 13:17:47 -05:00
Matthew Honnibal	16ba6aa8a6	Fix parser config serialization	2017-10-06 13:17:31 -05:00
Matthew Honnibal	c66399d8ae	Fix depth definition with history features	2017-10-06 06:20:05 -05:00
Matthew Honnibal	5c750a9c2f	Reserve 0 for 'missing' in history features	2017-10-06 06:10:13 -05:00
Matthew Honnibal	fbba7c517e	Pass dropout through to embed tables	2017-10-06 06:09:18 -05:00
Matthew Honnibal	21d11936fe	Fix significant train/test skew error in history feats	2017-10-06 06:08:50 -05:00
Matthew Honnibal	555d8c8bff	Fix beam history features	2017-10-05 22:21:50 -05:00
Matthew Honnibal	3db0a32fd6	Fix dropout for history features	2017-10-05 22:21:30 -05:00
Matthew Honnibal	b0618def8d	Add support for 2-token state option	2017-10-05 21:54:12 -05:00
Matthew Honnibal	363aa47b40	Clean up dead parsing code	2017-10-05 21:53:49 -05:00
Matthew Honnibal	ca12764772	Enable history features for beam parser	2017-10-05 21:53:29 -05:00
Matthew Honnibal	fc06b0a333	Fix training when hist_size==0	2017-10-05 21:52:28 -05:00
Matthew Honnibal	e25ffcb11f	Move history size under feature flags	2017-10-05 19:38:13 -05:00
Matthew Honnibal	563f46f026	Fix multi-label support for text classification The TextCategorizer class is supposed to support multi-label text classification, and allow training data to contain missing values. For this to work, the gradient of the loss should be 0 when labels are missing. Instead, there was no way to actually denote "missing" in the GoldParse class, and so the TextCategorizer class treated the label set within gold.cats as complete. To fix this, we change GoldParse.cats to be a dict instead of a list. The GoldParse.cats dict should map to floats, with 1. denoting 'present' and 0. denoting 'absent'. Gradients are zeroed for categories absent from the gold.cats dict. A nice bonus is that you can also set values between 0 and 1 for partial membership. You can also set numeric values, if you're using a text classification model that uses an appropriate loss function. Unfortunately this is a breaking change; although the functionality was only recently introduced and hasn't been properly documented yet. I've updated the example script accordingly.	2017-10-05 18:43:02 -05:00
Matthew Honnibal	c6cd81f192	Wrap try/except around model saving	2017-10-05 08:14:24 -05:00
Matthew Honnibal	5743b06e36	Wrap model saving in try/except	2017-10-05 08:12:50 -05:00
Matthew Honnibal	fd4baff475	Update tests	2017-10-05 08:12:27 -05:00
Matthew Honnibal	dcdfa071aa	Disable LayerNorm hack	2017-10-04 20:06:52 -05:00
Matthew Honnibal	943af4423a	Make depth setting in parser work again	2017-10-04 20:06:05 -05:00
Matthew Honnibal	bfabc333be	Merge remote-tracking branch 'origin/develop' into feature/parser-history-model	2017-10-04 20:00:36 -05:00
Matthew Honnibal	92066b04d6	Fix Embed and HistoryFeatures	2017-10-04 19:55:34 -05:00
Matthew Honnibal	d903986439	Increment version	2017-10-04 17:14:26 +02:00
Matthew Honnibal	40edb65ee7	Make test work for Python 2.7	2017-10-04 16:36:50 +02:00
Matthew Honnibal	bd8e84998a	Add nO attribute to TextCategorizer model	2017-10-04 16:07:30 +02:00
Matthew Honnibal	f8a0614527	Improve textcat model slightly	2017-10-04 15:15:53 +02:00
Matthew Honnibal	39798b0172	Uncomment layernorm adjustment hack	2017-10-04 15:12:09 +02:00
Matthew Honnibal	b3a7082bf8	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-04 14:56:46 +02:00
Matthew Honnibal	db05d4d582	Add test for #1380 . Passes without fix?	2017-10-04 14:56:31 +02:00
Matthew Honnibal	774f5732bd	Fix dimensionality of textcat when no vectors available	2017-10-04 14:55:15 +02:00
Ines Montani	28ba0b9b51	Merge pull request #1385 from explosion/feature/new-website 💫 New spaCy website	2017-10-04 14:35:52 +02:00
Matthew Honnibal	af75b74208	Unset LayerNorm backwards compat hack	2017-10-03 20:47:10 -05:00
ines	73ac0aa0b5	Update spacy evaluate and add displaCy option	2017-10-04 00:03:15 +02:00
Matthew Honnibal	246612cb53	Merge remote-tracking branch 'origin/develop' into feature/parser-history-model	2017-10-03 16:56:42 -05:00
Matthew Honnibal	f24c2e3a8a	Fix evaluate for non-GPU	2017-10-03 22:47:31 +02:00
Matthew Honnibal	5cbefcba17	Set backwards compatibility flag	2017-10-03 20:29:58 +02:00
Matthew Honnibal	5454b20cd7	Update thinc imports for 6.9	2017-10-03 20:07:17 +02:00
Matthew Honnibal	4a59f6358c	Fix thinc imports	2017-10-03 19:21:26 +02:00
Matthew Honnibal	e514d6aa0a	Import thinc modules more explicitly, to avoid cycles	2017-10-03 18:49:25 +02:00
Matthew Honnibal	338e1fda0e	Unbreak merge artefact	2017-10-03 09:41:05 -05:00
Matthew Honnibal	1289187279	Fix circular import	2017-10-03 09:33:21 -05:00
Matthew Honnibal	a44c4c3a5b	Add timer to evaluate	2017-10-03 09:15:35 -05:00
Matthew Honnibal	96da86b3e5	Add support for verbose flag to Language	2017-10-03 09:14:57 -05:00
Matthew Honnibal	02586a5243	Add timing to spacy evaluate command	2017-10-03 09:14:34 -05:00
ines	e49cd7aeaf	Move import into load to avoid circular imports	2017-10-03 15:22:19 +02:00
ines	b0dfa059db	Update docs link in about.py	2017-10-03 15:19:55 +02:00
Matthew Honnibal	dc3c791947	Fix history size option	2017-10-03 13:41:23 +02:00
Matthew Honnibal	278a4c17c6	Fix history features	2017-10-03 13:27:10 +02:00
Matthew Honnibal	b770f4e108	Fix embed class in history features	2017-10-03 13:26:55 +02:00
Matthew Honnibal	b50a359e11	Add support for history features in parsing models	2017-10-03 12:44:01 +02:00
Matthew Honnibal	ee41e4fea7	Support history features in stateclass	2017-10-03 12:43:48 +02:00
Matthew Honnibal	6aa6a5bc25	Add a layer type for history features	2017-10-03 12:43:09 +02:00
Matthew Honnibal	8902df44de	Fix component disabling during training	2017-10-02 21:07:23 +02:00
Matthew Honnibal	c617d288d8	Update pipeline component names in spaCy train	2017-10-02 17:20:19 +02:00
Matthew Honnibal	f942903429	Improve sentence merging in iob2json	2017-10-02 17:02:10 +02:00
Matthew Honnibal	31681d20e0	Fix concatenation in iob2json converter	2017-10-02 16:50:26 +02:00
Matthew Honnibal	4896ce3320	Remove misleading comment	2017-10-02 00:09:14 +02:00
Matthew Honnibal	d90cc917fa	Merge vectors.pyx doc strings	2017-10-01 17:05:54 -05:00
Matthew Honnibal	b2a8b9be77	Fix inconsistency of Vectors class API	2017-10-01 17:00:34 -05:00
Matthew Honnibal	e38089d598	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-01 22:10:54 +02:00
Matthew Honnibal	97c409b602	Add docstrings for spacy.vectors	2017-10-01 22:10:33 +02:00
ines	b776f48e58	Fix typo	2017-10-01 21:58:45 +02:00
Matthew Honnibal	94df115a81	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-10-01 14:06:23 -05:00
Matthew Honnibal	2cf0f4622f	Fix loading of models with pre-trained vectors	2017-10-01 14:05:32 -05:00
Matthew Honnibal	69c7c642c2	Add spacy evaluate	2017-10-01 14:05:04 -05:00
ines	8dbe49ecb8	Always compare lowercase package names Otherwise, is_package will return False if model name contains uppercase characters. See this issue: https://support.prodi.gy/t/saving-a-trained-ner-model-as-a-loadable-modu le/46/6	2017-09-29 20:55:17 +02:00
ines	153c2589d4	Revert "Always compare lowercase package names" This reverts commit `7d77dc490f`.	2017-09-29 20:53:36 +02:00
ines	fd1a9225d8	Handle conversion of pipeline components correctly Allow both comma and comma + whitespace as separators	2017-09-29 20:52:56 +02:00
ines	7d77dc490f	Always compare lowercase package names Otherwise, is_package will return False if model name contains uppercase characters. See this issue: https://support.prodi.gy/t/saving-a-trained-ner-model-as-a-loadable-modu le/46/6	2017-09-29 20:52:28 +02:00
Matthew Honnibal	cdb2d83e16	Pass dropout in parser	2017-09-28 18:47:13 -05:00
Matthew Honnibal	158e177cae	Fix default embed size	2017-09-28 08:25:23 -05:00
Matthew Honnibal	f6330d69e6	Default embed size to 7000	2017-09-28 08:07:41 -05:00
Matthew Honnibal	ac8481a7b0	Print NER loss	2017-09-28 08:05:31 -05:00
Matthew Honnibal	542ebfa498	Improve defaults	2017-09-27 18:54:37 -05:00
Matthew Honnibal	dcb86bdc43	Default batch size to 32	2017-09-27 11:48:19 -05:00
Matthew Honnibal	1a37a2c0a0	Update training defaults	2017-09-27 11:48:07 -05:00
Matthew Honnibal	13d7a97f3a	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-27 11:44:37 -05:00
Matthew Honnibal	66c388ee01	Remove unhelpful multitask objectives	2017-09-27 11:44:16 -05:00
Matthew Honnibal	983201a83a	Fix hard-coded vector width	2017-09-27 11:43:58 -05:00
Ines Montani	959c46eabe	Merge pull request #1365 from wannaphongcom/develop Add Thai language for spaCy v2	2017-09-26 23:43:05 +02:00
Matthew Honnibal	1ef4236f8e	Merge pull request #1343 from explosion/feature/phrasematcher Update PhraseMatcher for spaCy 2	2017-09-26 20:44:23 +02:00
Wannaphong Phatthiyaphaibun	7b5263ffa4	fix thai test	2017-09-26 23:54:15 +07:00
ines	1ff62eaee7	Fix option shortcut to avoid conflict	2017-09-26 17:59:34 +02:00
Wannaphong Phatthiyaphaibun	3d5046c499	fix import in th	2017-09-26 22:41:20 +07:00
ines	7fdfb78141	Add version option to cli.train	2017-09-26 17:34:52 +02:00
Wannaphong Phatthiyaphaibun	a63f790b8c	fix thai tag_map	2017-09-26 22:28:57 +07:00
Wannaphong Phatthiyaphaibun	2ea27d07f4	fix tokenizer_exceptions in thai	2017-09-26 22:14:47 +07:00
Matthew Honnibal	41cc5c4c17	Merge branch 'develop' into feature/phrasematcher	2017-09-26 09:59:17 -05:00
Matthew Honnibal	c2e2f81773	Merge pull request #1355 from explosion/feature/noshare Make pipeline components independent	2017-09-26 16:58:09 +02:00
Wannaphong Phatthiyaphaibun	a2bf4cc7bf	fix newline in file	2017-09-26 21:49:43 +07:00
ines	bb5c631402	Implement like_num getter for French (via #1161 )	2017-09-26 16:47:45 +02:00
ines	15479b3bae	Add comment to like_num re: future work	2017-09-26 16:43:28 +02:00
ines	adda08fe14	Implement like_num getter for Dutch (via #1177 )	2017-09-26 16:39:15 +02:00
ines	5ee10379db	Port over changes from #1340	2017-09-26 16:38:08 +02:00
Wannaphong Phatthiyaphaibun	5cba67146c	add thai in spacy2	2017-09-26 21:36:27 +07:00
ines	10d291f129	Port over change from #1351	2017-09-26 16:11:41 +02:00
Matthew Honnibal	3274b46a0d	Try to fix compile error on Windows	2017-09-26 09:05:53 -05:00
Matthew Honnibal	19c7c09bf7	Fix PhraseMatcher.__contains__	2017-09-26 08:35:53 -05:00
Matthew Honnibal	d02a41a8c9	Merge remote-tracking branch 'origin/develop' into feature/phrasematcher	2017-09-26 08:32:55 -05:00
Matthew Honnibal	698fc0d016	Remove merge artefact	2017-09-26 08:31:37 -05:00
Matthew Honnibal	defb68e94f	Update feature/noshare with recent develop changes	2017-09-26 08:15:14 -05:00
Matthew Honnibal	ca28590ddd	Use dep and ent multi-task objectives for parser'	2017-09-26 08:13:52 -05:00
Matthew Honnibal	9bfd585a11	Fix parameter name in .pxd file	2017-09-26 07:28:50 -05:00
Matthew Honnibal	74f08e1ad5	Update test	2017-09-26 06:45:56 -05:00
Matthew Honnibal	5aaef3e7b8	Dont link vectors in vocab deserialize	2017-09-26 06:45:47 -05:00
Matthew Honnibal	18a27c7579	Fix typo in tensorizer serialization	2017-09-26 06:45:14 -05:00
Matthew Honnibal	5056743ad5	Fix parser serialization	2017-09-26 06:44:56 -05:00
Ines Montani	7123139b2b	Add __contains__ to PhraseMatcher	2017-09-26 13:13:27 +02:00
Ines Montani	50ad50f96a	Update matcher.pyx	2017-09-26 13:11:17 +02:00
Matthew Honnibal	e34e70673f	Allow tagger models to be built with pre-defined tok2vec layer	2017-09-26 05:51:52 -05:00
Matthew Honnibal	bf917225ab	Allow multi-task objectives during training	2017-09-26 05:42:52 -05:00
Matthew Honnibal	4ae9ea7684	Remove unused argument in Language	2017-09-26 05:41:35 -05:00
ines	edf7e4881d	Add meta.json option to cli.train and add relevant properties Add accuracy scores to meta.json instead of accuracy.json and replace all relevant properties like lang, pipeline, spacy_version in existing meta.json. If not present, also add name and version placeholders to make it packagable.	2017-09-25 19:00:47 +02:00
ines	d2d35b63b7	Fix formatting	2017-09-25 18:37:13 +02:00
Matthew Honnibal	8eb0b7b779	Add docstrings for Pipe API	2017-09-25 16:22:07 +02:00
Matthew Honnibal	39f390dba7	Add docstrings for Pipe API	2017-09-25 16:20:49 +02:00
Matthew Honnibal	8716ffe57d	Serialize vocab last	2017-09-24 05:01:45 -05:00
Matthew Honnibal	72bbcc0871	Handle lemmatization for unknown string IDs	2017-09-24 05:01:31 -05:00
Matthew Honnibal	204b58c864	Fix evaluation during training	2017-09-24 05:01:03 -05:00
Matthew Honnibal	dc3a623d00	Remove unused update_shared argument	2017-09-24 05:00:37 -05:00
Matthew Honnibal	63bd87508d	Don't use iterated convolutions	2017-09-23 04:39:17 -05:00
Matthew Honnibal	5a7fd0fd36	Fix vector linkage	2017-09-22 20:11:52 -05:00
Matthew Honnibal	4348c479fc	Merge pre-trained vectors and noshare patches	2017-09-22 20:07:28 -05:00
Matthew Honnibal	7dc61b3f43	Whitespace	2017-09-22 20:00:50 -05:00
Matthew Honnibal	e93d43a43a	Fix training with preset vectors	2017-09-22 20:00:40 -05:00
Matthew Honnibal	0795857dcb	Fix beam parsing	2017-09-23 02:59:53 +02:00
Matthew Honnibal	4bd6a12b1f	Fix Tok2Vec	2017-09-23 02:58:54 +02:00
Matthew Honnibal	386c1a5bd8	Fix tagger training	2017-09-23 02:58:06 +02:00
Matthew Honnibal	a2357cce3f	Set random seed in train script	2017-09-23 02:57:31 +02:00
Matthew Honnibal	05596159bf	Fix serialization when pre-trained vectors	2017-09-22 15:33:27 -05:00
Matthew Honnibal	980fb6e854	Refactor Tok2Vec	2017-09-22 09:38:36 -05:00
Matthew Honnibal	d9124f1aa3	Add link_vectors_to_models function	2017-09-22 09:38:22 -05:00
Matthew Honnibal	a186596307	Add 'reapply' combinator, for iterated CNN	2017-09-22 09:37:03 -05:00
Matthew Honnibal	40a4873b70	Fix serialization of model options	2017-09-21 13:07:26 -05:00
Matthew Honnibal	0a9016cade	Fix serialization during training	2017-09-21 13:06:45 -05:00
Matthew Honnibal	20193371f5	Don't share CNN, to reduce complexities	2017-09-21 14:59:48 +02:00
Matthew Honnibal	1d73dec8b1	Refactor train script	2017-09-20 19:17:10 -05:00
Matthew Honnibal	ffda38356a	Add util function to enable GPU	2017-09-20 19:16:35 -05:00
Matthew Honnibal	24e85c2048	Pass values for CNN maxout pieces option	2017-09-20 19:16:12 -05:00
Matthew Honnibal	b832f89ff8	Add resume_training function	2017-09-20 19:15:20 -05:00
Matthew Honnibal	f5144f04be	Add argument for CNN maxout pieces	2017-09-20 19:14:41 -05:00
Matthew Honnibal	842e21de9f	Fix int type error for Python 2	2017-09-20 23:55:30 +02:00
Matthew Honnibal	0c93c73e49	Add __reduce__ method for PhraseMatcher	2017-09-20 22:26:40 +02:00
Matthew Honnibal	cc408fc189	Make PhraseMatcher API like Matcher API	2017-09-20 22:20:35 +02:00
Matthew Honnibal	43ad250dd5	Update matcher tests	2017-09-20 21:54:49 +02:00
Matthew Honnibal	828cc91545	Fix PhraseMatcher for spaCy 2	2017-09-20 21:54:31 +02:00
Matthew Honnibal	78301b2d29	Avoid comparison to None in Tok2Vec	2017-09-20 00:19:34 +02:00
Matthew Honnibal	b36a38f63d	Fix serialization of pretrained_dims property	2017-09-19 23:42:27 +02:00
Matthew Honnibal	2489dcaccf	Fix serialization of parser	2017-09-19 23:42:12 +02:00
Matthew Honnibal	40837b275d	Fix tensorizer with pretrained vectors	2017-09-18 18:05:38 -05:00
Matthew Honnibal	a0c4b33d03	Support resuming a model during spacy train	2017-09-18 18:04:47 -05:00
Matthew Honnibal	c858927271	Copy vectors to GPU on begin training	2017-09-18 18:04:16 -05:00
Matthew Honnibal	3fa76c17d1	Refactor Tok2Vec	2017-09-18 15:00:05 -05:00
Matthew Honnibal	217e7891cd	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-18 11:36:21 -05:00
Matthew Honnibal	7b3f391f80	Try dropping the Affine layer, conditionally	2017-09-18 11:35:59 -05:00
ines	2480f8f521	Add missing return in Doc.from_disk() (closes #1330 )	2017-09-18 15:32:00 +02:00
Matthew Honnibal	2148ae605b	Dont use iterated convolutions	2017-09-17 17:36:04 -05:00
Matthew Honnibal	c013e5996f	Fix parser test	2017-09-17 13:13:20 -05:00
Matthew Honnibal	8f42f8d305	Remove unused 'preprocess' argument in Tok2Vec'	2017-09-17 12:30:16 -05:00
Matthew Honnibal	039d609362	Remove hard-coded default vectors width	2017-09-17 12:29:39 -05:00
Matthew Honnibal	4f38a67a89	Make width default to 0 in vectors.pyx	2017-09-17 12:29:14 -05:00
Matthew Honnibal	16122f566e	Fix cpdef enum in attrs.pyx	2017-09-17 12:28:53 -05:00
Matthew Honnibal	b159e0eb50	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-17 05:47:50 -05:00
Matthew Honnibal	2b0efc77ae	Fix wiring of pre-trained vectors in parser loading	2017-09-17 05:47:34 -05:00
Matthew Honnibal	31c2e91c35	Fix wiring of pre-trained vectors in parser loading	2017-09-17 05:46:55 -05:00
Matthew Honnibal	8f913a74ca	Fix defaults and args to build_tagger_model	2017-09-17 05:46:36 -05:00
Matthew Honnibal	c003c561c3	Revert NER action loading change, for model compatibility	2017-09-17 05:46:03 -05:00
Matthew Honnibal	43210abacc	Resolve fine-tuning conflict	2017-09-17 05:30:04 -05:00
ines	ece30c28a8	Don't split hyphenated words in German This way, the tokenizer matches the tokenization in German treebanks	2017-09-16 20:40:15 +02:00
ines	68f66aebf8	Use pkg_resources instead of pip for is_package (resolves #1293 )	2017-09-16 20:27:59 +02:00
Matthew Honnibal	5ff2491f24	Pass option for pre-trained vectors in parser	2017-09-16 12:47:21 -05:00
Matthew Honnibal	8665a77f48	Fix feature error in NER	2017-09-16 12:46:57 -05:00
Matthew Honnibal	e37a50a436	Pass documents to tensorizer, not 'features'	2017-09-16 12:46:36 -05:00
Matthew Honnibal	84e637e2e6	Pass option for pretrained vectors in pipeline	2017-09-16 12:46:02 -05:00
Matthew Honnibal	2a93404da6	Support optional pre-trained vectors in tensorizer model	2017-09-16 12:45:37 -05:00
Matthew Honnibal	e0a2aa9289	Support having word vectors data on GPU	2017-09-16 12:45:09 -05:00
Matthew Honnibal	ebf8942564	Fix test for Python3	2017-09-16 16:22:38 +02:00
Matthew Honnibal	8c945310fb	Excuse emoji failure on narrow unicode builds	2017-09-16 16:21:13 +02:00
Matthew Honnibal	11f2a05ede	Fix code explosion from long enum in Python 3, Cython 0.24+	2017-09-16 12:20:04 +02:00
Matthew Honnibal	3fa5b40b5c	Add test for hash consistency	2017-09-16 11:21:35 +02:00
Matthew Honnibal	f730d07e4e	Fix prange error for Windows	2017-09-16 00:25:33 +02:00
Matthew Honnibal	4b2065430e	Merge branch 'feature/parser-history' into develop	2017-09-15 10:42:20 +02:00
Matthew Honnibal	2f08489694	Remove AddHistory layer -- didnt work as planned	2017-09-15 10:41:40 +02:00
Matthew Honnibal	8b481e0465	Remove redundant brackets	2017-09-15 10:38:08 +02:00
Matthew Honnibal	d84607f6bb	Vectorize update in AddHistory	2017-09-14 20:34:40 +02:00
Ines Montani	bd3da3d6fb	Port over change from #1323 and tidy up	2017-09-14 19:23:13 +02:00
Matthew Honnibal	18347ab69c	Implement AddHistory layer wrapper	2017-09-14 19:07:35 +02:00
Matthew Honnibal	d4ca6cef9e	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-14 17:00:07 +02:00
Matthew Honnibal	8c503487af	Fix lookup of missing NER actions	2017-09-14 16:59:45 +02:00
Matthew Honnibal	664c5af745	Revert padding in parser	2017-09-14 16:59:25 +02:00
Matthew Honnibal	8496d76224	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-14 09:21:20 -05:00
Matthew Honnibal	d1518027a9	Increment version	2017-09-14 16:18:46 +02:00
Matthew Honnibal	70da88a3a7	Update comment on Language.begin_training	2017-09-14 16:18:30 +02:00
Matthew Honnibal	c6395b057a	Improve parser feature extraction, for missing values	2017-09-14 16:18:02 +02:00
Matthew Honnibal	daf869ab3b	Fix add_action for NER, so labelled 'O' actions aren't added	2017-09-14 16:16:41 +02:00
Matthew Honnibal	9cb2aef587	Remove print statement	2017-09-14 13:38:28 +02:00
Matthew Honnibal	ba23d63c35	Fix minibatch function, for fixed batch size	2017-09-14 13:37:41 +02:00
Jim O'Regan	7de709483b	missed adding here	2017-09-11 10:51:21 +01:00
Jim O'Regan	b1b6123867	add ga_tokenizer	2017-09-11 10:31:41 +01:00
Jim O'Regan	9dfd301962	rearrange	2017-09-11 10:14:18 +01:00
Jim O'Regan	187be6d372	copy/paste error	2017-09-11 09:33:17 +01:00
Jim O'Regan	c283e9edfe	first stab at test	2017-09-11 08:57:48 +01:00
Jim O'Regan	1ee75ae337	Merge remote-tracking branch 'origin/develop' into develop-irish	2017-09-11 08:40:11 +01:00
Matthew Honnibal	456bb8a74c	Unxfail and close #1305	2017-09-06 19:14:17 +02:00
Matthew Honnibal	99e44fbdbb	Update regression test	2017-09-06 19:13:51 +02:00
Matthew Honnibal	5c3ff06924	Fix lemmatizer rules	2017-09-06 19:13:24 +02:00
Matthew Honnibal	dd9cab0faf	Fix type-check for int/long	2017-09-06 19:03:05 +02:00
Matthew Honnibal	497a9308a8	Xfail new lemmatizer test	2017-09-06 18:41:22 +02:00
Matthew Honnibal	dcbf866970	Merge parser changes	2017-09-06 18:41:05 +02:00
Matthew Honnibal	5384fff5ce	Add test for 1305: Incorrect lemmatization of VBZ for English	2017-09-06 18:40:18 +02:00
Matthew Honnibal	24ff6b0ad9	Fix parsing and tok2vec models	2017-09-06 05:50:58 -05:00
Matthew Honnibal	1b65115bc2	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-04 20:02:53 -05:00
Matthew Honnibal	33fa91feb7	Restore correctness of parser model	2017-09-04 21:19:30 +02:00
Matthew Honnibal	e88a42e460	Increment version	2017-09-04 21:14:39 +02:00
Matthew Honnibal	9d65d67985	Preserve model compatibility in parser, for now	2017-09-04 16:46:22 +02:00
Matthew Honnibal	d5fbf27335	Fix test	2017-09-04 16:45:11 +02:00
Matthew Honnibal	7fdafcc4c4	Fix config loading in tagger	2017-09-04 16:38:49 +02:00
Matthew Honnibal	058372d120	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-04 16:27:53 +02:00
Matthew Honnibal	16e25ce3b5	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-04 09:26:53 -05:00
Matthew Honnibal	9f512e657a	Fix drop_layer calculation	2017-09-04 09:26:38 -05:00
Matthew Honnibal	cb4839033c	Fix loader for EN tests	2017-09-04 15:19:18 +02:00
Matthew Honnibal	382ce566eb	Fix deserialization bug	2017-09-04 15:19:01 +02:00
Matthew Honnibal	bfddf50081	Fix #1296 : Incorrect lemmatization of base form verbs	2017-09-04 15:18:41 +02:00
Matthew Honnibal	b29e6bff46	Improve lemmatization rule for am\|VBP	2017-09-04 15:18:10 +02:00
Matthew Honnibal	644d6c9e1a	Improve lemmatization tests, re #1296	2017-09-04 15:17:44 +02:00
Matthew Honnibal	3cf3fa1704	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-02 12:46:11 -05:00
Matthew Honnibal	e920885676	Fix pickle during train	2017-09-02 12:46:01 -05:00
Matthew Honnibal	c0eaba8b28	Fix low-data textcat	2017-09-02 15:17:32 +02:00
Matthew Honnibal	9e378bdac5	Fix textcat serialization	2017-09-02 15:17:20 +02:00
Matthew Honnibal	e3ea6ee02b	Increment version	2017-09-02 15:17:01 +02:00
Matthew Honnibal	a3b69bcb3d	Add low_data mode in textcat	2017-09-02 14:56:30 +02:00
Matthew Honnibal	ead78c7b9b	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-02 12:55:25 +02:00
Matthew Honnibal	5e6a9e7dcc	Add rule-based SBD	2017-09-02 12:53:38 +02:00
Matthew Honnibal	a824cf8f9a	Adjust text classification model	2017-09-02 11:41:00 +02:00
Matthew Honnibal	ac040b99bb	Add support for pre-trained vectors in text classifier	2017-09-01 16:39:55 +02:00
Matthew Honnibal	7742a6d559	Add GloVe vectors reader	2017-09-01 16:39:22 +02:00
Matthew Honnibal	789e1a3980	Use 13 parser features, not 8	2017-08-31 14:13:00 -05:00
Matthew Honnibal	30e35d9666	Fix syntax error	2017-08-30 17:35:39 -05:00
Matthew Honnibal	4ceebde523	Fix gradient bug in parser	2017-08-30 17:32:56 -05:00
ines	173089a45a	Add more validation for model meta	2017-08-29 11:21:46 +02:00
Matthew Honnibal	2e28982e28	Merge pull request #1288 from geovedi/indonesian Indonesian language support	2017-08-26 21:31:13 +02:00
ines	7e04b7f89c	Fix info text on pipeline in package cli	2017-08-26 18:30:59 +02:00
ines	40afa13a8a	Increment version	2017-08-26 18:30:49 +02:00
Matthew Honnibal	876f38c548	Merge pull request #1279 from oroszgy/model_cli_v2 Added vector loading to model cli	2017-08-26 15:57:50 +02:00
Matthew Honnibal	cfc055734e	Split % in units, for compatibility with corpus	2017-08-25 20:03:37 -05:00
Matthew Honnibal	4bb6bc3f9e	Add support for sent_start to GoldParse	2017-08-25 20:03:14 -05:00
Matthew Honnibal	44589fb38c	Fix Break oracle	2017-08-25 19:50:55 -05:00
Matthew Honnibal	6d4e8e14ca	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-25 12:37:16 -05:00
Matthew Honnibal	4ce5531389	Use layer norm instead of batch norm	2017-08-25 12:37:10 -05:00
Matthew Honnibal	20dd66ddc2	Constrain sentence boundaries to IS_PUNCT and IS_SPACE tokens	2017-08-25 19:35:47 +02:00
Jim Geovedi	58d8078971	Merge remote-tracking branch 'upstream/develop' into indonesian	2017-08-25 09:21:49 +08:00
Matthew Honnibal	6ceb0f0518	Allow Lexeme.rank to be set	2017-08-24 21:43:00 +02:00
Matthew Honnibal	44a1fa80d3	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-23 13:02:16 +02:00
ines	bb1abbeba5	Only link model if download was successfull	2017-08-23 12:36:31 +02:00
Matthew Honnibal	bb2541ffd3	Fix PROB attr for OOV words	2017-08-23 12:11:52 +02:00
Matthew Honnibal	1c5c256e58	Fix fine_tune when optimizer is None	2017-08-23 10:51:33 +02:00
Matthew Honnibal	9c580ad28a	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-22 17:02:04 -05:00
Matthew Honnibal	a4633fff6f	Restore use of batch norm in model	2017-08-22 17:01:58 -05:00
Matthew Honnibal	03b5b9727a	Fix Doc.vector for empty doc objects	2017-08-22 19:52:19 +02:00
Matthew Honnibal	0551b7b03a	Fix doc.vector	2017-08-22 19:46:52 +02:00
Matthew Honnibal	83f8e98450	Fix retrieval of OOV vectors	2017-08-22 19:46:35 +02:00
Matthew Honnibal	df2745eb08	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-22 19:00:43 +02:00

... 6 7 8 9 10 ...

4434 Commits