spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-04 02:46:40 +03:00

Author	SHA1	Message	Date
Bri-Will	d77361d76c	Update lex_attrs.py. Fix like_url from matching on e-mail	2017-12-11 14:13:28 -08:00
Søren Lind Kristiansen	5a9d377580	Remove abbreviation for positional plac argument	2017-12-11 11:08:29 +01:00
Isaac Sijaranamual	38021fbb00	Switch from python 3 only TemporaryDirectory to pytest's tmpdir	2017-12-11 00:16:04 +01:00
Isaac Sijaranamual	20ae0c459a	Fixes "Error saving model" #1622	2017-12-10 23:07:13 +01:00
Isaac Sijaranamual	568130ce7c	Adds regression test_issue1622	2017-12-10 23:00:48 +01:00
Isaac Sijaranamual	e188b61960	Make cli/train.py not eat exception	2017-12-10 22:53:08 +01:00
ines	020a7e5d52	Allow 'fine_grained' option in displaCy (see #1703 ) Shows token.tag_ instead of token.pos_. Disabled by default, to not cause rendering issues for models with long fine-grained tags (e.g. merged morphological features).	2017-12-09 15:11:12 +01:00
Matthew Honnibal	3b17eb7c49	Merge branch 'master' of https://github.com/explosion/spaCy	2017-12-07 10:39:32 +01:00
Matthew Honnibal	a6b43729c6	Set version to v2.0.5	2017-12-07 10:39:14 +01:00
ines	5eaa61c2b8	Fix formatting	2017-12-07 10:23:09 +01:00
ines	24e80c51b8	Document init-model command	2017-12-07 10:14:37 +01:00
Matthew Honnibal	c91f451b0f	Fix imports and CLI in init-model	2017-12-07 10:03:07 +01:00
ines	82e80ff928	Rename model command to init_model and fix formatting	2017-12-07 09:59:23 +01:00
Ines Montani	2feeb428d6	Merge pull request #1646 from GreenRiverRUS/master Added model command to create models from raw data	2017-12-07 08:54:26 +00:00
Matthew Honnibal	6373d2580d	Increment version to v2.0.5.dev0	2017-12-07 09:53:59 +01:00
Matthew Honnibal	36b47e3fa6	Fix (and test) vector pickling	2017-12-07 09:53:30 +01:00
Matthew Honnibal	05f41ff587	Set version to 2.0.4	2017-12-06 13:24:02 +01:00
Matthew Honnibal	04c38f7e87	Merge branch 'master' of https://github.com/explosion/spaCy	2017-12-06 12:15:52 +01:00
Matthew Honnibal	361944e512	If no rules are set, lemmatize by lookup	2017-12-06 12:12:11 +01:00
Matthew Honnibal	2ab0f2d186	Merge pull request #1664 from jimregan/italian-lemmatizer BOM in Italian lemmatiser	2017-12-06 11:09:04 +01:00
Matthew Honnibal	3f247119d3	Merge pull request #1668 from sorenlind/da_morph Add more Danish morph rules and clean up existing ones	2017-12-06 11:08:09 +01:00
Matthew Honnibal	b712de774e	Fix vectors pickling	2017-12-05 12:45:24 +01:00
Matthew Honnibal	04650e38c7	Set version to 2.0.4.dev0	2017-12-05 10:52:31 +01:00
Matthew Honnibal	07acb43a85	Merge branch 'master' of https://github.com/explosion/spaCy	2017-12-04 14:42:52 +01:00
Thomas Werkmeister	94eac75b7c	fix setup.py spacy req string for packaging Requirement should be `spacy>=2.0.2` instead of `spacy2.0.2`	2017-12-03 04:16:28 -06:00
ines	f2ea6d4713	Add Dutch example sentences (see #1107 )	2017-12-01 23:36:05 +01:00
Canbey Bilgili	abe098b255	Adds Turkish Lemmatization	2017-12-01 17:04:32 +03:00
Søren Lind Kristiansen	d86b537a38	Enable morph rules for Danish	2017-11-30 15:58:02 +01:00
Søren Lind Kristiansen	13a988adc3	Remove 'Number[psor]'	2017-11-30 15:55:04 +01:00
Søren Lind Kristiansen	dd6fde18a9	Add more Danish morph rules and clean up existing ones	2017-11-30 11:17:19 +01:00
Vadim Mazaev	495eacf470	Merge branch 'model_command'	2017-11-30 12:30:26 +03:00
Vadim Mazaev	4ba7ddf651	Bugfixies	2017-11-30 12:29:38 +03:00
Jim O'Regan	a4ecdeadd4	aha	2017-11-29 23:43:25 +00:00
Jim O'Regan	2c7a9215d7	Merge branch 'master' into animacy	2017-11-29 23:31:12 +00:00
Jim O'Regan	c3e6cee17a	use inan in polimorf tagset conversion	2017-11-29 23:15:47 +00:00
Jim O'Regan	b32575e78c	imports	2017-11-29 23:03:41 +00:00
Jim O'Regan	3696ce6a7b	add UD mapping	2017-11-29 22:59:19 +00:00
Jim O'Regan	f8e7082fe4	typo in "inan", add "nhum"	2017-11-29 22:40:47 +00:00
Matthew Honnibal	6bc0f4d29f	Merge pull request #1611 from fsonntag/master Solving #1494	2017-11-29 23:11:23 +01:00
Matthew Honnibal	f9ed9ea529	Merge pull request #1624 from GreenRiverRUS/russian Add support for Russian	2017-11-29 23:10:01 +01:00
Jim O'Regan	076a6fc60a	symbols	2017-11-29 20:11:20 +00:00
Jim O'Regan	834ba3c69a	(semi generated) Polimorf mapping	2017-11-29 20:08:24 +00:00
Jim O'Regan	ba6a23fd11	BOM in Italian lemmatiser	2017-11-29 17:40:07 +00:00
ines	a31506e060	Fix off-by-one error in nlp.add_pipe(after=name) (fixes #1654 )	2017-11-28 20:37:55 +01:00
ines	b62739fbfe	Add regression test for #1654	2017-11-28 20:27:54 +01:00
ines	2e50dbb9d7	Simplify test	2017-11-28 20:27:27 +01:00
Felix Sonntag	724ae7dc55	Fixed issue of infix capturing prefixes	2017-11-28 17:17:12 +01:00
Ines Montani	9052643e2c	Merge pull request #1653 from sorenlind/da_example_typo Fix typo	2017-11-27 14:47:42 +00:00
Søren Lind Kristiansen	5fe58b885b	Fix typo	2017-11-27 15:36:18 +01:00
Ines Montani	d52b1ab245	Add unicode_literals (hopefully fixes test failure on Python 2)	2017-11-27 15:16:54 +01:00
Søren Lind Kristiansen	0ffd27b0f6	Add several Danish alternative spellings	2017-11-27 13:35:41 +01:00
Ines Montani	6362024cf8	Merge pull request #1645 from GreenRiverRUS/fix_default_meta Fixed spaCy version string in default meta	2017-11-27 11:58:02 +00:00
Vadim Mazaev	c332ffdde1	Added model command to create model from raw data: words counts, brown clusters and vectors	2017-11-27 01:21:47 +03:00
Vadim Mazaev	59f03ab1d7	Fixed spacy version string in default meta	2017-11-26 23:02:07 +03:00
Vadim Mazaev	53e7c38637	Fixed tests depends on pymorphy2	2017-11-26 21:04:44 +03:00
Vadim Mazaev	cacd859dcd	Added tag map, fixed tests fails, added more exceptions	2017-11-26 20:54:48 +03:00
Ines Montani	a7bb8f1b42	Merge pull request #1637 from sorenlind/da_tokenization Improve Danish tokenization	2017-11-26 15:41:38 +00:00
ines	c699aec089	Add offsets_from_biluo_tags helper and tests (see #1626 )	2017-11-26 16:38:01 +01:00
Søren Lind Kristiansen	ef03e9ea53	Remove unused import.	2017-11-25 13:04:02 +01:00
Søren Lind Kristiansen	6aa241bcec	Add day of month tokenizer exceptions for Danish.	2017-11-24 15:03:24 +01:00
Søren Lind Kristiansen	0c276ed020	Add weekday abbreviations and remove abiguous month abbreviations for Danish.	2017-11-24 14:43:29 +01:00
Søren Lind Kristiansen	056547e989	Add multiple tokenizer exceptions for Danish.	2017-11-24 11:51:26 +01:00
Søren Lind Kristiansen	8dc265ac0c	Add test for tokenization of 'i.' for Danish.	2017-11-24 11:29:37 +01:00
Søren Lind Kristiansen	ac8116510d	Fix tokenization of 'i.' for Danish.	2017-11-24 11:16:53 +01:00
Matthew Honnibal	79f11d4f85	Pickle vectors with vocab	2017-11-23 17:19:50 +01:00
Matthew Honnibal	f29c3925ee	Fix more efficient nonproj	2017-11-23 12:48:00 +00:00
Matthew Honnibal	e10e9ad2c5	Improve efficiency of Doc.to_array	2017-11-23 12:33:27 +00:00
Matthew Honnibal	2acc907d55	Improve profiling	2017-11-23 12:33:03 +00:00
Matthew Honnibal	fa62427300	Remove lookup-based lemmatization	2017-11-23 12:32:22 +00:00
Matthew Honnibal	fb26b2cb12	Use lookup lemmatizer if lemma unset	2017-11-23 12:31:58 +00:00
Matthew Honnibal	db5c714ad2	Improve efficiency of deprojectivization	2017-11-23 12:31:34 +00:00
Matthew Honnibal	8fec7268eb	Move string cleanup under a setting flag	2017-11-23 12:19:18 +00:00
Matthew Honnibal	5949777b12	Fix misleading multi-threading docstring	2017-11-23 12:18:59 +00:00
Matthew Honnibal	542e6fd4ea	Don't remove entries from specials	2017-11-23 12:17:42 +00:00
Matthew Honnibal	30ba81f881	Merge pull request #1576 from ligser/master Actually reset caches in pipe [wip]	2017-11-23 12:54:48 +01:00
ines	c90fe92e15	Fix displaCy test	2017-11-22 05:04:39 +01:00
ines	a6f33ac27d	Fix displaCy test	2017-11-22 04:19:28 +01:00
ines	93b0be611a	Merge branch 'master' of https://github.com/explosion/spaCy	2017-11-22 00:28:55 +01:00
ines	60b4915569	Use .pos_ instead of .tags_ in displaCy by default (see #1006 )	2017-11-22 00:28:52 +01:00
Vadim Mazaev	81314f8659	Fixed tokenizer: added char classes; added first lemmatizer and tokenizer tests	2017-11-21 22:23:59 +03:00
Vadim Mazaev	52ee1f9bf9	Updated Russian Language, added lemmatizer, norm exceptions and lex attrs	2017-11-21 11:44:46 +03:00
Burton DeWilde	a5c6869b2d	Fix bug where span.orth_ != span.text (see #1612 )	2017-11-20 12:05:43 -06:00
Burton DeWilde	635792997c	Add regression test for #1612	2017-11-20 12:05:35 -06:00
ines	9a63e32f21	Add noqa to Python 2 compat variables of built-ins (see #1617 )	2017-11-20 14:03:42 +01:00
ines	d70a64d78b	Fix syntax error and formatting in test (see #1617 )	2017-11-20 14:01:25 +01:00
ines	17849dee4b	Fix French test (see #1617 )	2017-11-20 13:59:59 +01:00
Felix Sonntag	33b0f86de3	Changed tokenizer to add infix when infix_start is offset	2017-11-19 16:32:10 +01:00
Felix Sonntag	8be3392302	Added regression text for 1494	2017-11-19 16:30:35 +01:00
Motoki Wu	a52e195a0a	Fixes Issue #1207 where `noun_chunks` of `Span` gives an error. Make sure to reference `self.doc` when getting the noun chunks. Same fix as `9750a0128c`	2017-11-17 17:16:20 -08:00
Motoki Wu	b818afaa0e	Added failing test for Issue #1207 . The noun chunk iterator should work for `Doc` but not for `Span`.	2017-11-17 17:04:27 -08:00
Vadim Mazaev	a0739a06d4	Returned russian support from v1.10 branch	2017-11-17 17:06:15 +03:00
yuukos	7401152289	updated Russian tokenizer moved the trying to import pymorph into __init__	2017-11-17 17:04:50 +03:00
yuukos	3aad66cf00	added russian language support	2017-11-17 17:04:22 +03:00
ines	a3d4dd1a5d	Test adding of lots of pipeline components (see #1585 ) Just to make sure that there's no error now or in the future with adding a large number of pipeline components.	2017-11-15 17:28:06 +01:00
Roman Domrachev	61d28d03e4	Try again to do selective remove cache	2017-11-15 19:11:12 +03:00
Roman Domrachev	b3311100c7	Merge branch 'master' of github.com:explosion/spaCy	2017-11-15 18:30:04 +03:00
Matthew Honnibal	b60d92aca8	Increment version	2017-11-15 16:14:46 +01:00
Roman Domrachev	505c6a2f2f	Completely cleanup tokenizer cache Tokenizer cache can have be different keys than string That modification can slow down tokenizer and need to be measured	2017-11-15 17:55:48 +03:00
Matthew Honnibal	cf0be62096	Increment version	2017-11-15 15:00:18 +01:00
ines	97a4f9362b	Merge branch 'master' of https://github.com/explosion/spaCy	2017-11-15 14:24:00 +01:00
ines	8e65247886	Fix lex.id if vectors is None	2017-11-15 14:23:58 +01:00
Matthew Honnibal	437ad1a852	Merge pull request #1570 from explosion/feature/fix-beam-leak Fix memory leak in beam parser	2017-11-15 14:15:05 +01:00
Matthew Honnibal	2f169fdb0a	Set lex ID correctly for new tokens in Vocab	2017-11-15 13:58:03 +01:00
Matthew Honnibal	fe3c42a06b	Fix caching in tokenizer	2017-11-15 13:55:46 +01:00
Matthew Honnibal	8d692771f6	Improve profiling	2017-11-15 13:51:25 +01:00
Matthew Honnibal	b797dca977	Merge branch 'master' of https://github.com/explosion/spaCy	2017-11-15 13:11:43 +01:00
ines	c9d72de0fb	Add dummy serialization methods for Japanese and missing lang getter (resolves #1557 )	2017-11-15 12:44:02 +01:00
Matthew Honnibal	d274d3a3b9	Let beam forward use minibatches	2017-11-15 00:51:42 +01:00
Matthew Honnibal	855872f872	Remove state hashing	2017-11-14 23:36:46 +01:00
Roman Domrachev	3e21680814	Use safer method to get string without hit	2017-11-14 22:58:46 +03:00
Roman Domrachev	a33d5a068d	Try to hold origin data instead of restore it	2017-11-14 22:40:03 +03:00
Roman Domrachev	91e2fa6561	Clean all caches	2017-11-14 21:15:04 +03:00
Roman Domrachev	4e378dc4a4	Remove all obsolete code and test only initial problem	2017-11-14 20:45:04 +03:00
Roman	47ce2347b0	Create test that fails when actual cleanup caused	2017-11-14 20:28:13 +03:00
Roman	caae77f72d	Update strings.pyx	2017-11-14 19:44:40 +03:00
Roman Domrachev	3d247d2bb8	Get back previous testcase	2017-11-14 18:01:37 +03:00
Roman Domrachev	870defa815	Swap keys in proper place Remove unnecessary clear of the hits	2017-11-14 17:56:30 +03:00
Roman Domrachev	86ca434c93	Merge github.com:explosion/spaCy	2017-11-14 17:46:22 +03:00
Roman Domrachev	a2745b0e84	StringStore now actually cleaned Do not lose docs in ref tracking	2017-11-14 17:45:50 +03:00
Matthew Honnibal	2512ea9eeb	Fix memory leak in beam parser	2017-11-14 02:11:40 +01:00
Matthew Honnibal	86ddf692a1	Fix bug in limit calculation on dev data	2017-11-14 01:37:10 +01:00
Ines Montani	ea6c85c67a	Merge pull request #1566 from MathiasDesch/master (resolves #1248 ) Add exceptions to tokenizer and norm	2017-11-13 19:05:22 +01:00
Matthew Honnibal	1b348389bb	Merge branch 'master' of https://github.com/explosion/spaCy	2017-11-13 18:18:48 +01:00
Matthew Honnibal	ca73d0d8fe	Cleanup states after beam parsing, explicitly	2017-11-13 18:18:26 +01:00
Matthew Honnibal	63ef9a2e73	Remove __dealloc__ from ParserBeam	2017-11-13 18:18:08 +01:00
Mathias Deschamps	c0691b2ab4	Add tokenizer exceptions for ing verbs Extend list of tokenizing exceptions introduced in `123810b`	2017-11-13 17:46:05 +01:00
Mathias Deschamps	288298ead9	Add norm exception for ing verbs Some ing verbs are sometimes written in or in'. Make the NORM form correct	2017-11-13 17:46:05 +01:00
Abhinav Sharma	59f5740ede	improved upon the list of included stop_words	2017-11-13 17:13:49 +05:30
Matthew Honnibal	6e641f46d4	Create a preprocess function that gets bigrams	2017-11-12 00:43:41 +01:00
Matthew Honnibal	c9251d79e3	Edit comment	2017-11-11 18:38:32 +01:00
Matthew Honnibal	dd1678eab3	Edit comment	2017-11-11 18:37:08 +01:00
Roman Domrachev	ee60a52ee7	Fix test imports and last batch cleanup	2017-11-11 11:32:16 +03:00
Roman Domrachev	4a6b094e09	Remove unused import	2017-11-11 03:13:05 +03:00
Roman Domrachev	3c600adf23	Try to fix StringStore clean up (see #1506 )	2017-11-11 03:11:27 +03:00
ines	ee97fd3cb4	Add regression test for #1547	2017-11-11 00:14:03 +01:00
ines	2df27db671	Add unicode declaration	2017-11-11 00:13:56 +01:00
ines	35653bef3a	Add missing import (fixes #1546 )	2017-11-10 19:05:18 +01:00
ines	4c5d2c80d5	Re-add python -m to commands, too brittle :( (see #1536 )	2017-11-10 02:30:55 +01:00
ines	123810b6de	Add "lovin'" to tokenizer exceptions (see #1248 )	2017-11-09 17:09:30 +01:00
ines	1c218397f6	Ensure path in Doc.to_disk/from_disk (resolves ##1521) Also add Doc serialization tests with both Path and string path options	2017-11-09 02:29:03 +01:00
Matthew Honnibal	49fd5a646f	Set version for 2.0.2 release	2017-11-08 22:39:39 +01:00
Matthew Honnibal	fba2dbddf7	Increment version	2017-11-08 22:19:08 +01:00
Matthew Honnibal	a5ea0fdf5a	Fix #1518 : vocab.vectors.resize() didn't work	2017-11-08 22:18:37 +01:00
Matthew Honnibal	de45702bbe	Strip dev suffixes from version for compatibility check	2017-11-08 18:40:21 +01:00
Matthew Honnibal	51639214a1	Merge branch 'master' of https://github.com/explosion/spaCy	2017-11-08 18:04:33 +01:00
Matthew Honnibal	a2f980de4e	Exclude .devN versioning from compatibility check	2017-11-08 18:03:52 +01:00
Daniel Hershcovich	d7ae54ff44	Fix typo in message	2017-11-08 16:06:28 +02:00
Matthew Honnibal	4194bc5744	Xfail flakey serialization test	2017-11-08 13:55:13 +01:00
Matthew Honnibal	d5537e5516	Work on Windows test failure	2017-11-08 13:25:18 +01:00
Matthew Honnibal	c27c82d5f9	Fix serialization	2017-11-08 13:08:48 +01:00
Matthew Honnibal	1d5599cd28	Fix dtype	2017-11-08 12:18:32 +01:00
Matthew Honnibal	fa7fdd0d9b	Merge branch 'master' of https://github.com/explosion/spaCy	2017-11-08 12:11:31 +01:00
Matthew Honnibal	072ff38a01	Try to fix python3.5 serialization	2017-11-08 12:10:49 +01:00
Ines Montani	3a0f34d567	Merge pull request #1509 from abhi18av/patch-1 Create examples.py for Hindi language	2017-11-08 11:37:19 +01:00
Ines Montani	42b241ccd0	Update language code in usage example in comment	2017-11-08 11:36:38 +01:00
Matthew Honnibal	e262e8d942	Increment version to v2.0.2.dev0	2017-11-08 11:25:47 +01:00
Matthew Honnibal	a8b592783b	Make a dtype more specific, to fix a windows build	2017-11-08 11:24:35 +01:00
Abhinav Sharma	84edade82d	Create examples.py Populated the file with the translations of English example sentences	2017-11-08 13:23:08 +05:30
Matthew Honnibal	d725aee4e2	Increment version to 2.0.1	2017-11-08 02:14:47 +01:00
Matthew Honnibal	8d6f68f1df	Increment version	2017-11-08 01:12:34 +01:00
ines	bcf42b8846	Fix typo	2017-11-08 01:06:37 +01:00
Matthew Honnibal	bbd2a3dee1	Fix title in about.py	2017-11-07 14:02:58 +01:00
Matthew Honnibal	4efaf9306c	Set version to spacy-nightly rc2	2017-11-07 13:27:26 +01:00
Matthew Honnibal	bf1ec2965f	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-07 13:20:29 +01:00
Matthew Honnibal	726f689da4	Fix missing import	2017-11-07 13:20:12 +01:00
ines	834f9c1aab	Update about.py	2017-11-07 13:11:33 +01:00
ines	a4662a31a9	Move model package templates to cli.package and update docs	2017-11-07 12:15:35 +01:00
ines	a09c096d3c	Get docs ready for v2.0.0	2017-11-07 12:00:43 +01:00
Matthew Honnibal	9a88e66103	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-07 02:00:06 +01:00
Matthew Honnibal	174abe4677	Increment to 2.0.0rc1	2017-11-07 01:59:46 +01:00
ines	42a0fbf291	Fix textcat simple train example	2017-11-07 01:25:54 +01:00
ines	8fb48b9b91	Update and document new util functions	2017-11-07 00:22:43 +01:00
Matthew Honnibal	1cab703bba	Move minibatch function to util	2017-11-06 23:45:36 +01:00
ines	5f43953536	Move test	2017-11-06 23:14:10 +01:00
Matthew Honnibal	dd90fe09f5	Remove extraneous label from textcat class	2017-11-06 22:09:02 +01:00
Matthew Honnibal	45e0617e61	Allow Language.update to take unicode text and dict objects	2017-11-06 22:07:38 +01:00
Matthew Honnibal	1831dbd065	Add test of simple textcat workflow	2017-11-06 22:04:29 +01:00
Matthew Honnibal	ffb9101f3f	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-06 19:20:41 +01:00
Matthew Honnibal	8fea512ac8	Don't set tensor in textcat	2017-11-06 19:20:14 +01:00
ines	acb9bdb852	Fix PRON_LEMMA imports	2017-11-06 17:41:53 +01:00
Matthew Honnibal	7d46793dd7	Add PRON_LEMMA to spacy.symbols	2017-11-06 17:38:25 +01:00
Matthew Honnibal	2f7e9f390d	Make test less flakey	2017-11-06 17:34:50 +01:00
Matthew Honnibal	407b08017e	Make test less flakey	2017-11-06 17:31:40 +01:00
Matthew Honnibal	102f797933	Fix lemma ordering in test	2017-11-06 17:02:17 +01:00
Matthew Honnibal	75e1618ec3	Fix lemma clobbering	2017-11-06 16:56:19 +01:00
Matthew Honnibal	6fdffd7246	Merge pull request #1497 from explosion/feature/improve-optimizer-handling 💫 Improve optimizer handling	2017-11-06 16:41:15 +01:00
Matthew Honnibal	8e6795437b	Set release=True	2017-11-06 16:39:32 +01:00
Matthew Honnibal	5c85bf3791	Fix missing import	2017-11-06 15:06:27 +01:00
Matthew Honnibal	25859dbb48	Return optimizer from begin_training, creating if necessary	2017-11-06 14:26:49 +01:00
Matthew Honnibal	465adfee94	Remove unused resume_training method, and pass optimizer through	2017-11-06 14:26:00 +01:00
Matthew Honnibal	13336a6197	Fix Adam import	2017-11-06 14:25:37 +01:00
Matthew Honnibal	2eb11d60f2	Add function create_default_optimizer to spacy._ml	2017-11-06 14:11:59 +01:00
Matthew Honnibal	31babe3c3f	Fix non-clobbering lemmatization	2017-11-06 12:36:05 +01:00
Matthew Honnibal	63c6ae4191	Fix lemmatizer test	2017-11-06 11:57:06 +01:00
Matthew Honnibal	a86a0181b5	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-05 22:19:10 +01:00
Matthew Honnibal	134d3b8143	Fix morphology	2017-11-05 22:18:22 +01:00
ines	08d1cf850a	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-05 21:41:58 +01:00
ines	baa231745c	Fix Dutch tag map	2017-11-05 21:41:50 +01:00
Matthew Honnibal	46e62ad747	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-05 19:40:00 +01:00
Matthew Honnibal	bb25cb0f76	Avoid clobbering preset lemmas	2017-11-05 19:39:38 +01:00
ines	507ecb67af	Fix Spanish tag map	2017-11-05 19:23:34 +01:00
Matthew Honnibal	320008352b	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-05 18:46:15 +01:00
Matthew Honnibal	38109a0e4a	Register SentenceSegmenter in Language.factories	2017-11-05 18:45:57 +01:00
ines	975e1042ff	Fix Italian tag map	2017-11-05 18:34:09 +01:00
ines	6b2d6e4937	Fix Portuguese tag map	2017-11-05 18:31:00 +01:00
ines	fa2687fded	Fix Dutch tag map	2017-11-05 17:57:59 +01:00
ines	fb8990d916	Fix Spanish tag map	2017-11-05 17:48:46 +01:00
ines	9d13288f73	Fix French tag map	2017-11-05 17:47:59 +01:00
ines	54579805c5	Fix French tag map	2017-11-05 17:44:05 +01:00
Matthew Honnibal	2b35bb76ad	Fix tensorizer on GPU	2017-11-05 15:34:40 +01:00
Matthew Honnibal	6e5181bbaa	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-05 15:33:56 +01:00
Matthew Honnibal	6f438b17c1	Increment version to v2.0.0a19	2017-11-05 14:43:36 +01:00
Matthew Honnibal	225cc249c9	Pass string path to numpy, to fix #1479	2017-11-05 14:42:46 +01:00
Matthew Honnibal	00435d8f0c	Add extra beam parsing test	2017-11-05 14:39:57 +01:00
Matthew Honnibal	e777ea25bb	Merge pull request #1492 from uwol/develop TextCategorizer return parameter fix	2017-11-05 14:13:04 +01:00
Matthew Honnibal	0d4bd6414e	Fix Italian tag map	2017-11-05 14:11:03 +01:00
ines	ef597622a6	Add Portuguese tag map	2017-11-05 13:58:34 +01:00
ines	793c62dfda	Add Dutch tag map	2017-11-05 13:48:07 +01:00
ines	f7485a09c8	Fix Italian tag map	2017-11-05 13:12:58 +01:00
uwol	a2162b8908	tensorizer return parameter fix	2017-11-05 12:25:10 +01:00
ines	0a27afbf86	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-04 23:32:52 +01:00
ines	3cef901834	Add tag map for French and Italian	2017-11-04 23:32:51 +01:00
Matthew Honnibal	cfb83c231c	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-04 23:08:19 +01:00
Matthew Honnibal	d185927998	Undo harmful pickling hacks on Language class	2017-11-04 23:07:03 +01:00
ines	6c15aafebd	Fix formatting	2017-11-04 23:07:02 +01:00
Matthew Honnibal	3ca16ddbd4	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-04 00:25:02 +01:00
Matthew Honnibal	e4ec4be948	Fix parser test	2017-11-04 00:23:45 +01:00
Matthew Honnibal	98c29b7912	Add padding vector in parser, to make gradient more correct	2017-11-04 00:23:23 +01:00
ines	5e7d98f72a	Remove test for #1491	2017-11-03 22:10:57 +01:00
ines	718f1c50fb	Add regression test for #1491	2017-11-03 21:11:20 +01:00
Matthew Honnibal	144a93c2a5	Back-off to tensor for similarity if no vectors	2017-11-03 20:56:33 +01:00
Matthew Honnibal	1e9634691a	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-03 20:21:15 +01:00
Matthew Honnibal	13c8881d2f	Expose parser's tok2vec model component	2017-11-03 20:20:59 +01:00
Matthew Honnibal	17c63906f9	Update tensorizer component	2017-11-03 20:20:26 +01:00
Matthew Honnibal	2bf21cbe29	Update model after optimising it instead of waiting	2017-11-03 20:20:01 +01:00
Matthew Honnibal	d6e831bf89	Fix lemmatizer tests	2017-11-03 19:46:34 +01:00
ines	eef930c73e	Assert instead of print	2017-11-03 18:50:57 +01:00
ines	f0986df94b	Add test for #1488 (passes on v2.0.0a18?)	2017-11-03 14:44:36 +01:00
Matthew Honnibal	711278b667	Make test less flakey	2017-11-03 14:36:08 +01:00
Matthew Honnibal	7fea845374	Remove print statement	2017-11-03 14:04:51 +01:00
Matthew Honnibal	0a534ae96a	Fix test for backprop d_pad	2017-11-03 14:04:16 +01:00
Matthew Honnibal	33bd2428db	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-03 13:29:56 +01:00
Matthew Honnibal	6681058abd	Fix tensor extending in tagger	2017-11-03 13:29:36 +01:00
Matthew Honnibal	bd2cbdfa85	Make Morphology not fail on unknown tags	2017-11-03 13:29:09 +01:00
Matthew Honnibal	c9b118a7e9	Set softmax attr in tagger model	2017-11-03 11:22:01 +01:00
Matthew Honnibal	a5b05f85f0	Set Doc.tensor attribute in parser	2017-11-03 11:21:00 +01:00
Matthew Honnibal	62ed58935a	Add Doc.extend_tensor() method	2017-11-03 11:20:31 +01:00
Matthew Honnibal	d6fc39c8a6	Set Doc.tensor from Tagger	2017-11-03 11:20:05 +01:00
Matthew Honnibal	b3264aa5f0	Expose the softmax layer in the tagger model, to allow setting tensors	2017-11-03 11:19:51 +01:00
Matthew Honnibal	c2bbf076a4	Add document length cap for training	2017-11-03 01:54:54 +01:00

... 3 4 5 6 7 ...

4799 Commits