spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-07-10 00:02:19 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	4f19fe0f3a	Add Makefile	2018-06-02 17:10:15 +02:00
Matthew Honnibal	5d281cf302	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2018-05-22 20:50:59 +02:00
Matthew Honnibal	ce458c2428	Fix spacy requirement constraint in package template	2018-05-22 20:50:46 +02:00
Ines Montani	862da5e793	Support pipeline factories via entry points (#2348 )	2018-05-22 18:29:45 +02:00
Matthew Honnibal	94ad2d66b6	Require thinc 6.11.2	2018-05-21 19:26:28 +02:00
Matthew Honnibal	d5af38f80c	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2018-05-21 17:42:55 +02:00
Matthew Honnibal	ee33de8652	Fix unpickling of NER parser	2018-05-21 17:42:40 +02:00
ines	f9dbcac8e4	Merge branch 'master' into develop	2018-05-21 02:29:29 +02:00
cclauss	f7dcaa1f6b	Simplify is_config() and normalize_string_keys() (#2305 ) * Simplify is_config() and normalize_string_keys() * Use __in__ to avoid the nested _ands_ and _ors_. * Dict comprehension directly tracks with the doc string * Keep more basic loop in normalize_string_keys * Whitespace	2018-05-21 01:54:35 +02:00
Ines Montani	cae4457c38	💫 Add .similarity warnings for no vectors and option to exclude warnings (#2197 ) * Add logic to filter out warning IDs via environment variable Usage: SPACY_WARNING_EXCLUDE=W001,W007 * Add warnings for empty vectors * Add warning if no word vectors are used in .similarity methods For example, if only tensors are available in small models – should hopefully clear up some confusion around this * Capture warnings in tests * Rename SPACY_WARNING_EXCLUDE to SPACY_WARNING_IGNORE	2018-05-21 01:22:38 +02:00
ines	ff1082d8e4	Add version tag in CLI docs [ci skip]	2018-05-21 01:17:49 +02:00
Matthew Honnibal	b096b22c20	Merge pull request #2247 from skrcode/1480 1480 - Implement Fast-Text vectors with subword features	2018-05-21 01:16:21 +02:00
Matthew Honnibal	f3b4f6a4ec	Merge setup.py	2018-05-20 23:21:00 +02:00
Ines Montani	d4cc736b7c	💫 Improve model downloads: check for existing install, customise pip and use requests library again (#2346 ) * Go back to using requests instead of urllib (closes #2320) Fewer dependencies are good, but this one was simply causing too many other problems around SSL verification and Python 2/3 compatibility. requests is a popular enough package that it's okay for spaCy to depend on it – and this will hopefully make model downloads less flakey. * Only download model if not installed (see #1456) Use #egg=model==version to allow pip to check for existing installations. The download is only started if no installation matching the package/version is found. Fixes a long-standing inconvenience. * Pass additional options to pip when installing model (resolves #1456) Treat all additional arguments passed to the download command as pip options to allow user to customise the command. For example: python -m spacy download en --user * Add CLI option to enable installing model package dependencies * Revert "Add CLI option to enable installing model package dependencies" This reverts commit `9336ffe695`. * Update documentation	2018-05-20 20:26:56 +02:00
Matthew Honnibal	3eb446e0a5	Require thinc 6.11.1 and prepare for release to spacy-nightly	2018-05-20 19:00:34 +02:00
Matthew Honnibal	bdc23dd8c1	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2018-05-20 18:59:24 +02:00
ines	5401c55c75	Merge branch 'master' into develop	2018-05-20 16:49:40 +02:00
ines	b59e3b157f	Don't require attrs argument in Doc.retokenize and allow both ints and unicode (resolves #2304 )	2018-05-20 15:15:37 +02:00
ines	5768df4f09	Add SimpleFrozenDict util to use as default function argument	2018-05-20 15:13:37 +02:00
Matthew Honnibal	7431e9c87f	Fix parser for GPU	2018-05-19 17:24:34 +00:00
Matthew Honnibal	401213fb1f	Only warn about unnamed vectors if non-zero sized.	2018-05-19 18:51:55 +02:00
Matthew Honnibal	260707a4c3	Make thinc look in /usr/local/cuda for cuda by default	2018-05-19 18:12:23 +02:00
Matthew Honnibal	b9e415a5f8	Use increasing beam_update_prob in ud-train	2018-05-16 23:21:53 +02:00
Matthew Honnibal	a7aa49c419	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2018-05-16 23:20:51 +02:00
Matthew Honnibal	a0b8a26655	Fix missing regex requirement	2018-05-16 23:19:01 +02:00
Matthew Honnibal	74d5c625b3	Use rising beam update prob	2018-05-16 20:11:59 +02:00
Matthew Honnibal	544ae7f1db	Merge branch 'develop' into feature/refactor-parser	2018-05-16 02:06:49 +02:00
Matthew Honnibal	d1b27fe5aa	Revert "Improve dynamic oracle when values are missing in parse" This reverts commit `f56bd4736b`.	2018-05-16 00:31:52 +02:00
Matthew Honnibal	8661218fe8	Refactor parser (#2308 ) * Work on refactoring greedy parser * Compile updated parser * Fix refactored parser * Update test * Fix refactored parser * Fix refactored parser * Readd beam search after refactor * Fix beam search after refactor * Fix parser * Fix beam parsing * Support oracle segmentation in ud-train CLI command * Avoid relying on final gold check in beam search * Add a keyword argument sink to GoldParse * Bug fixes to beam search after refactor * Avoid importing fused token symbol in ud-run-test, untl that's added * Avoid importing fused token symbol in ud-run-test, untl that's added * Don't modify Token in global scope * Fix error in beam gradient calculation * Default to beam_update_prob 1 * Set a more aggressive threshold on the max violn update * Disable some tests to figure out why CI fails * Disable some tests to figure out why CI fails * Add some diagnostics to travis.yml to try to figure out why build fails * Tell Thinc to link against system blas on Travis * Point thinc to libblas on Travis * Try running sudo=true for travis * Unhack travis.sh * Restore beam_density argument for parser beam * Require thinc 6.11.1.dev16 * Revert hacks to tests * Revert hacks to travis.yml * Update thinc requirement * Fix parser model loading * Fix size limits in training data * Add missing name attribute for parser * Fix appveyor for Windows	2018-05-15 22:17:29 +02:00
Matthew Honnibal	f3790bdeec	Fix appveyor for Windows	2018-05-15 21:16:39 +02:00
Matthew Honnibal	83acaa0358	Add missing name attribute for parser	2018-05-15 19:01:53 +02:00
Matthew Honnibal	f328c195ca	Fix size limits in training data	2018-05-15 19:01:41 +02:00
Matthew Honnibal	8446b35ce0	Fix parser model loading	2018-05-15 18:43:46 +02:00
Matthew Honnibal	dc1a479fbd	Merge branch 'develop' into feature/refactor-parser	2018-05-15 18:39:21 +02:00
Matthew Honnibal	13faf4e1ea	Update thinc requirement	2018-05-15 18:35:11 +02:00
Matthew Honnibal	546dd99cdf	Merge master into develop -- mostly Arabic and website	2018-05-15 18:14:28 +02:00
Matthew Honnibal	e3fdfba164	Revert hacks to travis.yml	2018-05-15 18:00:24 +02:00
Matthew Honnibal	5664ab7e6c	Revert hacks to tests	2018-05-15 18:00:09 +02:00
Matthew Honnibal	4dd1fb3c7b	Require thinc 6.11.1.dev16	2018-05-15 17:56:07 +02:00
Matthew Honnibal	7b9195657b	Restore beam_density argument for parser beam	2018-05-15 17:55:11 +02:00
Matthew Honnibal	581d318971	Fix conftest	2018-05-15 00:54:45 +02:00
Tahar Zanouda	00417794d3	Add Arabic language (#2314 ) * added support for Arabic lang * added Arabic language support * updated conftest	2018-05-15 00:27:19 +02:00
Jani Monoses	0e08e49e87	Lemmatizer ro (#2319 ) * Add Romanian lemmatizer lookup table. Adapted from http://www.lexiconista.com/datasets/lemmatization/ by replacing cedillas with commas (ș and ț). The original dataset is licensed under the Open Database License. * Fix one blatant issue in the Romanian lemmatizer * Romanian examples file * Add ro_tokenizer in conftest * Add Romanian lemmatizer test	2018-05-12 15:20:04 +02:00
vishnumenon	ae3719ece5	Fix the code for FACILITIY entities (#2324 ) * Fix the code for FACILITIY entities As far as I can tell, the default models all use "FAC" rather than "FACILITY" * Added my Contributor Agreement * Rename vishnumenon to vishnumenon.md	2018-05-12 15:19:17 +02:00
Matthew Honnibal	625ee6c464	Unhack travis.sh	2018-05-10 18:16:11 +02:00
Matthew Honnibal	299621b747	Try running sudo=true for travis	2018-05-10 18:11:11 +02:00
Matthew Honnibal	603907926f	Point thinc to libblas on Travis	2018-05-10 18:06:37 +02:00
Matthew Honnibal	1b294f4798	Tell Thinc to link against system blas on Travis	2018-05-10 18:03:44 +02:00
Matthew Honnibal	c261b5b996	Add some diagnostics to travis.yml to try to figure out why build fails	2018-05-10 17:10:44 +02:00
Matthew Honnibal	887631ca25	Disable some tests to figure out why CI fails	2018-05-10 16:42:01 +02:00

1 2 3 4 5 ...

8753 Commits