spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-12-26 18:06:29 +03:00

Author	SHA1	Message	Date
Paul O'Leary McCann	7dd21b66d5	Extras require mecab (#3024 ) * Add note that Unidic is required for Japanese This addresses #3001. -POLM * Add extras_require for mecab with old version Related to issue #3018. * mecab → ja Co-Authored-By: polm <polm@dampfkraft.com>	2018-12-08 06:34:49 +01:00
Justin DuJardin	33fca8672f	fix issue compiling the latest spacy on MacOS 10.3.6 (#2998 )	2018-12-02 05:51:11 +01:00
Matthew Honnibal	05b2336ffa	Try again to fix OSX build	2018-12-01 03:12:21 +01:00
Matthew Honnibal	4895b2e830	Merge branch 'master' of https://github.com/explosion/spaCy	2018-12-01 02:37:21 +01:00
Matthew Honnibal	3f16af123e	Try to fix OSX build error	2018-12-01 02:36:56 +01:00
Matthew Honnibal	61abb1ef70	Remove msgpack dependency, to try to fix #2995	2018-12-01 02:36:41 +01:00
Matthew Honnibal	9e2ff2f583	Fix regex pin to harmonize with conda (#2964 )	2018-11-26 19:28:54 +01:00
Matthew Honnibal	e2ae25d6f5	Try setting older regex version, to align with conda	2018-10-29 13:39:00 +01:00
Matthew Honnibal	a2745d310e	Revert "Update regex version" This reverts commit `62358dd867`.	2018-10-28 16:38:56 +01:00
Matthew Honnibal	62358dd867	Update regex version	2018-10-28 16:27:50 +01:00
Ines Montani	fd750ec3bf	Fix msgpack-numpy version pin	2018-10-15 14:18:38 +02:00
Ines Montani	051a6b73eb	Update Thinc version pin	2018-10-15 01:40:28 +02:00
Matthew Honnibal	7202abdfa9	Fix specifiers for GPU	2018-10-15 00:08:44 +02:00
Matthew Honnibal	b305b24c24	Require thinc 6.10.6	2018-10-14 23:28:41 +02:00
Matthew Honnibal	6e6f6be3f5	Update requirements and setup.py	2018-10-14 23:06:46 +02:00
Ines Montani	9ebe607f82	Add wheel to setup_requires	2018-10-14 16:38:48 +02:00
Ines Montani	2e675d9523	Update murmurhash pin	2018-10-14 16:37:38 +02:00
Matthew Honnibal	f784e42ffe	Try older version of regex	2018-10-03 00:23:40 +02:00
Matthew Honnibal	e4fd2ccd07	Try previous version of regex	2018-10-02 23:37:17 +02:00
Matthew Honnibal	9937ff93e5	Update regex version dependency	2018-10-02 19:43:59 +02:00
Matthew Honnibal	05b6103a0c	Try to fix version pin for msgpack-numpy	2018-09-28 14:07:00 +02:00
Matthew Honnibal	276aa83d1a	Require older msgpack-numpy	2018-09-27 15:34:24 +02:00
Matthew Honnibal	7be9118be3	Require numpy>=1.15.0 to avoid the RuntimeWarning	2018-08-10 00:14:13 +02:00
Matthew Honnibal	cabce07ba6	Fix thinc version requirement	2018-07-21 15:56:33 +02:00
Matthew Honnibal	a723fafea3	Require thinc 6.10.3.dev1	2018-07-21 12:49:09 +02:00
ines	95641f4026	Only install pathlib backport on Python < 3.4	2018-07-20 21:08:29 +02:00
Matthew Honnibal	adde3826e2	Build against thinc 6.10.3.dev0	2018-07-20 13:34:54 +02:00
Ines Montani	d4cc736b7c	💫 Improve model downloads: check for existing install, customise pip and use requests library again (#2346 ) * Go back to using requests instead of urllib (closes #2320) Fewer dependencies are good, but this one was simply causing too many other problems around SSL verification and Python 2/3 compatibility. requests is a popular enough package that it's okay for spaCy to depend on it – and this will hopefully make model downloads less flakey. * Only download model if not installed (see #1456) Use #egg=model==version to allow pip to check for existing installations. The download is only started if no installation matching the package/version is found. Fixes a long-standing inconvenience. * Pass additional options to pip when installing model (resolves #1456) Treat all additional arguments passed to the download command as pip options to allow user to customise the command. For example: python -m spacy download en --user * Add CLI option to enable installing model package dependencies * Revert "Add CLI option to enable installing model package dependencies" This reverts commit `9336ffe695`. * Update documentation	2018-05-20 20:26:56 +02:00
Matthew Honnibal	abf8b16d71	Add doc.retokenize() context manager (#2172 ) This patch takes a step towards #1487 by introducing the doc.retokenize() context manager, to handle merging spans, and soon splitting tokens. The idea is to do merging and splitting like this: with doc.retokenize() as retokenizer: for start, end, label in matches: retokenizer.merge(doc[start : end], attrs={'ent_type': label}) The retokenizer accumulates the merge requests, and applies them together at the end of the block. This will allow retokenization to be more efficient, and much less error prone. A retokenizer.split() function will then be added, to handle splitting a single token into multiple tokens. These methods take `Span` and `Token` objects; if the user wants to go directly from offsets, they can append to the .merges and .splits lists on the retokenizer. The doc.merge() method's behaviour remains unchanged, so this patch should be 100% backwards incompatible (modulo bugs). Internally, doc.merge() fixes up the arguments (to handle the various deprecated styles), opens the retokenizer, and makes the single merge. We can later start making deprecation warnings on direct calls to doc.merge(), to migrate people to use of the retokenize context manager.	2018-04-03 14:10:35 +02:00
Matthew Honnibal	8308bbc617	Get msgpack and msgpack_numpy via Thinc, to avoid potential version conflicts	2018-03-29 00:14:55 +02:00
ines	366c98a94b	Remove requests dependency	2018-03-28 12:46:18 +02:00
ines	ce6071ca89	Remove ftfy dependency and update docs	2018-03-28 12:09:42 +02:00
ines	6d2c85f428	Drop six and related hacks as a dependency	2018-03-28 10:45:25 +02:00
ines	f5f4de98d1	Version-lock msgpack-python (see #2015 )	2018-02-22 16:02:32 +01:00
ines	002ee80ddf	Add html5lib to setup.py to fix six error (see #1924 )	2018-02-02 20:32:08 +01:00
Matthew Honnibal	2e449c1fbf	Fix compiler flags, addressing #1591	2018-01-14 14:34:36 +01:00
Matthew Honnibal	04a92bd75e	Pin msgpack-numpy requirement	2017-12-06 03:24:24 +01:00
Hugo	aa898ab4e4	Drop support for EOL Python 2.6 and 3.3	2017-11-26 19:46:24 +02:00
Matthew Honnibal	716ccbb71e	Require thinc 6.10.1	2017-11-15 14:59:34 +01:00
Matthew Honnibal	314f5b9cdb	Require thinc 6.10.0	2017-10-28 18:20:10 +00:00
Matthew Honnibal	64e4ff7c4b	Merge 'tidy-up' changes into branch. Resolve conflicts	2017-10-28 13:16:06 +02:00
ines	7946464742	Remove spacy.tagger (now in pipeline)	2017-10-27 19:45:04 +02:00
Matthew Honnibal	531142a933	Merge remote-tracking branch 'origin/develop' into feature/better-parser	2017-10-27 12:34:48 +00:00
Matthew Honnibal	642eb28c16	Don't compile with OpenMP by default	2017-10-27 10:16:58 +00:00
Matthew Honnibal	90d1d9b230	Remove obsolete parser code	2017-10-26 13:22:45 +02:00
Matthew Honnibal	79fcf8576a	Compile with march=native	2017-10-18 21:46:34 +02:00
Matthew Honnibal	2eb0fe4957	Fix setup.py	2017-10-03 21:40:04 +02:00
Matthew Honnibal	b49cc8153a	Require correct thinc	2017-09-26 10:00:18 -05:00
ines	68f66aebf8	Use pkg_resources instead of pip for is_package (resolves #1293 )	2017-09-16 20:27:59 +02:00
Matthew Honnibal	07cdbd1219	Require thinc 6.8.1, for Windows	2017-09-15 22:47:53 +02:00

1 2 3 4 5 ...

388 Commits