spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-08 05:15:04 +03:00

Author	SHA1	Message	Date
Ines Montani	f02bb08f39	Update prefer_gpu and require_gpu docs [ci skip]	2018-10-14 23:30:44 +02:00
Matthew Honnibal	b305b24c24	Require thinc 6.10.6	2018-10-14 23:28:41 +02:00
Matthew Honnibal	8ccfa52d19	Unhack prefer_gpu	2018-10-14 23:27:09 +02:00
Matthew Honnibal	2ad3a4ea32	Update push-tag script	2018-10-14 23:16:08 +02:00
Matthew Honnibal	41adf3572b	Set version to v2.0.14	2018-10-14 23:15:34 +02:00
Matthew Honnibal	38aa835ada	Workaround bug in thinc require_gpu	2018-10-14 23:15:08 +02:00
Matthew Honnibal	6e6f6be3f5	Update requirements and setup.py	2018-10-14 23:06:46 +02:00
Matthew Honnibal	91593b7378	Add tests for prefer_gpu() and require_gpu()	2018-10-14 23:05:22 +02:00
Matthew Honnibal	62c70b3163	Import prefer_gpu and require_gpu functions from Thinc	2018-10-14 23:03:06 +02:00
Ines Montani	9ebe607f82	Add wheel to setup_requires	2018-10-14 16:38:48 +02:00
Ines Montani	5a4c5b78a8	Update GPU docs for v2.0.14	2018-10-14 16:38:12 +02:00
Ines Montani	295da0f11b	Increment version to 2.0.14.dev0	2018-10-14 16:37:46 +02:00
Ines Montani	2e675d9523	Update murmurhash pin	2018-10-14 16:37:38 +02:00
Matthew Honnibal	7de0dcb91f	Merge branch 'master' of https://github.com/explosion/spaCy	2018-10-14 16:12:23 +02:00
Ines Montani	76c43380e4	Update README.rst [ci skip]	2018-10-14 01:00:55 +02:00
Ines Montani	3decf44dd3	Update badge [ci skip]	2018-10-14 00:54:19 +02:00
Ines Montani	8f393b1dcf	Add wheels badge	2018-10-14 00:48:04 +02:00
Keshan	cb075c8e72	Adding "This is a sentence" example to Sinhala (#2846 )	2018-10-14 00:06:40 +02:00
Ines Montani	ac4cadd31d	Add info on wheels [ci skip]	2018-10-14 00:04:37 +02:00
Ines Montani	30aa7f8b20	Increment version [ci skip]	2018-10-13 23:55:50 +02:00
Ines Montani	23d5b4ff5b	Update docs for new version [ci skip]	2018-10-13 23:53:33 +02:00
Ines Montani	f0e7da6478	Fix formatting and consistency	2018-10-13 23:53:26 +02:00
Matthew Honnibal	9cfab5933a	Set version to 2.0.13	2018-10-13 19:42:16 +02:00
Matthew Honnibal	6a6ae5b0af	Merge branch 'master' of https://github.com/explosion/spaCy	2018-10-13 19:41:00 +02:00
mauryaland	36514b5762	Rule-based French Lemmatizer (#2818 ) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> Add a rule-based French Lemmatizer following the english one and the excellent PR for [greek language optimizations](https://github.com/explosion/spaCy/pull/2558) to adapt the Lemmatizer class. ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> - Lemma dictionary used can be found [here](http://infolingu.univ-mlv.fr/DonneesLinguistiques/Dictionnaires/telechargement.html), I used the XML version. - Add several files containing exhaustive list of words for each part of speech - Add some lemma rules - Add POS that are not checked in the standard Lemmatizer, i.e PRON, DET, ADV and AUX - Modify the Lemmatizer class to check in lookup table as a last resort if POS not mentionned - Modify the lemmatize function to check in lookup table as a last resort - Init files are updated so the model can support all the functionalities mentioned above - Add words to tokenizer_exceptions_list.py in respect to regex used in tokenizer_exceptions.py ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [X] I have submitted the spaCy Contributor Agreement. - [X] I ran the tests, and all new and existing tests passed. - [X] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-10-13 16:38:21 +02:00
Matthew Honnibal	de46286107	Merge branch 'master' of https://github.com/explosion/spaCy	2018-10-13 16:11:16 +02:00
Ines Montani	fa23be0f3c	Remove in favour of https://github.com/explosion/spaCy/graphs/contributors	2018-10-13 15:46:57 +02:00
Ines Montani	cb57b35bb8	Also include lowercase norm exceptions	2018-10-13 15:37:30 +02:00
JKhakpour	74a30d883c	Add Persian(Farsi) language support (#2797 )	2018-10-13 15:31:49 +02:00
Matthew Honnibal	c3ddf98b1e	Set version to 2.0.13.dev4	2018-10-13 15:20:59 +02:00
Marina Lysyuk	b76fe08308	Correcting lang/ru/examples.py (#2845 ) * Correct some grammatical inaccuracies in lang\ru\examples.py; filled Contributor Agreement * Correct some grammatical inaccuracies in lang\ru\examples.py * Move contributor agreement to separate file	2018-10-13 15:19:43 +02:00
Jacopo Farina	42c42376a3	Visual C++ link updated (#2842 ) (closes #2841 ) [ci skip] * New landing page * Add contribution agreement	2018-10-12 14:59:45 +02:00
Ines Montani	4cd9ec0f00	💫 Update training examples and use minibatching (#2830 ) <!--- Provide a general summary of your changes in the title. --> ## Description Update the training examples in `/examples/training` to show usage of spaCy's `minibatch` and `compounding` helpers ([see here](https://spacy.io/usage/training#tips-batch-size) for details). The lack of batching in the examples has caused some confusion in the past, especially for beginners who would copy-paste the examples, update them with large training sets and experienced slow and unsatisfying results. ### Types of change enhancements ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-10-10 01:40:29 +02:00
Matthew Honnibal	f784e42ffe	Try older version of regex	2018-10-03 00:23:40 +02:00
Matthew Honnibal	67ddce68d8	Unskip test	2018-10-02 23:47:55 +02:00
Matthew Honnibal	4cf5ce2cc2	Revert "Remove problematic test" This reverts commit `bdebbef455`.	2018-10-02 23:47:24 +02:00
Matthew Honnibal	e4fd2ccd07	Try previous version of regex	2018-10-02 23:37:17 +02:00
Matthew Honnibal	bdebbef455	Remove problematic test	2018-10-02 23:16:29 +02:00
Matthew Honnibal	6afc6ffe56	Skip seemingly problematic test	2018-10-02 22:33:40 +02:00
Matthew Honnibal	9e4079ddb2	Merge branch 'master' of https://github.com/explosion/spaCy	2018-10-02 19:44:43 +02:00
Matthew Honnibal	40f228c2f2	Set version to 2.0.13.dev3	2018-10-02 19:44:25 +02:00
Matthew Honnibal	9937ff93e5	Update regex version dependency	2018-10-02 19:43:59 +02:00
Ines Montani	7806deceb4	Fix typo (closes #2815 ) [ci skip]	2018-10-01 10:49:29 +02:00
Ines Montani	ea20b72c08	💫 Make like_num work for prefixed numbers (#2808 ) * Only split + prefix if not numbers * Make like_num work for prefixed numbers * Add test for like_num	2018-10-01 10:49:14 +02:00
John Stewart	9faea3ff10	Update Keras Example for (Parikh et al, 2016) implementation (#2803 ) * bug fixes in keras example * created contributor agreement * baseline for Parikh model * initial version of parikh 2016 implemented * tested asymmetric models * fixed grevious error in normalization * use standard SNLI test file * begin to rework parikh example * initial version of running example * start to document the new version * start to document the new version * Update Decompositional Attention.ipynb * fixed calls to similarity * updated the README * import sys package duh * simplified indexing on mapping word to IDs * stupid python indent error * added code from https://github.com/tensorflow/tensorflow/issues/3388 for tf bug workaround	2018-10-01 10:28:45 +02:00
Ioannis Daras	405a826436	Correct error in spacy universe docs concerning spacy-lookup (#2814 )	2018-10-01 10:24:50 +02:00
Filipe Caixeta	6c498f9ff4	Update Portuguese Language (#2790 ) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Portuguese - Add/remove stopwords, fix tokenizer, add currency symbols * Extended punctuation and norm_exceptions in the Portuguese language	2018-09-29 09:51:45 +02:00
Matthew Honnibal	b39810d692	Fix copy_reg compatibility on _serialize module	2018-09-28 15:23:14 +02:00
Matthew Honnibal	f82f8ba5dd	Fix serialization when empty parser model. Closes #2482	2018-09-28 15:18:52 +02:00
Matthew Honnibal	d5a6c63b62	Add regression test for #2482	2018-09-28 15:18:30 +02:00

... 8 9 10 11 12 ...

9552 Commits