spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-01-10 17:26:42 +03:00

Author	SHA1	Message	Date
wxv	06820ef6e7	Fix is_ascii documentation and create contributor file (#2988 ) Proposed in #2933	2018-11-30 15:57:58 +01:00
Ben Batorsky	658f7e0dc8	OntoNotes url fix (#2981 ) The website for OntoNotes 5 is: https://catalog.ldc.upenn.edu/LDC2013T19, currently the named entity section has it as https://catalog.ldc.upenn.edu/ldc2013T19.	2018-11-29 19:34:30 +01:00
Ines Montani	d33953037e	💫 Port master changes over to develop (#2979 ) * Create aryaprabhudesai.md (#2681) * Update _install.jade (#2688) Typo fix: "models" -> "model" * Add FAC to spacy.explain (resolves #2706) * Remove docstrings for deprecated arguments (see #2703) * When calling getoption() in conftest.py, pass a default option (#2709) * When calling getoption() in conftest.py, pass a default option This is necessary to allow testing an installed spacy by running: pytest --pyargs spacy * Add contributor agreement * update bengali token rules for hyphen and digits (#2731) * Less norm computations in token similarity (#2730) * Less norm computations in token similarity * Contributor agreement * Remove ')' for clarity (#2737) Sorry, don't mean to be nitpicky, I just noticed this when going through the CLI and thought it was a quick fix. That said, if this was intention than please let me know. * added contributor agreement for mbkupfer (#2738) * Basic support for Telugu language (#2751) * Lex _attrs for polish language (#2750) * Signed spaCy contributor agreement * Added polish version of english lex_attrs * Introduces a bulk merge function, in order to solve issue #653 (#2696) * Fix comment * Introduce bulk merge to increase performance on many span merges * Sign contributor agreement * Implement pull request suggestions * Describe converters more explicitly (see #2643) * Add multi-threading note to Language.pipe (resolves #2582) [ci skip] * Fix formatting * Fix dependency scheme docs (closes #2705) [ci skip] * Don't set stop word in example (closes #2657) [ci skip] * Add words to portuguese language _num_words (#2759) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Update Indonesian model (#2752) * adding e-KTP in tokenizer exceptions list * add exception token * removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception * add tokenizer exceptions list * combining base_norms with norm_exceptions * adding norm_exception * fix double key in lemmatizer * remove unused import on punctuation.py * reformat stop_words to reduce number of lines, improve readibility * updating tokenizer exception * implement is_currency for lang/id * adding orth_first_upper in tokenizer_exceptions * update the norm_exception list * remove bunch of abbreviations * adding contributors file * Fixed spaCy+Keras example (#2763) * bug fixes in keras example * created contributor agreement * Adding French hyphenated first name (#2786) * Fix typo (closes #2784) * Fix typo (#2795) [ci skip] Fixed typo on line 6 "regcognizer --> recognizer" * Adding basic support for Sinhala language. (#2788) * adding Sinhala language package, stop words, examples and lex_attrs. * Adding contributor agreement * Updating contributor agreement * Also include lowercase norm exceptions * Fix error (#2802) * Fix error ValueError: cannot resize an array that references or is referenced by another array in this way. Use the resize function * added spaCy Contributor Agreement * Add charlax's contributor agreement (#2805) * agreement of contributor, may I introduce a tiny pl languge contribution (#2799) * Contributors agreement * Contributors agreement * Contributors agreement * Add jupyter=True to displacy.render in documentation (#2806) * Revert "Also include lowercase norm exceptions" This reverts commit `70f4e8adf3`. * Remove deprecated encoding argument to msgpack * Set up dependency tree pattern matching skeleton (#2732) * Fix bug when too many entity types. Fixes #2800 * Fix Python 2 test failure * Require older msgpack-numpy * Restore encoding arg on msgpack-numpy * Try to fix version pin for msgpack-numpy * Update Portuguese Language (#2790) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Portuguese - Add/remove stopwords, fix tokenizer, add currency symbols * Extended punctuation and norm_exceptions in the Portuguese language * Correct error in spacy universe docs concerning spacy-lookup (#2814) * Update Keras Example for (Parikh et al, 2016) implementation (#2803) * bug fixes in keras example * created contributor agreement * baseline for Parikh model * initial version of parikh 2016 implemented * tested asymmetric models * fixed grevious error in normalization * use standard SNLI test file * begin to rework parikh example * initial version of running example * start to document the new version * start to document the new version * Update Decompositional Attention.ipynb * fixed calls to similarity * updated the README * import sys package duh * simplified indexing on mapping word to IDs * stupid python indent error * added code from https://github.com/tensorflow/tensorflow/issues/3388 for tf bug workaround * Fix typo (closes #2815) [ci skip] * Update regex version dependency * Set version to 2.0.13.dev3 * Skip seemingly problematic test * Remove problematic test * Try previous version of regex * Revert "Remove problematic test" This reverts commit `bdebbef455`. * Unskip test * Try older version of regex * 💫 Update training examples and use minibatching (#2830) <!--- Provide a general summary of your changes in the title. --> ## Description Update the training examples in `/examples/training` to show usage of spaCy's `minibatch` and `compounding` helpers ([see here](https://spacy.io/usage/training#tips-batch-size) for details). The lack of batching in the examples has caused some confusion in the past, especially for beginners who would copy-paste the examples, update them with large training sets and experienced slow and unsatisfying results. ### Types of change enhancements ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Visual C++ link updated (#2842) (closes #2841) [ci skip] * New landing page * Add contribution agreement * Correcting lang/ru/examples.py (#2845) * Correct some grammatical inaccuracies in lang\ru\examples.py; filled Contributor Agreement * Correct some grammatical inaccuracies in lang\ru\examples.py * Move contributor agreement to separate file * Set version to 2.0.13.dev4 * Add Persian(Farsi) language support (#2797) * Also include lowercase norm exceptions * Remove in favour of https://github.com/explosion/spaCy/graphs/contributors * Rule-based French Lemmatizer (#2818) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> Add a rule-based French Lemmatizer following the english one and the excellent PR for [greek language optimizations](https://github.com/explosion/spaCy/pull/2558) to adapt the Lemmatizer class. ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> - Lemma dictionary used can be found [here](http://infolingu.univ-mlv.fr/DonneesLinguistiques/Dictionnaires/telechargement.html), I used the XML version. - Add several files containing exhaustive list of words for each part of speech - Add some lemma rules - Add POS that are not checked in the standard Lemmatizer, i.e PRON, DET, ADV and AUX - Modify the Lemmatizer class to check in lookup table as a last resort if POS not mentionned - Modify the lemmatize function to check in lookup table as a last resort - Init files are updated so the model can support all the functionalities mentioned above - Add words to tokenizer_exceptions_list.py in respect to regex used in tokenizer_exceptions.py ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [X] I have submitted the spaCy Contributor Agreement. - [X] I ran the tests, and all new and existing tests passed. - [X] My changes don't require a change to the documentation, or if they do, I've added all required information. * Set version to 2.0.13 * Fix formatting and consistency * Update docs for new version [ci skip] * Increment version [ci skip] * Add info on wheels [ci skip] * Adding "This is a sentence" example to Sinhala (#2846) * Add wheels badge * Update badge [ci skip] * Update README.rst [ci skip] * Update murmurhash pin * Increment version to 2.0.14.dev0 * Update GPU docs for v2.0.14 * Add wheel to setup_requires * Import prefer_gpu and require_gpu functions from Thinc * Add tests for prefer_gpu() and require_gpu() * Update requirements and setup.py * Workaround bug in thinc require_gpu * Set version to v2.0.14 * Update push-tag script * Unhack prefer_gpu * Require thinc 6.10.6 * Update prefer_gpu and require_gpu docs [ci skip] * Fix specifiers for GPU * Set version to 2.0.14.dev1 * Set version to 2.0.14 * Update Thinc version pin * Increment version * Fix msgpack-numpy version pin * Increment version * Update version to 2.0.16 * Update version [ci skip] * Redundant ')' in the Stop words' example (#2856) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [ ] I have submitted the spaCy Contributor Agreement. - [ ] I ran the tests, and all new and existing tests passed. - [ ] My changes don't require a change to the documentation, or if they do, I've added all required information. * Documentation improvement regarding joblib and SO (#2867) Some documentation improvements ## Description 1. Fixed the dead URL to joblib 2. Fixed Stack Overflow brand name (with space) ### Types of change Documentation ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * raise error when setting overlapping entities as doc.ents (#2880) * Fix out-of-bounds access in NER training The helper method state.B(1) gets the index of the first token of the buffer, or -1 if no such token exists. Normally this is safe because we pass this to functions like state.safe_get(), which returns an empty token. Here we used it directly as an array index, which is not okay! This error may have been the cause of out-of-bounds access errors during training. Similar errors may still be around, so much be hunted down. Hunting this one down took a long time...I printed out values across training runs and diffed, looking for points of divergence between runs, when no randomness should be allowed. * Change PyThaiNLP Url (#2876) * Fix missing comma * Add example showing a fix-up rule for space entities * Set version to 2.0.17.dev0 * Update regex version * Revert "Update regex version" This reverts commit `62358dd867`. * Try setting older regex version, to align with conda * Set version to 2.0.17 * Add spacy-js to universe [ci-skip] * Add spacy-raspberry to universe (closes #2889) * Add script to validate universe json [ci skip] * Removed space in docs + added contributor indo (#2909) * - removed unneeded space in documentation * - added contributor info * Allow input text of length up to max_length, inclusive (#2922) * Include universe spec for spacy-wordnet component (#2919) * feat: include universe spec for spacy-wordnet component * chore: include spaCy contributor agreement * Minor formatting changes [ci skip] * Fix image [ci skip] Twitter URL doesn't work on live site * Check if the word is in one of the regular lists specific to each POS (#2886) * 💫 Create random IDs for SVGs to prevent ID clashes (#2927) Resolves #2924. ## Description Fixes problem where multiple visualizations in Jupyter notebooks would have clashing arc IDs, resulting in weirdly positioned arc labels. Generating a random ID prefix so even identical parses won't receive the same IDs for consistency (even if effect of ID clash isn't noticable here.) ### Types of change bug fix ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Fix typo [ci skip] * fixes symbolic link on py3 and windows (#2949) * fixes symbolic link on py3 and windows during setup of spacy using command python -m spacy link en_core_web_sm en closes #2948 * Update spacy/compat.py Co-Authored-By: cicorias <cicorias@users.noreply.github.com> * Fix formatting * Update universe [ci skip] * Catalan Language Support (#2940) * Catalan language Support * Ddding Catalan to documentation * Sort languages alphabetically [ci skip] * Update tests for pytest 4.x (#2965) <!--- Provide a general summary of your changes in the title. --> ## Description - [x] Replace marks in params for pytest 4.0 compat ([see here](https://docs.pytest.org/en/latest/deprecations.html#marks-in-pytest-mark-parametrize)) - [x] Un-xfail passing tests (some fixes in a recent update resolved a bunch of issues, but tests were apparently never updated here) ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Fix regex pin to harmonize with conda (#2964) * Update README.rst * Fix bug where Vocab.prune_vector did not use 'batch_size' (#2977) Fixes #2976 * Fix typo * Fix typo * Remove duplicate file * Require thinc 7.0.0.dev2 Fixes bug in gpu_ops that would use cupy instead of numpy on CPU * Add missing import * Fix error IDs * Fix tests	2018-11-29 16:30:29 +01:00
Ines Montani	c80c20e1ec	Sort languages alphabetically [ci skip]	2018-11-26 15:37:53 +01:00
Marc Puig	98fe1ab259	Catalan Language Support (#2940 ) * Catalan language Support * Ddding Catalan to documentation	2018-11-26 15:25:47 +01:00
Ines Montani	1844bc238a	Update universe [ci skip]	2018-11-26 14:16:22 +01:00
Ines Montani	696acb0f92	Fix typo [ci skip]	2018-11-24 15:20:57 +01:00
Ines Montani	dfcc8f02af	Fix image [ci skip] Twitter URL doesn't work on live site	2018-11-14 01:01:33 +01:00
Ines Montani	1aa91e926f	Minor formatting changes [ci skip]	2018-11-13 23:59:59 +01:00
Francisco Aranda	be99f1cac5	Include universe spec for spacy-wordnet component (#2919 ) * feat: include universe spec for spacy-wordnet component * chore: include spaCy contributor agreement	2018-11-13 23:54:46 +01:00
mikelibg	75e7d503b7	Removed space in docs + added contributor indo (#2909 ) * - removed unneeded space in documentation * - added contributor info	2018-11-08 14:18:25 +01:00
Ines Montani	11db4d2f27	Add script to validate universe json [ci skip]	2018-11-06 12:50:41 +01:00
Ines Montani	a9fda638a9	Add spacy-raspberry to universe (closes #2889 )	2018-11-06 12:45:50 +01:00
Ines Montani	c235ddf44f	Add spacy-js to universe [ci-skip]	2018-11-06 12:45:03 +01:00
Bram Vanroy	071789467e	Documentation improvement regarding joblib and SO (#2867 ) Some documentation improvements ## Description 1. Fixed the dead URL to joblib 2. Fixed Stack Overflow brand name (with space) ### Types of change Documentation ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-10-24 15:19:17 +02:00
Roman	5766d09a5b	Redundant ')' in the Stop words' example (#2856 ) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [ ] I have submitted the spaCy Contributor Agreement. - [ ] I ran the tests, and all new and existing tests passed. - [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-10-18 10:21:16 +02:00
Ines Montani	c6a320cad4	Update version [ci skip]	2018-10-15 16:42:35 +02:00
Ines Montani	f02bb08f39	Update prefer_gpu and require_gpu docs [ci skip]	2018-10-14 23:30:44 +02:00
Ines Montani	5a4c5b78a8	Update GPU docs for v2.0.14	2018-10-14 16:38:12 +02:00
Ines Montani	ac4cadd31d	Add info on wheels [ci skip]	2018-10-14 00:04:37 +02:00
Ines Montani	30aa7f8b20	Increment version [ci skip]	2018-10-13 23:55:50 +02:00
Ines Montani	23d5b4ff5b	Update docs for new version [ci skip]	2018-10-13 23:53:33 +02:00
Ines Montani	f0e7da6478	Fix formatting and consistency	2018-10-13 23:53:26 +02:00
Jacopo Farina	42c42376a3	Visual C++ link updated (#2842 ) (closes #2841 ) [ci skip] * New landing page * Add contribution agreement	2018-10-12 14:59:45 +02:00
Ines Montani	7806deceb4	Fix typo (closes #2815 ) [ci skip]	2018-10-01 10:49:29 +02:00
Ioannis Daras	405a826436	Correct error in spacy universe docs concerning spacy-lookup (#2814 )	2018-10-01 10:24:50 +02:00
Charles-Axel Dein	014dd47c70	Add jupyter=True to displacy.render in documentation (#2806 )	2018-09-27 12:28:04 +02:00
Pranshu Jethmalani	9fd27d777e	Fix typo (#2795 ) [ci skip] Fixed typo on line 6 "regcognizer --> recognizer"	2018-09-25 12:12:40 +02:00
Ines Montani	3c4e3ade30	Fix typo (closes #2784 )	2018-09-21 10:45:11 +02:00
Ines Montani	5001d31be6	Don't set stop word in example (closes #2657 ) [ci skip]	2018-09-12 15:36:51 +02:00
Ines Montani	4e89cfaae1	Fix dependency scheme docs (closes #2705 ) [ci skip]	2018-09-12 15:32:26 +02:00
Ines Montani	0729d1edca	Fix formatting	2018-09-12 15:32:08 +02:00
Ines Montani	907df53904	Add multi-threading note to Language.pipe (resolves #2582 ) [ci skip]	2018-09-12 15:03:30 +02:00
Ines Montani	885691a7ab	Describe converters more explicitly (see #2643 )	2018-09-12 14:53:03 +02:00
Steve Sharp	ca747f58a4	Update _install.jade (#2688 ) Typo fix: "models" -> "model"	2018-08-22 13:16:04 +02:00
Ines Montani	aeb49eb625	Update version [ci skip]	2018-08-16 16:56:02 +02:00
Ines Montani	a0eacd3293	Merge branch 'master' into develop	2018-08-16 16:55:05 +02:00
Ines Montani	c0fa9903f4	Update model directory JS [ci skip] Prevent the default release URL from being overwritten and add license type	2018-08-16 16:54:50 +02:00
Ines Montani	03f661fefb	Add Greek to models directory [ci skip]	2018-08-16 16:51:56 +02:00
Ines Montani	fd9d175a53	Update live code [ci skip]	2018-08-15 15:28:48 +02:00
Matthew Honnibal	4336397ecb	Update develop from master	2018-08-14 03:04:28 +02:00
Wojciech Łukasiewicz	3953e967a0	User correct variable name in the examples (#2664 ) * correct naming * add contributor agreement	2018-08-13 22:21:24 +02:00
Ines Montani	71723cece1	Add note on visualizing long texts ans sentences (see #2636 ) [ci skip]	2018-08-08 15:28:21 +02:00
Ines Montani	6147bd3eb4	Fix link target (closes #2645 ) [ci skip]	2018-08-08 15:03:52 +02:00
Ines Montani	8c47da1f19	Update Language serialization docs (see #2628 ) [ci skip] Add note on using from_disk and from_bytes via subclasses and add example	2018-08-07 14:17:57 +02:00
Matthew Honnibal	664cfc29bc	Merge branch 'master' of https://github.com/explosion/spaCy	2018-08-07 10:49:39 +02:00
Matthew Honnibal	2278c9734e	Fix spelling error #2640	2018-08-07 10:49:21 +02:00
Xiaoquan Kong	f0c9652ed1	New Feature: display more detail when Error E067 (#2639 ) * Fix off-by-one error * Add verbose option * Update verbose option * Update documents for verbose option	2018-08-07 10:45:29 +02:00
Ines Montani	6a4360e425	Update universe [ci skip]	2018-08-02 17:33:08 +02:00
Sami	dbc993f5b3	Updating description and code snippet spacy-lefff (#2623 ) * updating description and code snippet spacy-lefff * contributors agreement	2018-08-02 17:25:27 +02:00
Vikas Kumar Yadav	d3e21aad64	Update _benchmarks.jade (#2618 )	2018-08-02 00:28:28 +02:00
Brian Phillips	8227de0099	Update language.jade (#2616 )	2018-07-31 12:34:42 +02:00
Ioannis Daras	055cc0de44	Bug fix to pseudocode for tokenizer customization (#2604 )	2018-07-27 11:04:12 +02:00
Andriy Mulyar	e9ef51137d	Fixed typo (#2596 ) Changed 'The index of the first character after the span.' to The index of the last character after the span' in description of doc.char_span	2018-07-25 22:17:15 +02:00
Ines Montani	75f3234404	💫 Refactor test suite (#2568 ) ## Description Related issues: #2379 (should be fixed by separating model tests) * total execution time down from > 300 seconds to under 60 seconds 🎉 * removed all model-specific tests that could only really be run manually anyway – those will now live in a separate test suite in the [`spacy-models`](https://github.com/explosion/spacy-models) repository and are already integrated into our new model training infrastructure * changed all relative imports to absolute imports to prepare for moving the test suite from `/spacy/tests` to `/tests` (it'll now always test against the installed version) * merged old regression tests into collections, e.g. `test_issue1001-1500.py` (about 90% of the regression tests are very short anyways) * tidied up and rewrote existing tests wherever possible ### Todo - [ ] move tests to `/tests` and adjust CI commands accordingly - [x] move model test suite from internal repo to `spacy-models` - [x] ~~investigate why `pipeline/test_textcat.py` is flakey~~ - [x] review old regression tests (leftover files) and see if they can be merged, simplified or deleted - [ ] update documentation on how to run tests ### Types of change enhancement, tests ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-07-24 23:38:44 +02:00
kororo	b1ec827ee0	Fix typo (#2579 ) Update slogan, desc and code snippet to latest version	2018-07-24 22:47:33 +02:00
ines	cd687091fb	Remove nl examples from widget for now [ci skip] Restore for next spaCy version when path to example sentences is fixed	2018-07-24 22:41:20 +02:00
ines	2d8ffb8bcd	Fix formatting	2018-07-24 22:40:49 +02:00
ines	1b3da8d2ae	Update website for v2.0.12 [ci skip]	2018-07-24 21:04:22 +02:00
ines	ae5ed2d698	Update docs for v2.0.12 [ci skip]	2018-07-21 15:51:44 +02:00
ines	d517dd4297	Document remove_extension methods	2018-07-21 15:51:28 +02:00
ines	153f41a5cc	Use better examples for Doc extension methods	2018-07-21 15:51:11 +02:00
ines	3c30d1763c	Merge branch 'master' into develop	2018-07-21 15:34:18 +02:00
kororo	2784babef9	Add ExcelCy into Universe list (#2572 ) Hi guys, This is my first spaCy extension. I am excited to able to do this. Please do let me know if there is any suggestions or modifications I need to do. Feel free to use/contribute the repo that I made. ## Description ExcelCy is a SpaCy toolkit to help improve the data training experiences. It provides easy annotation using Excel file format. It has helper to pre-train entity annotation with phrase and regex matcher pipe. ### Types of change Update to Universe list in website. ## Checklist - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-07-19 19:28:33 +02:00
ines	80e7485630	Merge branch 'master' into develop	2018-07-18 17:28:47 +02:00
Xiang Ji	19a5ef1c58	Fix venv command examples (#2560 ) [ci skip] * Fix venv command examples The documentation refers to `venv`, which is native to Python3. However, the command examples are as if they were still `virtualenv`, which is a package independent of `venv`: - It doesn't need to be installed via `pip`. In fact `pip install venv` would return an error. - The correct way to invoke `venv` is `python3 -m venv`, not `venv`, which would return command not found. See https://docs.python.org/3/library/venv.html I suspect the documentation simply replaced all occurrences of `virtualenv` with `venv`. However they are different modules and are used differently. * Update comment [ci skip]	2018-07-18 10:31:24 +02:00
ines	50c367ee96	Update meta [ci skip]	2018-07-10 13:51:45 +02:00
ines	3a321e79ac	Merge branch 'master' into develop	2018-07-10 13:49:08 +02:00
ines	71bfc92913	Exclude models for non-stable versions [ci skip]	2018-07-10 13:44:55 +02:00
ines	b5200962c0	Adjust formatting [ci skip]	2018-07-09 18:35:46 +02:00
Alex Villarreal	bd35bf7f09	Guidance to handle binary files in git in Windows (#2526 ) Adds guidance on what to do if users encounter the error described in [1634](https://github.com/explosion/spaCy/issues/1634), which probably only happens in Windows environments.	2018-07-09 18:31:37 +02:00
ines	f575b01595	Update language and license meta [ci skip]	2018-07-04 15:09:36 +02:00
ines	63666af328	Merge branch 'master' into develop	2018-07-04 14:52:25 +02:00
Matthew Honnibal	a85620a731	Note CoreNLP tokenizer correction on website	2018-07-02 11:35:31 +02:00
ines	06c6dc6fbc	Update Juniper [ci skip]	2018-06-28 11:48:17 +02:00
Nipun Sadvilkar	741ba80bd5	Train model command n_iteration 20 -> 30 (#2454 ) In source code `train.py` default Number of iterations is 30	2018-06-18 11:57:08 +02:00
ines	53a2bc8c8d	Only scroll sidebar item into view if needed [ci skip]	2018-06-12 10:58:50 +02:00
ines	65713a6593	Increment versions [ci skip]	2018-06-12 10:49:50 +02:00
Ines Montani	968f6f0bda	💫 Document Cython API (#2433 ) ## Description This PR adds the most relevant documentation of spaCy's Cython API. (Todo for when we publish this: rewrite `/api/#section-cython` and `/api/#cython` to `/api/cython#conventions`.) ### Types of change docs ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-06-11 17:47:46 +02:00
GolanLevy	72d7e80f94	adding a missing apostrophe (#2436 )	2018-06-11 17:47:24 +02:00
ines	778e5f4da3	Merge branch 'master' into develop	2018-06-11 00:38:04 +02:00
himkt	57311d5d47	replace janome with mecab in the documentation and the test (#2415 ) * Add links to Reddit data (see #2401) * replace janome with mecab in the documentation and the test * add the assignment	2018-06-11 00:33:13 +02:00
ines	effb55d591	Adjust formatting [ci skip]	2018-06-11 00:29:13 +02:00
Nathan Breit	ba6d2cf393	Add EpiTator to Universe (#2429 )	2018-06-11 00:24:13 +02:00
himkt	1a568f2e08	fix wrong documentations (#2423 )	2018-06-11 00:21:06 +02:00
Bohdan Moskalevskyi	d66292f767	fix UD data file extensions (#2425 ) * fix UD data files extension * add contributor agreement for msklvsk	2018-06-08 14:26:11 +02:00
ines	a0017e4909	Merge branch 'master' into develop	2018-05-30 14:10:47 +02:00
ines	0baaf836cf	Update formatting [ci skip]	2018-05-30 13:32:49 +02:00
ines	3913e18201	Add self-attentive-parser to universe (see #59 )	2018-05-30 13:31:28 +02:00
ines	4a62486340	Merge branch 'master' into develop	2018-05-30 13:01:01 +02:00
ines	605c663a4c	Fix HTML merger examples (see #2390 )	2018-05-30 12:22:32 +02:00
ines	d0b16aa014	Update list of languages	2018-05-26 18:56:26 +02:00
Samuel Pouyt	5f988b8e9c	Update _custom.jade (#2372 ) It seems based on the doc and trying out that the `en` or `[lang]` is missing from the `spacy model-init`	2018-05-26 18:17:12 +02:00
ines	d84a830d79	Merge branch 'master' of https://github.com/explosion/spaCy	2018-05-26 17:57:05 +02:00
ines	fb923b31ea	Fix bad HTML example (see #2376 ) and turn it into section on matcher + components Avoid problems caused by merging while matching (e.g. index errors). Creating a Matcher component also better reflects the recommended best practices.	2018-05-26 17:57:02 +02:00
Shantam Raj	592834183a	corrected spelling (#2359 ) changed interpretted to interpreted	2018-05-24 13:29:52 +02:00
ines	8adb967e0c	Fix from source quickstart instructions for Windows See: https://stackoverflow.com/a/50478036/6400719	2018-05-24 12:42:16 +02:00
Shantam Raj	1a4682dd0b	Update _training.jade (#2340 ) * Update _training.jade Correcting grammar. Replacing "The" with "To". * Create armsp.md * Update armsp.md	2018-05-21 11:09:33 +02:00
ines	ff1082d8e4	Add version tag in CLI docs [ci skip]	2018-05-21 01:17:49 +02:00
Ines Montani	d4cc736b7c	💫 Improve model downloads: check for existing install, customise pip and use requests library again (#2346 ) * Go back to using requests instead of urllib (closes #2320) Fewer dependencies are good, but this one was simply causing too many other problems around SSL verification and Python 2/3 compatibility. requests is a popular enough package that it's okay for spaCy to depend on it – and this will hopefully make model downloads less flakey. * Only download model if not installed (see #1456) Use #egg=model==version to allow pip to check for existing installations. The download is only started if no installation matching the package/version is found. Fixes a long-standing inconvenience. * Pass additional options to pip when installing model (resolves #1456) Treat all additional arguments passed to the download command as pip options to allow user to customise the command. For example: python -m spacy download en --user * Add CLI option to enable installing model package dependencies * Revert "Add CLI option to enable installing model package dependencies" This reverts commit `9336ffe695`. * Update documentation	2018-05-20 20:26:56 +02:00
vishnumenon	ae3719ece5	Fix the code for FACILITIY entities (#2324 ) * Fix the code for FACILITIY entities As far as I can tell, the default models all use "FAC" rather than "FACILITY" * Added my Contributor Agreement * Rename vishnumenon to vishnumenon.md	2018-05-12 15:19:17 +02:00
ines	ac25bc4016	Add docs section on sentence segmentation [ci skip]	2018-05-07 21:25:20 +02:00
ines	14148cd147	Fix formatting and wording	2018-05-07 21:24:35 +02:00
ines	f803da609f	Add scattertext [ci skip]	2018-05-07 19:10:23 +02:00
ines	c9547b7b8b	Update Juniper (see #2293 )	2018-05-03 15:36:02 +02:00
Alex Villarreal	647f2544c5	Fix code sample for span.set_extension (#2286 )	2018-05-03 00:39:22 +02:00
Alex Villarreal	13d562e1a4	Fix code sample for Doc.set_extension (#2282 ) * Fix code sample for `set_extension` The previous sample code for `set_extension` fails the assertion at the end, because `city_getter` it checked if the whole document text matches any of the city names. Now it checks if any of the city names is contained in the document text. * Contributor agreement	2018-05-02 10:16:05 +02:00
Shirish Kadam	d98a90440f	Added Adam project to spaCy Universe (#2275 ) * Added 5hirish to contributors * Added Adam Qas Project to spaCy Universe * Remove $ from code example	2018-04-30 22:25:01 +02:00
ines	56e7faf16b	Fix spacing	2018-04-30 22:24:40 +02:00
ines	6efb4cdf88	Use Juniper and tidy up	2018-04-30 18:48:35 +02:00
ines	45bb8d75a5	Fix overflow issues on small screens [ci skip]	2018-04-29 03:17:36 +02:00
Ines Montani	49cee4af92	💫 Interactive code examples, spaCy Universe and various docs improvements (#2274 ) * Integrate Python kernel via Binder * Add live model test for languages with examples * Update docs and code examples * Adjust margin (if not bootstrapped) * Add binder version to global config * Update terminal and executable code mixins * Pass attributes through infobox and section * Hide v-cloak * Fix example * Take out model comparison for now * Add meta text for compat * Remove chart.js dependency * Tidy up and simplify JS and port big components over to Vue * Remove chartjs example * Add Twitter icon * Add purple stylesheet option * Add utility for hand cursor (special cases only) * Add transition classes * Add small option for section * Add thumb object for small round thumbnail images * Allow unset code block language via "none" value (workaround to still allow unset language to default to DEFAULT_SYNTAX) * Pass through attributes * Add syntax highlighting definitions for Julia, R and Docker * Add website icon * Remove user survey from navigation * Don't hide GitHub icon on small screens * Make top navigation scrollable on small screens * Remove old resources page and references to it * Add Universe * Add helper functions for better page URL and title * Update site description * Increment versions * Update preview images * Update mentions of resources * Fix image * Fix social images * Fix problem with cover sizing and floats * Add divider and move badges into heading * Add docstrings * Reference converting section * Add section on converting word vectors * Move converting section to custom section and fix formatting * Remove old fastText example * Move extensions content to own section Keep weird ID to not break permalinks for now (we don't want to rewrite URLs if not absolutely necessary) * Use better component example and add factories section * Add note on larger model * Use better example for non-vector * Remove similarity in context section Only works via small models with tensors so has always been kind of confusing * Add note on init-model command * Fix lightning tour examples and make excutable if possible * Add spacy train CLI section to train * Fix formatting and add video * Fix formatting * Fix textcat example description (resolves #2246) * Add dummy file to try resolve conflict * Delete dummy file * Tidy up [ci skip] * Ensure sufficient height of loading container * Add loading animation to universe * Update Thebelab build and use better startup message * Fix asset versioning * Fix typo [ci skip] * Add note on project idea label	2018-04-29 02:06:46 +02:00
ines	a512fa60ef	Remove upcoming option from docs for now	2018-04-28 23:32:18 +02:00
ines	6fb6371670	Add collapse_phrases option to displacy (closes #2266 )	2018-04-28 23:06:50 +02:00
Matt Upson	87cc6b3599	Add missing comma to NN example in docs (#2255 ) Also add a completed contributor agreement.	2018-04-28 14:56:00 +02:00
ines	4a3bea00c7	Update resources [ci skip]	2018-04-26 22:10:34 +02:00
Pradeep Kumar Tippa	df389e5b74	spacy-101 vocab doc giving valid variable names (#2236 )	2018-04-18 14:54:26 -07:00
ines	ce63f8997b	Update init-model docs	2018-04-10 21:42:54 +02:00
ines	0e847d7fe5	Fix typo	2018-04-09 14:51:14 +02:00
ines	de137fba84	Add TensorBoard examples to examples overview [ci skip]	2018-04-03 16:01:52 +02:00
ines	6d87b28f15	Add Vietnamese to language overview [ci skip]	2018-04-03 16:01:36 +02:00
ines	9615ed5ed7	Update emoji/hashtag matcher example (resolves #2156 ) [ci skip]	2018-03-28 18:41:28 +02:00
ines	ce6071ca89	Remove ftfy dependency and update docs	2018-03-28 12:09:42 +02:00
ines	5ecc60cf3b	Add book to resources [ci skip]	2018-03-24 17:12:56 +01:00
ines	53680642af	Port over docs changes [ci skip]	2018-03-24 17:12:48 +01:00
Matthew Honnibal	f9f46e5a07	Revert matcher fixes from GregDubbin	2018-02-18 10:59:28 +01:00
ines	612c79a4f5	Update first matcher example and match_id (resolves #1989 )	2018-02-17 11:57:38 +01:00
ines	ca56fb53d1	Add user survey to navigation [ci skip]	2018-02-15 12:14:30 +01:00
ines	cab5b775e7	Document ENT_TYPE matcher attribute [ci skip]	2018-02-15 12:14:19 +01:00
Pradeep Kumar Tippa	416cd021ce	Added TAG from spacy symbols which used below	2018-02-09 19:16:59 +05:30
Pradeep Kumar Tippa	01cc9cd9c0	assert statement syntax fix in doc	2018-02-09 19:16:25 +05:30
Pradeep Kumar Tippa	a78062e466	Merge remote-tracking branch 'upstream/master' into web-doc-patches	2018-02-09 19:13:19 +05:30
ines	ab33e274f5	Add more details on symlink error & Windows solution (resolves #1941 ) [ci skip]	2018-02-09 10:43:33 +01:00
ines	8eaa934382	Merge branch 'master' of https://github.com/explosion/spaCy	2018-02-09 10:23:36 +01:00
ines	e9f67be04d	Fix regex flag matcher example (resolves #1950 )	2018-02-09 10:23:33 +01:00
ines	fc4ae04c55	Document LENGTH attribute in matcher	2018-02-09 10:23:03 +01:00
Pradeep Kumar Tippa	8a7467b26e	Merge remote-tracking branch 'upstream/master' into web-doc-patches	2018-02-09 13:54:26 +05:30
Orion Montoya	24af6375db	update link to Honnibal and Johnson 2015 aclweb.org is throwing a gateway timeout on the link as `https`+`aclweb.org`, but is fine with `https`+`www.aclweb.org` (also with `http`+`aclweb.org`, but let's keep it in `https`, shall we?	2018-02-08 10:49:09 -08:00
Pradeep Kumar Tippa	03113d6779	Fixing navigating parse tree doc under dependency parse	2018-02-08 19:34:15 +05:30
ines	a3b965b29d	Remove UPPER from Matcher attributes docs (resolves #1949 )	2018-02-08 11:29:27 +01:00
ines	696ae87b47	Fix whitespace	2018-02-08 11:28:54 +01:00
ines	26bc75134d	Fix typo	2018-02-08 11:28:44 +01:00
Pradeep Kumar Tippa	da9d687e75	Fixing typo from taining to training	2018-02-07 16:49:25 +05:30
Pradeep Kumar Tippa	ed7d268e93	Fixing vocab doc Replacing "like" with "love", coffee suffix should be "fee" but not "ffe"	2018-02-07 14:55:12 +05:30
ines	f377c483e4	Add note on manual entity order in displaCy [ci skip]	2018-02-07 01:08:42 +01:00
ines	58eb178667	Update Doc.char_span docs [ci skip]	2018-02-07 01:08:30 +01:00
sayf eddine hammemi	86e7727855	Fix typo in the word build.	2018-02-04 20:48:45 +01:00
ines	901bc0e85f	Add Persian to list of languages [ci skip]	2018-02-01 04:47:34 +01:00
Hassan Shamim	a0b912c528	fix broken link to test suite models	2018-01-30 15:01:01 -08:00
greg	daefed0a34	Correct documentation of '+' and '*' ops	2018-01-22 15:55:44 -05:00
ines	67ba73351d	Fix typo and use better serialization example (resolves #1851 ) [ci skip]	2018-01-16 18:42:03 +01:00
ines	7943a8e90c	Add spacy-lookup by @mpuig [ci skip]	2018-01-16 00:28:46 +01:00
ines	5684206154	Add LanguageCrunch by @artpar [ci skip]	2018-01-15 16:14:26 +01:00
Mateusz Tatusko	dda0e58c11	Update _pos-tags.jade really small changes to English tags description, but might help some people while working on projects 1) -PRB- should be -RRB- instead 2) space gets tagged as _SP, and not SP	2018-01-15 12:01:51 +09:00
ines	0536e91564	Add note on Tagger.tag_names vs. Tagger.labels (see #1666 ) [ci skip]	2018-01-14 14:37:19 +01:00
ines	bbee48080d	Clarify hyperparameters and alias usage in spacy train (resolves #1838 ) [ci skip]	2018-01-14 14:32:50 +01:00
ines	4daba3abda	Add regex section to rule-based matching docs (see #1567 , #1833 ) [ci skip]	2018-01-14 14:22:13 +01:00
Ines Montani	36f426fe0a	Merge pull request #1808 from fucking-signup/master Fix issue #1769	2018-01-12 21:12:02 +00:00
ines	cfac5b955f	Fix aligment issues with newsletter signup form	2018-01-12 22:06:44 +01:00
ines	65babd9e2e	Fix typo, formatting and operator descriptions (resolves #1820 )	2018-01-12 22:06:27 +01:00
Matthew Honnibal	a2a06dce24	Merge pull request #1792 from explosion/feature-improve-model-download 💫 Improve model downloading and linking	2018-01-11 20:02:08 +01:00
Ines Montani	11676b47f2	Merge pull request #1828 from wrathagom/patch-1 Small Grammar Fix to _basics.jade	2018-01-11 17:27:23 +00:00
pbnsilva	4cfd848bc3	Fixes typo in PhraseMatcher API docs	2018-01-11 17:35:59 +01:00
Caleb M. Keller	e68f6bf890	Small Grammar Fix to _basics.jade Fixed an incorrect word order.	2018-01-11 09:26:47 -05:00
Matthew Honnibal	7ca49c2061	Merge branch 'master' into feature-improve-model-download	2018-01-10 18:21:55 +01:00
Kit	db6e4ba72e	Update code example according to new changes	2018-01-08 03:45:56 +01:00
ines	ef210c73dd	Update cli.download and cli.validate docs	2018-01-03 21:34:03 +01:00
ines	cc9df10e69	Document util.set_lang_class (see #1737 )	2018-01-03 20:13:25 +01:00
Ines Montani	874f174ab1	Merge pull request #1790 from nirdesh37/patch-1 Update goldparse.jade	2018-01-03 18:37:07 +00:00
ines	1fa6ba8130	Fix Doc.from_array example to make it work (see #1527 )	2018-01-03 16:59:38 +01:00
ines	49635350f0	Add .from_disk() to pipeline component init example (resolves #1728 )	2018-01-03 16:50:24 +01:00
ines	95063ba26b	Update tests documentation (resolves #1781 )	2018-01-03 16:42:26 +01:00
nirdesh37	67fdceed6a	Update goldparse.jade	2018-01-03 17:25:21 +05:30
Martin Andrews	e4355dade2	Documentation example fix : token.head needs '==' rather than 'is' (similar change to #1689, it seems).	2017-12-18 18:12:10 +08:00
Kristofer Berggren	1cb8c997fb	Fix typo Span -> Token on Token API page Change Span.vector_norm to Token.vector_norm.	2017-12-17 20:32:19 +08:00
Ines Montani	4befd8bd44	Merge pull request #1724 from mpuels/patch-7 doc: Fix minor mistakes	2017-12-17 12:09:17 +00:00
ines	21482b391b	Fix head	2017-12-16 13:48:19 +01:00
mpuels	b3df2a2ffd	doc: Fix minor mistakes	2017-12-14 20:55:59 +01:00
mpuels	3f7bedadee	doc: Fix minor mistakes	2017-12-13 11:37:24 +01:00
ines	24e80c51b8	Document init-model command	2017-12-07 10:14:37 +01:00
mpuels	e3af19a076	doc: Replace 'is not' with '!=' in code example The function `dependency_labels_to_root(token)` defined in section Get syntactic dependencies does not terminate. Here is a complete example: import spacy nlp = spacy.load('en') doc = nlp("Apple and banana are similar. Pasta and hippo aren't.") def dependency_labels_to_root(token): """Walk up the syntactic tree, collecting the arc labels.""" dep_labels = [] while token.head is not token: dep_labels.append(token.dep) token = token.head return dep_labels dep_labels = dependency_labels_to_root(doc[1]) dep_labels Replacing `is not` with `!=` solves the issue: import spacy nlp = spacy.load('en') doc = nlp("Apple and banana are similar. Pasta and hippo aren't.") def dependency_labels_to_root(token): """Walk up the syntactic tree, collecting the arc labels.""" dep_labels = [] while token.head != token: dep_labels.append(token.dep) token = token.head return dep_labels dep_labels = dependency_labels_to_root(doc[1]) dep_labels The output is ['cc', 'nsubj']	2017-12-06 20:08:42 +01:00
mpuels	82e575ebfb	doc: Fix assert statement in Lightning Tour Python 3 throws an error message on the original assert statement. Also, according to the Python documentation regarding the assert statement (https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement), `assert` takes at least one argument and at most two. In the two-argument form the second argument is meant as an error message to be displayed when the assertion fails. I don't think this is intended in this case.	2017-12-06 16:40:51 +01:00
mpuels	662601f01c	doc: Add missing -operator to nlp.disable_pipes() I'm using SpaCy version 2.0.3. If I don't use the -operator in the example, Python throws an error message. With the operator it works fine. Also according to the documentation of the function `nlp.disable_pipes()`, it expects one or more strings as arguments and not one argument being a list of strings.	2017-12-06 15:26:43 +01:00
ines	b078e276e6	Document offsets_from_biluo_tags	2017-12-06 13:40:51 +01:00
ines	fb663f9b7d	Add Russian to list of languages	2017-12-06 13:40:32 +01:00
ines	58a19518cf	Merge branch 'master' of https://github.com/explosion/spaCy	2017-12-05 13:17:58 +01:00
ines	7ade336ab7	Add "Unknown locale" issue to troubleshooting guide (see #1684 , #1641 , #1517 )	2017-12-05 13:17:55 +01:00
Mark Dodwell	9d4c185860	Fix link to CLEAR Style dependency labels PDF	2017-12-04 23:28:06 -08:00
ines	40638b7cdf	Update resources	2017-12-02 04:16:03 +01:00
ines	9ea8a7cf0c	Add spacy_cld to extensions	2017-12-01 23:21:33 +01:00
ines	8d3f29322f	Add spacy_hunspell to resources (see #315 )	2017-11-29 09:33:22 +01:00
atomobianco	f6a82da907	Corrected char index instead of token index Changed the index used to add the label because `displacy.render` apparently uses char index	2017-11-26 23:55:25 +01:00
ines	bda6e2a816	Add training example to lightning tour	2017-11-26 18:04:18 +01:00
ines	89f8b1fba0	Update example documents	2017-11-26 18:04:04 +01:00
ines	65d66b81f1	Fix typo	2017-11-26 18:03:44 +01:00
ines	e4ee666be5	Fix biluo_tags_from_offsets example and docs	2017-11-26 16:37:32 +01:00
ines	434030e0d0	Fix requirements.txt example (see #1638 )	2017-11-26 15:53:19 +01:00
Matthew Honnibal	6bc9917a0e	Another small fix to component docs	2017-11-23 11:47:20 +01:00
markulrich	c9b63c0dfc	Use correct local parameter in example MyComponent (and added markulrich.md contributor file)	2017-11-22 15:59:08 -08:00
ines	4f7e64e371	Update resources	2017-11-18 02:53:00 +01:00
ines	c3051e95f7	Add note on attribute extension defaults (resolves #1587 )	2017-11-17 19:14:29 +01:00
ines	954f8cc6d1	Update syntax theme (should move the modifications out to an extension sometime)	2017-11-17 19:13:53 +01:00
Raphaël Bournhonesque	a0793fd4cc	Fix typo	2017-11-17 17:57:55 +01:00
Martino Mensio	ce1aade41e	small typo on docs	2017-11-17 16:20:22 +01:00
pavillet	ad2935f0c3	Update _spacy.jade Doc example gives 'object is not subscriptable' error. Correcting as an attribuet	2017-11-17 00:02:20 +01:00
ines	40c4e8fc09	Remove "optional" from dev_data arg and add more info (see #1578 )	2017-11-14 20:26:05 +01:00
KMLDS	d5b20ac3b6	Update span.jade	2017-11-13 19:27:20 -05:00
ines	bc79274706	Fix typo	2017-11-13 17:00:03 +01:00
ines	7a7b01feb1	Update links	2017-11-13 08:30:06 +01:00
ines	b3e502a076	Add videos section to resources	2017-11-13 08:29:57 +01:00
ines	f2b6b98b75	Fix typo in code example (resolves #1556 )	2017-11-13 08:29:16 +01:00
ines	ceb2c596f1	Update conda details	2017-11-11 13:07:00 +01:00
ines	4a97def06a	Update features	2017-11-10 19:05:10 +01:00
ines	dea5636d6c	Fix broken links	2017-11-10 13:06:38 +01:00
Wahib Faizi	0da56f8ef8	Fix typo. Add missing '='.	2017-11-10 14:51:24 +03:00
ines	4c5d2c80d5	Re-add python -m to commands, too brittle :( (see #1536 )	2017-11-10 02:30:55 +01:00
ines	ee5697a1cd	Fix training tips	2017-11-10 00:19:42 +01:00
ines	6ae0ebfa3a	Update training tips	2017-11-10 00:17:10 +01:00
ines	b20779bac4	Update resources	2017-11-09 23:05:37 +01:00
ines	ed84688935	Remove old link	2017-11-09 15:34:12 +01:00
Ines Montani	e5b9ccdb5c	Merge pull request #1526 from mcsalgado/fix-typos fix typos	2017-11-09 15:33:55 +01:00
Victor Salgado	fe1d969d5f	fix typos	2017-11-09 10:55:13 -02:00
Mathias Deschamps	25b26f0d64	Fix similarity visual Doc was showing similarity when dissimilar	2017-11-09 11:08:26 +01:00
ines	98767122a7	Fix typos	2017-11-09 04:13:03 +01:00
ines	e87eb11beb	Update package.json	2017-11-09 04:12:57 +01:00
ines	33b84f4c39	Change clear_vectors to reset_vectors (resolves #1516 )	2017-11-08 18:11:23 +01:00
ines	97a5892347	Document Vectors.resize() and update v2 incompatibilities (resolves #1514 )	2017-11-08 17:11:11 +01:00
ines	c0a7a32bf8	Add en.stop_words change to v2 docs (resolves #1512 )	2017-11-08 16:30:46 +01:00
ines	9b09b6b0cd	Fix formatting	2017-11-08 16:30:23 +01:00
ines	f0bdfb4471	Fix vector listing for core sm models in list overview (see #1513 )	2017-11-08 16:24:27 +01:00
ines	94cd3d51db	Update v2 docs and model info Take out speed tables until we fix our benchmark tests on CPU and GPU	2017-11-08 11:43:00 +01:00
ines	14f97cfd20	Add note on stream processing to migration guide (see #1508 )	2017-11-08 01:53:36 +01:00
ines	5d1162cf21	Improve nlp.update / training loop overview (see #1507 )	2017-11-08 01:17:42 +01:00
ines	2229aba71c	Update website	2017-11-08 01:06:30 +01:00
ines	1768703e1c	Update website for v2.0	2017-11-07 14:48:17 +01:00
ines	e4a05385d6	Update docs	2017-11-07 12:33:43 +01:00
ines	a4662a31a9	Move model package templates to cli.package and update docs	2017-11-07 12:15:35 +01:00
ines	a09c096d3c	Get docs ready for v2.0.0	2017-11-07 12:00:43 +01:00
ines	173b1551af	Update examples	2017-11-07 01:22:30 +01:00
ines	c37837cad1	Update training docs	2017-11-07 01:06:31 +01:00
ines	c7bda87b17	Update model docs and add tips section	2017-11-07 01:05:37 +01:00
ines	a1261e8632	Fix formatting	2017-11-07 01:05:30 +01:00
ines	912c1b1821	Document "simple training style"	2017-11-07 00:23:19 +01:00
ines	ad6438ccdf	Update aside labels and under construction mixin	2017-11-07 00:23:00 +01:00
ines	8fb48b9b91	Update and document new util functions	2017-11-07 00:22:43 +01:00
ines	6447b8e396	Update v2 details	2017-11-06 21:15:36 +01:00
ines	008d7408cf	Make vectors vs. tensors more explicit in 101 (see #1498 )	2017-11-06 20:16:38 +01:00
ines	71852d3f25	Fix code mixins	2017-11-06 20:16:19 +01:00
ines	3b0699c9fe	Update benchmarks and data table style	2017-11-06 19:36:02 +01:00
ines	ddff7dc474	Update GPU install docs	2017-11-06 19:35:36 +01:00
ines	64d0f97c67	Update benchmarks and models	2017-11-06 18:19:00 +01:00
Matthew Honnibal	6fdffd7246	Merge pull request #1497 from explosion/feature/improve-optimizer-handling 💫 Improve optimizer handling	2017-11-06 16:41:15 +01:00
ines	972298e0c9	Update Pipe component docs and training API	2017-11-06 14:42:24 +01:00
ines	f48e1973ed	Fix accuracy table descriptions	2017-11-06 14:12:11 +01:00
ines	2d85ee6b5d	Fix broken link	2017-11-06 13:27:30 +01:00
ines	efb0a7e934	Fix broken links	2017-11-06 13:20:36 +01:00
ines	42a99eae02	Update troubleshooting guide	2017-11-06 13:17:09 +01:00
ines	2dca9e71a1	Add notes on catastrophic forgetting (see #1496 )	2017-11-06 13:17:02 +01:00
ines	e68d31bffa	Update models quickstart usage example	2017-11-06 13:06:26 +01:00
ines	2fe2c4942f	Update models directory and listing	2017-11-06 13:04:29 +01:00
ines	df1bdc7173	Add Dutch model	2017-11-06 02:44:59 +01:00
ines	333bef482f	Update pattern for Prism.js Python	2017-11-06 02:44:24 +01:00
ines	6b08aefd0c	Update formatting and styleguide	2017-11-05 23:31:31 +01:00
ines	e61a067c4b	Update v2 docs	2017-11-05 21:41:56 +01:00
ines	86d6bd7503	Fix wording	2017-11-05 19:23:50 +01:00
ines	6742657c4d	Fix website asset versioning	2017-11-05 19:23:45 +01:00
ines	2ca82d1f6e	Take out pt_core_news_sm for now	2017-11-05 18:57:04 +01:00
ines	a6ffa942bb	Update UD schemes	2017-11-05 18:46:24 +01:00
ines	3fa8900a6b	Don't include tag and label schemes in usage guide	2017-11-05 18:21:49 +01:00
ines	4810be4b44	Update POS scheme docs and add links for other schemes	2017-11-05 18:16:34 +01:00
ines	e7d0641125	Update POS row mixins	2017-11-05 18:16:16 +01:00
ines	15de2bb01d	Update and simplify other annotation scheme data	2017-11-05 16:09:48 +01:00
ines	2d59dd374b	Use collapsible sections for pos/dep scheme and update Will ensure better overview as we add more schemes for more languages	2017-11-05 16:09:30 +01:00
ines	a9c77e01b4	Add accordion component (collapsible section)	2017-11-05 16:08:13 +01:00
ines	3d4dff1845	Remove comment	2017-11-05 16:07:14 +01:00
ines	b53c2010db	Add global focus style for links	2017-11-05 16:07:00 +01:00
ines	f092506578	Use hidden attribute instead of style.display	2017-11-05 16:06:50 +01:00
ines	0e8157674a	Add Portuguese and French	2017-11-04 23:07:21 +01:00
ines	d9fa3c6054	Update adding languages example	2017-11-04 15:12:39 +01:00
ines	c83fe54f0c	Update venv docs in installation instructions	2017-11-04 14:27:55 +01:00
ines	2940938bd8	Use more distinct style for checkboxes in quickstart	2017-11-04 14:24:30 +01:00
ines	4793d56a3e	Update commands for building from source	2017-11-04 14:24:14 +01:00
ines	177bf4ee39	Update GitHub topic links	2017-11-04 14:02:28 +01:00
ines	2639ecd5f8	Add docs note on custom tokenizer rules (see #1491 )	2017-11-03 23:33:18 +01:00
ines	380f2441b4	Fix script includes	2017-11-03 18:51:03 +01:00
Abhinav Sharma	c740277f9f	Minor typo [ nad => and ]	2017-11-03 16:30:44 +05:30
ines	1e16374687	Update models list to reflect spaCy v2.0.0a18	2017-11-03 11:29:34 +01:00
ines	a62b0727d8	Tidy up and always use bundle in built site for now Just to be safe	2017-11-03 11:29:21 +01:00
ines	d0f88af5b6	Hide error earlier	2017-11-03 11:29:04 +01:00
ines	43512c68b2	Fix vector details in model overview	2017-11-02 20:04:13 +01:00
ines	9baab241b4	Add skeleton language data for Turkish	2017-11-02 16:32:24 +01:00
ines	31e349a62c	Update model families	2017-11-02 16:13:38 +01:00
ines	15cbc61a6e	Adjust rendering of large numbers 1234 -> 1.2k 12345 -> 12.3k 123456 -> 123k 1234567 -> 1.2m	2017-11-02 16:13:18 +01:00
ines	391fce09d9	Update licenses	2017-11-01 23:04:40 +01:00
ines	c6fea3e5f6	Add Romanian and Croatian skeletons (experimental) Add language data templates to make it easier for others to contribute to the language support	2017-11-01 23:04:28 +01:00
ines	408f450ce0	Tidy up	2017-11-01 23:01:12 +01:00
ines	2fa53b39d5	Add dev dependency	2017-11-01 23:01:06 +01:00
ines	1976fb157f	Update licenses	2017-11-01 21:49:57 +01:00
ines	2ba4e4fc88	Fix broken links and add check_links shortcut script	2017-11-01 21:11:10 +01:00
ines	e5a4c31bb4	Adjust code line height	2017-11-01 19:49:42 +01:00

... 4 5 6 7 8 ...

1390 Commits