Commit Graph

9006 Commits

Author SHA1 Message Date
Matthew Honnibal
8cd06cc763 Try to fix root-outside-sentence bug 2018-05-02 14:39:48 +00:00
Matthew Honnibal
acebd01033 Set cildren from heads in finalize doc 2018-05-02 14:19:22 +00:00
Alex Villarreal
13d562e1a4 Fix code sample for Doc.set_extension (#2282)
* Fix code sample for `set_extension`

The previous sample code for `set_extension` fails the assertion at the end, because `city_getter` it checked if the whole document text matches any of the city names. Now it checks if any of the city names is contained in the document text.

* Contributor agreement
2018-05-02 10:16:05 +02:00
Matthew Honnibal
569440a6db Dont normalize gradient by batch size 2018-05-02 08:42:10 +02:00
Matthew Honnibal
281e29cbcd Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2018-05-02 01:36:23 +00:00
Matthew Honnibal
2338e8c7fc Update develop from master 2018-05-02 01:36:12 +00:00
Matthew Honnibal
9d147e12c4 Merge remote-tracking branch 'origin/master' into develop 2018-05-01 18:18:51 +02:00
Matthew Honnibal
8562faeb39 Fix conll2017 fab command 2018-05-01 18:04:58 +02:00
Matthew Honnibal
116ae46802 Improve experiment management 2018-05-01 17:51:22 +02:00
Matthew Honnibal
6d0fe67b72 Constrain subtok label to adjacent tokens 2018-05-01 17:34:27 +02:00
Matthew Honnibal
8f21953fc5 Constrain subtok to adjacent words 2018-05-01 17:29:00 +02:00
Matthew Honnibal
b43bfd3524 Fix arc-eager oracle tests 2018-05-01 16:16:14 +02:00
Matthew Honnibal
31ed64e9b0 Fix textcat test 2018-05-01 15:18:39 +02:00
Matthew Honnibal
548bdff943 Update default Adam settings 2018-05-01 15:18:20 +02:00
Matthew Honnibal
adbb1f7533 Add better arc-eager oracle tests 2018-05-01 15:14:55 +02:00
Matthew Honnibal
697bcaa34f Add some methods to ArcEager that make testing easier 2018-05-01 15:13:14 +02:00
Matthew Honnibal
a5f6d69f8a Require new dev build of Thinc 2018-05-01 15:05:00 +02:00
Mr Roboto
6f5ccda19c Addresses Issue #2228 - Deserialization fails when using tensor=False or sentiment=False (#2230)
* Fixes issue #2228

* Adds a new contributor
2018-05-01 13:40:22 +02:00
Matthew Honnibal
d44bb45c72 Fix scoring if tokenization changes 2018-05-01 01:33:20 +02:00
Shirish Kadam
d98a90440f Added Adam project to spaCy Universe (#2275)
* Added 5hirish to contributors

* Added Adam Qas Project to spaCy Universe

* Remove $ from code example
2018-04-30 22:25:01 +02:00
ines
56e7faf16b Fix spacing 2018-04-30 22:24:40 +02:00
ines
6efb4cdf88 Use Juniper and tidy up 2018-04-30 18:48:35 +02:00
Matthew Honnibal
2b26c007cd Revert "Disable batch size compounding in ud-train"
This reverts commit 8a120fb455.
2018-04-29 14:09:02 +00:00
Matthew Honnibal
723b328062 Add script to run UD test 2018-04-29 15:50:25 +02:00
Matthew Honnibal
17af6aa3a4 Update ud_train script 2018-04-29 15:49:32 +02:00
Matthew Honnibal
5de8a36537 Fix arc_eager is_nonproj_tree 2018-04-29 15:49:11 +02:00
Matthew Honnibal
5260268f70 Fix textcat after merge 2018-04-29 15:48:53 +02:00
Matthew Honnibal
ad3d56c3ba Fix compile error in matcher 2018-04-29 15:48:34 +02:00
Matthew Honnibal
a8bc947fd4 Fix Token.set_extension 2018-04-29 15:48:19 +02:00
Matthew Honnibal
535833a38d Fix syntax error in setup.py 2018-04-29 15:47:54 +02:00
Matthew Honnibal
2c4a6d66fa Merge master into develop. Big merge, many conflicts -- need to review 2018-04-29 14:49:26 +02:00
ines
45bb8d75a5 Fix overflow issues on small screens [ci skip] 2018-04-29 03:17:36 +02:00
Ines Montani
49cee4af92
💫 Interactive code examples, spaCy Universe and various docs improvements (#2274)
* Integrate Python kernel via Binder

* Add live model test for languages with examples

* Update docs and code examples

* Adjust margin (if not bootstrapped)

* Add binder version to global config

* Update terminal and executable code mixins

* Pass attributes through infobox and section

* Hide v-cloak

* Fix example

* Take out model comparison for now

* Add meta text for compat

* Remove chart.js dependency

* Tidy up and simplify JS and port big components over to Vue

* Remove chartjs example

* Add Twitter icon

* Add purple stylesheet option

* Add utility for hand cursor (special cases only)

* Add transition classes

* Add small option for section

* Add thumb object for small round thumbnail images

* Allow unset code block language via "none" value

(workaround to still allow unset language to default to DEFAULT_SYNTAX)

* Pass through attributes

* Add syntax highlighting definitions for Julia, R and Docker

* Add website icon

* Remove user survey from navigation

* Don't hide GitHub icon on small screens

* Make top navigation scrollable on small screens

* Remove old resources page and references to it

* Add Universe

* Add helper functions for better page URL and title

* Update site description

* Increment versions

* Update preview images

* Update mentions of resources

* Fix image

* Fix social images

* Fix problem with cover sizing and floats

* Add divider and move badges into heading

* Add docstrings

* Reference converting section

* Add section on converting word vectors

* Move converting section to custom section and fix formatting

* Remove old fastText example

* Move extensions content to own section

Keep weird ID to not break permalinks for now (we don't want to rewrite URLs if not absolutely necessary)

* Use better component example and add factories section

* Add note on larger model

* Use better example for non-vector

* Remove similarity in context section

Only works via small models with tensors so has always been kind of confusing

* Add note on init-model command

* Fix lightning tour examples and make excutable if possible

* Add spacy train CLI section to train

* Fix formatting and add video

* Fix formatting

* Fix textcat example description (resolves #2246)

* Add dummy file to try resolve conflict

* Delete dummy file

* Tidy up [ci skip]

* Ensure sufficient height of loading container

* Add loading animation to universe

* Update Thebelab build and use better startup message

* Fix asset versioning

* Fix typo [ci skip]

* Add note on project idea label
2018-04-29 02:06:46 +02:00
ines
3c80f69ff5 Return data in cli.info and add silent option (resolves #2196) 2018-04-29 01:59:44 +02:00
ines
1c6d77610c Add remove_extension method on Doc, Token and Span (closes #2242) 2018-04-28 23:33:09 +02:00
ines
a512fa60ef Remove upcoming option from docs for now 2018-04-28 23:32:18 +02:00
ines
abdb853ebf Simplify underscore tests 2018-04-28 23:30:33 +02:00
ines
6fb6371670 Add collapse_phrases option to displacy (closes #2266) 2018-04-28 23:06:50 +02:00
Matt Upson
87cc6b3599 Add missing comma to NN example in docs (#2255)
Also add a completed contributor agreement.
2018-04-28 14:56:00 +02:00
Robin Linderborg
1f9904ef12 fixes #2238 (#2241)
* Remove erroneous lemma lookup år > åra in Swedish

* Add contributors agreement

* Add contrib agreement to correct directory

* Revert change to CONTRIBUTOR_AGREEMENT
2018-04-28 14:55:22 +02:00
Robin Linderborg
d01f503b54 Remove incorrect lemma lookup gäng->gänga (#2252)
* Remove incorrect lemma lookup gäng->gänga
In modern Swedish, "gäng" is mostly associated with "gang" or "group of people". The removed lemma lookup lemmatized it to the verb "thread".

* Add contrib agreement to correct directory

* Revert change to CONTRIBUTOR_AGREEMENT
2018-04-28 14:54:41 +02:00
ines
4a3bea00c7 Update resources [ci skip] 2018-04-26 22:10:34 +02:00
Suraj Krishnan Rajan
69d041148f Implement Fast-Text vectors with subword features 2018-04-21 01:34:14 +05:30
ines
686225eadd Fix Spanish noun_chunks (resolves #2210)
Make sure 'NP' label is added to StringStore and move noun_bounds helper into a closure to allow reusing label sets
2018-04-18 18:44:01 -04:00
ines
9632595fb4 Use correct, non-deprecated merge syntax (resolves #2226) 2018-04-18 18:28:28 -04:00
Suraj Rajan
5957f15227 Fixed typos for #2222,#2223 (#2233) (closes #2222, closes #2223) 2018-04-18 14:55:26 -07:00
Pradeep Kumar Tippa
df389e5b74 spacy-101 vocab doc giving valid variable names (#2236) 2018-04-18 14:54:26 -07:00
Ines Montani
b4d35c7dfb
Update README.rst 2018-04-10 23:05:49 +02:00
Matthew Honnibal
97851d2c4e Increment version to v2.0.12.dev0 2018-04-10 22:20:16 +02:00
Matthew Honnibal
ed39c75a92 Merge branch 'master' of https://github.com/explosion/spaCy 2018-04-10 22:19:40 +02:00