Commit Graph

9856 Commits

Author SHA1 Message Date
Roshni Biswas
e09f1347fa updates for Bengali language (#3286)
* Update morph_rules.py

* contributor agreement for roshni-b

* created example sentences
2019-02-18 10:02:28 +01:00
Ines Montani
d47bbbe2a1 Update layout.sass 2019-02-17 22:27:45 +01:00
Ines Montani
212ff359ef Fix links [ci skip] 2019-02-17 22:25:50 +01:00
Ines Montani
04b4df0ec9 Remove n_threads 2019-02-17 22:25:42 +01:00
Ines Montani
4c7ab7620a Update README.md 2019-02-17 22:16:17 +01:00
Ines Montani
8a8523d8c1 Update README.md 2019-02-17 21:59:52 +01:00
Ines Montani
e597110d31
💫 Update website (#3285)
<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-02-17 19:31:19 +01:00
Ines Montani
043e8186f3 Merge branch 'master' into develop 2019-02-17 17:51:17 +01:00
Marc Puig
51268e9f21 Typo error fixed (#3284) 2019-02-17 17:51:02 +01:00
Ines Montani
3af0b2dd1c Add xfailing test for #1971 [ci skip] 2019-02-17 13:04:47 +01:00
Ines Montani
19a002bfd3 Merge branch 'master' into develop 2019-02-17 12:22:54 +01:00
Ines Montani
1e252b129c Auto-format 2019-02-17 12:22:07 +01:00
Roshni Biswas
e26d923726 Update morph_rules.py (#3283) 2019-02-17 12:21:47 +01:00
Matthew Honnibal
7d4a52a4d0 Set version to v2.1.0a7 2019-02-16 17:48:34 +01:00
Matthew Honnibal
07617b6b7f Set version to v2.1.0a7.dev12 2019-02-16 17:30:29 +01:00
Matthew Honnibal
808ae7521b Require thinc 7.0.1 2019-02-16 17:29:57 +01:00
Matthew Honnibal
1dc314bada Set version to v2.1.0a7.dev11 2019-02-16 17:02:49 +01:00
Matthew Honnibal
eea3001b98 Depend on thinc 7.0.1.dev2 2019-02-16 17:02:30 +01:00
Matthew Honnibal
2ef227c313 Set version to v2.1.0a7.dev1 2019-02-16 16:22:46 +01:00
Matthew Honnibal
f456b673d4 Require thinc 7.0.1.dev1 2019-02-16 16:22:26 +01:00
Matthew Honnibal
22923b9cb1 Set version to v2.1.0a7.dev9 2019-02-16 15:47:19 +01:00
Matthew Honnibal
11e826ac3b Require thinc v7.0.1.dev0 2019-02-16 15:47:02 +01:00
Matthew Honnibal
e0c91a4c8d Set version to 2.1.0a7 2019-02-16 14:43:38 +01:00
Matthew Honnibal
92b6bd2977
Refinements to retokenize.split() function (#3282)
* Change retokenize.split() API for heads

* Pass lists as values for attrs in split

* Fix test_doc_split filename

* Add error for mismatched tokens after split

* Raise error if new tokens don't match text

* Fix doc test

* Fix error

* Move deps under attrs

* Fix split tests

* Fix retokenize.split
2019-02-15 17:32:31 +01:00
Matthew Honnibal
2dbc61bc26 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-15 14:03:54 +01:00
Ines Montani
1aa57690dc Add xfailing test for orth mismatch in retokenizer.split 2019-02-15 13:55:04 +01:00
Ines Montani
819768483f Add xfailing test for out-of-bounds heads 2019-02-15 13:09:07 +01:00
Ines Montani
d8051e89ca Tidy up tests 2019-02-15 12:56:51 +01:00
Matthew Honnibal
58aac58631 Set version to v2.1.0a7.dev8 2019-02-15 12:39:26 +01:00
Matthew Honnibal
4c49f5f7b0 Update Thinc dependency 2019-02-15 12:39:08 +01:00
Matthew Honnibal
5f1abe2cc7 Set version to v2.1.0a7.dev7 2019-02-15 10:30:53 +01:00
Matthew Honnibal
a66e8e0c8a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-15 10:30:22 +01:00
Ines Montani
c31a9dabd5 💫 Add en/em dash to prefixes and suffixes (#3281)
* Auto-format

* Add en/em dash to prefixes and suffixes
2019-02-15 10:29:59 +01:00
Ines Montani
5651a0d052 💫 Replace {Doc,Span}.merge with Doc.retokenize (#3280)
* Add deprecation warning to Doc.merge and Span.merge

* Replace {Doc,Span}.merge with Doc.retokenize
2019-02-15 10:29:44 +01:00
Matthew Honnibal
dcf79c5ef3 Set version to v2.1.0a7.dev6 2019-02-14 20:12:02 +01:00
Matthew Honnibal
0371ac23e7 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-14 20:09:10 +01:00
Ines Montani
f146121092 💫 Make handling of [Pipe].labels consistent (#3273)
* Make handling of [Pipe].labels consistent

* Un-xfail passing test

* Update spacy/pipeline/pipes.pyx

Co-Authored-By: ines <ines@ines.io>

* Update spacy/pipeline/pipes.pyx

Co-Authored-By: ines <ines@ines.io>

* Update spacy/tests/pipeline/test_pipe_methods.py

Co-Authored-By: ines <ines@ines.io>

* Update spacy/pipeline/pipes.pyx

Co-Authored-By: ines <ines@ines.io>

* Move error message to spacy.errors

* Fix textcat labels and test

* Make EntityRuler.labels return tuple as well
2019-02-15 06:03:19 +11:00
Ines Montani
3d577b77c6 Auto-formatting 2019-02-14 19:56:38 +01:00
Ines Montani
2569339a98 Formatting and whitespace [ci skip] 2019-02-14 18:05:07 +01:00
Matthew Honnibal
aebf71bc72 Set version to v2.1.0a7.dev5 2019-02-14 15:51:42 +01:00
Matthew Honnibal
6ccd67c682 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-14 15:51:12 +01:00
Ines Montani
e104e47c21 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-14 15:35:34 +01:00
Ines Montani
0cd01a8c5e Merge branch 'master' into develop 2019-02-14 15:35:20 +01:00
Ines Montani
2e31921d0a 💫 Add base Language classes for more languages (#3276)
* Add base classes for more languages

* Add test for language class initialization

Make sure language can be initialize – otherwise, it's difficult to catch serious errors in the test suite, because languages are lazy-loaded
2019-02-15 01:31:19 +11:00
Grivaz
39815513e2 Add split one token into several (resolves #2838) (#3253)
* Add split one token into several (resolves #2838)

* Improve error message for token splitting

* Make retokenizer.split() tests use a Token object

Change retokenizer.split() to use a Token object, instead of an index.

* Pass Token into retokenize.split()

Tweak retokenize.split() API so that we pass the `Token` object, not the index.

* Fix token.idx in retokenize.split()

* Test that token.idx is correct after split

* Fix token.idx for split tokens

* Fix retokenize.split()

* Fix retokenize.split

* Fix retokenize.split() test
2019-02-15 01:27:13 +11:00
Ines Montani
743ecf728c Tidy up conftest 2019-02-14 13:27:13 +01:00
Ines Montani
106d95b01a Fix typo 2019-02-14 12:26:56 +01:00
Ines Montani
11d6b874db
Update stop_words.py 2019-02-14 12:25:19 +01:00
Ines Montani
60c2a3bb65 Also raise original error message in util.get_lang_class
Otherwise, the true error that happens within a Language subclass is swallowed, because if it's imported lazily like that, it'll always be an ImportError
2019-02-13 16:52:25 +01:00
Ines Montani
4d2438f985 Tidy up and auto-format 2019-02-13 15:29:08 +01:00