Commit Graph

9697 Commits

Author SHA1 Message Date
Ines Montani
3af0b2dd1c Add xfailing test for #1971 [ci skip] 2019-02-17 13:04:47 +01:00
Ines Montani
19a002bfd3 Merge branch 'master' into develop 2019-02-17 12:22:54 +01:00
Ines Montani
1e252b129c Auto-format 2019-02-17 12:22:07 +01:00
Roshni Biswas
e26d923726 Update morph_rules.py (#3283) 2019-02-17 12:21:47 +01:00
Matthew Honnibal
7d4a52a4d0 Set version to v2.1.0a7 2019-02-16 17:48:34 +01:00
Matthew Honnibal
07617b6b7f Set version to v2.1.0a7.dev12 2019-02-16 17:30:29 +01:00
Matthew Honnibal
808ae7521b Require thinc 7.0.1 2019-02-16 17:29:57 +01:00
Matthew Honnibal
1dc314bada Set version to v2.1.0a7.dev11 2019-02-16 17:02:49 +01:00
Matthew Honnibal
eea3001b98 Depend on thinc 7.0.1.dev2 2019-02-16 17:02:30 +01:00
Matthew Honnibal
2ef227c313 Set version to v2.1.0a7.dev1 2019-02-16 16:22:46 +01:00
Matthew Honnibal
f456b673d4 Require thinc 7.0.1.dev1 2019-02-16 16:22:26 +01:00
Matthew Honnibal
22923b9cb1 Set version to v2.1.0a7.dev9 2019-02-16 15:47:19 +01:00
Matthew Honnibal
11e826ac3b Require thinc v7.0.1.dev0 2019-02-16 15:47:02 +01:00
Matthew Honnibal
e0c91a4c8d Set version to 2.1.0a7 2019-02-16 14:43:38 +01:00
Matthew Honnibal
92b6bd2977
Refinements to retokenize.split() function (#3282)
* Change retokenize.split() API for heads

* Pass lists as values for attrs in split

* Fix test_doc_split filename

* Add error for mismatched tokens after split

* Raise error if new tokens don't match text

* Fix doc test

* Fix error

* Move deps under attrs

* Fix split tests

* Fix retokenize.split
2019-02-15 17:32:31 +01:00
Matthew Honnibal
2dbc61bc26 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-15 14:03:54 +01:00
Ines Montani
1aa57690dc Add xfailing test for orth mismatch in retokenizer.split 2019-02-15 13:55:04 +01:00
Ines Montani
819768483f Add xfailing test for out-of-bounds heads 2019-02-15 13:09:07 +01:00
Ines Montani
d8051e89ca Tidy up tests 2019-02-15 12:56:51 +01:00
Matthew Honnibal
58aac58631 Set version to v2.1.0a7.dev8 2019-02-15 12:39:26 +01:00
Matthew Honnibal
4c49f5f7b0 Update Thinc dependency 2019-02-15 12:39:08 +01:00
Matthew Honnibal
5f1abe2cc7 Set version to v2.1.0a7.dev7 2019-02-15 10:30:53 +01:00
Matthew Honnibal
a66e8e0c8a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-15 10:30:22 +01:00
Ines Montani
c31a9dabd5 💫 Add en/em dash to prefixes and suffixes (#3281)
* Auto-format

* Add en/em dash to prefixes and suffixes
2019-02-15 10:29:59 +01:00
Ines Montani
5651a0d052 💫 Replace {Doc,Span}.merge with Doc.retokenize (#3280)
* Add deprecation warning to Doc.merge and Span.merge

* Replace {Doc,Span}.merge with Doc.retokenize
2019-02-15 10:29:44 +01:00
Matthew Honnibal
dcf79c5ef3 Set version to v2.1.0a7.dev6 2019-02-14 20:12:02 +01:00
Matthew Honnibal
0371ac23e7 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-14 20:09:10 +01:00
Ines Montani
f146121092 💫 Make handling of [Pipe].labels consistent (#3273)
* Make handling of [Pipe].labels consistent

* Un-xfail passing test

* Update spacy/pipeline/pipes.pyx

Co-Authored-By: ines <ines@ines.io>

* Update spacy/pipeline/pipes.pyx

Co-Authored-By: ines <ines@ines.io>

* Update spacy/tests/pipeline/test_pipe_methods.py

Co-Authored-By: ines <ines@ines.io>

* Update spacy/pipeline/pipes.pyx

Co-Authored-By: ines <ines@ines.io>

* Move error message to spacy.errors

* Fix textcat labels and test

* Make EntityRuler.labels return tuple as well
2019-02-15 06:03:19 +11:00
Ines Montani
3d577b77c6 Auto-formatting 2019-02-14 19:56:38 +01:00
Ines Montani
2569339a98 Formatting and whitespace [ci skip] 2019-02-14 18:05:07 +01:00
Matthew Honnibal
aebf71bc72 Set version to v2.1.0a7.dev5 2019-02-14 15:51:42 +01:00
Matthew Honnibal
6ccd67c682 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-14 15:51:12 +01:00
Ines Montani
e104e47c21 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-14 15:35:34 +01:00
Ines Montani
0cd01a8c5e Merge branch 'master' into develop 2019-02-14 15:35:20 +01:00
Ines Montani
2e31921d0a 💫 Add base Language classes for more languages (#3276)
* Add base classes for more languages

* Add test for language class initialization

Make sure language can be initialize – otherwise, it's difficult to catch serious errors in the test suite, because languages are lazy-loaded
2019-02-15 01:31:19 +11:00
Grivaz
39815513e2 Add split one token into several (resolves #2838) (#3253)
* Add split one token into several (resolves #2838)

* Improve error message for token splitting

* Make retokenizer.split() tests use a Token object

Change retokenizer.split() to use a Token object, instead of an index.

* Pass Token into retokenize.split()

Tweak retokenize.split() API so that we pass the `Token` object, not the index.

* Fix token.idx in retokenize.split()

* Test that token.idx is correct after split

* Fix token.idx for split tokens

* Fix retokenize.split()

* Fix retokenize.split

* Fix retokenize.split() test
2019-02-15 01:27:13 +11:00
Ines Montani
743ecf728c Tidy up conftest 2019-02-14 13:27:13 +01:00
Ines Montani
106d95b01a Fix typo 2019-02-14 12:26:56 +01:00
Ines Montani
11d6b874db
Update stop_words.py 2019-02-14 12:25:19 +01:00
Ines Montani
60c2a3bb65 Also raise original error message in util.get_lang_class
Otherwise, the true error that happens within a Language subclass is swallowed, because if it's imported lazily like that, it'll always be an ImportError
2019-02-13 16:52:25 +01:00
Ines Montani
4d2438f985 Tidy up and auto-format 2019-02-13 15:29:08 +01:00
Ines Montani
fbf9f1edf1 Also raise error in Span.__reduce__ 2019-02-13 13:22:05 +01:00
Matthew Honnibal
1831e1423d Set version to v2.1.0a7.dev4 2019-02-13 23:08:40 +11:00
Matthew Honnibal
bed956c698 Drop regex dependency 2019-02-13 23:08:22 +11:00
Matthew Honnibal
63dc4234a3 Set version to v2.1.0a7.dev3 2019-02-13 22:53:10 +11:00
Matthew Honnibal
b7ea39564f Set version to v2.1.0a7.dev2 2019-02-13 22:52:43 +11:00
Ines Montani
2d0c3c73f4
Raise better error if token is pickled (resolves #2833) (#3267) 2019-02-13 11:27:04 +01:00
Ines Montani
2f45bd94c0 Auto-formatting 2019-02-12 18:30:11 +01:00
Ines Montani
0184a95340 Merge branch 'master' into develop 2019-02-12 18:29:24 +01:00
Akhilesh
a78db10941 add kannada support (#3264)
* add kannada support

* add few more stop words

* add support for Kannada Language
2019-02-12 18:28:39 +01:00