Commit Graph

8153 Commits

Author SHA1 Message Date
Matthew Honnibal
c4a81a47a4 Fix deserialization 2017-07-23 14:11:07 +02:00
Matthew Honnibal
2df563ad24 Remove optimization for textcat that caused loading problem 2017-07-23 14:10:51 +02:00
Matthew Honnibal
4fe77bced2 Add cfg attr to pipeline components 2017-07-23 00:52:47 +02:00
Matthew Honnibal
d8aa721664 Compute Language.meta with a property 2017-07-23 00:50:18 +02:00
Matthew Honnibal
54a539a113 Finish text classifier example 2017-07-23 00:34:12 +02:00
Matthew Honnibal
a88a7deffe Five save/load of textcat config 2017-07-23 00:33:43 +02:00
Matthew Honnibal
c27fdaef6f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-07-22 20:15:55 +02:00
Matthew Honnibal
2bc7d87c70 Add example for training text classifier 2017-07-22 20:15:32 +02:00
Matthew Honnibal
9bae0ddc50 Fix minibatching 2017-07-22 20:14:49 +02:00
Matthew Honnibal
ded0df5e2f Expose hyper-param as keyword arg 2017-07-22 20:14:37 +02:00
Matthew Honnibal
f5de8deeec Increment version 2017-07-22 20:04:53 +02:00
Matthew Honnibal
b55714d5d1 Make gold_tuples arg optional in begin_training 2017-07-22 20:04:43 +02:00
Matthew Honnibal
ed6c85fa3c Fix loading of text categories in GoldParse 2017-07-22 20:04:03 +02:00
Matthew Honnibal
6ffec9dfea Update _ml, for textcat model 2017-07-22 20:03:40 +02:00
ines
864cefd3b2 Update README.rst 2017-07-22 18:29:55 +02:00
ines
e349271506 Increment version 2017-07-22 18:29:30 +02:00
ines
ab8ffbaab7 Add text classification to v2 overview 2017-07-22 17:56:51 +02:00
ines
f085b88f9d Add TextCategorizer API docs stub 2017-07-22 17:56:33 +02:00
ines
ab1a4e8b3c Add Tensorizer API docs stub 2017-07-22 17:56:25 +02:00
ines
0fb89dd204 Add text classification usage guide template 2017-07-22 17:56:07 +02:00
ines
d05ab1b3a0 Add text classification to 101 overview and change order 2017-07-22 17:55:53 +02:00
ines
d2a7e5b8e5 Add GoldParse.cats attribute 2017-07-22 17:55:35 +02:00
ines
23d976ed00 Add Doc.cats attribute and missing v2 tag 2017-07-22 17:55:14 +02:00
Ines Montani
570964e67f Update README.rst 2017-07-22 16:20:19 +02:00
Matthew Honnibal
5494605689 Fiddle with regex pin 2017-07-22 16:09:50 +02:00
Matthew Honnibal
78fcf56dd5 Update version pin for regex library 2017-07-22 15:57:58 +02:00
Matthew Honnibal
d51d55bba6 Increment version 2017-07-22 15:43:16 +02:00
Matthew Honnibal
8ccf154413 Merge branch 'master' of https://github.com/explosion/spaCy 2017-07-22 15:42:44 +02:00
Matthew Honnibal
796b2f4c1b Remove print statements in tests 2017-07-22 15:42:38 +02:00
ines
7c4bf9994d Add note on requirements and preventing model re-downloads (closes #1143) 2017-07-22 15:40:12 +02:00
ines
de25bad036 Use lower min version for requests dependency (fixes #1137)
Ensure compatibility with docker-compose and other packages
2017-07-22 15:29:10 +02:00
ines
d7560047c5 Fix version 2017-07-22 15:24:33 +02:00
Matthew Honnibal
af945ea8e2 Merge branch 'master' of https://github.com/explosion/spaCy 2017-07-22 15:09:59 +02:00
Matthew Honnibal
4b2e5e59ed Add flush_cache method to tokenizer, to fix #1061
The tokenizer caches output for common chunks, for efficiency. This
cache is be invalidated when the tokenizer rules change, e.g. when a new
special-case rule is introduced. That's what was causing #1061.

When the cache is flushed, we free the intermediate token chunks.
I *think* this is safe --- but if we start getting segfaults, this patch
is to blame. The resolution would be to simply not free those bits of
memory. They'll be freed when the tokenizer exits anyway.
2017-07-22 15:06:50 +02:00
Ines Montani
96df9c7154 Update CONTRIBUTORS.md 2017-07-22 15:05:46 +02:00
ines
b22b18a019 Add notes on spacy.explain() to annotation docs 2017-07-22 15:02:15 +02:00
Ines Montani
1ddbeddca2 Fix typo 2017-07-22 15:00:58 +02:00
ines
e3f23f9d91 Use latest available version in examples 2017-07-22 14:57:51 +02:00
Matthew Honnibal
23a55b40ca Default to English noun chunks iterator if no lang set 2017-07-22 14:15:25 +02:00
Matthew Honnibal
9750a0128c Fix Span.noun_chunks. Closes #1207 2017-07-22 14:14:57 +02:00
Matthew Honnibal
d9b85675d7 Rename regression test 2017-07-22 14:14:35 +02:00
Matthew Honnibal
dfbc7e49de Add test for Issue #1207 2017-07-22 14:14:01 +02:00
Matthew Honnibal
0ae3807d7d Fix gaps in Lexeme API. Closes #1031 2017-07-22 13:53:48 +02:00
Matthew Honnibal
83e1b5f1e3 Merge branch 'master' of https://github.com/explosion/spaCy 2017-07-22 13:45:35 +02:00
Matthew Honnibal
45f6961ae0 Add __version__ symbol in __init__.py 2017-07-22 13:45:21 +02:00
Matthew Honnibal
8b9c4c5e1c Add missing SP symbol to tag map, re #1052 2017-07-22 13:44:17 +02:00
Ines Montani
69396dcfd3 Update CONTRIBUTORS.md 2017-07-22 13:43:15 +02:00
Ines Montani
9af04ea11f Merge pull request #1161 from AlexisEidelman/patch-1
French NUM_WORDS and ORDINAL_WORDS
2017-07-22 13:40:46 +02:00
Matthew Honnibal
8b581fdac5 Remove unused example 2017-07-22 13:36:54 +02:00
Matthew Honnibal
44dd247e73 Merge branch 'master' of https://github.com/explosion/spaCy 2017-07-22 13:35:30 +02:00