Commit Graph

2900 Commits

Author SHA1 Message Date
Matthew Honnibal
6f82065761 * Fix infixed commas in tokenizer, re Issue #326. Need to benchmark on empirical data, to make sure this doesn't break other cases. 2016-04-14 11:36:03 +02:00
Matthew Honnibal
0f957dd586 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2016-04-14 10:37:56 +02:00
Matthew Honnibal
108aca0e50 * Make Matcher use attrs from the attrs.pyx file, rather than having an incomplete function doing the mapping. 2016-04-14 10:37:39 +02:00
Matthew Honnibal
61d20de35d * Fix language.py docstring 2016-04-14 10:36:57 +02:00
Matthew Honnibal
04d0209be9 * Recognise multiple infixes in a token. 2016-04-13 18:38:26 +10:00
Henning Peters
a473d6e937 fix tests (use english model) 2016-04-12 16:41:57 +02:00
Henning Peters
f2d011c034 avoid polluting spacy namespace with lang classes 2016-04-12 16:31:16 +02:00
Henning Peters
ff690f76ba fix loading non-german models 2016-04-12 16:00:56 +02:00
Henning Peters
6215272786 remove ujson as default non-dev dependency (still works as fallback if installed), because ujson doesn't ship wheels 2016-04-12 11:28:07 +02:00
Henning Peters
5f699883dd make openmp on windows optional 2016-04-12 10:12:57 +02:00
Matthew Honnibal
6df3858dbc * Fix Issue #323: Incorrect semantics of Token.__str__ built-in. Add flag to allow users to switch the old semantics back on, to ease transition. 2016-04-12 13:17:59 +10:00
Henning Peters
13a6899fc6 Merge pull request #329 from sjjpo2002/patch-1
Enable OpenMP compiler option for MSVC
2016-04-10 09:45:08 +02:00
SJ
91b3f1c12f Enable OpenMP compiler option for MSVC
Enable OpenMP compiler option for MSVC to support Multi-Threading for nlp.pipe()
2016-04-09 15:22:17 -07:00
Wolfgang Seeker
80bea62842 bugfix in unit test 2016-04-08 16:46:44 +02:00
Henning Peters
72e0de7330 Merge branch 'master' of github.com:spacy-io/spaCy 2016-04-08 14:52:38 +02:00
Henning Peters
29ad621825 add de 2016-04-08 14:52:29 +02:00
Henning Peters
7c4dde3a1b Update package.json 2016-04-08 14:48:47 +02:00
Wolfgang Seeker
be4903a1b2 update version numbers 2016-04-08 13:54:05 +02:00
Wolfgang Seeker
f9150ccf2a rename vectors.tgz to vectors.bz2 because it's not compressed with gzip but bzip 2016-04-08 13:38:07 +02:00
Wolfgang Seeker
1fe911cdb0 bigfix 2016-04-07 18:19:51 +02:00
Matthew Honnibal
872695759d Merge pull request #306 from wbwseeker/german_noun_chunks
add German noun chunk functionality
2016-04-08 00:54:24 +10:00
Matthew Honnibal
c628908479 * Pin Cython to <0.24, until we fix for new version 2016-04-07 11:51:53 +10:00
Matthew Honnibal
85485f5c2b Fix inconsistencies in generate_specials.py
Re Issue #321, fix inconsistencies in the script that generates specials.json. The result still isn't so satisfying --- we need to revise this as we move to parse more morphologically rich languages.
2016-04-07 11:21:52 +10:00
Henning Peters
357e2aaece Merge branch 'master' of github.com:spacy-io/spaCy 2016-04-05 11:26:22 +02:00
Henning Peters
470cdf5bf9 remove deprecated LOCAL_DATA_DIR 2016-04-05 11:25:54 +02:00
Henning Peters
67a2bd2197 Update README.rst 2016-04-01 20:20:22 +02:00
Ines Montani
54ae410bed Update GitHub links 2016-04-01 02:23:52 +11:00
Ines Montani
5f2654c30d Fix main container flex properties 2016-04-01 02:23:42 +11:00
Ines Montani
53a45995c4 Update readme 2016-04-01 01:30:19 +11:00
Ines Montani
1f8309a862 Replace website with new version 2016-04-01 01:24:48 +11:00
Ines Montani
f321272bee Update gitignore for website 2016-04-01 00:36:56 +11:00
Wolfgang Seeker
a8f4e49900 update init_model.py to previous (better) state 2016-03-29 16:12:13 +02:00
Matthew Honnibal
26622f0ffc Merge branch 'master' of ssh://github.com/honnibal/spaCy 2016-03-29 14:31:52 +11:00
Matthew Honnibal
b1fe41b45d * Extend infix test, commenting on limitation of tokenizer w.r.t. infixes at the moment. 2016-03-29 14:31:05 +11:00
Matthew Honnibal
9c73983bdd * Add test for hyphenation problem in Issue #302 2016-03-29 14:27:13 +11:00
Matthew Honnibal
d249e2f7f3 * Improve error message in bin/parser/train.py 2016-03-29 13:04:33 +11:00
Matthew Honnibal
910a6c805f * Add infix rule for double hyphens, re Issue #302 2016-03-29 13:03:44 +11:00
Matthew Honnibal
ad119c074f * Fix incorrect whitespacing in Doc.text. This change is potentially breaking, to anyone who was relying on the previous incorrect semantics. 2016-03-29 13:02:42 +11:00
Matthew Honnibal
8c7a1908ee Merge pull request #307 from scoder/faster_string_store
remove internal redundancy and overhead from StringStore
2016-03-29 12:59:52 +11:00
Matthew Honnibal
8c77a994c6 Merge pull request #305 from henningpeters/master
multiple langs in download script
2016-03-26 21:54:59 +11:00
Henning Peters
c90d4a6f17 relative imports in __init__.py 2016-03-26 11:44:53 +01:00
Henning Peters
db095a162c fix 2016-03-25 18:59:47 +01:00
Henning Peters
b8f63071eb add lang registration facility 2016-03-25 18:54:45 +01:00
Matthew Honnibal
9cd21ad5b5 Merge pull request #284 from olegzd/olegzd/example/inventoryCount
Added reloadable English() example for inventory counting
2016-03-25 09:48:47 +11:00
Matthew Honnibal
4a37fdcee1 Merge pull request #287 from wbwseeker/deproj_sentbnd_bug
add function to Token for setting head and dep (and dep_)
2016-03-25 09:47:45 +11:00
Stefan Behnel
f18805ee1c make StringStore.__contains__() return True for the empty string (which is also contained in iteration) 2016-03-24 15:42:12 +01:00
Stefan Behnel
f2cfbfc412 remove internal redundancy and overhead from StringStore 2016-03-24 15:25:27 +01:00
Wolfgang Seeker
d65ef41d08 make error messages language independent 2016-03-24 11:47:09 +01:00
Henning Peters
963570aa49 Merge branch 'master' of github.com:spacy-io/spaCy 2016-03-24 11:19:47 +01:00
Henning Peters
a7d7ea3afa first idea for supporting multiple langs in download script 2016-03-24 11:19:43 +01:00