Commit Graph

2893 Commits

Author SHA1 Message Date
Matthew Honnibal
108aca0e50 * Make Matcher use attrs from the attrs.pyx file, rather than having an incomplete function doing the mapping. 2016-04-14 10:37:39 +02:00
Matthew Honnibal
61d20de35d * Fix language.py docstring 2016-04-14 10:36:57 +02:00
Matthew Honnibal
04d0209be9 * Recognise multiple infixes in a token. 2016-04-13 18:38:26 +10:00
Matthew Honnibal
6df3858dbc * Fix Issue #323: Incorrect semantics of Token.__str__ built-in. Add flag to allow users to switch the old semantics back on, to ease transition. 2016-04-12 13:17:59 +10:00
Henning Peters
13a6899fc6 Merge pull request #329 from sjjpo2002/patch-1
Enable OpenMP compiler option for MSVC
2016-04-10 09:45:08 +02:00
SJ
91b3f1c12f Enable OpenMP compiler option for MSVC
Enable OpenMP compiler option for MSVC to support Multi-Threading for nlp.pipe()
2016-04-09 15:22:17 -07:00
Wolfgang Seeker
80bea62842 bugfix in unit test 2016-04-08 16:46:44 +02:00
Henning Peters
72e0de7330 Merge branch 'master' of github.com:spacy-io/spaCy 2016-04-08 14:52:38 +02:00
Henning Peters
29ad621825 add de 2016-04-08 14:52:29 +02:00
Henning Peters
7c4dde3a1b Update package.json 2016-04-08 14:48:47 +02:00
Wolfgang Seeker
be4903a1b2 update version numbers 2016-04-08 13:54:05 +02:00
Wolfgang Seeker
f9150ccf2a rename vectors.tgz to vectors.bz2 because it's not compressed with gzip but bzip 2016-04-08 13:38:07 +02:00
Wolfgang Seeker
1fe911cdb0 bigfix 2016-04-07 18:19:51 +02:00
Matthew Honnibal
872695759d Merge pull request #306 from wbwseeker/german_noun_chunks
add German noun chunk functionality
2016-04-08 00:54:24 +10:00
Matthew Honnibal
c628908479 * Pin Cython to <0.24, until we fix for new version 2016-04-07 11:51:53 +10:00
Matthew Honnibal
85485f5c2b Fix inconsistencies in generate_specials.py
Re Issue #321, fix inconsistencies in the script that generates specials.json. The result still isn't so satisfying --- we need to revise this as we move to parse more morphologically rich languages.
2016-04-07 11:21:52 +10:00
Henning Peters
357e2aaece Merge branch 'master' of github.com:spacy-io/spaCy 2016-04-05 11:26:22 +02:00
Henning Peters
470cdf5bf9 remove deprecated LOCAL_DATA_DIR 2016-04-05 11:25:54 +02:00
Henning Peters
67a2bd2197 Update README.rst 2016-04-01 20:20:22 +02:00
Ines Montani
54ae410bed Update GitHub links 2016-04-01 02:23:52 +11:00
Ines Montani
5f2654c30d Fix main container flex properties 2016-04-01 02:23:42 +11:00
Ines Montani
53a45995c4 Update readme 2016-04-01 01:30:19 +11:00
Ines Montani
1f8309a862 Replace website with new version 2016-04-01 01:24:48 +11:00
Ines Montani
f321272bee Update gitignore for website 2016-04-01 00:36:56 +11:00
Wolfgang Seeker
a8f4e49900 update init_model.py to previous (better) state 2016-03-29 16:12:13 +02:00
Matthew Honnibal
26622f0ffc Merge branch 'master' of ssh://github.com/honnibal/spaCy 2016-03-29 14:31:52 +11:00
Matthew Honnibal
b1fe41b45d * Extend infix test, commenting on limitation of tokenizer w.r.t. infixes at the moment. 2016-03-29 14:31:05 +11:00
Matthew Honnibal
9c73983bdd * Add test for hyphenation problem in Issue #302 2016-03-29 14:27:13 +11:00
Matthew Honnibal
d249e2f7f3 * Improve error message in bin/parser/train.py 2016-03-29 13:04:33 +11:00
Matthew Honnibal
910a6c805f * Add infix rule for double hyphens, re Issue #302 2016-03-29 13:03:44 +11:00
Matthew Honnibal
ad119c074f * Fix incorrect whitespacing in Doc.text. This change is potentially breaking, to anyone who was relying on the previous incorrect semantics. 2016-03-29 13:02:42 +11:00
Matthew Honnibal
8c7a1908ee Merge pull request #307 from scoder/faster_string_store
remove internal redundancy and overhead from StringStore
2016-03-29 12:59:52 +11:00
Matthew Honnibal
8c77a994c6 Merge pull request #305 from henningpeters/master
multiple langs in download script
2016-03-26 21:54:59 +11:00
Henning Peters
c90d4a6f17 relative imports in __init__.py 2016-03-26 11:44:53 +01:00
Henning Peters
db095a162c fix 2016-03-25 18:59:47 +01:00
Henning Peters
b8f63071eb add lang registration facility 2016-03-25 18:54:45 +01:00
Matthew Honnibal
9cd21ad5b5 Merge pull request #284 from olegzd/olegzd/example/inventoryCount
Added reloadable English() example for inventory counting
2016-03-25 09:48:47 +11:00
Matthew Honnibal
4a37fdcee1 Merge pull request #287 from wbwseeker/deproj_sentbnd_bug
add function to Token for setting head and dep (and dep_)
2016-03-25 09:47:45 +11:00
Stefan Behnel
f18805ee1c make StringStore.__contains__() return True for the empty string (which is also contained in iteration) 2016-03-24 15:42:12 +01:00
Stefan Behnel
f2cfbfc412 remove internal redundancy and overhead from StringStore 2016-03-24 15:25:27 +01:00
Wolfgang Seeker
d65ef41d08 make error messages language independent 2016-03-24 11:47:09 +01:00
Henning Peters
963570aa49 Merge branch 'master' of github.com:spacy-io/spaCy 2016-03-24 11:19:47 +01:00
Henning Peters
a7d7ea3afa first idea for supporting multiple langs in download script 2016-03-24 11:19:43 +01:00
Wolfgang Seeker
5080077097 revert init_model.py back to pre-german state (because it makes more sense)
simplify token.n_rights and token.n_lefts
2016-03-21 16:10:25 +01:00
Matthew Honnibal
a862edc0e6 Merge pull request #296 from elyase/patch-2
make use of log_smooth_count
2016-03-19 06:50:30 +11:00
Yaser Martinez Palenzuela
3c210f45fa make use of log_smooth_count 2016-03-17 12:19:52 +01:00
Wolfgang Seeker
5e2e8e951a add baseclass DocIterator for iterators over documents
add classes for English and German noun chunks

the respective iterators are set for the document when created by the parser
as they depend on the annotation scheme of the parsing model
2016-03-16 15:53:35 +01:00
Matthew Honnibal
80134eb12d Merge branch 'master' of https://github.com/spacy-io/spaCy 2016-03-15 19:14:50 +00:00
Matthew Honnibal
eaccbcda0f Fix bug in pos_tag.py script 2016-03-16 06:04:14 +11:00
Wolfgang Seeker
2ae253ef5b changed head.__set__ to make it simpler 2016-03-14 13:43:48 +01:00