Wolfgang Seeker
3448cb40a4
integrated pseudo-projective parsing into parser
...
- nonproj.pyx holds a class PseudoProjectivity which currently holds
all functionality to implement Nivre & Nilsson 2005's pseudo-projective
parsing using the HEAD decoration scheme
- changed lefts/rights in Token to account for possible non-projective
structures
2016-03-01 10:09:08 +01:00
Wolfgang Seeker
56b7210e82
moved nonproj.py to syntax/nonproj.pyx
2016-02-25 15:08:49 +01:00
Wolfgang Seeker
4b2297d5d4
add class PseudoProjective for pseudo-projective parsing
...
PseudoProjective() implements the algorithm from Nivre & Nilsson 2005
using their HEAD decoration scheme.
2016-02-24 11:26:25 +01:00
Wolfgang Seeker
8d531c958b
replace tests for non-projectivity
...
- add functions to find non-projective edges
- add test file for non-projectivity functions
2016-02-22 14:40:40 +01:00
Wolfgang Seeker
eae35e9b27
add tokenizer files for German, add/change code to train German pos tagger
...
- add files to specify rules for German tokenization
- change generate_specials.py to generate from an external file (abbrev.de.tab)
- copy gazetteer.json from lang_data/en/
- init_model.py
- change doc freq threshold to 0
- add train_german_tagger.py
- expects conll09-formatted input
2016-02-18 13:24:20 +01:00
Henning Peters
9d8966a2c0
Update test_tokenizer.py
2016-02-10 19:24:37 +01:00
Henning Peters
82c57f21a4
Update requirements.txt
2016-02-10 18:56:21 +01:00
Matthew Honnibal
cc66a63e0a
Merge pull request #255 from henningpeters/master
...
py26 compatibility
2016-02-11 03:24:52 +11:00
Henning Peters
3b5f1e753b
py26 compatibility
2016-02-10 14:32:54 +01:00
Henning Peters
5c60847341
remove appveyor
2016-02-10 11:29:29 +01:00
Henning Peters
7e0d1dd8d3
remove appveyor
2016-02-10 11:28:55 +01:00
Henning Peters
73251eddac
Update README.md
2016-02-10 11:05:08 +01:00
Henning Peters
66765d4d8f
Update .appveyor.yml
2016-02-10 08:04:11 +01:00
Henning Peters
ee1f1ac300
mark test_sentence_space() as model test
2016-02-10 07:49:11 +01:00
Henning Peters
2072120d7d
Update package.json
2016-02-09 19:54:52 +01:00
Henning Peters
1c0c2f565b
Update .travis.yml
2016-02-09 19:34:24 +01:00
Henning Peters
62a6adf33a
Update .travis.yml
2016-02-09 19:29:23 +01:00
Henning Peters
c00dd43fe0
add sun data
2016-02-09 16:42:55 +01:00
Henning Peters
116ec3b849
switch to buildbot.json
2016-02-09 16:11:08 +01:00
Henning Peters
8d3957c5e6
switch to buildbot.json
2016-02-09 15:31:55 +01:00
Matthew Honnibal
bc9a31df3e
Merge branch 'master' of ssh://github.com/honnibal/spaCy
2016-02-09 14:43:30 +01:00
Matthew Honnibal
9fe814225a
* Add sense2vec-reddit draft
2016-02-09 14:43:05 +01:00
Henning Peters
7c0aa60ca7
Update README.md
2016-02-08 18:48:16 +01:00
Henning Peters
ab59d6ca91
Update README.md
2016-02-08 18:47:37 +01:00
Henning Peters
4ac755b0fb
Update README.md
2016-02-08 18:42:38 +01:00
Matthew Honnibal
ad5fa3f335
Merge branch 'master' of ssh://github.com/honnibal/spaCy
2016-02-08 14:00:30 +01:00
Matthew Honnibal
9bbc41c1e3
* Fix conda install instructions
2016-02-08 13:51:13 +01:00
Matthew Honnibal
9c24d0e7bb
* Fix author bio
2016-02-08 13:49:54 +01:00
Matthew Honnibal
5d96b3ef4f
* Increment version
2016-02-07 13:48:58 +01:00
Matthew Honnibal
dc61056183
* Fix parallel_parse script
2016-02-07 02:56:16 +01:00
Matthew Honnibal
18eaa44835
* Add parallel_parse example
2016-02-07 02:53:44 +01:00
Matthew Honnibal
9b303e158e
* Add example file to show answer to Issue #252
2016-02-07 01:13:40 +01:00
Matthew Honnibal
1b83cb9dfa
* Fix Issue #251 : Incorrect right edge calculation on left-clobber low in the tree
2016-02-07 00:00:42 +01:00
Matthew Honnibal
860fd11e98
* Don't import include files --- use the repository
2016-02-06 23:59:47 +01:00
Matthew Honnibal
c6623889c1
* Add test for Issue #251 : Incorrect right edges, caused by bad update to r_edge in del_arc, triggered from non-monotonic left-arc
2016-02-06 23:47:51 +01:00
Matthew Honnibal
84e0aa7118
* Add header files to repo, to prevent cross-compilation problems
2016-02-06 22:57:11 +01:00
Matthew Honnibal
a95974ad3f
* Fix oov probability
2016-02-06 15:13:55 +01:00
Matthew Honnibal
af8514cb0c
* Refine the way the is_parsed attribute is set by from_array
2016-02-06 14:44:35 +01:00
Matthew Honnibal
161b01d4c0
* Tweak usage example for multi-processing
2016-02-06 14:44:11 +01:00
Matthew Honnibal
963ccc2aee
* Add usage note for multi-threading
2016-02-06 14:43:39 +01:00
Matthew Honnibal
7f24229f10
* Don't try to pickle the tokenizer
2016-02-06 14:09:05 +01:00
Matthew Honnibal
dcb401f3e1
* Remove broken Vocab pickling
2016-02-06 14:08:47 +01:00
Matthew Honnibal
e66d45bf66
* Restore previous patch to Span.root, as it seems it wasn't the cause of the problem.
2016-02-06 13:37:41 +01:00
Matthew Honnibal
54e210d633
* Work on docs for new .pipe() method
2016-02-06 13:34:57 +01:00
Matthew Honnibal
4412a70dc5
* Initialize StateC._empty_token to 0, to avoid undefined behaviour.
2016-02-06 13:34:38 +01:00
Matthew Honnibal
1b41f868d2
* Check for errors in parser, and parallelise the left-over batch
2016-02-06 10:06:30 +01:00
Matthew Honnibal
031b00cb91
* Fix Span.root calculation
2016-02-05 20:12:09 +01:00
Matthew Honnibal
165ca28b80
* Set is_parsed flag in Parser.pipe
2016-02-05 19:51:44 +01:00
Matthew Honnibal
bdd579db0a
* Set is_parsed flag in Parser.pipe
2016-02-05 19:50:11 +01:00
Matthew Honnibal
7119e77fb6
* Fix Matcher.pipe
2016-02-05 19:46:02 +01:00