Commit Graph

2653 Commits

Author SHA1 Message Date
ines
ad8bf1829f Import and combine Portuguese tokenizer exceptions (see #943) 2017-04-01 10:37:42 +02:00
Ines Montani
f8b2d9c3b7 Merge pull request #943 from mamoit/master
Portuguese improvements
2017-04-01 10:32:00 +02:00
ines
3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines
e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
Matthew Honnibal
51882ee2b8 Fix check for setting ent_id in merge 2017-03-31 19:32:01 +02:00
Miguel Almeida
4fde64c4ea Portuguese contractions and some abreviations 2017-03-31 15:52:55 +01:00
Miguel Almeida
465b240bcb Review Portuguese stop words
Mainly to review typos and add missing masculines/feminines
2017-03-31 13:00:47 +01:00
Matthew Honnibal
fc3900e5b2 Allow ent_id to be set in Token 2017-03-31 14:00:14 +02:00
Matthew Honnibal
9720103428 Improve attribute handlign in doc.merge(). Still unsatisfying 2017-03-31 13:59:58 +02:00
Matthew Honnibal
cfff4e0f61 Improve test 2017-03-31 13:59:32 +02:00
Matthew Honnibal
1bb7b4ca71 Add comment 2017-03-31 13:59:19 +02:00
Matthew Honnibal
725249c59a Add merge_phrase callback in matcher.pyx 2017-03-31 13:58:59 +02:00
Matthew Honnibal
e854f28304 Add test for Issue #758
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Miguel Almeida
c1d020b0a6 Remove "ista" from portuguese stop words 2017-03-31 12:26:13 +01:00
Miguel Almeida
17a1e7a119 Add Portuguese numbers and ordinals 2017-03-31 12:21:01 +01:00
Matthew Honnibal
47a3ef06a6 Unhack deprojetivization, moving it into pipeline
Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes #898.
2017-03-31 12:31:50 +02:00
Joshua Reeter
564daf6dec Issue #934 symlink should not convert paths as_posix under windows. 2017-03-30 23:47:45 -05:00
Bruno P. Kinoshita
c2d48974bc Fix typos in Portuguese stop words 2017-03-30 21:59:18 +13:00
Matthew Honnibal
0fefdfcbda Merge pull request #935 from ericzhao28/master
Add option to use label=ent_type in doc.merge arguments (Bug fix for issue #862)
2017-03-30 02:51:24 +02:00
ines
7e4befec88 Add Hebrew to init and setup.py 2017-03-29 10:34:57 +02:00
Grégory Howard
9c2996b27f correction of package.py (encoding on open instead of write) 2017-03-29 09:11:02 +02:00
Eric Zhao
aafdf6ffb8 Add option to use label karg to determine ent_type in doc.merge 2017-03-28 23:35:03 -07:00
Matthew Honnibal
83ba6c247c Fix init of Language without model 2017-03-26 16:46:00 +02:00
Matthew Honnibal
fa107f95f6 Remove unused train_config command 2017-03-26 09:28:59 -05:00
Matthew Honnibal
df83921f0a Increment version 2017-03-26 09:27:32 -05:00
Matthew Honnibal
92ac3af21d Merge branch 'master' of https://github.com/explosion/spaCy 2017-03-26 09:26:59 -05:00
Matthew Honnibal
a9b1f23c7d Enable regression loss for parser 2017-03-26 09:26:30 -05:00
ines
c00d997924 Merge branch 'develop' 2017-03-26 15:57:00 +02:00
Matthew Honnibal
2efdbc08ff Make training work with directories 2017-03-26 08:46:44 -05:00
ines
007a2492bd Remove train_config command for now 2017-03-26 15:40:50 +02:00
ines
b297fab062 Update error message for missing commands 2017-03-26 15:40:02 +02:00
ines
7f95023fc0 Fix formatting 2017-03-26 15:37:37 +02:00
ines
5901c8f7f0 Update spacy train CLI documentation 2017-03-26 15:33:48 +02:00
Matthew Honnibal
9dcb58aaaf Merge CLI changes 2017-03-26 07:30:45 -05:00
Matthew Honnibal
6b7f7a2060 Connect parser L1 option to train CLI 2017-03-26 07:24:07 -05:00
Matthew Honnibal
ed2b106f4d Fix circular import in lemmatizer 2017-03-26 07:17:07 -05:00
Matthew Honnibal
dec5571bf3 Update train CLI 2017-03-26 07:16:52 -05:00
ines
53cf2f1c0e Make dev data optional 2017-03-26 11:48:17 +02:00
Matthew Honnibal
5eac089fbe Merge branch 'master' into develop 2017-03-26 04:45:43 -05:00
ines
0fc56e2544 Update flag and defaults 2017-03-26 11:42:11 +02:00
Matthew Honnibal
2f63806ddb Update config when adding label. Re #910 2017-03-25 22:35:44 +01:00
Matthew Honnibal
b94286de30 Fix regression test 2017-03-25 22:35:07 +01:00
Matthew Honnibal
c748907a66 Fix errors in previous commit 2017-03-25 22:25:01 +01:00
Matthew Honnibal
4f400fa486 Prevent lemmatization of base nouns
Update lemmatizer's base-form check, for change in morphology class.
Closes #903.
2017-03-25 21:51:12 +01:00
Matthew Honnibal
850d35dcb3 Make morphology use int attributes internally
The morphology class was calling the lemmatizer inconsistently,
which some string-valued attributes. This caused Issue #903.
2017-03-25 21:49:10 +01:00
Matthew Honnibal
4454c1b23f Block lemmatization of base-form adjectives
Fixes check that an adjective is a base form (as opposed to a
comparative or superlative), so that it's not lemmatized.
e.g. inner -!> inn. Closes #912.
2017-03-25 21:29:57 +01:00
ines
97814f8da6 Update Windows Python 2 link workaround to use helper functions 2017-03-25 14:04:27 +01:00
ines
fdec758113 Add is_windows and is_python2 utility functions 2017-03-25 14:04:02 +01:00
Ines Montani
09837158e4 Merge pull request #921 from solresol/master
Possible solution to #909
2017-03-25 13:51:55 +01:00
Greg Baker
b7f714b498 Possible solution to #909 2017-03-25 21:36:38 +11:00