Matthew Honnibal
4ef68c413f
Approximate cost in Break transition, to speed things up a bit.
2017-03-15 16:40:27 -05:00
Matthew Honnibal
8543db8a5b
Use ftrl optimizer in parser
2017-03-15 11:56:37 -05:00
ines
4cfc8ffbd2
Reformat pickle tests
2017-03-15 17:39:54 +01:00
ines
2a0fcf1354
Add tests for new download module
2017-03-15 17:39:43 +01:00
ines
4c53eed35a
Remove sputnik from dependencies and docs
2017-03-15 17:39:25 +01:00
ines
b62322d602
Add requests to requirements
2017-03-15 17:39:08 +01:00
ines
71956c94db
Handle deprecated language-specific model downloading
2017-03-15 17:37:55 +01:00
ines
58b884b6d4
Refactor download script and about.py to use new download method
2017-03-15 17:37:18 +01:00
ines
f5d1a39a5b
Add util functions for printing and wrapping messages
2017-03-15 17:35:57 +01:00
ines
d7898d586f
Fix formatting and remove unused code
2017-03-15 17:35:41 +01:00
ines
b672e95045
Fix formatting
2017-03-15 17:35:04 +01:00
ines
0474e706a0
Remove unused deprecated functions for sputnik
2017-03-15 17:34:54 +01:00
ines
b13e7f79b4
Fix formatting and remove unused imports
2017-03-15 17:33:57 +01:00
ines
1101fd3855
Fix formatting and remove unused imports
2017-03-15 17:33:39 +01:00
ines
842782c128
Move fix_deprecated_glove_vectors_loading to deprecated.py
2017-03-15 17:33:29 +01:00
Matthew Honnibal
4cab8ac136
Update morph exceptions test
2017-03-15 09:31:34 -05:00
Matthew Honnibal
d719f8e77e
Use nogil in parser, and set L1 to 0.0 by default
2017-03-15 09:31:01 -05:00
Matthew Honnibal
c61c501406
Update beam-parser to allow parser to maintain nogil
2017-03-15 09:30:22 -05:00
Matthew Honnibal
3d4e389d23
Whitespace
2017-03-15 09:29:42 -05:00
Matthew Honnibal
7769bc31e3
Add beam-search classes
2017-03-15 09:27:41 -05:00
Matthew Honnibal
c79b3129e3
Fix setting of empty lexeme in initial parse state
2017-03-15 09:26:53 -05:00
Matthew Honnibal
d864708072
Add more morphology names in attrs.pyx
2017-03-15 09:26:16 -05:00
Matthew Honnibal
b382dc902c
Add morph rules in Language
2017-03-15 09:24:40 -05:00
Matthew Honnibal
8dbff4f5f4
Wire up English lemma and morph rules.
2017-03-15 09:23:22 -05:00
Matthew Honnibal
f70be44746
Use lemmatizer in code, not from downloaded model.
2017-03-15 04:52:50 -05:00
ines
42ba740dde
Revert "Merge branch 'debug'"
...
This reverts commit 89b79d1178
, reversing
changes made to 02bdf490a1
.
2017-03-13 20:11:52 +01:00
ines
89b79d1178
Merge branch 'debug'
2017-03-13 18:19:58 +01:00
ines
4c5f51e49e
Update regression test
2017-03-13 15:16:11 +01:00
ines
02bdf490a1
Remove regression test to see if it caused pytest Travis error
2017-03-13 13:00:22 +01:00
ines
17018750ac
Add regression test for #717
2017-03-13 12:58:22 +01:00
ines
2883ebfca2
Remove print statement
2017-03-13 12:30:42 +01:00
ines
98c13d8aa9
Add regression test for #401
2017-03-13 12:28:41 +01:00
ines
444d665f9d
Add regression test for #686
2017-03-13 12:23:35 +01:00
ines
46b17e5b51
Add regression test for #719
2017-03-13 12:17:35 +01:00
ines
c8ae682ff9
Add regression test for #636
2017-03-13 12:08:31 +01:00
ines
337f9601f2
Add missing unicode declaration
2017-03-13 12:08:19 +01:00
ines
d70386ec6e
Update docstring in #886 regression test
2017-03-13 12:00:38 +01:00
ines
51ba3ef0a8
Add regression test for #886
2017-03-13 11:44:58 +01:00
Ines Montani
13b1e6b3e0
Update README.rst
2017-03-12 20:03:15 +01:00
Ines Montani
fa0be1f2fe
Update README.rst
2017-03-12 20:00:41 +01:00
Ines Montani
39dc0d98d2
Update changelog to use table
2017-03-12 19:32:17 +01:00
ines
eec3f21c50
Add WordNet license
2017-03-12 13:58:24 +01:00
ines
0fcdde6c4d
Remove old WordNet data files
2017-03-12 13:58:24 +01:00
ines
f9e603903b
Rename stop_words.py to word_sets.py and include more sets
...
NUM_WORDS and ORDINAL_WORDS are currently not used, but the hard-coded
list should be removed from orth.pyx and replaced to use
language-specific functions. This will later allow other languages to
use their own functions to set those flags. (In English, this is easier
because it only needs to be checked against a set – in German for
example, this requires a more complex function, as most number words
are one word.)
2017-03-12 13:58:22 +01:00
ines
f24f9b4b7b
Remove unused code
2017-03-12 13:58:22 +01:00
ines
1da29a7146
Use new Lemmatizer data and remove file import
...
Since there's currently only an English lemmatizer, the global
Lemmatizer imports from spacy.en. This is unideal and still needs to be
fixed.
2017-03-12 13:58:22 +01:00
ines
0957737ee8
Add Python-formatted lemmatizer data and rules
2017-03-12 13:58:22 +01:00
ines
c89e30d1a3
Add test for English time exceptions ("1a.m." etc.)
2017-03-12 13:58:22 +01:00
ines
ce9568af84
Move English time exceptions ("1a.m." etc.) and refactor
2017-03-12 13:58:22 +01:00
ines
6b30541774
Fix formatting
2017-03-12 13:58:22 +01:00