Matthew Honnibal
|
4f765eee79
|
Merge branch 'gaz' of https://github.com/honnibal/spaCy into gaz
|
2015-09-06 14:07:43 +02:00 |
|
Matthew Honnibal
|
7e4fea67d3
|
* Fix bug in token subtree, introduced by duplication of L/R code in Stateclass. Need to consolidate the two methods.
|
2015-09-06 10:48:36 +02:00 |
|
Matthew Honnibal
|
571b6eda88
|
* Upd tests
|
2015-09-06 05:40:10 +02:00 |
|
Matthew Honnibal
|
5edac11225
|
* Wrap self.parse in nogil, and break if an invalid move is predicted. The invalid break is a work-around that papers over likely bugs, but we can't easily break in the nogil block, and otherwise we'll get an infinite loop. Need to set this as an error flag.
|
2015-09-06 04:15:00 +02:00 |
|
Matthew Honnibal
|
fd1eeb3102
|
* Add POS attribute support in get_attr
|
2015-09-06 04:13:03 +02:00 |
|
Matthew Honnibal
|
534e3dda3c
|
* More work on language independent parsing
|
2015-08-28 03:44:54 +02:00 |
|
Matthew Honnibal
|
c2307fa9ee
|
* More work on language-generic parsing
|
2015-08-28 02:02:33 +02:00 |
|
Matthew Honnibal
|
86c4a8e3e2
|
* Work on new morphology organization
|
2015-08-27 23:11:51 +02:00 |
|
Matthew Honnibal
|
5b89e2454c
|
* Improve error-reporting in tagger
|
2015-08-27 10:26:36 +02:00 |
|
Matthew Honnibal
|
f0a7c99554
|
* Relax rule-requirement in lemmatizer
|
2015-08-27 10:26:19 +02:00 |
|
Matthew Honnibal
|
b6b1e1aa12
|
* Add link for Finnish model
|
2015-08-27 10:26:02 +02:00 |
|
Matthew Honnibal
|
0af139e183
|
* Tagger training now working. Still need to test load/save of model. Morphology still broken.
|
2015-08-27 09:16:11 +02:00 |
|
Matthew Honnibal
|
320ced276a
|
* Add tagger training script
|
2015-08-27 09:15:41 +02:00 |
|
Matthew Honnibal
|
56c4e07a59
|
Update gazetteer.json
|
2015-08-27 08:53:48 +10:00 |
|
Matthew Honnibal
|
c07eea8563
|
* Comment out old doc tests for now
|
2015-08-26 19:23:04 +02:00 |
|
Matthew Honnibal
|
884251801e
|
* Mark space tests as requiring model
|
2015-08-26 19:22:50 +02:00 |
|
Matthew Honnibal
|
ff9db9f3ae
|
* Fix serializer tests for new attr scheme
|
2015-08-26 19:22:26 +02:00 |
|
Matthew Honnibal
|
658c4a3930
|
* Mark test_inital as requiring models
|
2015-08-26 19:22:06 +02:00 |
|
Matthew Honnibal
|
1302d35dff
|
* Rework interfaces in vocab
|
2015-08-26 19:21:46 +02:00 |
|
Matthew Honnibal
|
2d521768a3
|
* Store Morphology class in Vocab
|
2015-08-26 19:21:03 +02:00 |
|
Matthew Honnibal
|
d30029979e
|
* Avoid import of morphology in spans
|
2015-08-26 19:20:46 +02:00 |
|
Matthew Honnibal
|
119c0f8c3f
|
* Hack out morphology stuff from tokenizer, while morphology being reimplemented.
|
2015-08-26 19:20:11 +02:00 |
|
Matthew Honnibal
|
b4faf551f5
|
* Refactor language-independent tagger class
|
2015-08-26 19:19:21 +02:00 |
|
Matthew Honnibal
|
a3d5e6c0dd
|
* Reform constructor and save/load workflow in parser model
|
2015-08-26 19:19:01 +02:00 |
|
Matthew Honnibal
|
1d7f2d3abc
|
* Hack on morphology structs
|
2015-08-26 19:18:36 +02:00 |
|
Matthew Honnibal
|
f8f2f4e545
|
* Temporarily add PUNC name to parts_of_specch dictionary, until better solution
|
2015-08-26 19:18:19 +02:00 |
|
Matthew Honnibal
|
008b02b035
|
* Comment out enums in Morpohlogy for now
|
2015-08-26 19:17:35 +02:00 |
|
Matthew Honnibal
|
378729f81a
|
* Hack Morphology class towards usability
|
2015-08-26 19:17:21 +02:00 |
|
Matthew Honnibal
|
430affc347
|
* Fix missing n_patterns property in Matcher class. Fix from_dir method
|
2015-08-26 19:17:02 +02:00 |
|
Matthew Honnibal
|
3acf60df06
|
* Add missing properties in Lexeme class
|
2015-08-26 19:16:28 +02:00 |
|
Matthew Honnibal
|
76996f4145
|
* Hack on generic Language class. Still needs work for morphology, defaults, etc
|
2015-08-26 19:16:09 +02:00 |
|
Matthew Honnibal
|
e2ef78b29c
|
* Gut pos.pyx module, since functionality moved to spacy/tagger.pyx
|
2015-08-26 19:15:42 +02:00 |
|
Matthew Honnibal
|
c4d8754385
|
* Specify LOCAL_DATA_DIR global in spacy.en.__init__.py
|
2015-08-26 19:15:07 +02:00 |
|
Matthew Honnibal
|
c2d8edd0bd
|
* Add PROB attribute in attrs.pxd
|
2015-08-26 19:14:19 +02:00 |
|
Matthew Honnibal
|
dc13edd7cb
|
* Refactor init_model to accomodate other languages
|
2015-08-26 19:14:05 +02:00 |
|
Matthew Honnibal
|
494da25872
|
* Refactor for more universal spacy
|
2015-08-26 19:13:50 +02:00 |
|
Matthew Honnibal
|
c5a27d1821
|
* Move lemmatizer to spacy
|
2015-08-25 15:47:08 +02:00 |
|
Matthew Honnibal
|
82217c6ec6
|
* Generalize lemmatizer
|
2015-08-25 15:46:19 +02:00 |
|
Matthew Honnibal
|
8083a07c3e
|
* Use language base class
|
2015-08-25 15:37:30 +02:00 |
|
Matthew Honnibal
|
f2f699ac18
|
* Add language base class
|
2015-08-25 15:37:17 +02:00 |
|
jxs8172
|
85f01c5e16
|
Add contributor agreement. Add exception to 'it' so that 'its' and 'Its' isn't generated (its =/= it's)
|
2015-08-24 18:20:06 -04:00 |
|
Matthew Honnibal
|
25f29232ca
|
Merge pull request #86 from vsolovyov/fix-c-ext-in-setuppy
Correctly pass link_args in c_ext() in setup.py
|
2015-08-24 20:18:49 +10:00 |
|
Vsevolod Solovyov
|
bbdb973398
|
Add contributor agreement for vsolovyov
|
2015-08-24 13:09:23 +03:00 |
|
Vsevolod Solovyov
|
39cfe28f33
|
Correctly pass link_args in c_ext() in setup.py
|
2015-08-24 12:52:05 +03:00 |
|
Matthew Honnibal
|
5dd76be446
|
* Split EnPosTagger up into base class and subclass
|
2015-08-24 05:25:55 +02:00 |
|
Matthew Honnibal
|
bbf07ac253
|
* Cut down init_model to work on more languages
|
2015-08-24 01:05:20 +02:00 |
|
Matthew Honnibal
|
5d5922dbfa
|
* Begin laying out morphological features
|
2015-08-24 01:04:30 +02:00 |
|
Matthew Honnibal
|
6f1743692a
|
* Work on language-independent refactoring
|
2015-08-23 20:49:18 +02:00 |
|
Matthew Honnibal
|
3879d28457
|
* Fix https for url detection
|
2015-08-23 02:40:35 +02:00 |
|
Matthew Honnibal
|
aa12b374c0
|
* Remove old doc tests
|
2015-08-22 22:12:55 +02:00 |
|