Gregory Howard
|
0e8c41ea4f
|
Adding method lemmatizer for every class
|
2017-05-03 12:14:42 +02:00 |
|
Gregory Howard
|
f2ab7d77b4
|
Lazy imports language
|
2017-05-03 11:01:42 +02:00 |
|
ghoward
|
55c6910f90
|
Look_up table for languages in spacy.
Need to find an another name for lemmatizerlookup. I was not inspired.
Trying to uses new files in fr language.
|
2017-04-24 16:39:00 +02:00 |
|
ines
|
71956c94db
|
Handle deprecated language-specific model downloading
|
2017-03-15 17:37:55 +01:00 |
|
ines
|
66c1f194f9
|
Use consistent unicode declarations
|
2017-03-12 13:07:28 +01:00 |
|
Roman Inflianskas
|
66e1109b53
|
Add support for Universal Dependencies v2.0
|
2017-03-03 13:17:34 +01:00 |
|
Ines Montani
|
0dec90e9f7
|
Use global abbreviation data languages and remove duplicates
|
2017-01-08 20:36:00 +01:00 |
|
Ines Montani
|
702d1eed93
|
Update tokenizer exceptions for German
|
2016-12-21 18:06:27 +01:00 |
|
Ines Montani
|
2b2ea8ca11
|
Reorganise language data
|
2016-12-18 16:54:19 +01:00 |
|
Ines Montani
|
32b36c3882
|
Break language data components into their own files
|
2016-12-18 15:40:22 +01:00 |
|
Ines Montani
|
0fc4e45cb3
|
Fix tag map for German
|
2016-12-18 13:30:03 +01:00 |
|
Ines Montani
|
69baf1c9a8
|
Fix tag map
|
2016-12-17 22:44:22 +01:00 |
|
Ines Montani
|
fc4ad17136
|
Fix typo
|
2016-12-17 14:00:47 +01:00 |
|
Ines Montani
|
e0a7b5c612
|
Fix formatting
|
2016-12-17 12:33:09 +01:00 |
|
Ines Montani
|
08162dce67
|
Move shared functions and constants to global language data
|
2016-12-17 12:32:48 +01:00 |
|
Ines Montani
|
6a60a61086
|
Move update_exc to global language data utils
|
2016-12-17 12:29:02 +01:00 |
|
Ines Montani
|
487ce1e20a
|
Add encoding declaration
|
2016-12-17 12:25:44 +01:00 |
|
Ines Montani
|
0a6d529104
|
Remove unused data
|
2016-12-08 20:36:56 +01:00 |
|
Ines Montani
|
0c39654786
|
Remove unused import
|
2016-12-08 19:46:53 +01:00 |
|
Ines Montani
|
e47ee94761
|
Split punctuation into its own file
|
2016-12-08 19:46:43 +01:00 |
|
Ines Montani
|
70b51ed7c8
|
Remove time from German language data
|
2016-12-08 19:45:50 +01:00 |
|
Ines Montani
|
311b30ab35
|
Reorganize exceptions for English and German
|
2016-12-08 13:58:32 +01:00 |
|
Ines Montani
|
1256232fad
|
Fix formatting
|
2016-12-08 13:56:40 +01:00 |
|
Ines Montani
|
0176b99004
|
Fix formatting
|
2016-12-08 12:48:02 +01:00 |
|
Ines Montani
|
bfaa42636c
|
Update language data for German
|
2016-12-08 12:01:09 +01:00 |
|
Ines Montani
|
e0712d1b32
|
Reformat language data
|
2016-12-07 20:33:28 +01:00 |
|
Mark Amery
|
1988fce389
|
Merge remote-tracking branch 'origin/master' into specify-data-path
|
2016-11-20 16:07:14 +00:00 |
|
Mark Amery
|
3871007c72
|
Let --data-path be specified when running download.py scripts
Resolves https://github.com/explosion/spaCy/issues/637
|
2016-11-20 15:48:04 +00:00 |
|
Ines Montani
|
3082e49326
|
Update and reformat German stopwords
|
2016-11-20 16:45:26 +01:00 |
|
Sourav Singh
|
6745eac309
|
Update language_data.py
|
2016-11-20 19:52:02 +05:30 |
|
Sourav Singh
|
4d9aae7d6a
|
Add German Stopwords
|
2016-11-19 22:47:53 +05:30 |
|
Matthew Honnibal
|
8c8f5c62c6
|
Add LANG attribute to English and German
|
2016-10-18 18:52:48 +02:00 |
|
Matthew Honnibal
|
e56653f848
|
Add language data for German
|
2016-09-25 15:44:45 +02:00 |
|
Matthew Honnibal
|
7db956133e
|
Move tokenizer data for German into spacy.de.language_data
|
2016-09-25 15:37:33 +02:00 |
|
Matthew Honnibal
|
95aaea0d3f
|
Refactor so that the tokenizer data is read from Python data, rather than from disk
|
2016-09-25 14:49:53 +02:00 |
|
Matthew Honnibal
|
fd65cf6cbb
|
Finish refactoring data loading
|
2016-09-24 20:26:17 +02:00 |
|
Wolfgang Seeker
|
92bfbebeec
|
remove unnecessary imports
|
2016-05-02 17:33:22 +02:00 |
|
Wolfgang Seeker
|
857454ffa0
|
fix indentation -.-
|
2016-05-02 17:10:41 +02:00 |
|
Wolfgang Seeker
|
dae6bc05eb
|
define German dummy lemmatizer until morphology is done
|
2016-05-02 16:04:53 +02:00 |
|
Henning Peters
|
a7d7ea3afa
|
first idea for supporting multiple langs in download script
|
2016-03-24 11:19:43 +01:00 |
|
Wolfgang Seeker
|
690c5acabf
|
adjust train.py to train both english and german models
|
2016-03-03 15:21:00 +01:00 |
|
Henning Peters
|
9027cef3bc
|
access model via sputnik
|
2015-12-07 06:01:28 +01:00 |
|
Matthew Honnibal
|
528e26a506
|
* Add rule to ensure ordinals are preserved as single tokens
|
2015-09-22 12:26:05 +10:00 |
|
Matthew Honnibal
|
dbb48ce49e
|
* Delete extra wordnets
|
2015-09-13 10:31:37 +10:00 |
|
Matthew Honnibal
|
2154a54f6b
|
* Add spacy.de
|
2015-09-06 21:56:47 +02:00 |
|