Jim O'Regan
c1e4e0f3bf
just now discovered that you can do multiwords
2017-06-26 22:19:39 +01:00
Jim O'Regan
5e5f94c1c0
fix dup
2017-06-26 21:57:00 +01:00
Jim O'Regan
a8dff9133e
add POS
2017-06-26 21:53:41 +01:00
Jim O'Regan
e9213f54de
missed one
2017-06-26 21:29:21 +01:00
Jim O'Regan
1eb7cc3017
attempt a port from #1147
2017-06-26 21:24:55 +01:00
Matthew Honnibal
8815507f8e
Move SpanishDefaults out of Language class, for pickle
2017-05-18 04:28:51 -05:00
ines
1a05078c79
Add language-specific syntax iterators to en and de
2017-05-17 12:04:03 +02:00
Matthew Honnibal
4b9d69f428
Merge branch 'v2' into develop
...
* Move v2 parser into nn_parser.pyx
* New TokenVectorEncoder class in pipeline.pyx
* New spacy/_ml.py module
Currently the two parsers live side-by-side, until we figure out how to
organize them.
2017-05-14 01:10:23 +02:00
ines
a4a37a783e
Remove import from non-existing module
2017-05-13 16:00:09 +02:00
ines
c13b3fa052
Add LEX_ATTRS
2017-05-12 15:37:45 +02:00
ines
bca2ea9c72
Update Portuguese lexical attributes
2017-05-12 15:37:39 +02:00
ines
2f870123bf
Fix formatting
2017-05-12 15:37:20 +02:00
ines
ca65993d59
Add basic Polish Language class
2017-05-12 09:25:37 +02:00
ines
48177c4f92
Add missing tokenizer exceptions
2017-05-12 09:25:24 +02:00
ines
bb8be3d194
Add Danish language data
2017-05-10 21:15:12 +02:00
ines
a0b00624bb
Make sure like_email returns bool
2017-05-09 11:37:29 +02:00
ines
ea60932e1b
Fix formatting
2017-05-09 11:08:14 +02:00
ines
02d0ac5cab
Remove redundant function and fix formatting
2017-05-09 11:06:04 +02:00
ines
b5ca50607e
Reorganise entity rules
2017-05-09 01:37:10 +02:00
ines
12c3d5fbba
Fix formatting
2017-05-09 01:15:28 +02:00
ines
2829a024ef
Re-add basic like_num check to global lex_attrs
2017-05-09 01:15:23 +02:00
ines
88adeee548
Add English lex_attrs overrides
2017-05-09 01:09:52 +02:00
ines
8f3fbbb147
Fix typos
2017-05-09 01:09:37 +02:00
ines
2216e5f326
Reorganise lex_attrs and add dict
2017-05-09 00:57:54 +02:00
ines
e666f14d20
Add global lex_attrs
2017-05-09 00:41:53 +02:00
ines
41972c43fe
Use consistent regex imports
2017-05-09 00:34:31 +02:00
ines
9f0fd5963f
Reorganise Hungarian punctuation rules
2017-05-09 00:01:59 +02:00
ines
fc0d793360
Reorganise Bengali punctuation rules
2017-05-09 00:01:52 +02:00
ines
e895d1afd7
Reorganise French punctuation rules
2017-05-09 00:00:54 +02:00
ines
014bda0ae3
Reorganise global punctuation rules
2017-05-09 00:00:46 +02:00
ines
a91278cb32
Rename _URL_PATTERN to URL_PATTERN
2017-05-09 00:00:00 +02:00
ines
604f299cf6
Add char classes to global language data
2017-05-08 23:59:33 +02:00
ines
f6f5d78cb9
Fix formatting
2017-05-08 23:59:17 +02:00
ines
3c0f85de8e
Remove imports in /lang/__init__.py
2017-05-08 23:58:07 +02:00
ines
614aa09582
Tidy up Bengali tokenizer exceptions
2017-05-08 22:29:49 +02:00
ines
73b577cb01
Fix relative imports
2017-05-08 22:29:04 +02:00
ines
ae99990f63
Fix formatting
2017-05-08 22:23:48 +02:00
ines
f46ffe3e89
Move language data to /lang module
2017-05-08 20:00:40 +02:00