Commit Graph

26 Commits

Author SHA1 Message Date
Ines Montani
ea20b72c08 💫 Make like_num work for prefixed numbers (#2808)
* Only split + prefix if not numbers

* Make like_num work for prefixed numbers

* Add test for like_num
2018-10-01 10:49:14 +02:00
ines
f2e3e039b7 Update French stop words (resolves #2540) 2018-07-24 23:41:51 +02:00
mauryaland
5368ba028a Update stop_words.py for French language (#2310)
* Add contraction forms of some common stopwords

All the stopwords added contain the apostrophe" ' "or " ’ ".

* Adds contributor agreement mauryaland

* Update mauryaland.md
2018-05-09 12:04:38 +02:00
Kit
ed0db95183
Find lowercased forms of ordinal words, where possible 2018-01-08 03:28:50 +01:00
Kit
9bc524982e
Find lowercased forms of numeric words 2018-01-08 03:25:08 +01:00
ines
acb9bdb852 Fix PRON_LEMMA imports 2017-11-06 17:41:53 +01:00
ines
9d13288f73 Fix French tag map 2017-11-05 17:47:59 +01:00
ines
54579805c5 Fix French tag map 2017-11-05 17:44:05 +01:00
ines
3cef901834 Add tag map for French and Italian 2017-11-04 23:32:51 +01:00
ines
819e30a26e Tidy up tokenizer exceptions 2017-11-01 23:02:45 +01:00
ines
7e424a1804 Don't copy exception dicts if not necessary and tidy up 2017-10-31 21:05:29 +01:00
ines
8ce6f96180 Don't make copies of language data components 2017-10-11 15:34:55 +02:00
ines
417d45f5d0 Add lemmatizer data as variable on language data
Don't create lookup lemmatizer within Language class and just pass in
the data so it can be set on Token creation
2017-10-11 02:24:58 +02:00
ines
0c2343d73a Tidy up language data 2017-10-11 02:22:49 +02:00
ines
bb5c631402 Implement like_num getter for French (via #1161) 2017-09-26 16:47:45 +02:00
ines
1fe5e1a4d1 Add language example sentences (see #1107)
da, de, en, es, fr, he, it, nb, pl, pt, sv
2017-08-19 12:22:29 +02:00
Matthew Honnibal
91e52543ef Merge pull request #1118 from Gregory-Howard/patch-2
Update _tokenizer_exceptions_list (adding cities)
2017-06-20 11:16:07 +02:00
Tpt
7745b3ae04 Adds noun chunks to French syntax iterators 2017-06-12 15:29:58 +02:00
Grégory Howard
cd974b32b7 Update _tokenizer_exceptions_list (adding cities) 2017-06-09 17:58:18 +02:00
ines
4c643d74c5 Add norm exceptions to other Language classes 2017-06-03 22:29:21 +02:00
ines
924e8506de Move Defaults subclass to module scope (necessary for pickling) 2017-05-20 19:02:27 +02:00
ines
e895d1afd7 Reorganise French punctuation rules 2017-05-09 00:00:54 +02:00
ines
a91278cb32 Rename _URL_PATTERN to URL_PATTERN 2017-05-09 00:00:00 +02:00
ines
73b577cb01 Fix relative imports 2017-05-08 22:29:04 +02:00
ines
ae99990f63 Fix formatting 2017-05-08 22:23:48 +02:00
ines
f46ffe3e89 Move language data to /lang module 2017-05-08 20:00:40 +02:00