mirror of
https://github.com/explosion/spaCy.git
synced 2024-11-14 13:47:13 +03:00
Add support for Greek language (#2535)
* Add contributor agreement * Support for Greek language * Fix missing el_tokenizer
This commit is contained in:
parent
71bfc92913
commit
6042723535
106
.github/contributors/Eleni170.md
vendored
Normal file
106
.github/contributors/Eleni170.md
vendored
Normal file
|
@ -0,0 +1,106 @@
|
||||||
|
# spaCy contributor agreement
|
||||||
|
|
||||||
|
This spaCy Contributor Agreement (**"SCA"**) is based on the
|
||||||
|
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
|
||||||
|
The SCA applies to any contribution that you make to any product or project
|
||||||
|
managed by us (the **"project"**), and sets out the intellectual property rights
|
||||||
|
you grant to us in the contributed materials. The term **"us"** shall mean
|
||||||
|
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
|
||||||
|
**"you"** shall mean the person or entity identified below.
|
||||||
|
|
||||||
|
If you agree to be bound by these terms, fill in the information requested
|
||||||
|
below and include the filled-in version with your first pull request, under the
|
||||||
|
folder [`.github/contributors/`](/.github/contributors/). The name of the file
|
||||||
|
should be your GitHub username, with the extension `.md`. For example, the user
|
||||||
|
example_user would create the file `.github/contributors/example_user.md`.
|
||||||
|
|
||||||
|
Read this agreement carefully before signing. These terms and conditions
|
||||||
|
constitute a binding legal agreement.
|
||||||
|
|
||||||
|
## Contributor Agreement
|
||||||
|
|
||||||
|
1. The term "contribution" or "contributed materials" means any source code,
|
||||||
|
object code, patch, tool, sample, graphic, specification, manual,
|
||||||
|
documentation, or any other material posted or submitted by you to the project.
|
||||||
|
|
||||||
|
2. With respect to any worldwide copyrights, or copyright applications and
|
||||||
|
registrations, in your contribution:
|
||||||
|
|
||||||
|
* you hereby assign to us joint ownership, and to the extent that such
|
||||||
|
assignment is or becomes invalid, ineffective or unenforceable, you hereby
|
||||||
|
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
|
||||||
|
royalty-free, unrestricted license to exercise all rights under those
|
||||||
|
copyrights. This includes, at our option, the right to sublicense these same
|
||||||
|
rights to third parties through multiple levels of sublicensees or other
|
||||||
|
licensing arrangements;
|
||||||
|
|
||||||
|
* you agree that each of us can do all things in relation to your
|
||||||
|
contribution as if each of us were the sole owners, and if one of us makes
|
||||||
|
a derivative work of your contribution, the one who makes the derivative
|
||||||
|
work (or has it made will be the sole owner of that derivative work;
|
||||||
|
|
||||||
|
* you agree that you will not assert any moral rights in your contribution
|
||||||
|
against us, our licensees or transferees;
|
||||||
|
|
||||||
|
* you agree that we may register a copyright in your contribution and
|
||||||
|
exercise all ownership rights associated with it; and
|
||||||
|
|
||||||
|
* you agree that neither of us has any duty to consult with, obtain the
|
||||||
|
consent of, pay or render an accounting to the other for any use or
|
||||||
|
distribution of your contribution.
|
||||||
|
|
||||||
|
3. With respect to any patents you own, or that you can license without payment
|
||||||
|
to any third party, you hereby grant to us a perpetual, irrevocable,
|
||||||
|
non-exclusive, worldwide, no-charge, royalty-free license to:
|
||||||
|
|
||||||
|
* make, have made, use, sell, offer to sell, import, and otherwise transfer
|
||||||
|
your contribution in whole or in part, alone or in combination with or
|
||||||
|
included in any product, work or materials arising out of the project to
|
||||||
|
which your contribution was submitted, and
|
||||||
|
|
||||||
|
* at our option, to sublicense these same rights to third parties through
|
||||||
|
multiple levels of sublicensees or other licensing arrangements.
|
||||||
|
|
||||||
|
4. Except as set out above, you keep all right, title, and interest in your
|
||||||
|
contribution. The rights that you grant to us under these terms are effective
|
||||||
|
on the date you first submitted a contribution to us, even if your submission
|
||||||
|
took place before the date you sign these terms.
|
||||||
|
|
||||||
|
5. You covenant, represent, warrant and agree that:
|
||||||
|
|
||||||
|
* Each contribution that you submit is and shall be an original work of
|
||||||
|
authorship and you can legally grant the rights set out in this SCA;
|
||||||
|
|
||||||
|
* to the best of your knowledge, each contribution will not violate any
|
||||||
|
third party's copyrights, trademarks, patents, or other intellectual
|
||||||
|
property rights; and
|
||||||
|
|
||||||
|
* each contribution shall be in compliance with U.S. export control laws and
|
||||||
|
other applicable export and import laws. You agree to notify us if you
|
||||||
|
become aware of any circumstance which would make any of the foregoing
|
||||||
|
representations inaccurate in any respect. We may publicly disclose your
|
||||||
|
participation in the project, including the fact that you have signed the SCA.
|
||||||
|
|
||||||
|
6. This SCA is governed by the laws of the State of California and applicable
|
||||||
|
U.S. Federal law. Any choice of law rules will not apply.
|
||||||
|
|
||||||
|
7. Please place an “x” on one of the applicable statement below. Please do NOT
|
||||||
|
mark both statements:
|
||||||
|
|
||||||
|
* [ ] I am signing on behalf of myself as an individual and no other person
|
||||||
|
or entity, including my employer, has or will have rights with respect to my
|
||||||
|
contributions.
|
||||||
|
|
||||||
|
* [x] I am signing on behalf of my employer or a legal entity and I have the
|
||||||
|
actual authority to contractually bind that entity.
|
||||||
|
|
||||||
|
## Contributor Details
|
||||||
|
|
||||||
|
| Field | Entry |
|
||||||
|
|------------------------------- | -------------------- |
|
||||||
|
| Name | Eleni Partalidou |
|
||||||
|
| Company name (if applicable) | DataScouting |
|
||||||
|
| Title or role (if applicable) | Software Engineer |
|
||||||
|
| Date | 06.07.2018 |
|
||||||
|
| GitHub username | Eleni170 |
|
||||||
|
| Website (optional) | |
|
40
spacy/lang/el/__init__.py
Normal file
40
spacy/lang/el/__init__.py
Normal file
|
@ -0,0 +1,40 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .tokenizer_exceptions import TOKENIZER_EXCEPTIONS
|
||||||
|
from .tag_map import TAG_MAP
|
||||||
|
from .stop_words import STOP_WORDS
|
||||||
|
from .lex_attrs import LEX_ATTRS
|
||||||
|
from .lemmatizer import LOOKUP
|
||||||
|
from .punctuation import TOKENIZER_PREFIXES, TOKENIZER_SUFFIXES, TOKENIZER_INFIXES
|
||||||
|
from ..tokenizer_exceptions import BASE_EXCEPTIONS
|
||||||
|
from ..norm_exceptions import BASE_NORMS
|
||||||
|
from ...language import Language
|
||||||
|
from ...attrs import LANG, NORM
|
||||||
|
from ...util import update_exc, add_lookups
|
||||||
|
|
||||||
|
|
||||||
|
class GreekDefaults(Language.Defaults):
|
||||||
|
lex_attr_getters = dict(Language.Defaults.lex_attr_getters)
|
||||||
|
lex_attr_getters.update(LEX_ATTRS)
|
||||||
|
lex_attr_getters[LANG] = lambda text: 'el' # ISO code
|
||||||
|
lex_attr_getters[NORM] = add_lookups(Language.Defaults.lex_attr_getters[NORM], BASE_NORMS)
|
||||||
|
tokenizer_exceptions = update_exc(BASE_EXCEPTIONS, TOKENIZER_EXCEPTIONS)
|
||||||
|
stop_words = STOP_WORDS
|
||||||
|
lemma_lookup = LOOKUP
|
||||||
|
tag_map = TAG_MAP
|
||||||
|
prefixes = TOKENIZER_PREFIXES
|
||||||
|
suffixes = TOKENIZER_SUFFIXES
|
||||||
|
infixes = TOKENIZER_INFIXES
|
||||||
|
|
||||||
|
|
||||||
|
class Greek(Language):
|
||||||
|
|
||||||
|
lang = 'el' # ISO code
|
||||||
|
Defaults = GreekDefaults # set Defaults to custom language defaults
|
||||||
|
|
||||||
|
|
||||||
|
# set default export – this allows the language class to be lazy-loaded
|
||||||
|
__all__ = ['Greek']
|
||||||
|
|
19
spacy/lang/el/examples.py
Normal file
19
spacy/lang/el/examples.py
Normal file
|
@ -0,0 +1,19 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
"""
|
||||||
|
Example sentences to test spaCy and its language models.
|
||||||
|
>>> from spacy.lang.el.examples import sentences
|
||||||
|
>>> docs = nlp.pipe(sentences)
|
||||||
|
"""
|
||||||
|
|
||||||
|
sentences = [
|
||||||
|
"Η άνιση κατανομή του πλούτου και του εισοδήματος, η οποία έχει λάβει τρομερές διαστάσεις, δεν δείχνει τάσεις βελτίωσης.",
|
||||||
|
"Ο στόχος της σύντομης αυτής έκθεσης είναι να συνοψίσει τα κυριότερα συμπεράσματα των επισκοπήσεων κάθε μιας χώρας.",
|
||||||
|
"Μέχρι αργά χθες το βράδυ ο πλοιοκτήτης παρέμενε έξω από το γραφείο του γενικού γραμματέα του υπουργείου, ενώ είχε μόνον τηλεφωνική επικοινωνία με τον υπουργό.",
|
||||||
|
"Σύμφωνα με καλά ενημερωμένη πηγή, από την επεξεργασία του προέκυψε ότι οι δράστες της επίθεσης ήταν δύο, καθώς και ότι προσέγγισαν και αποχώρησαν από το σημείο με μοτοσικλέτα.",
|
||||||
|
"Η υποδομή καταλυμάτων στην Ελλάδα είναι πλήρης και ανανεώνεται συνεχώς.",
|
||||||
|
"Το επείγον ταχυδρομείο (ήτοι το παραδοτέο εντός 48 ωρών το πολύ) μπορεί να μεταφέρεται αεροπορικώς μόνον εφόσον εφαρμόζονται οι κανόνες ασφαλείας.",
|
||||||
|
"Στις ορεινές περιοχές του νησιού οι χιονοπτώσεις και οι παγετοί είναι περιορισμένοι ενώ στις παραθαλάσσιες περιοχές σημειώνονται σπανίως."
|
||||||
|
]
|
100057
spacy/lang/el/lemmatizer.py
Normal file
100057
spacy/lang/el/lemmatizer.py
Normal file
File diff suppressed because it is too large
Load Diff
38
spacy/lang/el/lex_attrs.py
Normal file
38
spacy/lang/el/lex_attrs.py
Normal file
|
@ -0,0 +1,38 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from ...attrs import LIKE_NUM
|
||||||
|
|
||||||
|
_num_words = ['μηδέν', 'ένας', 'δυο', 'δυό', 'τρεις', 'τέσσερις', 'πέντε', 'έξι', 'εφτά', 'επτά', 'οκτώ', 'οχτώ',
|
||||||
|
'εννιά', 'εννέα', 'δέκα', 'έντεκα', 'ένδεκα', 'δώδεκα', 'δεκατρείς', 'δεκατέσσερις', 'δεκαπέντε',
|
||||||
|
'δεκαέξι', 'δεκαεπτά', 'δεκαοχτώ', 'δεκαεννέα', 'δεκαεννεα', 'είκοσι', 'τριάντα', 'σαράντα', 'πενήντα',
|
||||||
|
'εξήντα', 'εβδομήντα', 'ογδόντα', 'ενενήντα', 'εκατό', 'διακόσιοι', 'διακόσοι', 'τριακόσιοι', 'τριακόσοι',
|
||||||
|
'τετρακόσιοι', 'τετρακόσοι', 'πεντακόσιοι', 'πεντακόσοι', 'εξακόσιοι', 'εξακόσοι', 'εφτακόσιοι',
|
||||||
|
'εφτακόσοι', 'επτακόσιοι', 'επτακόσοι', 'οχτακόσιοι', 'οχτακόσοι', 'οκτακόσιοι', 'οκτακόσοι',
|
||||||
|
'εννιακόσιοι', 'χίλιοι', 'χιλιάδα', 'εκατομμύριο', 'δισεκατομμύριο', 'τρισεκατομμύριο', 'τετράκις',
|
||||||
|
'πεντάκις', 'εξάκις', 'επτάκις', 'οκτάκις', 'εννεάκις']
|
||||||
|
|
||||||
|
|
||||||
|
def like_num(text):
|
||||||
|
text = text.replace(',', '').replace('.', '')
|
||||||
|
if text.isdigit():
|
||||||
|
return True
|
||||||
|
if text.count('/') == 1:
|
||||||
|
num, denom = text.split('/')
|
||||||
|
if num.isdigit() and denom.isdigit():
|
||||||
|
return True
|
||||||
|
if text.count('^') == 1:
|
||||||
|
num, denom = text.split('^')
|
||||||
|
if num.isdigit() and denom.isdigit():
|
||||||
|
return True
|
||||||
|
if text.lower() in _num_words or text.lower().split(' ')[0] in _num_words:
|
||||||
|
return True
|
||||||
|
if text in _num_words:
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
LEX_ATTRS = {
|
||||||
|
LIKE_NUM: like_num
|
||||||
|
}
|
66
spacy/lang/el/punctuation.py
Normal file
66
spacy/lang/el/punctuation.py
Normal file
|
@ -0,0 +1,66 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from ..char_classes import LIST_PUNCT, LIST_ELLIPSES, LIST_QUOTES, LIST_CURRENCY
|
||||||
|
from ..char_classes import LIST_ICONS, ALPHA_LOWER, ALPHA_UPPER, ALPHA, HYPHENS
|
||||||
|
from ..char_classes import QUOTES, CURRENCY
|
||||||
|
|
||||||
|
_units = ('km km² km³ m m² m³ dm dm² dm³ cm cm² cm³ mm mm² mm³ ha µm nm yd in ft '
|
||||||
|
'kg g mg µg t lb oz m/s km/h kmh mph hPa Pa mbar mb MB kb KB gb GB tb '
|
||||||
|
'TB T G M K км км² км³ м м² м³ дм дм² дм³ см см² см³ мм мм² мм³ нм '
|
||||||
|
'кг г мг м/с км/ч кПа Па мбар Кб КБ кб Мб МБ мб Гб ГБ гб Тб ТБ тб')
|
||||||
|
merge_chars = lambda char: char.strip().replace(' ', '|')
|
||||||
|
UNITS = merge_chars(_units)
|
||||||
|
|
||||||
|
_prefixes = (['\'\'', '§', '%', '=', r'\+[0-9]+%', # 90%
|
||||||
|
r'\'([0-9]){2}([\-]\'([0-9]){2})*', # '12'-13
|
||||||
|
r'\-([0-9]){1,9}\.([0-9]){1,9}', # -12.13
|
||||||
|
r'\'([Α-Ωα-ωίϊΐόάέύϋΰήώ]+)\'', # 'αβγ'
|
||||||
|
r'([Α-Ωα-ωίϊΐόάέύϋΰήώ]){1,3}\'', # αβγ'
|
||||||
|
r'http://www.[A-Za-z]+\-[A-Za-z]+(\.[A-Za-z]+)+(\/[A-Za-z]+)*(\.[A-Za-z]+)*',
|
||||||
|
r'[ΈΆΊΑ-Ωα-ωίϊΐόάέύϋΰήώ]+\*', # όνομα*
|
||||||
|
r'\$([0-9])+([\,\.]([0-9])+){0,1}',
|
||||||
|
] + LIST_PUNCT + LIST_ELLIPSES + LIST_QUOTES +
|
||||||
|
LIST_CURRENCY + LIST_ICONS)
|
||||||
|
|
||||||
|
_suffixes = (LIST_PUNCT + LIST_ELLIPSES + LIST_QUOTES + LIST_ICONS +
|
||||||
|
[r'(?<=[0-9])\+', # 12+
|
||||||
|
r'([0-9])+\'', # 12'
|
||||||
|
r'([A-Za-z])?\'', # a'
|
||||||
|
r'^([0-9]){1,2}\.', # 12.
|
||||||
|
r' ([0-9]){1,2}\.', # 12.
|
||||||
|
r'([0-9]){1}\) ', # 12)
|
||||||
|
r'^([0-9]){1}\)$', # 12)
|
||||||
|
r'(?<=°[FfCcKk])\.',
|
||||||
|
r'([0-9])+\&', # 12&
|
||||||
|
r'(?<=[0-9])(?:{})'.format(CURRENCY),
|
||||||
|
r'(?<=[0-9])(?:{})'.format(UNITS),
|
||||||
|
r'(?<=[0-9{}{}(?:{})])\.'.format(ALPHA_LOWER, r'²\-\)\]\+', QUOTES),
|
||||||
|
r'(?<=[{a}][{a}])\.'.format(a=ALPHA_UPPER),
|
||||||
|
r'(?<=[Α-Ωα-ωίϊΐόάέύϋΰήώ])\-', # όνομα-
|
||||||
|
r'(?<=[Α-Ωα-ωίϊΐόάέύϋΰήώ])\.',
|
||||||
|
r'^[Α-Ω]{1}\.',
|
||||||
|
r'\ [Α-Ω]{1}\.',
|
||||||
|
r'[ΈΆΊΑΌ-Ωα-ωίϊΐόάέύϋΰήώ]+([\-]([ΈΆΊΑΌ-Ωα-ωίϊΐόάέύϋΰήώ]+))+', # πρώτος-δεύτερος , πρώτος-δεύτερος-τρίτος
|
||||||
|
r'([0-9]+)mg', # 13mg
|
||||||
|
r'([0-9]+)\.([0-9]+)m' # 1.2m
|
||||||
|
])
|
||||||
|
|
||||||
|
_infixes = (LIST_ELLIPSES + LIST_ICONS +
|
||||||
|
[r'(?<=[0-9])[+\/\-\*^](?=[0-9])', # 1/2 , 1-2 , 1*2
|
||||||
|
r'([a-zA-Z]+)\/([a-zA-Z]+)\/([a-zA-Z]+)', # name1/name2/name3
|
||||||
|
r'([0-9])+(\.([0-9]+))*([\-]([0-9])+)+', # 10.9 , 10.9.9 , 10.9-6
|
||||||
|
r'([0-9])+[,]([0-9])+[\-]([0-9])+[,]([0-9])+', # 10,11,12
|
||||||
|
r'([0-9])+[ης]+([\-]([0-9])+)+', # 1ης-2
|
||||||
|
r'([0-9]){1,4}[\/]([0-9]){1,2}([\/]([0-9]){0,4}){0,1}', # 15/2 , 15/2/17 , 2017/2/15
|
||||||
|
r'[A-Za-z]+\@[A-Za-z]+(\-[A-Za-z]+)*\.[A-Za-z]+', # abc@cde-fgh.a
|
||||||
|
r'([a-zA-Z]+)(\-([a-zA-Z]+))+', # abc-abc
|
||||||
|
r'(?<=[{}])\.(?=[{}])'.format(ALPHA_LOWER, ALPHA_UPPER),
|
||||||
|
r'(?<=[{a}]),(?=[{a}])'.format(a=ALPHA),
|
||||||
|
r'(?<=[{a}])[?";:=,.]*(?:{h})(?=[{a}])'.format(a=ALPHA, h=HYPHENS),
|
||||||
|
r'(?<=[{a}"])[:<>=/](?=[{a}])'.format(a=ALPHA)])
|
||||||
|
|
||||||
|
TOKENIZER_PREFIXES = _prefixes
|
||||||
|
TOKENIZER_SUFFIXES = _suffixes
|
||||||
|
TOKENIZER_INFIXES = _infixes
|
98
spacy/lang/el/stop_words.py
Normal file
98
spacy/lang/el/stop_words.py
Normal file
|
@ -0,0 +1,98 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
# Stop words
|
||||||
|
|
||||||
|
# Link to greek stop words: https://www.translatum.gr/forum/index.php?topic=3550.0?topic=3550.0
|
||||||
|
|
||||||
|
|
||||||
|
STOP_WORDS = set("""
|
||||||
|
αδιάκοπα αι ακόμα ακόμη ακριβώς αλήθεια αληθινά αλλά αλλαχού άλλες άλλη άλλην
|
||||||
|
άλλης αλλιώς αλλιώτικα άλλο άλλοι αλλοιώς αλλοιώτικα άλλον άλλος άλλοτε αλλού
|
||||||
|
άλλους άλλων άμα άμεσα αμέσως αν ανά ανάμεσα αναμεταξύ άνευ αντί αντίπερα αντίς
|
||||||
|
άνω ανωτέρω άξαφνα απ απέναντι από απόψε άρα άραγε αργά αργότερο αριστερά αρκετά
|
||||||
|
αρχικά ας αύριο αυτά αυτές αυτή αυτήν αυτής αυτό αυτοί αυτόν αυτός αυτού αυτούς
|
||||||
|
αυτών αφότου αφού
|
||||||
|
|
||||||
|
βέβαια βεβαιότατα
|
||||||
|
|
||||||
|
γι για γρήγορα γύρω
|
||||||
|
|
||||||
|
δα δε δείνα δεν δεξιά δήθεν δηλαδή δι δια διαρκώς δικά δικό δικοί δικός δικού
|
||||||
|
δικούς διόλου δίπλα δίχως
|
||||||
|
|
||||||
|
εάν εαυτό εαυτόν εαυτού εαυτούς εαυτών έγκαιρα εγκαίρως εγώ εδώ ειδεμή είθε είμαι
|
||||||
|
είμαστε είναι εις είσαι είσαστε είστε είτε είχα είχαμε είχαν είχατε είχε είχες έκαστα
|
||||||
|
έκαστες έκαστη έκαστην έκαστης έκαστο έκαστοι έκαστον έκαστος εκάστου εκάστους εκάστων
|
||||||
|
εκεί εκείνα εκείνες εκείνη εκείνην εκείνης εκείνο εκείνοι εκείνον εκείνος εκείνου
|
||||||
|
εκείνους εκείνων εκτός εμάς εμείς εμένα εμπρός εν ένα έναν ένας ενός εντελώς εντός
|
||||||
|
εντωμεταξύ ενώ εξ έξαφνα εξήσ εξίσου έξω επάνω επειδή έπειτα επί επίσης επομένως εσάς
|
||||||
|
εσείς εσένα έστω εσύ ετέρα ετέραι ετέρας έτερες έτερη έτερης έτερο έτεροι έτερον έτερος
|
||||||
|
ετέρου έτερους ετέρων ετούτα ετούτες ετούτη ετούτην ετούτης ετούτο ετούτοι ετούτον
|
||||||
|
ετούτος ετούτου ετούτους ετούτων έτσι εύγε ευθύς ευτυχώς εφεξής έχει έχεις έχετε
|
||||||
|
εχθές έχομε έχουμε έχουν εχτές έχω έως
|
||||||
|
|
||||||
|
η ήδη ήμασταν ήμαστε ήμουν ήσασταν ήσαστε ήσουν ήταν ήτανε ήτοι ήττον
|
||||||
|
|
||||||
|
θα
|
||||||
|
|
||||||
|
ι ίδια ίδιαν ιδίας ίδιες ίδιο ίδιοι ίδιον ίδιοσ ιδίου ίδιους ίδιων ιδίως ιι ιιι
|
||||||
|
ίσαμε ίσια ίσως
|
||||||
|
|
||||||
|
κάθε καθεμία καθεμίας καθένα καθένας καθενός καθετί καθόλου καθώς και κακά κακώς καλά
|
||||||
|
καλώς καμία καμίαν καμίας κάμποσα κάμποσες κάμποση κάμποσην κάμποσης κάμποσο κάμποσοι
|
||||||
|
κάμποσον κάμποσος κάμποσου κάμποσους κάμποσων κανείς κάνεν κανένα κανέναν κανένας
|
||||||
|
κανενός κάποια κάποιαν κάποιας κάποιες κάποιο κάποιοι κάποιον κάποιος κάποιου κάποιους
|
||||||
|
κάποιων κάποτε κάπου κάπως κατ κατά κάτι κατιτί κατόπιν κάτω κιόλας κλπ κοντά κτλ κυρίως
|
||||||
|
|
||||||
|
λιγάκι λίγο λιγότερο λόγω λοιπά λοιπόν
|
||||||
|
|
||||||
|
μα μαζί μακάρι μακρυά μάλιστα μάλλον μας με μεθαύριο μείον μέλει μέλλεται μεμιάς μεν
|
||||||
|
μερικά μερικές μερικοί μερικούς μερικών μέσα μετ μετά μεταξύ μέχρι μη μήδε μην μήπως
|
||||||
|
μήτε μια μιαν μιας μόλις μολονότι μονάχα μόνες μόνη μόνην μόνης μόνο μόνοι μονομιάς
|
||||||
|
μόνος μόνου μόνους μόνων μου μπορεί μπορούν μπράβο μπρος
|
||||||
|
|
||||||
|
να ναι νωρίς
|
||||||
|
|
||||||
|
ξανά ξαφνικά
|
||||||
|
|
||||||
|
ο οι όλα όλες όλη όλην όλης όλο ολόγυρα όλοι όλον ολονέν όλος ολότελα όλου όλους όλων
|
||||||
|
όλως ολωσδιόλου όμως όποια οποιαδήποτε οποίαν οποιανδήποτε οποίας οποιασδήποτε οποιδήποτε
|
||||||
|
όποιες οποιεσδήποτε όποιο οποιοδηήποτε όποιοι όποιον οποιονδήποτε όποιος οποιοσδήποτε
|
||||||
|
οποίου οποιουδήποτε οποίους οποιουσδήποτε οποίων οποιωνδήποτε όποτε οποτεδήποτε όπου
|
||||||
|
οπουδήποτε όπως ορισμένα ορισμένες ορισμένων ορισμένως όσα οσαδήποτε όσες οσεσδήποτε
|
||||||
|
όση οσηδήποτε όσην οσηνδήποτε όσης οσησδήποτε όσο οσοδήποτε όσοι οσοιδήποτε όσον οσονδήποτε
|
||||||
|
όσος οσοσδήποτε όσου οσουδήποτε όσους οσουσδήποτε όσων οσωνδήποτε όταν ότι οτιδήποτε
|
||||||
|
ότου ου ουδέ ούτε όχι
|
||||||
|
|
||||||
|
πάλι πάντοτε παντού πάντως πάρα πέρα πέρι περίπου περισσότερο πέρσι πέρυσι πια πιθανόν
|
||||||
|
πιο πίσω πλάι πλέον πλην ποιά ποιάν ποιάς ποιές ποιό ποιοί ποιόν ποιός ποιού ποιούς
|
||||||
|
ποιών πολύ πόσες πόση πόσην πόσης πόσοι πόσος πόσους πότε πού πούθε πουθενά πρέπει
|
||||||
|
πριν προ προκειμένου πρόκειται πρόπερσι προς προτού προχθές προχτές πρωτύτερα πώς
|
||||||
|
|
||||||
|
σαν σας σε σεις σήμερα σιγά σου στα στη στην στης στις στο στον στου στους στων συγχρόνως
|
||||||
|
συν συνάμα συνεπώς συνήθως συχνά συχνάς συχνές συχνή συχνήν συχνής συχνό συχνοί συχνόν
|
||||||
|
συχνός συχνού συχνούς συχνών συχνώς σχεδόν σωστά
|
||||||
|
|
||||||
|
τα τάδε ταύτα ταύτες ταύτη ταύτην ταύτης ταύτοταύτον ταύτος ταύτου ταύτων τάχα τάχατε
|
||||||
|
τελικά τελικώς τες τέτοια τέτοιαν τέτοιας τέτοιες τέτοιο τέτοιοι τέτοιον τέτοιος τέτοιου
|
||||||
|
τέτοιους τέτοιων τη την της τι τίποτα τίποτε τις το τοι τον τοσ τόσα τόσες τόση τόσην
|
||||||
|
τόσης τόσο τόσοι τόσον τόσος τόσου τόσους τόσων τότε του τουλάχιστο τουλάχιστον τους τούτα
|
||||||
|
τούτες τούτη τούτην τούτης τούτο τούτοι τούτοις τούτον τούτος τούτου τούτους τούτων τυχόν
|
||||||
|
των τώρα
|
||||||
|
|
||||||
|
υπ υπέρ υπό υπόψη υπόψιν ύστερα
|
||||||
|
|
||||||
|
φέτος
|
||||||
|
|
||||||
|
χαμηλά χθες χτες χωρίς χωριστά
|
||||||
|
|
||||||
|
ψηλά
|
||||||
|
|
||||||
|
ω ωραία ως ωσάν ωσότου ώσπου ώστε ωστόσο ωχ
|
||||||
|
""".split())
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
599
spacy/lang/el/tag_map.py
Normal file
599
spacy/lang/el/tag_map.py
Normal file
|
@ -0,0 +1,599 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
from ...symbols import POS, PUNCT, SYM, ADJ, CCONJ, SCONJ, NUM, DET, ADV, ADP, X, VERB
|
||||||
|
from ...symbols import NOUN, PROPN, PART, INTJ,SPACE,PRON
|
||||||
|
|
||||||
|
TAG_MAP = {
|
||||||
|
"ABBR": {POS: NOUN, "Abbr":"Yes"},
|
||||||
|
"AdXxBa": {POS: ADV, "Degree": ""},
|
||||||
|
"AdXxCp": {POS: ADV, "Degree": "Cmp"},
|
||||||
|
"AdXxSu": {POS: ADV, "Degree": "Sup"},
|
||||||
|
"AjBaFePlAc": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjBaFePlDa": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjBaFePlGe": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjBaFePlNm": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjBaFePlVo": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjBaFeSgAc": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjBaFeSgDa": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjBaFeSgGe": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjBaFeSgNm": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjBaFeSgVo": {POS: ADJ, "Degree": "", "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjBaMaPlAc": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjBaMaPlDa": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjBaMaPlGe": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjBaMaPlNm": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjBaMaPlVo": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjBaMaSgAc": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjBaMaSgDa": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjBaMaSgGe": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjBaMaSgNm": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjBaMaSgVo": {POS: ADJ, "Degree": "", "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjBaNePlAc": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjBaNePlDa": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjBaNePlGe": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjBaNePlNm": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjBaNePlVo": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjBaNeSgAc": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjBaNeSgDa": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjBaNeSgGe": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjBaNeSgNm": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjBaNeSgVo": {POS: ADJ, "Degree": "", "Gender": "Neut", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjCpFePlAc": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjCpFePlDa": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjCpFePlGe": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjCpFePlNm": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjCpFePlVo": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjCpFeSgAc": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjCpFeSgDa": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjCpFeSgGe": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjCpFeSgNm": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjCpFeSgVo": {POS: ADJ, "Degree": "Cmp", "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjCpMaPlAc": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjCpMaPlDa": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjCpMaPlGe": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjCpMaPlNm": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjCpMaPlVo": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjCpMaSgAc": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjCpMaSgDa": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjCpMaSgGe": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjCpMaSgNm": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjCpMaSgVo": {POS: ADJ, "Degree": "Cmp", "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjCpNePlAc": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjCpNePlDa": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjCpNePlGe": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjCpNePlNm": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjCpNePlVo": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjCpNeSgAc": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjCpNeSgDa": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjCpNeSgGe": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjCpNeSgNm": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjCpNeSgVo": {POS: ADJ, "Degree": "Cmp", "Gender": "Neut", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjSuFePlAc": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjSuFePlDa": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjSuFePlGe": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjSuFePlNm": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjSuFePlVo": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjSuFeSgAc": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjSuFeSgDa": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjSuFeSgGe": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjSuFeSgNm": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjSuFeSgVo": {POS: ADJ, "Degree": "Sup", "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjSuMaPlAc": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjSuMaPlDa": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjSuMaPlGe": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjSuMaPlNm": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjSuMaPlVo": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjSuMaSgAc": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjSuMaSgDa": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjSuMaSgGe": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjSuMaSgNm": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjSuMaSgVo": {POS: ADJ, "Degree": "Sup", "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AjSuNePlAc": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AjSuNePlDa": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"AjSuNePlGe": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AjSuNePlNm": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"AjSuNePlVo": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"AjSuNeSgAc": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AjSuNeSgDa": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"AjSuNeSgGe": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AjSuNeSgNm": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"AjSuNeSgVo": {POS: ADJ, "Degree": "Sup", "Gender": "Neut", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"AsPpPaFePlAc": {POS: ADP, "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AsPpPaFePlGe": {POS: ADP, "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AsPpPaFeSgAc": {POS: ADP, "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AsPpPaFeSgGe": {POS: ADP, "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AsPpPaMaPlAc": {POS: ADP, "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AsPpPaMaPlGe": {POS: ADP, "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AsPpPaMaSgAc": {POS: ADP, "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AsPpPaMaSgGe": {POS: ADP, "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AsPpPaNePlAc": {POS: ADP, "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"AsPpPaNePlGe": {POS: ADP, "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"AsPpPaNeSgAc": {POS: ADP, "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"AsPpPaNeSgGe": {POS: ADP, "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"AsPpSp": {POS: ADP},
|
||||||
|
"AtDfFePlAc": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Plur", "Case": "Acc", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfFePlGe": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Plur", "Case": "Gen", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfFePlNm": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Plur", "Case": "Nom", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfFeSgAc": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Acc", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfFeSgDa": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Dat", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfFeSgGe": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Gen", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfFeSgNm": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Nom", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfMaPlAc": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Plur", "Case": "Acc", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfMaPlGe": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Plur", "Case": "Gen", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfMaPlNm": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Plur", "Case": "Nom", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfMaSgAc": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Sing", "Case": "Acc", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfMaSgDa": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Sing", "Case": "Dat", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfMaSgGe": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Sing", "Case": "Gen", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfMaSgNm": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Sing", "Case": "Nom", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNePlAc": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Plur", "Case": "Acc", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNePlDa": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Plur", "Case": "Dat", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNePlGe": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Plur", "Case": "Gen", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNePlNm": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Plur", "Case": "Nom", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNeSgAc": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Sing", "Case": "Acc", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNeSgDa": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Sing", "Case": "Dat", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNeSgGe": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Sing", "Case": "Gen", "Other":{"Definite": "Def"}},
|
||||||
|
"AtDfNeSgNm": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Sing", "Case": "Nom", "Other":{"Definite": "Def"}},
|
||||||
|
"AtIdFeSgAc": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Acc", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdFeSgDa": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Dat", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdFeSgGe": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Gen", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdFeSgNm": {POS: DET, "PronType": "Art", "Gender": "Fem", "Number": "Sing", "Case": "Nom", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdMaSgAc": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Sing", "Case": "Acc", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdMaSgGe": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Sing", "Case": "Gen", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdMaSgNm": {POS: DET, "PronType": "Art", "Gender": "Masc", "Number": "Sing", "Case": "Nom", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdNeSgAc": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Sing", "Case": "Acc", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdNeSgGe": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Sing", "Case": "Gen", "Other":{"Definite": "Ind"}},
|
||||||
|
"AtIdNeSgNm": {POS: DET, "PronType": "Art", "Gender": "Neut", "Number": "Sing", "Case": "Nom", "Other":{"Definite": "Ind"}},
|
||||||
|
"CjCo": {POS: CCONJ},
|
||||||
|
"CjSb": {POS: SCONJ},
|
||||||
|
"CPUNCT": {POS: PUNCT},
|
||||||
|
"DATE": {POS: NUM},
|
||||||
|
"DIG": {POS: NUM},
|
||||||
|
"ENUM": {POS: NUM},
|
||||||
|
"Ij": {POS: INTJ},
|
||||||
|
"INIT": {POS: SYM},
|
||||||
|
"NBABBR": {POS: NOUN, "Abbr":"Yes"},
|
||||||
|
"NmAnFePlAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmAnFePlGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmAnFePlNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmAnFePlVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmAnFeSgAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmAnFeSgGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmAnFeSgNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmAnFeSgVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmAnMaPlAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmAnMaPlGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmAnMaPlNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmAnMaPlVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmAnMaSgAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmAnMaSgGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmAnMaSgNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmAnMaSgVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmAnNePlAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmAnNePlGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmAnNePlNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmAnNePlVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmAnNeSgAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmAnNeSgGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmAnNeSgNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmAnNeSgVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmAnXxXxXxAd": {POS: NUM, "NumType": "Mult", "Gender": "Masc|Fem|Neut", "Number": "Sing|Plur", "Case": "Acc|Gen|Nom|Voc"},
|
||||||
|
"NmCdFePlAcAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmCdFePlGeAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmCdFePlNmAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmCdFePlVoAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmCdFeSgAcAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmCdFeSgDaAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"NmCdFeSgGeAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmCdFeSgNmAj": {POS: NUM, "NumType": "Card", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmCdMaPlAcAj": {POS: NUM, "NumType": "Card", "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmCdMaPlGeAj": {POS: NUM, "NumType": "Card", "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmCdMaPlNmAj": {POS: NUM, "NumType": "Card", "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmCdMaPlVoAj": {POS: NUM, "NumType": "Card", "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmCdMaSgAcAj": {POS: NUM, "NumType": "Card", "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmCdMaSgGeAj": {POS: NUM, "NumType": "Card", "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmCdMaSgNmAj": {POS: NUM, "NumType": "Card", "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmCdNePlAcAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmCdNePlDaAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"NmCdNePlGeAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmCdNePlNmAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmCdNePlVoAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmCdNeSgAcAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmCdNeSgGeAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmCdNeSgNmAj": {POS: NUM, "NumType": "Card", "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmCtFePlAcNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmCtFePlGeNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmCtFePlNmNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmCtFePlVoNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmCtFeSgAcNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmCtFeSgGeNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmCtFeSgNmNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmCtFeSgVoNo": {POS: NUM, "NumType": "Sets", "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmMlFePlAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmMlFePlGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmMlFePlNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmMlFePlVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmMlFeSgAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmMlFeSgGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmMlFeSgNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmMlFeSgVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmMlMaPlAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmMlMaPlGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmMlMaPlNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmMlMaPlVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmMlMaSgAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmMlMaSgGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmMlMaSgNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmMlMaSgVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmMlNePlAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmMlNePlGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmMlNePlNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmMlNePlVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmMlNeSgAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmMlNeSgGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmMlNeSgNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmMlNeSgVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Neut", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmMlXxXxXxAd": {POS: NUM, "NumType": "Mult", "Gender": "Masc|Fem|Neut", "Number": "Sing|Plur", "Case": "Acc|Gen|Nom|Voc"},
|
||||||
|
"NmOdFePlAcAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmOdFePlGeAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmOdFePlNmAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmOdFePlVoAj": {POS: NUM, "NumType": "Mult", "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmOdFeSgAcAj": {POS: NUM, "NumType": "Ord", "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmOdFeSgGeAj": {POS: NUM, "NumType": "Ord", "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmOdFeSgNmAj": {POS: NUM, "NumType": "Ord", "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmOdFeSgVoAj": {POS: NUM, "NumType": "Ord", "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmOdMaPlAcAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmOdMaPlGeAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmOdMaPlNmAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmOdMaPlVoAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmOdMaSgAcAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmOdMaSgGeAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmOdMaSgNmAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmOdMaSgVoAj": {POS: NUM, "NumType": "Ord", "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NmOdNePlAcAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NmOdNePlGeAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NmOdNePlNmAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NmOdNePlVoAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NmOdNeSgAcAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NmOdNeSgGeAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NmOdNeSgNmAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NmOdNeSgVoAj": {POS: NUM, "NumType": "Ord", "Gender": "Neut", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NoCmFePlAc": {POS: NOUN, "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NoCmFePlDa": {POS: NOUN, "Gender": "Fem", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"NoCmFePlGe": {POS: NOUN, "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NoCmFePlNm": {POS: NOUN, "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NoCmFePlVo": {POS: NOUN, "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NoCmFeSgAc": {POS: NOUN, "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NoCmFeSgDa": {POS: NOUN, "Gender": "Fem", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"NoCmFeSgGe": {POS: NOUN, "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NoCmFeSgNm": {POS: NOUN, "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NoCmFeSgVo": {POS: NOUN, "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NoCmMaPlAc": {POS: NOUN, "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NoCmMaPlDa": {POS: NOUN, "Gender": "Masc", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"NoCmMaPlGe": {POS: NOUN, "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NoCmMaPlNm": {POS: NOUN, "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NoCmMaPlVo": {POS: NOUN, "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NoCmMaSgAc": {POS: NOUN, "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NoCmMaSgDa": {POS: NOUN, "Gender": "Masc", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"NoCmMaSgGe": {POS: NOUN, "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NoCmMaSgNm": {POS: NOUN, "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NoCmMaSgVo": {POS: NOUN, "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NoCmNePlAc": {POS: NOUN, "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NoCmNePlDa": {POS: NOUN, "Gender": "Neut", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"NoCmNePlGe": {POS: NOUN, "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NoCmNePlNm": {POS: NOUN, "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NoCmNePlVo": {POS: NOUN, "Gender": "Neut", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NoCmNeSgAc": {POS: NOUN, "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NoCmNeSgDa": {POS: NOUN, "Gender": "Neut", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"NoCmNeSgGe": {POS: NOUN, "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NoCmNeSgNm": {POS: NOUN, "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NoCmNeSgVo": {POS: NOUN, "Gender": "Neut", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NoPrFePlAc": {POS: PROPN, "Gender": "Fem", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NoPrFePlDa": {POS: PROPN, "Gender": "Fem", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"NoPrFePlGe": {POS: PROPN, "Gender": "Fem", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NoPrFePlNm": {POS: PROPN, "Gender": "Fem", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NoPrFePlVo": {POS: PROPN, "Gender": "Fem", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NoPrFeSgAc": {POS: PROPN, "Gender": "Fem", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NoPrFeSgDa": {POS: PROPN, "Gender": "Fem", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"NoPrFeSgGe": {POS: PROPN, "Gender": "Fem", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NoPrFeSgNm": {POS: PROPN, "Gender": "Fem", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NoPrFeSgVo": {POS: PROPN, "Gender": "Fem", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NoPrMaPlAc": {POS: PROPN, "Gender": "Masc", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NoPrMaPlGe": {POS: PROPN, "Gender": "Masc", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NoPrMaPlNm": {POS: PROPN, "Gender": "Masc", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NoPrMaPlVo": {POS: PROPN, "Gender": "Masc", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"NoPrMaSgAc": {POS: PROPN, "Gender": "Masc", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NoPrMaSgDa": {POS: PROPN, "Gender": "Masc", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"NoPrMaSgGe": {POS: PROPN, "Gender": "Masc", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NoPrMaSgNm": {POS: PROPN, "Gender": "Masc", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"NoPrMaSgVo": {POS: PROPN, "Gender": "Masc", "Number": "Sing", "Case": "Voc"},
|
||||||
|
"NoPrNePlAc": {POS: PROPN, "Gender": "Neut", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"NoPrNePlGe": {POS: PROPN, "Gender": "Neut", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"NoPrNePlNm": {POS: PROPN, "Gender": "Neut", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"NoPrNeSgAc": {POS: PROPN, "Gender": "Neut", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"NoPrNeSgGe": {POS: PROPN, "Gender": "Neut", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"NoPrNeSgNm": {POS: PROPN, "Gender": "Neut", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"OPUNCT": {POS: PUNCT},
|
||||||
|
"PnDfFe03PlAcXx": {POS: PRON, "PronType": "", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnDfFe03SgAcXx": {POS: PRON, "PronType": "", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnDfMa03PlGeXx": {POS: PRON, "PronType": "", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnDmFe03PlAcXx": {POS: PRON, "PronType": "Dem", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnDmFe03PlGeXx": {POS: PRON, "PronType": "Dem", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnDmFe03PlNmXx": {POS: PRON, "PronType": "Dem", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnDmFe03SgAcXx": {POS: PRON, "PronType": "Dem", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnDmFe03SgDaXx": {POS: PRON, "PronType": "Dem", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"PnDmFe03SgGeXx": {POS: PRON, "PronType": "Dem", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnDmFe03SgNmXx": {POS: PRON, "PronType": "Dem", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnDmMa03PlAcXx": {POS: PRON, "PronType": "Dem", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnDmMa03PlDaXx": {POS: PRON, "PronType": "Dem", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"PnDmMa03PlGeXx": {POS: PRON, "PronType": "Dem", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnDmMa03PlNmXx": {POS: PRON, "PronType": "Dem", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnDmMa03SgAcXx": {POS: PRON, "PronType": "Dem", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnDmMa03SgGeXx": {POS: PRON, "PronType": "Dem", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnDmMa03SgNmXx": {POS: PRON, "PronType": "Dem", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnDmNe03PlAcXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnDmNe03PlDaXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"PnDmNe03PlGeXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnDmNe03PlNmXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnDmNe03SgAcXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnDmNe03SgDaXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"PnDmNe03SgGeXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnDmNe03SgNmXx": {POS: PRON, "PronType": "Dem", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnIdFe03PlAcXx": {POS: PRON, "PronType": "Ind", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnIdFe03PlGeXx": {POS: PRON, "PronType": "Ind", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnIdFe03PlNmXx": {POS: PRON, "PronType": "Ind", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnIdFe03SgAcXx": {POS: PRON, "PronType": "Ind", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnIdFe03SgGeXx": {POS: PRON, "PronType": "Ind", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnIdFe03SgNmXx": {POS: PRON, "PronType": "Ind", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnIdMa03PlAcXx": {POS: PRON, "PronType": "Ind", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnIdMa03PlGeXx": {POS: PRON, "PronType": "Ind", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnIdMa03PlNmXx": {POS: PRON, "PronType": "Ind", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnIdMa03SgAcXx": {POS: PRON, "PronType": "Ind", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnIdMa03SgGeXx": {POS: PRON, "PronType": "Ind", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnIdMa03SgNmXx": {POS: PRON, "PronType": "Ind", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnIdNe03PlAcXx": {POS: PRON, "PronType": "Ind", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnIdNe03PlGeXx": {POS: PRON, "PronType": "Ind", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnIdNe03PlNmXx": {POS: PRON, "PronType": "Ind", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnIdNe03SgAcXx": {POS: PRON, "PronType": "Ind", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnIdNe03SgDaXx": {POS: PRON, "PronType": "Ind", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Dat"},
|
||||||
|
"PnIdNe03SgGeXx": {POS: PRON, "PronType": "Ind", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnIdNe03SgNmXx": {POS: PRON, "PronType": "Ind", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnIrFe03PlAcXx": {POS: PRON, "PronType": "Int", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnIrFe03PlGeXx": {POS: PRON, "PronType": "Int", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnIrFe03PlNmXx": {POS: PRON, "PronType": "Int", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnIrFe03SgAcXx": {POS: PRON, "PronType": "Int", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnIrFe03SgGeXx": {POS: PRON, "PronType": "Int", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnIrFe03SgNmXx": {POS: PRON, "PronType": "Int", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnIrMa03PlAcXx": {POS: PRON, "PronType": "Int", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnIrMa03PlGeXx": {POS: PRON, "PronType": "Int", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnIrMa03PlNmXx": {POS: PRON, "PronType": "Int", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnIrMa03SgAcXx": {POS: PRON, "PronType": "Int", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnIrMa03SgGeXx": {POS: PRON, "PronType": "Int", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnIrMa03SgNmXx": {POS: PRON, "PronType": "Int", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnIrNe03PlAcXx": {POS: PRON, "PronType": "Int", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnIrNe03PlGeXx": {POS: PRON, "PronType": "Int", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnIrNe03PlNmXx": {POS: PRON, "PronType": "Int", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnIrNe03SgAcXx": {POS: PRON, "PronType": "Int", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnIrNe03SgGeXx": {POS: PRON, "PronType": "Int", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnIrNe03SgNmXx": {POS: PRON, "PronType": "Int", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnPeFe01PlAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeFe01PlAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeFe01PlGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeFe01PlNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnPeFe01SgAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeFe01SgAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeFe01SgGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeFe01SgGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeFe01SgNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "1", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnPeFe02PlAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeFe02PlAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeFe02PlGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeFe02PlGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeFe02PlNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnPeFe02SgAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeFe02SgAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeFe02SgGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeFe02SgNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "2", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnPeFe03PlAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeFe03PlAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeFe03PlGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeFe03PlGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeFe03PlNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnPeFe03SgAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeFe03SgAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeFe03SgGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeFe03SgGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeMa01PlAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeMa01PlAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeMa01PlDaSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Plur", "Case": "Dat"},
|
||||||
|
"PnPeMa01PlGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeMa01PlGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeMa01PlNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnPeMa01SgAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeMa01SgAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeMa01SgGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeMa01SgGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeMa01SgNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "1", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnPeMa02PlAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeMa02PlAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeMa02PlGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeMa02PlNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnPeMa02PlVoSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Plur", "Case": "Voc"},
|
||||||
|
"PnPeMa02SgAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeMa02SgAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeMa02SgGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeMa02SgNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "2", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnPeMa03PlAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeMa03PlGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeMa03PlGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeMa03PlNmSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnPeMa03SgAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeMa03SgAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeMa03SgGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeMa03SgGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeMa03SgNmWe": {POS: PRON, "PronType": "Prs", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnPeNe03PlAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnPeNe03PlGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeNe03PlGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPeNe03SgAcSt": {POS: PRON, "PronType": "Prs", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeNe03SgAcWe": {POS: PRON, "PronType": "Prs", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnPeNe03SgGeSt": {POS: PRON, "PronType": "Prs", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPeNe03SgGeWe": {POS: PRON, "PronType": "Prs", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPoFe01PlGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Fem", "Person": "1", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPoFe01SgGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Fem", "Person": "1", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPoFe02PlGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Fem", "Person": "2", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPoFe02SgGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Fem", "Person": "2", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPoFe03PlGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPoFe03SgGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPoMa01PlGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Masc", "Person": "1", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPoMa01SgGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Masc", "Person": "1", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPoMa02PlGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Masc", "Person": "2", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPoMa02SgGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Masc", "Person": "2", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPoMa03PlGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPoMa03SgGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnPoNe03PlGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnPoNe03SgGeXx": {POS: PRON, "Poss": "Yes", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnReFe03PlAcXx": {POS: PRON, "PronType": "Rel", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnReFe03PlGeXx": {POS: PRON, "PronType": "Rel", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnReFe03PlNmXx": {POS: PRON, "PronType": "Rel", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnReFe03SgAcXx": {POS: PRON, "PronType": "Rel", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnReFe03SgGeXx": {POS: PRON, "PronType": "Rel", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnReFe03SgNmXx": {POS: PRON, "PronType": "Rel", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnReMa03PlAcXx": {POS: PRON, "PronType": "Rel", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnReMa03PlGeXx": {POS: PRON, "PronType": "Rel", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnReMa03PlNmXx": {POS: PRON, "PronType": "Rel", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnReMa03SgAcXx": {POS: PRON, "PronType": "Rel", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnReMa03SgGeXx": {POS: PRON, "PronType": "Rel", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnReMa03SgNmXx": {POS: PRON, "PronType": "Rel", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnReNe03PlAcXx": {POS: PRON, "PronType": "Rel", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnReNe03PlGeXx": {POS: PRON, "PronType": "Rel", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnReNe03PlNmXx": {POS: PRON, "PronType": "Rel", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnReNe03SgAcXx": {POS: PRON, "PronType": "Rel", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnReNe03SgGeXx": {POS: PRON, "PronType": "Rel", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnReNe03SgNmXx": {POS: PRON, "PronType": "Rel", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnRiFe03PlAcXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnRiFe03PlGeXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnRiFe03PlNmXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Fem", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnRiFe03SgAcXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnRiFe03SgGeXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnRiFe03SgNmXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Fem", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnRiMa03PlAcXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnRiMa03PlGeXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnRiMa03PlNmXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Masc", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnRiMa03SgAcXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnRiMa03SgGeXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnRiMa03SgNmXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Masc", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PnRiNe03PlAcXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Acc"},
|
||||||
|
"PnRiNe03PlGeXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Gen"},
|
||||||
|
"PnRiNe03PlNmXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Neut", "Person": "3", "Number": "Plur", "Case": "Nom"},
|
||||||
|
"PnRiNe03SgAcXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Acc"},
|
||||||
|
"PnRiNe03SgGeXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Gen"},
|
||||||
|
"PnRiNe03SgNmXx": {POS: PRON, "PronType": "Ind,Rel", "Gender": "Neut", "Person": "3", "Number": "Sing", "Case": "Nom"},
|
||||||
|
"PTERM_P": {POS: PUNCT},
|
||||||
|
"PtFu": {POS: PART},
|
||||||
|
"PtNg": {POS: PART},
|
||||||
|
"PtOt": {POS: PART},
|
||||||
|
"PtSj": {POS: PART},
|
||||||
|
"Pu": {POS: SYM},
|
||||||
|
"PUNCT": {POS: PUNCT},
|
||||||
|
"RgAbXx": {POS: X},
|
||||||
|
"RgAnXx": {POS: X},
|
||||||
|
"RgFwOr": {POS: X, "Foreign": "Yes"},
|
||||||
|
"RgFwTr": {POS: X, "Foreign": "Yes"},
|
||||||
|
"RgSyXx": {POS: SYM},
|
||||||
|
"VbIsIdPa03SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsIdPa03SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsIdPa03SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsIdPa03SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsIdPr03SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsIdPr03SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsIdXx03SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsIdXx03SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbIsNfXxXxXxXxPeAvXx": {POS: VERB, "VerbForm": "Inf", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing|Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01PlXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01PlXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01PlXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01PlXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa01SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "1", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02PlXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02PlXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02PlXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02PlXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa02SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03PlXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03PlXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp", "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03PlXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf", "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03PlXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPa03SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr01PlXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr01PlXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr01SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "1", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr01SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "1", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr02PlXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr02PlXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr02SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr02SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr03PlXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr03PlXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr03SgXxIpAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdPr03SgXxIpPvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx01PlXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx01PlXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "1", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx01SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "1", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx01SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "1", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx02PlXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx02PlXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx02SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx02SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx03PlXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx03PlXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "3", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx03SgXxPeAvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnIdXx03SgXxPePvXx": {POS: VERB, "VerbForm": "Fin", "Mood": "Ind", "Tense": "Pres|Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02PlXxIpAvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02PlXxIpPvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02PlXxPeAvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02PlXxPePvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02SgXxIpAvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02SgXxIpPvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02SgXxPeAvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx02SgXxPePvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "2", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnMpXx03SgXxIpPvXx": {POS: VERB, "VerbForm": "", "Mood": "Imp", "Tense": "Pres|Past", "Person": "3", "Number": "Sing", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnNfXxXxXxXxPeAvXx": {POS: VERB, "VerbForm": "Inf", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing|Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnNfXxXxXxXxPePvXx": {POS: VERB, "VerbForm": "Inf", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing|Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnPpPrXxXxXxIpAvXx": {POS: VERB, "VerbForm": "Conv", "Mood": "", "Tense": "Pres", "Person": "1|2|3", "Number": "Sing|Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"},
|
||||||
|
"VbMnPpXxXxPlFePePvAc": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Acc"},
|
||||||
|
"VbMnPpXxXxPlFePePvGe": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Gen"},
|
||||||
|
"VbMnPpXxXxPlFePePvNm": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom"},
|
||||||
|
"VbMnPpXxXxPlFePePvVo": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Voc"},
|
||||||
|
"VbMnPpXxXxPlMaPePvAc": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Acc"},
|
||||||
|
"VbMnPpXxXxPlMaPePvGe": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Gen"},
|
||||||
|
"VbMnPpXxXxPlMaPePvNm": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom"},
|
||||||
|
"VbMnPpXxXxPlMaPePvVo": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Voc"},
|
||||||
|
"VbMnPpXxXxPlNePePvAc": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Acc"},
|
||||||
|
"VbMnPpXxXxPlNePePvGe": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Gen"},
|
||||||
|
"VbMnPpXxXxPlNePePvNm": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom"},
|
||||||
|
"VbMnPpXxXxPlNePePvVo": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Plur", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Voc"},
|
||||||
|
"VbMnPpXxXxSgFePePvAc": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Acc"},
|
||||||
|
"VbMnPpXxXxSgFePePvGe": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Gen"},
|
||||||
|
"VbMnPpXxXxSgFePePvNm": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom"},
|
||||||
|
"VbMnPpXxXxSgFePePvVo": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Fem", "Aspect": "Perf" , "Voice": "Pass", "Case": "Voc"},
|
||||||
|
"VbMnPpXxXxSgMaPePvAc": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Acc"},
|
||||||
|
"VbMnPpXxXxSgMaPePvGe": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Gen"},
|
||||||
|
"VbMnPpXxXxSgMaPePvNm": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom"},
|
||||||
|
"VbMnPpXxXxSgMaPePvVo": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Masc", "Aspect": "Perf" , "Voice": "Pass", "Case": "Voc"},
|
||||||
|
"VbMnPpXxXxSgNePePvAc": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Acc"},
|
||||||
|
"VbMnPpXxXxSgNePePvGe": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Gen"},
|
||||||
|
"VbMnPpXxXxSgNePePvNm": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Nom"},
|
||||||
|
"VbMnPpXxXxSgNePePvVo": {POS: VERB, "VerbForm": "Part", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing", "Gender": "Neut", "Aspect": "Perf" , "Voice": "Pass", "Case": "Voc"},
|
||||||
|
"VbMnPpXxXxXxXxIpAvXx": {POS: VERB, "VerbForm": "Conv", "Mood": "", "Tense": "Pres|Past", "Person": "1|2|3", "Number": "Sing|Plur", "Gender": "Masc|Fem|Neut", "Aspect": "Imp" , "Voice": "Act", "Case": "Nom|Gen|Dat|Acc|Voc"}
|
||||||
|
}
|
||||||
|
|
24
spacy/lang/el/tag_map_general.py
Normal file
24
spacy/lang/el/tag_map_general.py
Normal file
|
@ -0,0 +1,24 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
from ...symbols import POS, PUNCT, SYM, ADJ, CCONJ, SCONJ, NUM, DET, ADV, ADP, X, VERB
|
||||||
|
from ...symbols import NOUN, PROPN, PART, INTJ,SPACE,PRON
|
||||||
|
|
||||||
|
TAG_MAP = {
|
||||||
|
"Adjective": {POS:ADJ},
|
||||||
|
"Adposition": {POS:ADP},
|
||||||
|
"Adverb": {POS:ADV},
|
||||||
|
"Conjuction_Coordinating": {POS:CCONJ},
|
||||||
|
"Conjuction_Subordinating": {POS:SCONJ},
|
||||||
|
"Determiner": {POS:DET},
|
||||||
|
"Interjection": {POS:INTJ},
|
||||||
|
"Noun_Common": {POS:NOUN},
|
||||||
|
"Noun_Proper": {POS:PROPN},
|
||||||
|
"Numeral": {POS:NUM},
|
||||||
|
"Other": {POS:X},
|
||||||
|
"Particle": {POS:PART},
|
||||||
|
"Pronoun": {POS:PRON},
|
||||||
|
"Punctuation": {POS:PUNCT},
|
||||||
|
"Symbol": {POS:SYM},
|
||||||
|
"Verb": {POS:VERB}
|
||||||
|
}
|
382
spacy/lang/el/tokenizer_exceptions.py
Normal file
382
spacy/lang/el/tokenizer_exceptions.py
Normal file
|
@ -0,0 +1,382 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from ...symbols import ORTH, LEMMA, TAG, NORM, ADP, DET
|
||||||
|
|
||||||
|
_exc = {}
|
||||||
|
|
||||||
|
for token in ["Απ'", "ΑΠ'", "αφ'", "Αφ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "από", NORM: "από"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Αλλ'", "αλλ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "αλλά", NORM: "αλλά"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["παρ'", "Παρ'", "ΠΑΡ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "παρά", NORM: "παρά"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["καθ'", "Καθ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "κάθε", NORM: "κάθε"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["κατ'", "Κατ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "κατά", NORM: "κατά"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["'ΣΟΥΝ", "'ναι", "'ταν", "'τανε", "'μαστε", "'μουνα", "'μουν"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "είμαι", NORM: "είμαι"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Επ'", "επ'", "εφ'", "Εφ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "επί", NORM: "επί"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Δι'", "δι'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "δια", NORM: "δια"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["'χουν", "'χουμε", "'χαμε", "'χα", "'χε", "'χεις", "'χει"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "έχω", NORM: "έχω"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["υπ'", "Υπ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "υπό", NORM: "υπό"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Μετ'", "ΜΕΤ'", "'μετ"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "μετά", NORM: "μετά"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Μ'", "μ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "με", NORM: "με"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Γι'", "ΓΙ'", "γι'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "για", NORM: "για"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Σ'", "σ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "σε", NORM: "σε"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Θ'", "θ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "θα", NORM: "θα"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Ν'", "ν'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "να", NORM: "να"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Τ'", "τ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "να", NORM: "να"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["'γω", "'σένα", "'μεις"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "εγώ", NORM: "εγώ"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Τ'", "τ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "το", NORM: "το"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Φέρ'", "Φερ'", "φέρ'", "φερ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "φέρνω", NORM: "φέρνω"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["'ρθούνε", "'ρθουν", "'ρθει", "'ρθεί", "'ρθε", "'ρχεται"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "έρχομαι", NORM: "έρχομαι"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["'πανε", "'λεγε", "'λεγαν", "'πε", "'λεγα"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "λέγω", NORM: "λέγω"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Πάρ'", "πάρ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "παίρνω", NORM: "παίρνω"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["μέσ'", "Μέσ'", "μεσ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "μέσα", NORM: "μέσα"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["Δέσ'", "Δεσ'", "δεσ'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "δένω", NORM: "δένω"}
|
||||||
|
]
|
||||||
|
|
||||||
|
for token in ["'κανε", "Κάν'"]:
|
||||||
|
_exc[token] = [
|
||||||
|
{ORTH: token, LEMMA: "κάνω", NORM: "κάνω"}
|
||||||
|
]
|
||||||
|
|
||||||
|
_other_exc = {
|
||||||
|
|
||||||
|
"κι": [
|
||||||
|
{ORTH: "κι", LEMMA: "και", NORM: "και"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Παίξ'": [
|
||||||
|
{ORTH: "Παίξ'", LEMMA: "παίζω", NORM: "παίζω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Αντ'": [
|
||||||
|
{ORTH: "Αντ'", LEMMA: "αντί", NORM: "αντί"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"ολ'": [
|
||||||
|
{ORTH: "ολ'", LEMMA: "όλος", NORM: "όλος"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"ύστερ'": [
|
||||||
|
{ORTH: "ύστερ'", LEMMA: "ύστερα", NORM: "ύστερα"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'πρεπε": [
|
||||||
|
{ORTH: "'πρεπε", LEMMA: "πρέπει", NORM: "πρέπει"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Δύσκολ'": [
|
||||||
|
{ORTH: "Δύσκολ'", LEMMA: "δύσκολος", NORM: "δύσκολος"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'θελα": [
|
||||||
|
{ORTH: "'θελα", LEMMA: "θέλω", NORM: "θέλω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'γραφα": [
|
||||||
|
{ORTH: "'γραφα", LEMMA: "γράφω", NORM: "γράφω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'παιρνα": [
|
||||||
|
{ORTH: "'παιρνα", LEMMA: "παίρνω", NORM: "παίρνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'δειξε": [
|
||||||
|
{ORTH: "'δειξε", LEMMA: "δείχνω", NORM: "δείχνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"όμουρφ'": [
|
||||||
|
{ORTH: "όμουρφ'", LEMMA: "όμορφος", NORM: "όμορφος"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"κ'τσή": [
|
||||||
|
{ORTH: "κ'τσή", LEMMA: "κουτσός", NORM: "κουτσός"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"μηδ'": [
|
||||||
|
{ORTH: "μηδ'", LEMMA: "μήδε", NORM: "μήδε"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'ξομολογήθηκε": [
|
||||||
|
{ORTH: "'ξομολογήθηκε", LEMMA: "εξομολογούμαι", NORM: "εξομολογούμαι"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'μας": [
|
||||||
|
{ORTH: "'μας", LEMMA: "εμάς", NORM: "εμάς"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'ξερες": [
|
||||||
|
{ORTH: "'ξερες", LEMMA: "ξέρω", NORM: "ξέρω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"έφθασ'": [
|
||||||
|
{ORTH: "έφθασ'", LEMMA: "φθάνω", NORM: "φθάνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"εξ'": [
|
||||||
|
{ORTH: "εξ'", LEMMA: "εκ", NORM: "εκ"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"δώσ'": [
|
||||||
|
{ORTH: "δώσ'", LEMMA: "δίνω", NORM: "δίνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"τίποτ'": [
|
||||||
|
{ORTH: "τίποτ'", LEMMA: "τίποτα", NORM: "τίποτα"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Λήξ'": [
|
||||||
|
{ORTH: "Λήξ'", LEMMA: "λήγω", NORM: "λήγω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"άσ'": [
|
||||||
|
{ORTH: "άσ'", LEMMA: "αφήνω", NORM: "αφήνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Στ'": [
|
||||||
|
{ORTH: "Στ'", LEMMA: "στο", NORM: "στο"},
|
||||||
|
|
||||||
|
],
|
||||||
|
|
||||||
|
"Δωσ'": [
|
||||||
|
{ORTH: "Δωσ'", LEMMA: "δίνω", NORM: "δίνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Βάψ'": [
|
||||||
|
{ORTH: "Βάψ'", LEMMA: "βάφω", NORM: "βάφω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Αλλ'": [
|
||||||
|
{ORTH: "Αλλ'", LEMMA: "αλλά", NORM: "αλλά"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Αμ'": [
|
||||||
|
{ORTH: "Αμ'", LEMMA: "άμα", NORM: "άμα"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"Αγόρασ'": [
|
||||||
|
{ORTH: "Αγόρασ'", LEMMA: "αγοράζω", NORM: "αγοράζω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'φύγε": [
|
||||||
|
{ORTH: "'φύγε", LEMMA: "φεύγω", NORM: "φεύγω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'φερε": [
|
||||||
|
{ORTH: "'φερε", LEMMA: "φέρνω", NORM: "φέρνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'φαγε": [
|
||||||
|
{ORTH: "'φαγε", LEMMA: "τρώω", NORM: "τρώω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'σπαγαν": [
|
||||||
|
{ORTH: "'σπαγαν", LEMMA: "σπάω", NORM: "σπάω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'σκασε": [
|
||||||
|
{ORTH: "'σκασε", LEMMA: "σκάω", NORM: "σκάω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'σβηνε": [
|
||||||
|
{ORTH: "'σβηνε", LEMMA: "σβήνω", NORM: "σβήνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'ριξε": [
|
||||||
|
{ORTH: "'ριξε", LEMMA: "ρίχνω", NORM: "ρίχνω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'κλεβε": [
|
||||||
|
{ORTH: "'κλεβε", LEMMA: "κλέβω", NORM: "κλέβω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'κει": [
|
||||||
|
{ORTH: "'κει", LEMMA: "εκεί", NORM: "εκεί"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'βλεπε": [
|
||||||
|
{ORTH: "'βλεπε", LEMMA: "βλέπω", NORM: "βλέπω"},
|
||||||
|
],
|
||||||
|
|
||||||
|
"'βγαινε": [
|
||||||
|
{ORTH: "'βγαινε", LEMMA: "βγαίνω", NORM: "βγαίνω"},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
_exc.update(_other_exc)
|
||||||
|
|
||||||
|
for h in range(1, 12 + 1):
|
||||||
|
|
||||||
|
for period in ["π.μ.", "πμ"]:
|
||||||
|
_exc["%d%s" % (h, period)] = [
|
||||||
|
{ORTH: "%d" % h},
|
||||||
|
{ORTH: period, LEMMA: "π.μ.", NORM: "π.μ."}]
|
||||||
|
|
||||||
|
for period in ["μ.μ.", "μμ"]:
|
||||||
|
_exc["%d%s" % (h, period)] = [
|
||||||
|
{ORTH: "%d" % h},
|
||||||
|
{ORTH: period, LEMMA: "μ.μ.", NORM: "μ.μ."}]
|
||||||
|
|
||||||
|
for exc_data in [
|
||||||
|
{ORTH: "ΑΓΡ.", LEMMA: "Αγροτικός", NORM: "Αγροτικός"},
|
||||||
|
{ORTH: "Αγ. Γρ.", LEMMA: "Αγία Γραφή", NORM: "Αγία Γραφή"},
|
||||||
|
{ORTH: "Αθ.", LEMMA: "Αθανάσιος", NORM: "Αθανάσιος"},
|
||||||
|
{ORTH: "Αλεξ.", LEMMA: "Αλέξανδρος", NORM: "Αλέξανδρος"},
|
||||||
|
{ORTH: "Απρ.", LEMMA: "Απρίλιος", NORM: "Απρίλιος"},
|
||||||
|
{ORTH: "Αύγ.", LEMMA: "Αύγουστος", NORM: "Αύγουστος"},
|
||||||
|
{ORTH: "Δεκ.", LEMMA: "Δεκέμβριος", NORM: "Δεκέμβριος"},
|
||||||
|
{ORTH: "Δημ.", LEMMA: "Δήμος", NORM: "Δήμος"},
|
||||||
|
{ORTH: "Ιαν.", LEMMA: "Ιανουάριος", NORM: "Ιανουάριος"},
|
||||||
|
{ORTH: "Ιούλ.", LEMMA: "Ιούλιος", NORM: "Ιούλιος"},
|
||||||
|
{ORTH: "Ιούν.", LEMMA: "Ιούνιος", NORM: "Ιούνιος"},
|
||||||
|
{ORTH: "Ιωαν.", LEMMA: "Ιωάννης", NORM: "Ιωάννης"},
|
||||||
|
{ORTH: "Μ. Ασία", LEMMA: "Μικρά Ασία", NORM: "Μικρά Ασία"},
|
||||||
|
{ORTH: "Μάρτ.", LEMMA: "Μάρτιος", NORM: "Μάρτιος"},
|
||||||
|
{ORTH: "Μάρτ'", LEMMA: "Μάρτιος", NORM: "Μάρτιος"},
|
||||||
|
{ORTH: "Νοέμβρ.", LEMMA: "Νοέμβριος", NORM: "Νοέμβριος"},
|
||||||
|
{ORTH: "Οκτ.", LEMMA: "Οκτώβριος", NORM: "Οκτώβριος"},
|
||||||
|
{ORTH: "Σεπτ.", LEMMA: "Σεπτέμβριος", NORM: "Σεπτέμβριος"},
|
||||||
|
{ORTH: "Φεβρ.", LEMMA: "Φεβρουάριος", NORM: "Φεβρουάριος"},
|
||||||
|
]:
|
||||||
|
_exc[exc_data[ORTH]] = [exc_data]
|
||||||
|
|
||||||
|
for orth in [
|
||||||
|
"$ΗΠΑ",
|
||||||
|
"Α'", "Α.Ε.", "Α.Ε.Β.Ε.", "Α.Ε.Ι.", "Α.Ε.Π.", "Α.Μ.Α.", "Α.Π.Θ.", "Α.Τ.", "Α.Χ.", "ΑΝ.", "Αγ.", "Αλ.", "Αν.",
|
||||||
|
"Αντ.", "Απ.",
|
||||||
|
"Β'", "Β)", "Β.Ζ.", "Β.Ι.Ο.", "Β.Κ.", "Β.Μ.Α.", "Βασ.",
|
||||||
|
"Γ'", "Γ)", "Γ.Γ.", "Γ.Δ.", "Γκ.",
|
||||||
|
"Δ.Ε.Η.", "Δ.Ε.Σ.Ε.", "Δ.Ν.", "Δ.Ο.Υ.", "Δ.Σ.", "Δ.Υ.", "ΔΙ.ΚΑ.Τ.Σ.Α.", "Δηλ.", "Διον.",
|
||||||
|
"Ε.Α.", "Ε.Α.Κ.", "Ε.Α.Π.", "Ε.Ε.", "Ε.Κ.", "Ε.ΚΕ.ΠΙΣ.", "Ε.Λ.Α.", "Ε.Λ.Ι.Α.", "Ε.Π.Σ.", "Ε.Π.Τ.Α.", "Ε.Σ.Ε.Ε.Κ.",
|
||||||
|
"Ε.Υ.Κ.", "ΕΕ.", "ΕΚ.", "ΕΛ.", "ΕΛ.ΑΣ.", "Εθν.", "Ελ.", "Εμ.", "Επ.", "Ευ.",
|
||||||
|
"Η'", "Η.Π.Α.",
|
||||||
|
"ΘΕ.", "Θεμ.", "Θεοδ.", "Θρ.",
|
||||||
|
"Ι.Ε.Κ.", "Ι.Κ.Α.", "Ι.Κ.Υ.", "Ι.Σ.Θ.", "Ι.Χ.", "ΙΖ'", "ΙΧ.",
|
||||||
|
"Κ.Α.Α.", "Κ.Α.Ε.", "Κ.Β.Σ.", "Κ.Δ.", "Κ.Ε.", "Κ.Ε.Κ.", "Κ.Ι.", "Κ.Κ.", "Κ.Ι.Θ.", "Κ.Ι.Θ.", "Κ.ΚΕΚ.", "Κ.Ο.",
|
||||||
|
"Κ.Π.Ρ.", "ΚΑΤ.", "ΚΚ.", "Καν.", "Καρ.", "Κατ.", "Κυρ.", "Κων.",
|
||||||
|
"Λ.Α.", "Λ.χ.", "Λ.Χ.", "Λεωφ.", "Λι.",
|
||||||
|
"Μ.Δ.Ε.", "Μ.Ε.Ο.", "Μ.Ζ.", "Μ.Μ.Ε.", "Μ.Ο.", "Μεγ.", "Μιλτ.", "Μιχ.",
|
||||||
|
"Ν.Δ.", "Ν.Ε.Α.", "Ν.Κ.", "Ν.Ο.", "Ν.Ο.Θ.", "Ν.Π.Δ.Δ.", "Ν.Υ.", "ΝΔ.", "Νικ.", "Ντ'", "Ντ.",
|
||||||
|
"Ο'", "Ο.Α.", "Ο.Α.Ε.Δ.", "Ο.Δ.", "Ο.Ε.Ε.", "Ο.Ε.Ε.Κ.", "Ο.Η.Ε.", "Ο.Κ.",
|
||||||
|
"Π.Δ.", "Π.Ε.Κ.Δ.Υ.", "Π.Ε.Π.", "Π.Μ.Σ.", "ΠΟΛ.", "Π.Χ.", "Παρ.", "Πλ.", "Πρ.",
|
||||||
|
"Σ.Δ.Ο.Ε.", "Σ.Ε.", "Σ.Ε.Κ.", "Σ.Π.Δ.Ω.Β.", "Σ.Τ.", "Σαβ.", "Στ.", "ΣτΕ.", "Στρ.",
|
||||||
|
"Τ.Α.", "Τ.Ε.Ε.", "Τ.Ε.Ι.", "ΤΡ.", "Τζ.", "Τηλ.",
|
||||||
|
"Υ.Γ.", "ΥΓ.", "ΥΠ.Ε.Π.Θ.",
|
||||||
|
"Φ.Α.Β.Ε.", "Φ.Κ.", "Φ.Σ.", "Φ.Χ.", "Φ.Π.Α.", "Φιλ.",
|
||||||
|
"Χ.Α.Α.", "ΧΡ.", "Χ.Χ.", "Χαρ.", "Χιλ.", "Χρ.",
|
||||||
|
"άγ.", "άρθρ.", "αι.", "αν.", "απ.", "αρ.", "αριθ.", "αριθμ.",
|
||||||
|
"β'", "βλ.",
|
||||||
|
"γ.γ.", "γεν.", "γραμμ.",
|
||||||
|
"δ.δ.", "δ.σ.", "δηλ.", "δισ.", "δολ.", "δρχ.",
|
||||||
|
"εκ.", "εκατ.", "ελ.",
|
||||||
|
"θιν'",
|
||||||
|
"κ.", "κ.ά.", "κ.α.", "κ.κ.", "κ.λπ.", "κ.ο.κ.", "κ.τ.λ.", "κλπ.", "κτλ.", "κυβ.",
|
||||||
|
"λ.χ.",
|
||||||
|
"μ.", "μ.Χ.", "μ.μ.", "μιλ.",
|
||||||
|
"ντ'",
|
||||||
|
"π.Χ.", "π.β.", "π.δ.", "π.μ.", "π.χ.",
|
||||||
|
"σ.", "σ.α.λ.", "σ.σ.", "σελ.", "στρ.",
|
||||||
|
"τ'ς", "τ.μ.", "τετ.", "τετρ.", "τηλ.", "τρισ.", "τόν.",
|
||||||
|
"υπ.",
|
||||||
|
"χ.μ.", "χγρ.", "χιλ.", "χλμ."
|
||||||
|
]:
|
||||||
|
_exc[orth] = [{ORTH: orth}]
|
||||||
|
|
||||||
|
TOKENIZER_EXCEPTIONS = _exc
|
|
@ -14,7 +14,7 @@ from .. import util
|
||||||
# These languages are used for generic tokenizer tests – only add a language
|
# These languages are used for generic tokenizer tests – only add a language
|
||||||
# here if it's using spaCy's tokenizer (not a different library)
|
# here if it's using spaCy's tokenizer (not a different library)
|
||||||
# TODO: re-implement generic tokenizer tests
|
# TODO: re-implement generic tokenizer tests
|
||||||
_languages = ['bn', 'da', 'de', 'en', 'es', 'fi', 'fr', 'ga', 'he', 'hu', 'id',
|
_languages = ['bn', 'da', 'de', 'el', 'en', 'es', 'fi', 'fr', 'ga', 'he', 'hu', 'id',
|
||||||
'it', 'nb', 'nl', 'pl', 'pt', 'ro', 'ru', 'sv', 'tr', 'ar', 'ut', 'tt',
|
'it', 'nb', 'nl', 'pl', 'pt', 'ro', 'ru', 'sv', 'tr', 'ar', 'ut', 'tt',
|
||||||
'xx']
|
'xx']
|
||||||
|
|
||||||
|
@ -158,6 +158,10 @@ def tr_tokenizer():
|
||||||
def tt_tokenizer():
|
def tt_tokenizer():
|
||||||
return util.get_lang_class('tt').Defaults.create_tokenizer()
|
return util.get_lang_class('tt').Defaults.create_tokenizer()
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def el_tokenizer():
|
||||||
|
return util.get_lang_class('el').Defaults.create_tokenizer()
|
||||||
|
|
||||||
@pytest.fixture
|
@pytest.fixture
|
||||||
def ar_tokenizer():
|
def ar_tokenizer():
|
||||||
return util.get_lang_class('ar').Defaults.create_tokenizer()
|
return util.get_lang_class('ar').Defaults.create_tokenizer()
|
||||||
|
|
0
spacy/tests/lang/el/__init__.py
Normal file
0
spacy/tests/lang/el/__init__.py
Normal file
18
spacy/tests/lang/el/test_exception.py
Normal file
18
spacy/tests/lang/el/test_exception.py
Normal file
|
@ -0,0 +1,18 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.parametrize('text', ["αριθ.", "τρισ.", "δισ.", "σελ."])
|
||||||
|
def test_tokenizer_handles_abbr(el_tokenizer, text):
|
||||||
|
tokens = el_tokenizer(text)
|
||||||
|
assert len(tokens) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_tokenizer_handles_exc_in_text(el_tokenizer):
|
||||||
|
text = "Στα 14 τρισ. δολάρια το κόστος από την άνοδο της στάθμης της θάλασσας."
|
||||||
|
tokens = el_tokenizer(text)
|
||||||
|
assert len(tokens) == 14
|
||||||
|
assert tokens[2].text == "τρισ."
|
25
spacy/tests/lang/el/test_text.py
Normal file
25
spacy/tests/lang/el/test_text.py
Normal file
|
@ -0,0 +1,25 @@
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
|
||||||
|
def test_tokenizer_handles_long_text(el_tokenizer):
|
||||||
|
text = """Η Ελλάδα (παλαιότερα Ελλάς), επίσημα γνωστή ως Ελληνική Δημοκρατία,\
|
||||||
|
είναι χώρα της νοτιοανατολικής Ευρώπης στο νοτιότερο άκρο της Βαλκανικής χερσονήσου.\
|
||||||
|
Συνορεύει στα βορειοδυτικά με την Αλβανία, στα βόρεια με την πρώην\
|
||||||
|
Γιουγκοσλαβική Δημοκρατία της Μακεδονίας και τη Βουλγαρία και στα βορειοανατολικά με την Τουρκία."""
|
||||||
|
tokens = el_tokenizer(text)
|
||||||
|
assert len(tokens) == 54
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.parametrize('text,length',[
|
||||||
|
("Διοικητικά η Ελλάδα διαιρείται σε 13 Περιφέρειες.", 8),
|
||||||
|
("Η εκπαίδευση στην Ελλάδα χωρίζεται κυρίως σε τρία επίπεδα.", 10),
|
||||||
|
("Η Ελλάδα είναι μία από τις χώρες της Ευρωπαϊκής Ένωσης (ΕΕ) που διαθέτει σηµαντικό ορυκτό πλούτο.", 19),
|
||||||
|
("Η ναυτιλία αποτέλεσε ένα σημαντικό στοιχείο της Ελληνικής οικονομικής δραστηριότητας από τα αρχαία χρόνια.", 15),
|
||||||
|
("Η Ελλάδα είναι μέλος σε αρκετούς διεθνείς οργανισμούς.", 9)])
|
||||||
|
def test_tokenizer_handles_cnts(el_tokenizer,text, length):
|
||||||
|
tokens = el_tokenizer(text)
|
||||||
|
assert len(tokens) == length
|
Loading…
Reference in New Issue
Block a user