spaCy/spacy/lang/fa/tokenizer_exceptions.py

748 lines
63 KiB
Python
Raw Normal View History

from ...symbols import ORTH, NORM
💫 Port master changes over to develop (#2979) * Create aryaprabhudesai.md (#2681) * Update _install.jade (#2688) Typo fix: "models" -> "model" * Add FAC to spacy.explain (resolves #2706) * Remove docstrings for deprecated arguments (see #2703) * When calling getoption() in conftest.py, pass a default option (#2709) * When calling getoption() in conftest.py, pass a default option This is necessary to allow testing an installed spacy by running: pytest --pyargs spacy * Add contributor agreement * update bengali token rules for hyphen and digits (#2731) * Less norm computations in token similarity (#2730) * Less norm computations in token similarity * Contributor agreement * Remove ')' for clarity (#2737) Sorry, don't mean to be nitpicky, I just noticed this when going through the CLI and thought it was a quick fix. That said, if this was intention than please let me know. * added contributor agreement for mbkupfer (#2738) * Basic support for Telugu language (#2751) * Lex _attrs for polish language (#2750) * Signed spaCy contributor agreement * Added polish version of english lex_attrs * Introduces a bulk merge function, in order to solve issue #653 (#2696) * Fix comment * Introduce bulk merge to increase performance on many span merges * Sign contributor agreement * Implement pull request suggestions * Describe converters more explicitly (see #2643) * Add multi-threading note to Language.pipe (resolves #2582) [ci skip] * Fix formatting * Fix dependency scheme docs (closes #2705) [ci skip] * Don't set stop word in example (closes #2657) [ci skip] * Add words to portuguese language _num_words (#2759) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Update Indonesian model (#2752) * adding e-KTP in tokenizer exceptions list * add exception token * removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception * add tokenizer exceptions list * combining base_norms with norm_exceptions * adding norm_exception * fix double key in lemmatizer * remove unused import on punctuation.py * reformat stop_words to reduce number of lines, improve readibility * updating tokenizer exception * implement is_currency for lang/id * adding orth_first_upper in tokenizer_exceptions * update the norm_exception list * remove bunch of abbreviations * adding contributors file * Fixed spaCy+Keras example (#2763) * bug fixes in keras example * created contributor agreement * Adding French hyphenated first name (#2786) * Fix typo (closes #2784) * Fix typo (#2795) [ci skip] Fixed typo on line 6 "regcognizer --> recognizer" * Adding basic support for Sinhala language. (#2788) * adding Sinhala language package, stop words, examples and lex_attrs. * Adding contributor agreement * Updating contributor agreement * Also include lowercase norm exceptions * Fix error (#2802) * Fix error ValueError: cannot resize an array that references or is referenced by another array in this way. Use the resize function * added spaCy Contributor Agreement * Add charlax's contributor agreement (#2805) * agreement of contributor, may I introduce a tiny pl languge contribution (#2799) * Contributors agreement * Contributors agreement * Contributors agreement * Add jupyter=True to displacy.render in documentation (#2806) * Revert "Also include lowercase norm exceptions" This reverts commit 70f4e8adf37cfcfab60be2b97d6deae949b30e9e. * Remove deprecated encoding argument to msgpack * Set up dependency tree pattern matching skeleton (#2732) * Fix bug when too many entity types. Fixes #2800 * Fix Python 2 test failure * Require older msgpack-numpy * Restore encoding arg on msgpack-numpy * Try to fix version pin for msgpack-numpy * Update Portuguese Language (#2790) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Portuguese - Add/remove stopwords, fix tokenizer, add currency symbols * Extended punctuation and norm_exceptions in the Portuguese language * Correct error in spacy universe docs concerning spacy-lookup (#2814) * Update Keras Example for (Parikh et al, 2016) implementation (#2803) * bug fixes in keras example * created contributor agreement * baseline for Parikh model * initial version of parikh 2016 implemented * tested asymmetric models * fixed grevious error in normalization * use standard SNLI test file * begin to rework parikh example * initial version of running example * start to document the new version * start to document the new version * Update Decompositional Attention.ipynb * fixed calls to similarity * updated the README * import sys package duh * simplified indexing on mapping word to IDs * stupid python indent error * added code from https://github.com/tensorflow/tensorflow/issues/3388 for tf bug workaround * Fix typo (closes #2815) [ci skip] * Update regex version dependency * Set version to 2.0.13.dev3 * Skip seemingly problematic test * Remove problematic test * Try previous version of regex * Revert "Remove problematic test" This reverts commit bdebbef45552d698d390aa430b527ee27830f11b. * Unskip test * Try older version of regex * 💫 Update training examples and use minibatching (#2830) <!--- Provide a general summary of your changes in the title. --> ## Description Update the training examples in `/examples/training` to show usage of spaCy's `minibatch` and `compounding` helpers ([see here](https://spacy.io/usage/training#tips-batch-size) for details). The lack of batching in the examples has caused some confusion in the past, especially for beginners who would copy-paste the examples, update them with large training sets and experienced slow and unsatisfying results. ### Types of change enhancements ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Visual C++ link updated (#2842) (closes #2841) [ci skip] * New landing page * Add contribution agreement * Correcting lang/ru/examples.py (#2845) * Correct some grammatical inaccuracies in lang\ru\examples.py; filled Contributor Agreement * Correct some grammatical inaccuracies in lang\ru\examples.py * Move contributor agreement to separate file * Set version to 2.0.13.dev4 * Add Persian(Farsi) language support (#2797) * Also include lowercase norm exceptions * Remove in favour of https://github.com/explosion/spaCy/graphs/contributors * Rule-based French Lemmatizer (#2818) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> Add a rule-based French Lemmatizer following the english one and the excellent PR for [greek language optimizations](https://github.com/explosion/spaCy/pull/2558) to adapt the Lemmatizer class. ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> - Lemma dictionary used can be found [here](http://infolingu.univ-mlv.fr/DonneesLinguistiques/Dictionnaires/telechargement.html), I used the XML version. - Add several files containing exhaustive list of words for each part of speech - Add some lemma rules - Add POS that are not checked in the standard Lemmatizer, i.e PRON, DET, ADV and AUX - Modify the Lemmatizer class to check in lookup table as a last resort if POS not mentionned - Modify the lemmatize function to check in lookup table as a last resort - Init files are updated so the model can support all the functionalities mentioned above - Add words to tokenizer_exceptions_list.py in respect to regex used in tokenizer_exceptions.py ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [X] I have submitted the spaCy Contributor Agreement. - [X] I ran the tests, and all new and existing tests passed. - [X] My changes don't require a change to the documentation, or if they do, I've added all required information. * Set version to 2.0.13 * Fix formatting and consistency * Update docs for new version [ci skip] * Increment version [ci skip] * Add info on wheels [ci skip] * Adding "This is a sentence" example to Sinhala (#2846) * Add wheels badge * Update badge [ci skip] * Update README.rst [ci skip] * Update murmurhash pin * Increment version to 2.0.14.dev0 * Update GPU docs for v2.0.14 * Add wheel to setup_requires * Import prefer_gpu and require_gpu functions from Thinc * Add tests for prefer_gpu() and require_gpu() * Update requirements and setup.py * Workaround bug in thinc require_gpu * Set version to v2.0.14 * Update push-tag script * Unhack prefer_gpu * Require thinc 6.10.6 * Update prefer_gpu and require_gpu docs [ci skip] * Fix specifiers for GPU * Set version to 2.0.14.dev1 * Set version to 2.0.14 * Update Thinc version pin * Increment version * Fix msgpack-numpy version pin * Increment version * Update version to 2.0.16 * Update version [ci skip] * Redundant ')' in the Stop words' example (#2856) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [ ] I have submitted the spaCy Contributor Agreement. - [ ] I ran the tests, and all new and existing tests passed. - [ ] My changes don't require a change to the documentation, or if they do, I've added all required information. * Documentation improvement regarding joblib and SO (#2867) Some documentation improvements ## Description 1. Fixed the dead URL to joblib 2. Fixed Stack Overflow brand name (with space) ### Types of change Documentation ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * raise error when setting overlapping entities as doc.ents (#2880) * Fix out-of-bounds access in NER training The helper method state.B(1) gets the index of the first token of the buffer, or -1 if no such token exists. Normally this is safe because we pass this to functions like state.safe_get(), which returns an empty token. Here we used it directly as an array index, which is not okay! This error may have been the cause of out-of-bounds access errors during training. Similar errors may still be around, so much be hunted down. Hunting this one down took a long time...I printed out values across training runs and diffed, looking for points of divergence between runs, when no randomness should be allowed. * Change PyThaiNLP Url (#2876) * Fix missing comma * Add example showing a fix-up rule for space entities * Set version to 2.0.17.dev0 * Update regex version * Revert "Update regex version" This reverts commit 62358dd867d15bc6a475942dff34effba69dd70a. * Try setting older regex version, to align with conda * Set version to 2.0.17 * Add spacy-js to universe [ci-skip] * Add spacy-raspberry to universe (closes #2889) * Add script to validate universe json [ci skip] * Removed space in docs + added contributor indo (#2909) * - removed unneeded space in documentation * - added contributor info * Allow input text of length up to max_length, inclusive (#2922) * Include universe spec for spacy-wordnet component (#2919) * feat: include universe spec for spacy-wordnet component * chore: include spaCy contributor agreement * Minor formatting changes [ci skip] * Fix image [ci skip] Twitter URL doesn't work on live site * Check if the word is in one of the regular lists specific to each POS (#2886) * 💫 Create random IDs for SVGs to prevent ID clashes (#2927) Resolves #2924. ## Description Fixes problem where multiple visualizations in Jupyter notebooks would have clashing arc IDs, resulting in weirdly positioned arc labels. Generating a random ID prefix so even identical parses won't receive the same IDs for consistency (even if effect of ID clash isn't noticable here.) ### Types of change bug fix ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Fix typo [ci skip] * fixes symbolic link on py3 and windows (#2949) * fixes symbolic link on py3 and windows during setup of spacy using command python -m spacy link en_core_web_sm en closes #2948 * Update spacy/compat.py Co-Authored-By: cicorias <cicorias@users.noreply.github.com> * Fix formatting * Update universe [ci skip] * Catalan Language Support (#2940) * Catalan language Support * Ddding Catalan to documentation * Sort languages alphabetically [ci skip] * Update tests for pytest 4.x (#2965) <!--- Provide a general summary of your changes in the title. --> ## Description - [x] Replace marks in params for pytest 4.0 compat ([see here](https://docs.pytest.org/en/latest/deprecations.html#marks-in-pytest-mark-parametrize)) - [x] Un-xfail passing tests (some fixes in a recent update resolved a bunch of issues, but tests were apparently never updated here) ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Fix regex pin to harmonize with conda (#2964) * Update README.rst * Fix bug where Vocab.prune_vector did not use 'batch_size' (#2977) Fixes #2976 * Fix typo * Fix typo * Remove duplicate file * Require thinc 7.0.0.dev2 Fixes bug in gpu_ops that would use cupy instead of numpy on CPU * Add missing import * Fix error IDs * Fix tests
2018-11-29 18:30:29 +03:00
TOKENIZER_EXCEPTIONS = {
"": [{ORTH: ""}],
"": [{ORTH: ""}],
".هـ": [{ORTH: ".هـ"}],
"ب.م": [{ORTH: "ب.م"}],
"ق.م": [{ORTH: "ق.م"}],
"آبرویت": [{ORTH: "آبروی", NORM: "آبروی"}, {ORTH: "ت", NORM: "ت"}],
"آب‌نباتش": [{ORTH: "آب‌نبات", NORM: "آب‌نبات"}, {ORTH: "ش", NORM: "ش"}],
"آثارش": [{ORTH: "آثار", NORM: "آثار"}, {ORTH: "ش", NORM: "ش"}],
"آخرش": [{ORTH: "آخر", NORM: "آخر"}, {ORTH: "ش", NORM: "ش"}],
"آدمهاست": [{ORTH: "آدمها", NORM: "آدمها"}, {ORTH: "ست", NORM: "ست"}],
"آرزومندیم": [{ORTH: "آرزومند", NORM: "آرزومند"}, {ORTH: "یم", NORM: "یم"}],
"آزادند": [{ORTH: "آزاد", NORM: "آزاد"}, {ORTH: "ند", NORM: "ند"}],
"آسیب‌پذیرند": [{ORTH: "آسیب‌پذیر", NORM: "آسیب‌پذیر"}, {ORTH: "ند", NORM: "ند"}],
"آفریده‌اند": [{ORTH: "آفریده‌", NORM: "آفریده‌"}, {ORTH: "اند", NORM: "اند"}],
"آمدنش": [{ORTH: "آمدن", NORM: "آمدن"}, {ORTH: "ش", NORM: "ش"}],
"آمریکاست": [{ORTH: "آمریکا", NORM: "آمریکا"}, {ORTH: "ست", NORM: "ست"}],
"آنجاست": [{ORTH: "آنجا", NORM: "آنجا"}, {ORTH: "ست", NORM: "ست"}],
"آنست": [{ORTH: "آن", NORM: "آن"}, {ORTH: "ست", NORM: "ست"}],
"آنند": [{ORTH: "آن", NORM: "آن"}, {ORTH: "ند", NORM: "ند"}],
"آن‌هاست": [{ORTH: "آن‌ها", NORM: "آن‌ها"}, {ORTH: "ست", NORM: "ست"}],
"آپاداناست": [{ORTH: "آپادانا", NORM: "آپادانا"}, {ORTH: "ست", NORM: "ست"}],
"اجتماعی‌مان": [{ORTH: "اجتماعی‌", NORM: "اجتماعی‌"}, {ORTH: "مان", NORM: "مان"}],
"اجدادت": [{ORTH: "اجداد", NORM: "اجداد"}, {ORTH: "ت", NORM: "ت"}],
"اجدادش": [{ORTH: "اجداد", NORM: "اجداد"}, {ORTH: "ش", NORM: "ش"}],
"اجدادی‌شان": [{ORTH: "اجدادی‌", NORM: "اجدادی‌"}, {ORTH: "شان", NORM: "شان"}],
"اجراست": [{ORTH: "اجرا", NORM: "اجرا"}, {ORTH: "ست", NORM: "ست"}],
"اختیارش": [{ORTH: "اختیار", NORM: "اختیار"}, {ORTH: "ش", NORM: "ش"}],
"اخلاقشان": [{ORTH: "اخلاق", NORM: "اخلاق"}, {ORTH: "شان", NORM: "شان"}],
"ادعایمان": [{ORTH: "ادعای", NORM: "ادعای"}, {ORTH: "مان", NORM: "مان"}],
"اذیتش": [{ORTH: "اذیت", NORM: "اذیت"}, {ORTH: "ش", NORM: "ش"}],
"اراده‌اش": [{ORTH: "اراده‌", NORM: "اراده‌"}, {ORTH: "اش", NORM: "اش"}],
"ارتباطش": [{ORTH: "ارتباط", NORM: "ارتباط"}, {ORTH: "ش", NORM: "ش"}],
"ارتباطمان": [{ORTH: "ارتباط", NORM: "ارتباط"}, {ORTH: "مان", NORM: "مان"}],
"ارزشهاست": [{ORTH: "ارزشها", NORM: "ارزشها"}, {ORTH: "ست", NORM: "ست"}],
"ارزی‌اش": [{ORTH: "ارزی‌", NORM: "ارزی‌"}, {ORTH: "اش", NORM: "اش"}],
"اره‌اش": [{ORTH: "اره‌", NORM: "اره‌"}, {ORTH: "اش", NORM: "اش"}],
"ازش": [{ORTH: "از", NORM: "از"}, {ORTH: "ش", NORM: "ش"}],
"ازین": [{ORTH: "از", NORM: "از"}, {ORTH: "ین", NORM: "ین"}],
"ازین‌هاست": [
{ORTH: "از", NORM: "از"},
{ORTH: "ین‌ها", NORM: "ین‌ها"},
{ORTH: "ست", NORM: "ست"},
],
"استخوانند": [{ORTH: "استخوان", NORM: "استخوان"}, {ORTH: "ند", NORM: "ند"}],
"اسلامند": [{ORTH: "اسلام", NORM: "اسلام"}, {ORTH: "ند", NORM: "ند"}],
"اسلامی‌اند": [{ORTH: "اسلامی‌", NORM: "اسلامی‌"}, {ORTH: "اند", NORM: "اند"}],
"اسلحه‌هایشان": [
{ORTH: "اسلحه‌های", NORM: "اسلحه‌های"},
{ORTH: "شان", NORM: "شان"},
],
"اسمت": [{ORTH: "اسم", NORM: "اسم"}, {ORTH: "ت", NORM: "ت"}],
"اسمش": [{ORTH: "اسم", NORM: "اسم"}, {ORTH: "ش", NORM: "ش"}],
"اشتباهند": [{ORTH: "اشتباه", NORM: "اشتباه"}, {ORTH: "ند", NORM: "ند"}],
"اصلش": [{ORTH: "اصل", NORM: "اصل"}, {ORTH: "ش", NORM: "ش"}],
"اطاقش": [{ORTH: "اطاق", NORM: "اطاق"}, {ORTH: "ش", NORM: "ش"}],
"اعتقادند": [{ORTH: "اعتقاد", NORM: "اعتقاد"}, {ORTH: "ند", NORM: "ند"}],
"اعلایش": [{ORTH: "اعلای", NORM: "اعلای"}, {ORTH: "ش", NORM: "ش"}],
"افتراست": [{ORTH: "افترا", NORM: "افترا"}, {ORTH: "ست", NORM: "ست"}],
"افطارت": [{ORTH: "افطار", NORM: "افطار"}, {ORTH: "ت", NORM: "ت"}],
"اقوامش": [{ORTH: "اقوام", NORM: "اقوام"}, {ORTH: "ش", NORM: "ش"}],
"امروزیش": [{ORTH: "امروزی", NORM: "امروزی"}, {ORTH: "ش", NORM: "ش"}],
"اموالش": [{ORTH: "اموال", NORM: "اموال"}, {ORTH: "ش", NORM: "ش"}],
"امیدوارند": [{ORTH: "امیدوار", NORM: "امیدوار"}, {ORTH: "ند", NORM: "ند"}],
"امیدواریم": [{ORTH: "امیدوار", NORM: "امیدوار"}, {ORTH: "یم", NORM: "یم"}],
"انتخابهایم": [{ORTH: "انتخابها", NORM: "انتخابها"}, {ORTH: "یم", NORM: "یم"}],
"انتظارم": [{ORTH: "انتظار", NORM: "انتظار"}, {ORTH: "م", NORM: "م"}],
"انجمنم": [{ORTH: "انجمن", NORM: "انجمن"}, {ORTH: "م", NORM: "م"}],
"اندرش": [{ORTH: "اندر", NORM: "اندر"}, {ORTH: "ش", NORM: "ش"}],
"انشایش": [{ORTH: "انشای", NORM: "انشای"}, {ORTH: "ش", NORM: "ش"}],
"انگشتشان": [{ORTH: "انگشت", NORM: "انگشت"}, {ORTH: "شان", NORM: "شان"}],
"انگشتهایش": [{ORTH: "انگشتهای", NORM: "انگشتهای"}, {ORTH: "ش", NORM: "ش"}],
"اهمیتشان": [{ORTH: "اهمیت", NORM: "اهمیت"}, {ORTH: "شان", NORM: "شان"}],
"اهمیتند": [{ORTH: "اهمیت", NORM: "اهمیت"}, {ORTH: "ند", NORM: "ند"}],
"اوایلش": [{ORTH: "اوایل", NORM: "اوایل"}, {ORTH: "ش", NORM: "ش"}],
"اوست": [{ORTH: "او", NORM: "او"}, {ORTH: "ست", NORM: "ست"}],
"اولش": [{ORTH: "اول", NORM: "اول"}, {ORTH: "ش", NORM: "ش"}],
"اولشان": [{ORTH: "اول", NORM: "اول"}, {ORTH: "شان", NORM: "شان"}],
"اولم": [{ORTH: "اول", NORM: "اول"}, {ORTH: "م", NORM: "م"}],
"اکثرشان": [{ORTH: "اکثر", NORM: "اکثر"}, {ORTH: "شان", NORM: "شان"}],
"ایتالیاست": [{ORTH: "ایتالیا", NORM: "ایتالیا"}, {ORTH: "ست", NORM: "ست"}],
"ایرانی‌اش": [{ORTH: "ایرانی‌", NORM: "ایرانی‌"}, {ORTH: "اش", NORM: "اش"}],
"اینجاست": [{ORTH: "اینجا", NORM: "اینجا"}, {ORTH: "ست", NORM: "ست"}],
"این‌هاست": [{ORTH: "این‌ها", NORM: "این‌ها"}, {ORTH: "ست", NORM: "ست"}],
"بابات": [{ORTH: "بابا", NORM: "بابا"}, {ORTH: "ت", NORM: "ت"}],
"بارش": [{ORTH: "بار", NORM: "بار"}, {ORTH: "ش", NORM: "ش"}],
"بازیگرانش": [{ORTH: "بازیگران", NORM: "بازیگران"}, {ORTH: "ش", NORM: "ش"}],
"بازیگرمان": [{ORTH: "بازیگر", NORM: "بازیگر"}, {ORTH: "مان", NORM: "مان"}],
"بازیگرهایم": [{ORTH: "بازیگرها", NORM: "بازیگرها"}, {ORTH: "یم", NORM: "یم"}],
"بازی‌اش": [{ORTH: "بازی‌", NORM: "بازی‌"}, {ORTH: "اش", NORM: "اش"}],
"بالاست": [{ORTH: "بالا", NORM: "بالا"}, {ORTH: "ست", NORM: "ست"}],
"باورند": [{ORTH: "باور", NORM: "باور"}, {ORTH: "ند", NORM: "ند"}],
"بجاست": [{ORTH: "بجا", NORM: "بجا"}, {ORTH: "ست", NORM: "ست"}],
"بدان": [{ORTH: "ب", NORM: "ب"}, {ORTH: "دان", NORM: "دان"}],
"بدش": [{ORTH: "بد", NORM: "بد"}, {ORTH: "ش", NORM: "ش"}],
"بدشان": [{ORTH: "بد", NORM: "بد"}, {ORTH: "شان", NORM: "شان"}],
"بدنم": [{ORTH: "بدن", NORM: "بدن"}, {ORTH: "م", NORM: "م"}],
"بدهی‌ات": [{ORTH: "بدهی‌", NORM: "بدهی‌"}, {ORTH: "ات", NORM: "ات"}],
"بدین": [{ORTH: "ب", NORM: "ب"}, {ORTH: "دین", NORM: "دین"}],
"برابرش": [{ORTH: "برابر", NORM: "برابر"}, {ORTH: "ش", NORM: "ش"}],
"برادرت": [{ORTH: "برادر", NORM: "برادر"}, {ORTH: "ت", NORM: "ت"}],
"برادرش": [{ORTH: "برادر", NORM: "برادر"}, {ORTH: "ش", NORM: "ش"}],
"برایت": [{ORTH: "برای", NORM: "برای"}, {ORTH: "ت", NORM: "ت"}],
"برایتان": [{ORTH: "برای", NORM: "برای"}, {ORTH: "تان", NORM: "تان"}],
"برایش": [{ORTH: "برای", NORM: "برای"}, {ORTH: "ش", NORM: "ش"}],
"برایشان": [{ORTH: "برای", NORM: "برای"}, {ORTH: "شان", NORM: "شان"}],
"برایم": [{ORTH: "برای", NORM: "برای"}, {ORTH: "م", NORM: "م"}],
"برایمان": [{ORTH: "برای", NORM: "برای"}, {ORTH: "مان", NORM: "مان"}],
"برخوردارند": [{ORTH: "برخوردار", NORM: "برخوردار"}, {ORTH: "ند", NORM: "ند"}],
"برنامه‌سازهاست": [
{ORTH: "برنامه‌سازها", NORM: "برنامه‌سازها"},
{ORTH: "ست", NORM: "ست"},
],
"برهمش": [{ORTH: "برهم", NORM: "برهم"}, {ORTH: "ش", NORM: "ش"}],
"برهنه‌اش": [{ORTH: "برهنه‌", NORM: "برهنه‌"}, {ORTH: "اش", NORM: "اش"}],
"برگهایش": [{ORTH: "برگها", NORM: "برگها"}, {ORTH: "یش", NORM: "یش"}],
"برین": [{ORTH: "بر", NORM: "بر"}, {ORTH: "ین", NORM: "ین"}],
"بزرگش": [{ORTH: "بزرگ", NORM: "بزرگ"}, {ORTH: "ش", NORM: "ش"}],
"بزرگ‌تری": [{ORTH: "بزرگ‌تر", NORM: "بزرگ‌تر"}, {ORTH: "ی", NORM: "ی"}],
"بساطش": [{ORTH: "بساط", NORM: "بساط"}, {ORTH: "ش", NORM: "ش"}],
"بعدش": [{ORTH: "بعد", NORM: "بعد"}, {ORTH: "ش", NORM: "ش"}],
"بعضیهایشان": [{ORTH: "بعضیهای", NORM: "بعضیهای"}, {ORTH: "شان", NORM: "شان"}],
"بعضی‌شان": [{ORTH: "بعضی", NORM: "بعضی"}, {ORTH: "‌شان", NORM: "شان"}],
"بقیه‌اش": [{ORTH: "بقیه‌", NORM: "بقیه‌"}, {ORTH: "اش", NORM: "اش"}],
"بلندش": [{ORTH: "بلند", NORM: "بلند"}, {ORTH: "ش", NORM: "ش"}],
"بناگوشش": [{ORTH: "بناگوش", NORM: "بناگوش"}, {ORTH: "ش", NORM: "ش"}],
"بنظرم": [
{ORTH: "ب", NORM: "ب"},
{ORTH: "نظر", NORM: "نظر"},
{ORTH: "م", NORM: "م"},
],
"بهت": [{ORTH: "به", NORM: "به"}, {ORTH: "ت", NORM: "ت"}],
"بهترش": [{ORTH: "بهتر", NORM: "بهتر"}, {ORTH: "ش", NORM: "ش"}],
"بهترم": [{ORTH: "بهتر", NORM: "بهتر"}, {ORTH: "م", NORM: "م"}],
"بهتری": [{ORTH: "بهتر", NORM: "بهتر"}, {ORTH: "ی", NORM: "ی"}],
"بهش": [{ORTH: "به", NORM: "به"}, {ORTH: "ش", NORM: "ش"}],
"به‌شان": [{ORTH: "به‌", NORM: "به‌"}, {ORTH: "شان", NORM: "شان"}],
"بودمش": [{ORTH: "بودم", NORM: "بودم"}, {ORTH: "ش", NORM: "ش"}],
"بودنش": [{ORTH: "بودن", NORM: "بودن"}, {ORTH: "ش", NORM: "ش"}],
"بودن‌شان": [{ORTH: "بودن‌", NORM: "بودن‌"}, {ORTH: "شان", NORM: "شان"}],
"بوستانش": [{ORTH: "بوستان", NORM: "بوستان"}, {ORTH: "ش", NORM: "ش"}],
"بویش": [{ORTH: "بو", NORM: "بو"}, {ORTH: "یش", NORM: "یش"}],
"بچه‌اش": [{ORTH: "بچه‌", NORM: "بچه‌"}, {ORTH: "اش", NORM: "اش"}],
"بچه‌م": [{ORTH: "بچه‌", NORM: "بچه‌"}, {ORTH: "م", NORM: "م"}],
"بچه‌هایش": [{ORTH: "بچه‌های", NORM: "بچه‌های"}, {ORTH: "ش", NORM: "ش"}],
"بیانیه‌شان": [{ORTH: "بیانیه‌", NORM: "بیانیه‌"}, {ORTH: "شان", NORM: "شان"}],
"بیدارم": [{ORTH: "بیدار", NORM: "بیدار"}, {ORTH: "م", NORM: "م"}],
"بیناتری": [{ORTH: "بیناتر", NORM: "بیناتر"}, {ORTH: "ی", NORM: "ی"}],
"بی‌اطلاعند": [{ORTH: "بی‌اطلاع", NORM: "بی‌اطلاع"}, {ORTH: "ند", NORM: "ند"}],
"بی‌اطلاعید": [{ORTH: "بی‌اطلاع", NORM: "بی‌اطلاع"}, {ORTH: "ید", NORM: "ید"}],
"بی‌بهره‌اند": [{ORTH: "بی‌بهره‌", NORM: "بی‌بهره‌"}, {ORTH: "اند", NORM: "اند"}],
"بی‌تفاوتند": [{ORTH: "بی‌تفاوت", NORM: "بی‌تفاوت"}, {ORTH: "ند", NORM: "ند"}],
"بی‌حسابش": [{ORTH: "بی‌حساب", NORM: "بی‌حساب"}, {ORTH: "ش", NORM: "ش"}],
"بی‌نیش": [{ORTH: "بی‌نی", NORM: "بی‌نی"}, {ORTH: "ش", NORM: "ش"}],
"تجربه‌هایم": [{ORTH: "تجربه‌ها", NORM: "تجربه‌ها"}, {ORTH: "یم", NORM: "یم"}],
"تحریم‌هاست": [{ORTH: "تحریم‌ها", NORM: "تحریم‌ها"}, {ORTH: "ست", NORM: "ست"}],
"تحولند": [{ORTH: "تحول", NORM: "تحول"}, {ORTH: "ند", NORM: "ند"}],
"تخیلی‌اش": [{ORTH: "تخیلی‌", NORM: "تخیلی‌"}, {ORTH: "اش", NORM: "اش"}],
"ترا": [{ORTH: "ت", NORM: "ت"}, {ORTH: "را", NORM: "را"}],
"ترسشان": [{ORTH: "ترس", NORM: "ترس"}, {ORTH: "شان", NORM: "شان"}],
"ترکش": [{ORTH: "ترک", NORM: "ترک"}, {ORTH: "ش", NORM: "ش"}],
"تشنه‌ت": [{ORTH: "تشنه‌", NORM: "تشنه‌"}, {ORTH: "ت", NORM: "ت"}],
"تشکیلاتی‌اش": [{ORTH: "تشکیلاتی‌", NORM: "تشکیلاتی‌"}, {ORTH: "اش", NORM: "اش"}],
"تعلقش": [{ORTH: "تعلق", NORM: "تعلق"}, {ORTH: "ش", NORM: "ش"}],
"تلاششان": [{ORTH: "تلاش", NORM: "تلاش"}, {ORTH: "شان", NORM: "شان"}],
"تلاشمان": [{ORTH: "تلاش", NORM: "تلاش"}, {ORTH: "مان", NORM: "مان"}],
"تماشاگرش": [{ORTH: "تماشاگر", NORM: "تماشاگر"}, {ORTH: "ش", NORM: "ش"}],
"تمامشان": [{ORTH: "تمام", NORM: "تمام"}, {ORTH: "شان", NORM: "شان"}],
"تنش": [{ORTH: "تن", NORM: "تن"}, {ORTH: "ش", NORM: "ش"}],
"تنمان": [{ORTH: "تن", NORM: "تن"}, {ORTH: "مان", NORM: "مان"}],
"تنهایی‌اش": [{ORTH: "تنهایی‌", NORM: "تنهایی‌"}, {ORTH: "اش", NORM: "اش"}],
"توانایی‌اش": [{ORTH: "توانایی‌", NORM: "توانایی‌"}, {ORTH: "اش", NORM: "اش"}],
"توجهش": [{ORTH: "توجه", NORM: "توجه"}, {ORTH: "ش", NORM: "ش"}],
"توست": [{ORTH: "تو", NORM: "تو"}, {ORTH: "ست", NORM: "ست"}],
"توصیه‌اش": [{ORTH: "توصیه‌", NORM: "توصیه‌"}, {ORTH: "اش", NORM: "اش"}],
"تیغه‌اش": [{ORTH: "تیغه‌", NORM: "تیغه‌"}, {ORTH: "اش", NORM: "اش"}],
"جاست": [{ORTH: "جا", NORM: "جا"}, {ORTH: "ست", NORM: "ست"}],
"جامعه‌اند": [{ORTH: "جامعه‌", NORM: "جامعه‌"}, {ORTH: "اند", NORM: "اند"}],
"جانم": [{ORTH: "جان", NORM: "جان"}, {ORTH: "م", NORM: "م"}],
"جایش": [{ORTH: "جای", NORM: "جای"}, {ORTH: "ش", NORM: "ش"}],
"جایشان": [{ORTH: "جای", NORM: "جای"}, {ORTH: "شان", NORM: "شان"}],
"جدیدش": [{ORTH: "جدید", NORM: "جدید"}, {ORTH: "ش", NORM: "ش"}],
"جرمزاست": [{ORTH: "جرمزا", NORM: "جرمزا"}, {ORTH: "ست", NORM: "ست"}],
"جلوست": [{ORTH: "جلو", NORM: "جلو"}, {ORTH: "ست", NORM: "ست"}],
"جلویش": [{ORTH: "جلوی", NORM: "جلوی"}, {ORTH: "ش", NORM: "ش"}],
"جمهوریست": [{ORTH: "جمهوری", NORM: "جمهوری"}, {ORTH: "ست", NORM: "ست"}],
"جنسش": [{ORTH: "جنس", NORM: "جنس"}, {ORTH: "ش", NORM: "ش"}],
"جنس‌اند": [{ORTH: "جنس‌", NORM: "جنس‌"}, {ORTH: "اند", NORM: "اند"}],
"جوانانش": [{ORTH: "جوانان", NORM: "جوانان"}, {ORTH: "ش", NORM: "ش"}],
"جویش": [{ORTH: "جوی", NORM: "جوی"}, {ORTH: "ش", NORM: "ش"}],
"جگرش": [{ORTH: "جگر", NORM: "جگر"}, {ORTH: "ش", NORM: "ش"}],
"حاضرم": [{ORTH: "حاضر", NORM: "حاضر"}, {ORTH: "م", NORM: "م"}],
"حالتهایشان": [{ORTH: "حالتهای", NORM: "حالتهای"}, {ORTH: "شان", NORM: "شان"}],
"حالیست": [{ORTH: "حالی", NORM: "حالی"}, {ORTH: "ست", NORM: "ست"}],
"حالی‌مان": [{ORTH: "حالی‌", NORM: "حالی‌"}, {ORTH: "مان", NORM: "مان"}],
"حاکیست": [{ORTH: "حاکی", NORM: "حاکی"}, {ORTH: "ست", NORM: "ست"}],
"حرامزادگی‌اش": [
{ORTH: "حرامزادگی‌", NORM: "حرامزادگی‌"},
{ORTH: "اش", NORM: "اش"},
],
"حرفتان": [{ORTH: "حرف", NORM: "حرف"}, {ORTH: "تان", NORM: "تان"}],
"حرفش": [{ORTH: "حرف", NORM: "حرف"}, {ORTH: "ش", NORM: "ش"}],
"حرفشان": [{ORTH: "حرف", NORM: "حرف"}, {ORTH: "شان", NORM: "شان"}],
"حرفم": [{ORTH: "حرف", NORM: "حرف"}, {ORTH: "م", NORM: "م"}],
"حرف‌های‌شان": [{ORTH: "حرف‌های‌", NORM: "حرف‌های‌"}, {ORTH: "شان", NORM: "شان"}],
"حرکتمان": [{ORTH: "حرکت", NORM: "حرکت"}, {ORTH: "مان", NORM: "مان"}],
"حریفانشان": [{ORTH: "حریفان", NORM: "حریفان"}, {ORTH: "شان", NORM: "شان"}],
"حضورشان": [{ORTH: "حضور", NORM: "حضور"}, {ORTH: "شان", NORM: "شان"}],
"حمایتش": [{ORTH: "حمایت", NORM: "حمایت"}, {ORTH: "ش", NORM: "ش"}],
"حواسش": [{ORTH: "حواس", NORM: "حواس"}, {ORTH: "ش", NORM: "ش"}],
"حواسشان": [{ORTH: "حواس", NORM: "حواس"}, {ORTH: "شان", NORM: "شان"}],
"حوصله‌مان": [{ORTH: "حوصله‌", NORM: "حوصله‌"}, {ORTH: "مان", NORM: "مان"}],
"حکومتش": [{ORTH: "حکومت", NORM: "حکومت"}, {ORTH: "ش", NORM: "ش"}],
"حکومتشان": [{ORTH: "حکومت", NORM: "حکومت"}, {ORTH: "شان", NORM: "شان"}],
"حیفم": [{ORTH: "حیف", NORM: "حیف"}, {ORTH: "م", NORM: "م"}],
"خاندانش": [{ORTH: "خاندان", NORM: "خاندان"}, {ORTH: "ش", NORM: "ش"}],
"خانه‌اش": [{ORTH: "خانه‌", NORM: "خانه‌"}, {ORTH: "اش", NORM: "اش"}],
"خانه‌شان": [{ORTH: "خانه‌", NORM: "خانه‌"}, {ORTH: "شان", NORM: "شان"}],
"خانه‌مان": [{ORTH: "خانه‌", NORM: "خانه‌"}, {ORTH: "مان", NORM: "مان"}],
"خانه‌هایشان": [{ORTH: "خانه‌های", NORM: "خانه‌های"}, {ORTH: "شان", NORM: "شان"}],
"خانواده‌ات": [{ORTH: "خانواده", NORM: "خانواده"}, {ORTH: "‌ات", NORM: "ات"}],
"خانواده‌اش": [{ORTH: "خانواده‌", NORM: "خانواده‌"}, {ORTH: "اش", NORM: "اش"}],
"خانواده‌ام": [{ORTH: "خانواده‌", NORM: "خانواده‌"}, {ORTH: "ام", NORM: "ام"}],
"خانواده‌شان": [{ORTH: "خانواده‌", NORM: "خانواده‌"}, {ORTH: "شان", NORM: "شان"}],
"خداست": [{ORTH: "خدا", NORM: "خدا"}, {ORTH: "ست", NORM: "ست"}],
"خدایش": [{ORTH: "خدا", NORM: "خدا"}, {ORTH: "یش", NORM: "یش"}],
"خدایشان": [{ORTH: "خدای", NORM: "خدای"}, {ORTH: "شان", NORM: "شان"}],
"خردسالش": [{ORTH: "خردسال", NORM: "خردسال"}, {ORTH: "ش", NORM: "ش"}],
"خروپفشان": [{ORTH: "خروپف", NORM: "خروپف"}, {ORTH: "شان", NORM: "شان"}],
"خسته‌ای": [{ORTH: "خسته‌", NORM: "خسته‌"}, {ORTH: "ای", NORM: "ای"}],
"خطت": [{ORTH: "خط", NORM: "خط"}, {ORTH: "ت", NORM: "ت"}],
"خوابمان": [{ORTH: "خواب", NORM: "خواب"}, {ORTH: "مان", NORM: "مان"}],
"خواندنش": [{ORTH: "خواندن", NORM: "خواندن"}, {ORTH: "ش", NORM: "ش"}],
"خواهرش": [{ORTH: "خواهر", NORM: "خواهر"}, {ORTH: "ش", NORM: "ش"}],
"خوبش": [{ORTH: "خوب", NORM: "خوب"}, {ORTH: "ش", NORM: "ش"}],
"خودت": [{ORTH: "خود", NORM: "خود"}, {ORTH: "ت", NORM: "ت"}],
"خودتان": [{ORTH: "خود", NORM: "خود"}, {ORTH: "تان", NORM: "تان"}],
"خودش": [{ORTH: "خود", NORM: "خود"}, {ORTH: "ش", NORM: "ش"}],
"خودشان": [{ORTH: "خود", NORM: "خود"}, {ORTH: "شان", NORM: "شان"}],
"خودمان": [{ORTH: "خود", NORM: "خود"}, {ORTH: "مان", NORM: "مان"}],
"خوردمان": [{ORTH: "خورد", NORM: "خورد"}, {ORTH: "مان", NORM: "مان"}],
"خوردنشان": [{ORTH: "خوردن", NORM: "خوردن"}, {ORTH: "شان", NORM: "شان"}],
"خوشش": [{ORTH: "خوش", NORM: "خوش"}, {ORTH: "ش", NORM: "ش"}],
"خوشوقتم": [{ORTH: "خوشوقت", NORM: "خوشوقت"}, {ORTH: "م", NORM: "م"}],
"خونشان": [{ORTH: "خون", NORM: "خون"}, {ORTH: "شان", NORM: "شان"}],
"خویش": [{ORTH: "خوی", NORM: "خوی"}, {ORTH: "ش", NORM: "ش"}],
"خویشتنم": [{ORTH: "خویشتن", NORM: "خویشتن"}, {ORTH: "م", NORM: "م"}],
"خیالش": [{ORTH: "خیال", NORM: "خیال"}, {ORTH: "ش", NORM: "ش"}],
"خیسش": [{ORTH: "خیس", NORM: "خیس"}, {ORTH: "ش", NORM: "ش"}],
"داراست": [{ORTH: "دارا", NORM: "دارا"}, {ORTH: "ست", NORM: "ست"}],
"داستانهایش": [{ORTH: "داستانهای", NORM: "داستانهای"}, {ORTH: "ش", NORM: "ش"}],
"دخترمان": [{ORTH: "دختر", NORM: "دختر"}, {ORTH: "مان", NORM: "مان"}],
"دخیلند": [{ORTH: "دخیل", NORM: "دخیل"}, {ORTH: "ند", NORM: "ند"}],
"درباره‌ات": [{ORTH: "درباره", NORM: "درباره"}, {ORTH: "‌ات", NORM: "ات"}],
"درباره‌اش": [{ORTH: "درباره‌", NORM: "درباره‌"}, {ORTH: "اش", NORM: "اش"}],
"دردش": [{ORTH: "درد", NORM: "درد"}, {ORTH: "ش", NORM: "ش"}],
"دردشان": [{ORTH: "درد", NORM: "درد"}, {ORTH: "شان", NORM: "شان"}],
"درسته": [{ORTH: "درست", NORM: "درست"}, {ORTH: "ه", NORM: "ه"}],
"درش": [{ORTH: "در", NORM: "در"}, {ORTH: "ش", NORM: "ش"}],
"درون‌شان": [{ORTH: "درون‌", NORM: "درون‌"}, {ORTH: "شان", NORM: "شان"}],
"درین": [{ORTH: "در", NORM: "در"}, {ORTH: "ین", NORM: "ین"}],
"دریچه‌هایش": [{ORTH: "دریچه‌های", NORM: "دریچه‌های"}, {ORTH: "ش", NORM: "ش"}],
"دزدانش": [{ORTH: "دزدان", NORM: "دزدان"}, {ORTH: "ش", NORM: "ش"}],
"دستت": [{ORTH: "دست", NORM: "دست"}, {ORTH: "ت", NORM: "ت"}],
"دستش": [{ORTH: "دست", NORM: "دست"}, {ORTH: "ش", NORM: "ش"}],
"دستمان": [{ORTH: "دست", NORM: "دست"}, {ORTH: "مان", NORM: "مان"}],
"دستهایشان": [{ORTH: "دستهای", NORM: "دستهای"}, {ORTH: "شان", NORM: "شان"}],
"دست‌یافتنی‌ست": [
{ORTH: "دست‌یافتنی‌", NORM: "دست‌یافتنی‌"},
{ORTH: "ست", NORM: "ست"},
],
"دشمنند": [{ORTH: "دشمن", NORM: "دشمن"}, {ORTH: "ند", NORM: "ند"}],
"دشمنیشان": [{ORTH: "دشمنی", NORM: "دشمنی"}, {ORTH: "شان", NORM: "شان"}],
"دشمنیم": [{ORTH: "دشمن", NORM: "دشمن"}, {ORTH: "یم", NORM: "یم"}],
"دفترش": [{ORTH: "دفتر", NORM: "دفتر"}, {ORTH: "ش", NORM: "ش"}],
"دفنشان": [{ORTH: "دفن", NORM: "دفن"}, {ORTH: "شان", NORM: "شان"}],
"دلت": [{ORTH: "دل", NORM: "دل"}, {ORTH: "ت", NORM: "ت"}],
"دلش": [{ORTH: "دل", NORM: "دل"}, {ORTH: "ش", NORM: "ش"}],
"دلشان": [{ORTH: "دل", NORM: "دل"}, {ORTH: "شان", NORM: "شان"}],
"دلم": [{ORTH: "دل", NORM: "دل"}, {ORTH: "م", NORM: "م"}],
"دلیلش": [{ORTH: "دلیل", NORM: "دلیل"}, {ORTH: "ش", NORM: "ش"}],
"دنبالش": [{ORTH: "دنبال", NORM: "دنبال"}, {ORTH: "ش", NORM: "ش"}],
"دنباله‌اش": [{ORTH: "دنباله‌", NORM: "دنباله‌"}, {ORTH: "اش", NORM: "اش"}],
"دهاتی‌هایش": [{ORTH: "دهاتی‌های", NORM: "دهاتی‌های"}, {ORTH: "ش", NORM: "ش"}],
"دهانت": [{ORTH: "دهان", NORM: "دهان"}, {ORTH: "ت", NORM: "ت"}],
"دهنش": [{ORTH: "دهن", NORM: "دهن"}, {ORTH: "ش", NORM: "ش"}],
"دورش": [{ORTH: "دور", NORM: "دور"}, {ORTH: "ش", NORM: "ش"}],
"دوروبریهاشان": [
{ORTH: "دوروبریها", NORM: "دوروبریها"},
{ORTH: "شان", NORM: "شان"},
],
"دوستانش": [{ORTH: "دوستان", NORM: "دوستان"}, {ORTH: "ش", NORM: "ش"}],
"دوستانشان": [{ORTH: "دوستان", NORM: "دوستان"}, {ORTH: "شان", NORM: "شان"}],
"دوستت": [{ORTH: "دوست", NORM: "دوست"}, {ORTH: "ت", NORM: "ت"}],
"دوستش": [{ORTH: "دوست", NORM: "دوست"}, {ORTH: "ش", NORM: "ش"}],
"دومش": [{ORTH: "دوم", NORM: "دوم"}, {ORTH: "ش", NORM: "ش"}],
"دویدنش": [{ORTH: "دویدن", NORM: "دویدن"}, {ORTH: "ش", NORM: "ش"}],
"دکورهایمان": [{ORTH: "دکورهای", NORM: "دکورهای"}, {ORTH: "مان", NORM: "مان"}],
"دیدگاهش": [{ORTH: "دیدگاه", NORM: "دیدگاه"}, {ORTH: "ش", NORM: "ش"}],
"دیرت": [{ORTH: "دیر", NORM: "دیر"}, {ORTH: "ت", NORM: "ت"}],
"دیرم": [{ORTH: "دیر", NORM: "دیر"}, {ORTH: "م", NORM: "م"}],
"دینت": [{ORTH: "دین", NORM: "دین"}, {ORTH: "ت", NORM: "ت"}],
"دینش": [{ORTH: "دین", NORM: "دین"}, {ORTH: "ش", NORM: "ش"}],
"دین‌شان": [{ORTH: "دین‌", NORM: "دین‌"}, {ORTH: "شان", NORM: "شان"}],
"دیواره‌هایش": [{ORTH: "دیواره‌های", NORM: "دیواره‌های"}, {ORTH: "ش", NORM: "ش"}],
"دیوانه‌ای": [{ORTH: "دیوانه‌", NORM: "دیوانه‌"}, {ORTH: "ای", NORM: "ای"}],
"دیوی": [{ORTH: "دیو", NORM: "دیو"}, {ORTH: "ی", NORM: "ی"}],
"دیگرم": [{ORTH: "دیگر", NORM: "دیگر"}, {ORTH: "م", NORM: "م"}],
"دیگرمان": [{ORTH: "دیگر", NORM: "دیگر"}, {ORTH: "مان", NORM: "مان"}],
"ذهنش": [{ORTH: "ذهن", NORM: "ذهن"}, {ORTH: "ش", NORM: "ش"}],
"ذهنشان": [{ORTH: "ذهن", NORM: "ذهن"}, {ORTH: "شان", NORM: "شان"}],
"ذهنم": [{ORTH: "ذهن", NORM: "ذهن"}, {ORTH: "م", NORM: "م"}],
"رئوسش": [{ORTH: "رئوس", NORM: "رئوس"}, {ORTH: "ش", NORM: "ش"}],
"راهشان": [{ORTH: "راه", NORM: "راه"}, {ORTH: "شان", NORM: "شان"}],
"راهگشاست": [{ORTH: "راهگشا", NORM: "راهگشا"}, {ORTH: "ست", NORM: "ست"}],
"رایانه‌هایشان": [
{ORTH: "رایانه‌های", NORM: "رایانه‌های"},
{ORTH: "شان", NORM: "شان"},
],
"رعایتشان": [{ORTH: "رعایت", NORM: "رعایت"}, {ORTH: "شان", NORM: "شان"}],
"رفتارش": [{ORTH: "رفتار", NORM: "رفتار"}, {ORTH: "ش", NORM: "ش"}],
"رفتارشان": [{ORTH: "رفتار", NORM: "رفتار"}, {ORTH: "شان", NORM: "شان"}],
"رفتارمان": [{ORTH: "رفتار", NORM: "رفتار"}, {ORTH: "مان", NORM: "مان"}],
"رفتارهاست": [{ORTH: "رفتارها", NORM: "رفتارها"}, {ORTH: "ست", NORM: "ست"}],
"رفتارهایشان": [{ORTH: "رفتارهای", NORM: "رفتارهای"}, {ORTH: "شان", NORM: "شان"}],
"رفقایم": [{ORTH: "رفقا", NORM: "رفقا"}, {ORTH: "یم", NORM: "یم"}],
"رقیق‌ترش": [{ORTH: "رقیق‌تر", NORM: "رقیق‌تر"}, {ORTH: "ش", NORM: "ش"}],
"رنجند": [{ORTH: "رنج", NORM: "رنج"}, {ORTH: "ند", NORM: "ند"}],
"رهگشاست": [{ORTH: "رهگشا", NORM: "رهگشا"}, {ORTH: "ست", NORM: "ست"}],
"رواست": [{ORTH: "روا", NORM: "روا"}, {ORTH: "ست", NORM: "ست"}],
"روبروست": [{ORTH: "روبرو", NORM: "روبرو"}, {ORTH: "ست", NORM: "ست"}],
"روحی‌اش": [{ORTH: "روحی‌", NORM: "روحی‌"}, {ORTH: "اش", NORM: "اش"}],
"روزنامه‌اش": [{ORTH: "روزنامه‌", NORM: "روزنامه‌"}, {ORTH: "اش", NORM: "اش"}],
"روزه‌ست": [{ORTH: "روزه‌", NORM: "روزه‌"}, {ORTH: "ست", NORM: "ست"}],
"روسری‌اش": [{ORTH: "روسری‌", NORM: "روسری‌"}, {ORTH: "اش", NORM: "اش"}],
"روشتان": [{ORTH: "روش", NORM: "روش"}, {ORTH: "تان", NORM: "تان"}],
"رویش": [{ORTH: "روی", NORM: "روی"}, {ORTH: "ش", NORM: "ش"}],
"زبانش": [{ORTH: "زبان", NORM: "زبان"}, {ORTH: "ش", NORM: "ش"}],
"زحماتشان": [{ORTH: "زحمات", NORM: "زحمات"}, {ORTH: "شان", NORM: "شان"}],
"زدنهایشان": [{ORTH: "زدنهای", NORM: "زدنهای"}, {ORTH: "شان", NORM: "شان"}],
"زرنگشان": [{ORTH: "زرنگ", NORM: "زرنگ"}, {ORTH: "شان", NORM: "شان"}],
"زشتش": [{ORTH: "زشت", NORM: "زشت"}, {ORTH: "ش", NORM: "ش"}],
"زشتکارانند": [{ORTH: "زشتکاران", NORM: "زشتکاران"}, {ORTH: "ند", NORM: "ند"}],
"زلفش": [{ORTH: "زلف", NORM: "زلف"}, {ORTH: "ش", NORM: "ش"}],
"زمن": [{ORTH: "ز", NORM: "ز"}, {ORTH: "من", NORM: "من"}],
"زنبوری‌اش": [{ORTH: "زنبوری‌", NORM: "زنبوری‌"}, {ORTH: "اش", NORM: "اش"}],
"زندانم": [{ORTH: "زندان", NORM: "زندان"}, {ORTH: "م", NORM: "م"}],
"زنده‌ام": [{ORTH: "زنده‌", NORM: "زنده‌"}, {ORTH: "ام", NORM: "ام"}],
"زندگانی‌اش": [{ORTH: "زندگانی‌", NORM: "زندگانی‌"}, {ORTH: "اش", NORM: "اش"}],
"زندگی‌اش": [{ORTH: "زندگی‌", NORM: "زندگی‌"}, {ORTH: "اش", NORM: "اش"}],
"زندگی‌ام": [{ORTH: "زندگی‌", NORM: "زندگی‌"}, {ORTH: "ام", NORM: "ام"}],
"زندگی‌شان": [{ORTH: "زندگی‌", NORM: "زندگی‌"}, {ORTH: "شان", NORM: "شان"}],
"زنش": [{ORTH: "زن", NORM: "زن"}, {ORTH: "ش", NORM: "ش"}],
"زنند": [{ORTH: "زن", NORM: "زن"}, {ORTH: "ند", NORM: "ند"}],
"زو": [{ORTH: "ز", NORM: "ز"}, {ORTH: "و", NORM: "و"}],
"زیاده": [{ORTH: "زیاد", NORM: "زیاد"}, {ORTH: "ه", NORM: "ه"}],
"زیباست": [{ORTH: "زیبا", NORM: "زیبا"}, {ORTH: "ست", NORM: "ست"}],
"زیبایش": [{ORTH: "زیبای", NORM: "زیبای"}, {ORTH: "ش", NORM: "ش"}],
"زیبایی": [{ORTH: "زیبای", NORM: "زیبای"}, {ORTH: "ی", NORM: "ی"}],
"زیربناست": [{ORTH: "زیربنا", NORM: "زیربنا"}, {ORTH: "ست", NORM: "ست"}],
"زیرک‌اند": [{ORTH: "زیرک‌", NORM: "زیرک‌"}, {ORTH: "اند", NORM: "اند"}],
"سؤالتان": [{ORTH: "سؤال", NORM: "سؤال"}, {ORTH: "تان", NORM: "تان"}],
"سؤالم": [{ORTH: "سؤال", NORM: "سؤال"}, {ORTH: "م", NORM: "م"}],
"سابقه‌اش": [{ORTH: "سابقه‌", NORM: "سابقه‌"}, {ORTH: "اش", NORM: "اش"}],
"ساختنم": [{ORTH: "ساختن", NORM: "ساختن"}, {ORTH: "م", NORM: "م"}],
"ساده‌اش": [{ORTH: "ساده‌", NORM: "ساده‌"}, {ORTH: "اش", NORM: "اش"}],
"ساده‌اند": [{ORTH: "ساده‌", NORM: "ساده‌"}, {ORTH: "اند", NORM: "اند"}],
"سازمانش": [{ORTH: "سازمان", NORM: "سازمان"}, {ORTH: "ش", NORM: "ش"}],
"ساعتم": [{ORTH: "ساعت", NORM: "ساعت"}, {ORTH: "م", NORM: "م"}],
"سالته": [
{ORTH: "سال", NORM: "سال"},
{ORTH: "ت", NORM: "ت"},
{ORTH: "ه", NORM: "ه"},
],
"سالش": [{ORTH: "سال", NORM: "سال"}, {ORTH: "ش", NORM: "ش"}],
"سالهاست": [{ORTH: "سالها", NORM: "سالها"}, {ORTH: "ست", NORM: "ست"}],
"ساله‌اش": [{ORTH: "ساله‌", NORM: "ساله‌"}, {ORTH: "اش", NORM: "اش"}],
"ساکتند": [{ORTH: "ساکت", NORM: "ساکت"}, {ORTH: "ند", NORM: "ند"}],
"ساکنند": [{ORTH: "ساکن", NORM: "ساکن"}, {ORTH: "ند", NORM: "ند"}],
"سبزشان": [{ORTH: "سبز", NORM: "سبز"}, {ORTH: "شان", NORM: "شان"}],
"سبیل‌مان": [{ORTH: "سبیل‌", NORM: "سبیل‌"}, {ORTH: "مان", NORM: "مان"}],
"ستم‌هایش": [{ORTH: "ستم‌های", NORM: "ستم‌های"}, {ORTH: "ش", NORM: "ش"}],
"سخنانش": [{ORTH: "سخنان", NORM: "سخنان"}, {ORTH: "ش", NORM: "ش"}],
"سخنانشان": [{ORTH: "سخنان", NORM: "سخنان"}, {ORTH: "شان", NORM: "شان"}],
"سخنتان": [{ORTH: "سخن", NORM: "سخن"}, {ORTH: "تان", NORM: "تان"}],
"سخنش": [{ORTH: "سخن", NORM: "سخن"}, {ORTH: "ش", NORM: "ش"}],
"سخنم": [{ORTH: "سخن", NORM: "سخن"}, {ORTH: "م", NORM: "م"}],
"سردش": [{ORTH: "سرد", NORM: "سرد"}, {ORTH: "ش", NORM: "ش"}],
"سرزمینشان": [{ORTH: "سرزمین", NORM: "سرزمین"}, {ORTH: "شان", NORM: "شان"}],
"سرش": [{ORTH: "سر", NORM: "سر"}, {ORTH: "ش", NORM: "ش"}],
"سرمایه‌دارهاست": [
{ORTH: "سرمایه‌دارها", NORM: "سرمایه‌دارها"},
{ORTH: "ست", NORM: "ست"},
],
"سرنوشتش": [{ORTH: "سرنوشت", NORM: "سرنوشت"}, {ORTH: "ش", NORM: "ش"}],
"سرنوشتشان": [{ORTH: "سرنوشت", NORM: "سرنوشت"}, {ORTH: "شان", NORM: "شان"}],
"سروتهش": [{ORTH: "سروته", NORM: "سروته"}, {ORTH: "ش", NORM: "ش"}],
"سرچشمه‌اش": [{ORTH: "سرچشمه‌", NORM: "سرچشمه‌"}, {ORTH: "اش", NORM: "اش"}],
"سقمش": [{ORTH: "سقم", NORM: "سقم"}, {ORTH: "ش", NORM: "ش"}],
"سنش": [{ORTH: "سن", NORM: "سن"}, {ORTH: "ش", NORM: "ش"}],
"سپاهش": [{ORTH: "سپاه", NORM: "سپاه"}, {ORTH: "ش", NORM: "ش"}],
"سیاسیشان": [{ORTH: "سیاسی", NORM: "سیاسی"}, {ORTH: "شان", NORM: "شان"}],
"سیاه‌چاله‌هاست": [
{ORTH: "سیاه‌چاله‌ها", NORM: "سیاه‌چاله‌ها"},
{ORTH: "ست", NORM: "ست"},
],
"شاخه‌هایشان": [{ORTH: "شاخه‌های", NORM: "شاخه‌های"}, {ORTH: "شان", NORM: "شان"}],
"شالوده‌اش": [{ORTH: "شالوده‌", NORM: "شالوده‌"}, {ORTH: "اش", NORM: "اش"}],
"شانه‌هایش": [{ORTH: "شانه‌های", NORM: "شانه‌های"}, {ORTH: "ش", NORM: "ش"}],
"شاهدیم": [{ORTH: "شاهد", NORM: "شاهد"}, {ORTH: "یم", NORM: "یم"}],
"شاهکارهایش": [{ORTH: "شاهکارهای", NORM: "شاهکارهای"}, {ORTH: "ش", NORM: "ش"}],
"شخصیتش": [{ORTH: "شخصیت", NORM: "شخصیت"}, {ORTH: "ش", NORM: "ش"}],
"شدنشان": [{ORTH: "شدن", NORM: "شدن"}, {ORTH: "شان", NORM: "شان"}],
"شرکتیست": [{ORTH: "شرکتی", NORM: "شرکتی"}, {ORTH: "ست", NORM: "ست"}],
"شعارهاشان": [{ORTH: "شعارها", NORM: "شعارها"}, {ORTH: "شان", NORM: "شان"}],
"شعورش": [{ORTH: "شعور", NORM: "شعور"}, {ORTH: "ش", NORM: "ش"}],
"شغلش": [{ORTH: "شغل", NORM: "شغل"}, {ORTH: "ش", NORM: "ش"}],
"شماست": [{ORTH: "شما", NORM: "شما"}, {ORTH: "ست", NORM: "ست"}],
"شمشیرش": [{ORTH: "شمشیر", NORM: "شمشیر"}, {ORTH: "ش", NORM: "ش"}],
"شنیدنش": [{ORTH: "شنیدن", NORM: "شنیدن"}, {ORTH: "ش", NORM: "ش"}],
"شوراست": [{ORTH: "شورا", NORM: "شورا"}, {ORTH: "ست", NORM: "ست"}],
"شومت": [{ORTH: "شوم", NORM: "شوم"}, {ORTH: "ت", NORM: "ت"}],
"شیرینترش": [{ORTH: "شیرینتر", NORM: "شیرینتر"}, {ORTH: "ش", NORM: "ش"}],
"شیطان‌اند": [{ORTH: "شیطان‌", NORM: "شیطان‌"}, {ORTH: "اند", NORM: "اند"}],
"شیوه‌هاست": [{ORTH: "شیوه‌ها", NORM: "شیوه‌ها"}, {ORTH: "ست", NORM: "ست"}],
"صاحبش": [{ORTH: "صاحب", NORM: "صاحب"}, {ORTH: "ش", NORM: "ش"}],
"صحنه‌اش": [{ORTH: "صحنه‌", NORM: "صحنه‌"}, {ORTH: "اش", NORM: "اش"}],
"صدایش": [{ORTH: "صدای", NORM: "صدای"}, {ORTH: "ش", NORM: "ش"}],
"صددند": [{ORTH: "صدد", NORM: "صدد"}, {ORTH: "ند", NORM: "ند"}],
"صندوق‌هاست": [{ORTH: "صندوق‌ها", NORM: "صندوق‌ها"}, {ORTH: "ست", NORM: "ست"}],
"صندوق‌هایش": [{ORTH: "صندوق‌های", NORM: "صندوق‌های"}, {ORTH: "ش", NORM: "ش"}],
"صورتش": [{ORTH: "صورت", NORM: "صورت"}, {ORTH: "ش", NORM: "ش"}],
"ضروری‌اند": [{ORTH: "ضروری‌", NORM: "ضروری‌"}, {ORTH: "اند", NORM: "اند"}],
"ضمیرش": [{ORTH: "ضمیر", NORM: "ضمیر"}, {ORTH: "ش", NORM: "ش"}],
"طرفش": [{ORTH: "طرف", NORM: "طرف"}, {ORTH: "ش", NORM: "ش"}],
"طلسمش": [{ORTH: "طلسم", NORM: "طلسم"}, {ORTH: "ش", NORM: "ش"}],
"طوره": [{ORTH: "طور", NORM: "طور"}, {ORTH: "ه", NORM: "ه"}],
"عاشوراست": [{ORTH: "عاشورا", NORM: "عاشورا"}, {ORTH: "ست", NORM: "ست"}],
"عبارتند": [{ORTH: "عبارت", NORM: "عبارت"}, {ORTH: "ند", NORM: "ند"}],
"عزیزانتان": [{ORTH: "عزیزان", NORM: "عزیزان"}, {ORTH: "تان", NORM: "تان"}],
"عزیزانش": [{ORTH: "عزیزان", NORM: "عزیزان"}, {ORTH: "ش", NORM: "ش"}],
"عزیزش": [{ORTH: "عزیز", NORM: "عزیز"}, {ORTH: "ش", NORM: "ش"}],
"عشرت‌طلبی‌اش": [
{ORTH: "عشرت‌طلبی‌", NORM: "عشرت‌طلبی‌"},
{ORTH: "اش", NORM: "اش"},
],
"عقبیم": [{ORTH: "عقب", NORM: "عقب"}, {ORTH: "یم", NORM: "یم"}],
"علاقه‌اش": [{ORTH: "علاقه‌", NORM: "علاقه‌"}, {ORTH: "اش", NORM: "اش"}],
"علمیمان": [{ORTH: "علمی", NORM: "علمی"}, {ORTH: "مان", NORM: "مان"}],
"عمرش": [{ORTH: "عمر", NORM: "عمر"}, {ORTH: "ش", NORM: "ش"}],
"عمرشان": [{ORTH: "عمر", NORM: "عمر"}, {ORTH: "شان", NORM: "شان"}],
"عملش": [{ORTH: "عمل", NORM: "عمل"}, {ORTH: "ش", NORM: "ش"}],
"عملی‌اند": [{ORTH: "عملی‌", NORM: "عملی‌"}, {ORTH: "اند", NORM: "اند"}],
"عمویت": [{ORTH: "عموی", NORM: "عموی"}, {ORTH: "ت", NORM: "ت"}],
"عمویش": [{ORTH: "عموی", NORM: "عموی"}, {ORTH: "ش", NORM: "ش"}],
"عمیقش": [{ORTH: "عمیق", NORM: "عمیق"}, {ORTH: "ش", NORM: "ش"}],
"عواملش": [{ORTH: "عوامل", NORM: "عوامل"}, {ORTH: "ش", NORM: "ش"}],
"عوضشان": [{ORTH: "عوض", NORM: "عوض"}, {ORTH: "شان", NORM: "شان"}],
"غذایی‌شان": [{ORTH: "غذایی‌", NORM: "غذایی‌"}, {ORTH: "شان", NORM: "شان"}],
"غریبه‌اند": [{ORTH: "غریبه‌", NORM: "غریبه‌"}, {ORTH: "اند", NORM: "اند"}],
"غلامانش": [{ORTH: "غلامان", NORM: "غلامان"}, {ORTH: "ش", NORM: "ش"}],
"غلطهاست": [{ORTH: "غلطها", NORM: "غلطها"}, {ORTH: "ست", NORM: "ست"}],
"فراموشتان": [{ORTH: "فراموش", NORM: "فراموش"}, {ORTH: "تان", NORM: "تان"}],
"فردی‌اند": [{ORTH: "فردی‌", NORM: "فردی‌"}, {ORTH: "اند", NORM: "اند"}],
"فرزندانش": [{ORTH: "فرزندان", NORM: "فرزندان"}, {ORTH: "ش", NORM: "ش"}],
"فرزندش": [{ORTH: "فرزند", NORM: "فرزند"}, {ORTH: "ش", NORM: "ش"}],
"فرم‌هایش": [{ORTH: "فرم‌های", NORM: "فرم‌های"}, {ORTH: "ش", NORM: "ش"}],
"فرهنگی‌مان": [{ORTH: "فرهنگی‌", NORM: "فرهنگی‌"}, {ORTH: "مان", NORM: "مان"}],
"فریادشان": [{ORTH: "فریاد", NORM: "فریاد"}, {ORTH: "شان", NORM: "شان"}],
"فضایی‌شان": [{ORTH: "فضایی‌", NORM: "فضایی‌"}, {ORTH: "شان", NORM: "شان"}],
"فقیرشان": [{ORTH: "فقیر", NORM: "فقیر"}, {ORTH: "شان", NORM: "شان"}],
"فوری‌شان": [{ORTH: "فوری‌", NORM: "فوری‌"}, {ORTH: "شان", NORM: "شان"}],
"قائلند": [{ORTH: "قائل", NORM: "قائل"}, {ORTH: "ند", NORM: "ند"}],
"قائلیم": [{ORTH: "قائل", NORM: "قائل"}, {ORTH: "یم", NORM: "یم"}],
"قادرند": [{ORTH: "قادر", NORM: "قادر"}, {ORTH: "ند", NORM: "ند"}],
"قانونمندش": [{ORTH: "قانونمند", NORM: "قانونمند"}, {ORTH: "ش", NORM: "ش"}],
"قبلند": [{ORTH: "قبل", NORM: "قبل"}, {ORTH: "ند", NORM: "ند"}],
"قبلی‌اش": [{ORTH: "قبلی‌", NORM: "قبلی‌"}, {ORTH: "اش", NORM: "اش"}],
"قبلی‌مان": [{ORTH: "قبلی‌", NORM: "قبلی‌"}, {ORTH: "مان", NORM: "مان"}],
"قدریست": [{ORTH: "قدری", NORM: "قدری"}, {ORTH: "ست", NORM: "ست"}],
"قدمش": [{ORTH: "قدم", NORM: "قدم"}, {ORTH: "ش", NORM: "ش"}],
"قسمتش": [{ORTH: "قسمت", NORM: "قسمت"}, {ORTH: "ش", NORM: "ش"}],
"قضایاست": [{ORTH: "قضایا", NORM: "قضایا"}, {ORTH: "ست", NORM: "ست"}],
"قضیه‌شان": [{ORTH: "قضیه‌", NORM: "قضیه‌"}, {ORTH: "شان", NORM: "شان"}],
"قهرمانهایشان": [
{ORTH: "قهرمانهای", NORM: "قهرمانهای"},
{ORTH: "شان", NORM: "شان"},
],
"قهرمانیش": [{ORTH: "قهرمانی", NORM: "قهرمانی"}, {ORTH: "ش", NORM: "ش"}],
"قومت": [{ORTH: "قوم", NORM: "قوم"}, {ORTH: "ت", NORM: "ت"}],
"لازمه‌اش": [{ORTH: "لازمه‌", NORM: "لازمه‌"}, {ORTH: "اش", NORM: "اش"}],
"مأموریتش": [{ORTH: "مأموریت", NORM: "مأموریت"}, {ORTH: "ش", NORM: "ش"}],
"مأموریتم": [{ORTH: "مأموریت", NORM: "مأموریت"}, {ORTH: "م", NORM: "م"}],
"مأموریت‌اند": [{ORTH: "مأموریت‌", NORM: "مأموریت‌"}, {ORTH: "اند", NORM: "اند"}],
"مادرانشان": [{ORTH: "مادران", NORM: "مادران"}, {ORTH: "شان", NORM: "شان"}],
"مادرت": [{ORTH: "مادر", NORM: "مادر"}, {ORTH: "ت", NORM: "ت"}],
"مادرش": [{ORTH: "مادر", NORM: "مادر"}, {ORTH: "ش", NORM: "ش"}],
"مادرم": [{ORTH: "مادر", NORM: "مادر"}, {ORTH: "م", NORM: "م"}],
"ماست": [{ORTH: "ما", NORM: "ما"}, {ORTH: "ست", NORM: "ست"}],
"مالی‌اش": [{ORTH: "مالی‌", NORM: "مالی‌"}, {ORTH: "اش", NORM: "اش"}],
"ماهیتش": [{ORTH: "ماهیت", NORM: "ماهیت"}, {ORTH: "ش", NORM: "ش"}],
"مایی": [{ORTH: "ما", NORM: "ما"}, {ORTH: "یی", NORM: "یی"}],
"مجازاتش": [{ORTH: "مجازات", NORM: "مجازات"}, {ORTH: "ش", NORM: "ش"}],
"مجبورند": [{ORTH: "مجبور", NORM: "مجبور"}, {ORTH: "ند", NORM: "ند"}],
"محتاجند": [{ORTH: "محتاج", NORM: "محتاج"}, {ORTH: "ند", NORM: "ند"}],
"محرمم": [{ORTH: "محرم", NORM: "محرم"}, {ORTH: "م", NORM: "م"}],
"محلش": [{ORTH: "محل", NORM: "محل"}, {ORTH: "ش", NORM: "ش"}],
"مخالفند": [{ORTH: "مخالف", NORM: "مخالف"}, {ORTH: "ند", NORM: "ند"}],
"مخدرش": [{ORTH: "مخدر", NORM: "مخدر"}, {ORTH: "ش", NORM: "ش"}],
"مدتهاست": [{ORTH: "مدتها", NORM: "مدتها"}, {ORTH: "ست", NORM: "ست"}],
"مدرسه‌ات": [{ORTH: "مدرسه", NORM: "مدرسه"}, {ORTH: "‌ات", NORM: "ات"}],
"مدرکم": [{ORTH: "مدرک", NORM: "مدرک"}, {ORTH: "م", NORM: "م"}],
"مدیرانش": [{ORTH: "مدیران", NORM: "مدیران"}, {ORTH: "ش", NORM: "ش"}],
"مدیونم": [{ORTH: "مدیون", NORM: "مدیون"}, {ORTH: "م", NORM: "م"}],
"مذهبی‌اند": [{ORTH: "مذهبی‌", NORM: "مذهبی‌"}, {ORTH: "اند", NORM: "اند"}],
"مرا": [{ORTH: "م", NORM: "م"}, {ORTH: "را", NORM: "را"}],
"مرادت": [{ORTH: "مراد", NORM: "مراد"}, {ORTH: "ت", NORM: "ت"}],
"مردمشان": [{ORTH: "مردم", NORM: "مردم"}, {ORTH: "شان", NORM: "شان"}],
"مردمند": [{ORTH: "مردم", NORM: "مردم"}, {ORTH: "ند", NORM: "ند"}],
"مردم‌اند": [{ORTH: "مردم‌", NORM: "مردم‌"}, {ORTH: "اند", NORM: "اند"}],
"مرزشان": [{ORTH: "مرز", NORM: "مرز"}, {ORTH: "شان", NORM: "شان"}],
"مرزهاشان": [{ORTH: "مرزها", NORM: "مرزها"}, {ORTH: "شان", NORM: "شان"}],
"مزدورش": [{ORTH: "مزدور", NORM: "مزدور"}, {ORTH: "ش", NORM: "ش"}],
"مسئولیتش": [{ORTH: "مسئولیت", NORM: "مسئولیت"}, {ORTH: "ش", NORM: "ش"}],
"مسائلش": [{ORTH: "مسائل", NORM: "مسائل"}, {ORTH: "ش", NORM: "ش"}],
"مستحضرید": [{ORTH: "مستحضر", NORM: "مستحضر"}, {ORTH: "ید", NORM: "ید"}],
"مسلمانم": [{ORTH: "مسلمان", NORM: "مسلمان"}, {ORTH: "م", NORM: "م"}],
"مسلمانند": [{ORTH: "مسلمان", NORM: "مسلمان"}, {ORTH: "ند", NORM: "ند"}],
"مشتریانش": [{ORTH: "مشتریان", NORM: "مشتریان"}, {ORTH: "ش", NORM: "ش"}],
"مشتهایمان": [{ORTH: "مشتهای", NORM: "مشتهای"}, {ORTH: "مان", NORM: "مان"}],
"مشخصند": [{ORTH: "مشخص", NORM: "مشخص"}, {ORTH: "ند", NORM: "ند"}],
"مشغولند": [{ORTH: "مشغول", NORM: "مشغول"}, {ORTH: "ند", NORM: "ند"}],
"مشغولیم": [{ORTH: "مشغول", NORM: "مشغول"}, {ORTH: "یم", NORM: "یم"}],
"مشهورش": [{ORTH: "مشهور", NORM: "مشهور"}, {ORTH: "ش", NORM: "ش"}],
"مشکلاتشان": [{ORTH: "مشکلات", NORM: "مشکلات"}, {ORTH: "شان", NORM: "شان"}],
"مشکلم": [{ORTH: "مشکل", NORM: "مشکل"}, {ORTH: "م", NORM: "م"}],
"مطمئنم": [{ORTH: "مطمئن", NORM: "مطمئن"}, {ORTH: "م", NORM: "م"}],
"معامله‌مان": [{ORTH: "معامله‌", NORM: "معامله‌"}, {ORTH: "مان", NORM: "مان"}],
"معتقدم": [{ORTH: "معتقد", NORM: "معتقد"}, {ORTH: "م", NORM: "م"}],
"معتقدند": [{ORTH: "معتقد", NORM: "معتقد"}, {ORTH: "ند", NORM: "ند"}],
"معتقدیم": [{ORTH: "معتقد", NORM: "معتقد"}, {ORTH: "یم", NORM: "یم"}],
"معرفی‌اش": [{ORTH: "معرفی‌", NORM: "معرفی‌"}, {ORTH: "اش", NORM: "اش"}],
"معروفش": [{ORTH: "معروف", NORM: "معروف"}, {ORTH: "ش", NORM: "ش"}],
"معضلاتمان": [{ORTH: "معضلات", NORM: "معضلات"}, {ORTH: "مان", NORM: "مان"}],
"معلمش": [{ORTH: "معلم", NORM: "معلم"}, {ORTH: "ش", NORM: "ش"}],
"معنایش": [{ORTH: "معنای", NORM: "معنای"}, {ORTH: "ش", NORM: "ش"}],
"مغزشان": [{ORTH: "مغز", NORM: "مغز"}, {ORTH: "شان", NORM: "شان"}],
"مفیدند": [{ORTH: "مفید", NORM: "مفید"}, {ORTH: "ند", NORM: "ند"}],
"مقابلش": [{ORTH: "مقابل", NORM: "مقابل"}, {ORTH: "ش", NORM: "ش"}],
"مقاله‌اش": [{ORTH: "مقاله‌", NORM: "مقاله‌"}, {ORTH: "اش", NORM: "اش"}],
"مقدمش": [{ORTH: "مقدم", NORM: "مقدم"}, {ORTH: "ش", NORM: "ش"}],
"مقرش": [{ORTH: "مقر", NORM: "مقر"}, {ORTH: "ش", NORM: "ش"}],
"مقصدشان": [{ORTH: "مقصد", NORM: "مقصد"}, {ORTH: "شان", NORM: "شان"}],
"مقصرند": [{ORTH: "مقصر", NORM: "مقصر"}, {ORTH: "ند", NORM: "ند"}],
"مقصودتان": [{ORTH: "مقصود", NORM: "مقصود"}, {ORTH: "تان", NORM: "تان"}],
"ملاقاتهایش": [{ORTH: "ملاقاتهای", NORM: "ملاقاتهای"}, {ORTH: "ش", NORM: "ش"}],
"ممکنشان": [{ORTH: "ممکن", NORM: "ممکن"}, {ORTH: "شان", NORM: "شان"}],
"ممیزیهاست": [{ORTH: "ممیزیها", NORM: "ممیزیها"}, {ORTH: "ست", NORM: "ست"}],
"منظورم": [{ORTH: "منظور", NORM: "منظور"}, {ORTH: "م", NORM: "م"}],
"منی": [{ORTH: "من", NORM: "من"}, {ORTH: "ی", NORM: "ی"}],
"منید": [{ORTH: "من", NORM: "من"}, {ORTH: "ید", NORM: "ید"}],
"مهربانش": [{ORTH: "مهربان", NORM: "مهربان"}, {ORTH: "ش", NORM: "ش"}],
"مهم‌اند": [{ORTH: "مهم‌", NORM: "مهم‌"}, {ORTH: "اند", NORM: "اند"}],
"مواجهند": [{ORTH: "مواجه", NORM: "مواجه"}, {ORTH: "ند", NORM: "ند"}],
"مواجه‌اند": [{ORTH: "مواجه‌", NORM: "مواجه‌"}, {ORTH: "اند", NORM: "اند"}],
"مواخذه‌ات": [{ORTH: "مواخذه", NORM: "مواخذه"}, {ORTH: "‌ات", NORM: "ات"}],
"مواضعشان": [{ORTH: "مواضع", NORM: "مواضع"}, {ORTH: "شان", NORM: "شان"}],
"مواضعمان": [{ORTH: "مواضع", NORM: "مواضع"}, {ORTH: "مان", NORM: "مان"}],
"موافقند": [{ORTH: "موافق", NORM: "موافق"}, {ORTH: "ند", NORM: "ند"}],
"موجوداتش": [{ORTH: "موجودات", NORM: "موجودات"}, {ORTH: "ش", NORM: "ش"}],
"موجودند": [{ORTH: "موجود", NORM: "موجود"}, {ORTH: "ند", NORM: "ند"}],
"موردش": [{ORTH: "مورد", NORM: "مورد"}, {ORTH: "ش", NORM: "ش"}],
"موضعشان": [{ORTH: "موضع", NORM: "موضع"}, {ORTH: "شان", NORM: "شان"}],
"موظفند": [{ORTH: "موظف", NORM: "موظف"}, {ORTH: "ند", NORM: "ند"}],
"موهایش": [{ORTH: "موهای", NORM: "موهای"}, {ORTH: "ش", NORM: "ش"}],
"موهایمان": [{ORTH: "موهای", NORM: "موهای"}, {ORTH: "مان", NORM: "مان"}],
"مویم": [{ORTH: "مو", NORM: "مو"}, {ORTH: "یم", NORM: "یم"}],
"ناخرسندند": [{ORTH: "ناخرسند", NORM: "ناخرسند"}, {ORTH: "ند", NORM: "ند"}],
"ناراحتیش": [{ORTH: "ناراحتی", NORM: "ناراحتی"}, {ORTH: "ش", NORM: "ش"}],
"ناراضی‌اند": [{ORTH: "ناراضی‌", NORM: "ناراضی‌"}, {ORTH: "اند", NORM: "اند"}],
"نارواست": [{ORTH: "ناروا", NORM: "ناروا"}, {ORTH: "ست", NORM: "ست"}],
"نازش": [{ORTH: "ناز", NORM: "ناز"}, {ORTH: "ش", NORM: "ش"}],
"نامش": [{ORTH: "نام", NORM: "نام"}, {ORTH: "ش", NORM: "ش"}],
"نامشان": [{ORTH: "نام", NORM: "نام"}, {ORTH: "شان", NORM: "شان"}],
"نامم": [{ORTH: "نام", NORM: "نام"}, {ORTH: "م", NORM: "م"}],
"نامه‌ات": [{ORTH: "نامه", NORM: "نامه"}, {ORTH: "‌ات", NORM: "ات"}],
"نامه‌ام": [{ORTH: "نامه‌", NORM: "نامه‌"}, {ORTH: "ام", NORM: "ام"}],
"ناچارم": [{ORTH: "ناچار", NORM: "ناچار"}, {ORTH: "م", NORM: "م"}],
"نخست‌وزیری‌اش": [
{ORTH: "نخست‌وزیری‌", NORM: "نخست‌وزیری‌"},
{ORTH: "اش", NORM: "اش"},
],
"نزدش": [{ORTH: "نزد", NORM: "نزد"}, {ORTH: "ش", NORM: "ش"}],
"نشانم": [{ORTH: "نشان", NORM: "نشان"}, {ORTH: "م", NORM: "م"}],
"نظرات‌شان": [{ORTH: "نظرات‌", NORM: "نظرات‌"}, {ORTH: "شان", NORM: "شان"}],
"نظرتان": [{ORTH: "نظر", NORM: "نظر"}, {ORTH: "تان", NORM: "تان"}],
"نظرش": [{ORTH: "نظر", NORM: "نظر"}, {ORTH: "ش", NORM: "ش"}],
"نظرشان": [{ORTH: "نظر", NORM: "نظر"}, {ORTH: "شان", NORM: "شان"}],
"نظرم": [{ORTH: "نظر", NORM: "نظر"}, {ORTH: "م", NORM: "م"}],
"نظرهایشان": [{ORTH: "نظرهای", NORM: "نظرهای"}, {ORTH: "شان", NORM: "شان"}],
"نفاقش": [{ORTH: "نفاق", NORM: "نفاق"}, {ORTH: "ش", NORM: "ش"}],
"نفرند": [{ORTH: "نفر", NORM: "نفر"}, {ORTH: "ند", NORM: "ند"}],
"نفوذیند": [{ORTH: "نفوذی", NORM: "نفوذی"}, {ORTH: "ند", NORM: "ند"}],
"نقطه‌نظراتتان": [
{ORTH: "نقطه‌نظرات", NORM: "نقطه‌نظرات"},
{ORTH: "تان", NORM: "تان"},
],
"نمایشی‌مان": [{ORTH: "نمایشی‌", NORM: "نمایشی‌"}, {ORTH: "مان", NORM: "مان"}],
"نمایندگی‌شان": [
{ORTH: "نمایندگی‌", NORM: "نمایندگی‌"},
{ORTH: "شان", NORM: "شان"},
],
"نمونه‌اش": [{ORTH: "نمونه‌", NORM: "نمونه‌"}, {ORTH: "اش", NORM: "اش"}],
"نمی‌پذیرندش": [{ORTH: "نمی‌پذیرند", NORM: "نمی‌پذیرند"}, {ORTH: "ش", NORM: "ش"}],
"نوآوری‌اش": [{ORTH: "نوآوری‌", NORM: "نوآوری‌"}, {ORTH: "اش", NORM: "اش"}],
"نوشته‌هایشان": [
{ORTH: "نوشته‌های", NORM: "نوشته‌های"},
{ORTH: "شان", NORM: "شان"},
],
"نوشته‌هایم": [{ORTH: "نوشته‌ها", NORM: "نوشته‌ها"}, {ORTH: "یم", NORM: "یم"}],
"نکردنشان": [{ORTH: "نکردن", NORM: "نکردن"}, {ORTH: "شان", NORM: "شان"}],
"نگاهداری‌شان": [
{ORTH: "نگاهداری‌", NORM: "نگاهداری‌"},
{ORTH: "شان", NORM: "شان"},
],
"نگاهش": [{ORTH: "نگاه", NORM: "نگاه"}, {ORTH: "ش", NORM: "ش"}],
"نگرانم": [{ORTH: "نگران", NORM: "نگران"}, {ORTH: "م", NORM: "م"}],
"نگرشهایشان": [{ORTH: "نگرشهای", NORM: "نگرشهای"}, {ORTH: "شان", NORM: "شان"}],
"نیازمندند": [{ORTH: "نیازمند", NORM: "نیازمند"}, {ORTH: "ند", NORM: "ند"}],
"هدفش": [{ORTH: "هدف", NORM: "هدف"}, {ORTH: "ش", NORM: "ش"}],
"همانست": [{ORTH: "همان", NORM: "همان"}, {ORTH: "ست", NORM: "ست"}],
"همراهش": [{ORTH: "همراه", NORM: "همراه"}, {ORTH: "ش", NORM: "ش"}],
"همسرتان": [{ORTH: "همسر", NORM: "همسر"}, {ORTH: "تان", NORM: "تان"}],
"همسرش": [{ORTH: "همسر", NORM: "همسر"}, {ORTH: "ش", NORM: "ش"}],
"همسرم": [{ORTH: "همسر", NORM: "همسر"}, {ORTH: "م", NORM: "م"}],
"همفکرانش": [{ORTH: "همفکران", NORM: "همفکران"}, {ORTH: "ش", NORM: "ش"}],
"همه‌اش": [{ORTH: "همه‌", NORM: "همه‌"}, {ORTH: "اش", NORM: "اش"}],
"همه‌شان": [{ORTH: "همه‌", NORM: "همه‌"}, {ORTH: "شان", NORM: "شان"}],
"همکارانش": [{ORTH: "همکاران", NORM: "همکاران"}, {ORTH: "ش", NORM: "ش"}],
"هم‌نظریم": [{ORTH: "هم‌نظر", NORM: "هم‌نظر"}, {ORTH: "یم", NORM: "یم"}],
"هنرش": [{ORTH: "هنر", NORM: "هنر"}, {ORTH: "ش", NORM: "ش"}],
"هواست": [{ORTH: "هوا", NORM: "هوا"}, {ORTH: "ست", NORM: "ست"}],
"هویتش": [{ORTH: "هویت", NORM: "هویت"}, {ORTH: "ش", NORM: "ش"}],
"وابسته‌اند": [{ORTH: "وابسته‌", NORM: "وابسته‌"}, {ORTH: "اند", NORM: "اند"}],
"واقفند": [{ORTH: "واقف", NORM: "واقف"}, {ORTH: "ند", NORM: "ند"}],
"والدینشان": [{ORTH: "والدین", NORM: "والدین"}, {ORTH: "شان", NORM: "شان"}],
"وجدان‌تان": [{ORTH: "وجدان‌", NORM: "وجدان‌"}, {ORTH: "تان", NORM: "تان"}],
"وجودشان": [{ORTH: "وجود", NORM: "وجود"}, {ORTH: "شان", NORM: "شان"}],
"وطنم": [{ORTH: "وطن", NORM: "وطن"}, {ORTH: "م", NORM: "م"}],
"وعده‌اش": [{ORTH: "وعده‌", NORM: "وعده‌"}, {ORTH: "اش", NORM: "اش"}],
"وقتمان": [{ORTH: "وقت", NORM: "وقت"}, {ORTH: "مان", NORM: "مان"}],
"ولادتش": [{ORTH: "ولادت", NORM: "ولادت"}, {ORTH: "ش", NORM: "ش"}],
"پایانش": [{ORTH: "پایان", NORM: "پایان"}, {ORTH: "ش", NORM: "ش"}],
"پایش": [{ORTH: "پای", NORM: "پای"}, {ORTH: "ش", NORM: "ش"}],
"پایین‌ترند": [{ORTH: "پایین‌تر", NORM: "پایین‌تر"}, {ORTH: "ند", NORM: "ند"}],
"پدرت": [{ORTH: "پدر", NORM: "پدر"}, {ORTH: "ت", NORM: "ت"}],
"پدرش": [{ORTH: "پدر", NORM: "پدر"}, {ORTH: "ش", NORM: "ش"}],
"پدرشان": [{ORTH: "پدر", NORM: "پدر"}, {ORTH: "شان", NORM: "شان"}],
"پدرم": [{ORTH: "پدر", NORM: "پدر"}, {ORTH: "م", NORM: "م"}],
"پربارش": [{ORTH: "پربار", NORM: "پربار"}, {ORTH: "ش", NORM: "ش"}],
"پروردگارت": [{ORTH: "پروردگار", NORM: "پروردگار"}, {ORTH: "ت", NORM: "ت"}],
"پسرتان": [{ORTH: "پسر", NORM: "پسر"}, {ORTH: "تان", NORM: "تان"}],
"پسرش": [{ORTH: "پسر", NORM: "پسر"}, {ORTH: "ش", NORM: "ش"}],
"پسرعمویش": [{ORTH: "پسرعموی", NORM: "پسرعموی"}, {ORTH: "ش", NORM: "ش"}],
"پسر‌عمویت": [{ORTH: "پسر‌عموی", NORM: "پسر‌عموی"}, {ORTH: "ت", NORM: "ت"}],
"پشتش": [{ORTH: "پشت", NORM: "پشت"}, {ORTH: "ش", NORM: "ش"}],
"پشیمونی": [{ORTH: "پشیمون", NORM: "پشیمون"}, {ORTH: "ی", NORM: "ی"}],
"پولش": [{ORTH: "پول", NORM: "پول"}, {ORTH: "ش", NORM: "ش"}],
"پژوهش‌هایش": [{ORTH: "پژوهش‌های", NORM: "پژوهش‌های"}, {ORTH: "ش", NORM: "ش"}],
"پیامبرش": [{ORTH: "پیامبر", NORM: "پیامبر"}, {ORTH: "ش", NORM: "ش"}],
"پیامبری": [{ORTH: "پیامبر", NORM: "پیامبر"}, {ORTH: "ی", NORM: "ی"}],
"پیامش": [{ORTH: "پیام", NORM: "پیام"}, {ORTH: "ش", NORM: "ش"}],
"پیداست": [{ORTH: "پیدا", NORM: "پیدا"}, {ORTH: "ست", NORM: "ست"}],
"پیراهنش": [{ORTH: "پیراهن", NORM: "پیراهن"}, {ORTH: "ش", NORM: "ش"}],
"پیروانش": [{ORTH: "پیروان", NORM: "پیروان"}, {ORTH: "ش", NORM: "ش"}],
"پیشانی‌اش": [{ORTH: "پیشانی‌", NORM: "پیشانی‌"}, {ORTH: "اش", NORM: "اش"}],
"پیمانت": [{ORTH: "پیمان", NORM: "پیمان"}, {ORTH: "ت", NORM: "ت"}],
"پیوندشان": [{ORTH: "پیوند", NORM: "پیوند"}, {ORTH: "شان", NORM: "شان"}],
"چاپش": [{ORTH: "چاپ", NORM: "چاپ"}, {ORTH: "ش", NORM: "ش"}],
"چت": [{ORTH: "چ", NORM: "چ"}, {ORTH: "ت", NORM: "ت"}],
"چته": [{ORTH: "چ", NORM: "چ"}, {ORTH: "ت", NORM: "ت"}, {ORTH: "ه", NORM: "ه"}],
"چرخ‌هایش": [{ORTH: "چرخ‌های", NORM: "چرخ‌های"}, {ORTH: "ش", NORM: "ش"}],
"چشمم": [{ORTH: "چشم", NORM: "چشم"}, {ORTH: "م", NORM: "م"}],
"چشمهایش": [{ORTH: "چشمهای", NORM: "چشمهای"}, {ORTH: "ش", NORM: "ش"}],
"چشمهایشان": [{ORTH: "چشمهای", NORM: "چشمهای"}, {ORTH: "شان", NORM: "شان"}],
"چمنم": [{ORTH: "چمن", NORM: "چمن"}, {ORTH: "م", NORM: "م"}],
"چهره‌اش": [{ORTH: "چهره‌", NORM: "چهره‌"}, {ORTH: "اش", NORM: "اش"}],
"چکاره‌اند": [{ORTH: "چکاره‌", NORM: "چکاره‌"}, {ORTH: "اند", NORM: "اند"}],
"چیزهاست": [{ORTH: "چیزها", NORM: "چیزها"}, {ORTH: "ست", NORM: "ست"}],
"چیزهایش": [{ORTH: "چیزهای", NORM: "چیزهای"}, {ORTH: "ش", NORM: "ش"}],
"چیزیست": [{ORTH: "چیزی", NORM: "چیزی"}, {ORTH: "ست", NORM: "ست"}],
"چیست": [{ORTH: "چی", NORM: "چی"}, {ORTH: "ست", NORM: "ست"}],
"کارش": [{ORTH: "کار", NORM: "کار"}, {ORTH: "ش", NORM: "ش"}],
"کارشان": [{ORTH: "کار", NORM: "کار"}, {ORTH: "شان", NORM: "شان"}],
"کارم": [{ORTH: "کار", NORM: "کار"}, {ORTH: "م", NORM: "م"}],
"کارند": [{ORTH: "کار", NORM: "کار"}, {ORTH: "ند", NORM: "ند"}],
"کارهایم": [{ORTH: "کارها", NORM: "کارها"}, {ORTH: "یم", NORM: "یم"}],
"کافیست": [{ORTH: "کافی", NORM: "کافی"}, {ORTH: "ست", NORM: "ست"}],
"کتابخانه‌اش": [{ORTH: "کتابخانه‌", NORM: "کتابخانه‌"}, {ORTH: "اش", NORM: "اش"}],
"کتابش": [{ORTH: "کتاب", NORM: "کتاب"}, {ORTH: "ش", NORM: "ش"}],
"کتابهاشان": [{ORTH: "کتابها", NORM: "کتابها"}, {ORTH: "شان", NORM: "شان"}],
"کجاست": [{ORTH: "کجا", NORM: "کجا"}, {ORTH: "ست", NORM: "ست"}],
"کدورتهایشان": [{ORTH: "کدورتهای", NORM: "کدورتهای"}, {ORTH: "شان", NORM: "شان"}],
"کردنش": [{ORTH: "کردن", NORM: "کردن"}, {ORTH: "ش", NORM: "ش"}],
"کرم‌خورده‌اش": [
{ORTH: "کرم‌خورده‌", NORM: "کرم‌خورده‌"},
{ORTH: "اش", NORM: "اش"},
],
"کشش": [{ORTH: "کش", NORM: "کش"}, {ORTH: "ش", NORM: "ش"}],
"کشورش": [{ORTH: "کشور", NORM: "کشور"}, {ORTH: "ش", NORM: "ش"}],
"کشورشان": [{ORTH: "کشور", NORM: "کشور"}, {ORTH: "شان", NORM: "شان"}],
"کشورمان": [{ORTH: "کشور", NORM: "کشور"}, {ORTH: "مان", NORM: "مان"}],
"کشورهاست": [{ORTH: "کشورها", NORM: "کشورها"}, {ORTH: "ست", NORM: "ست"}],
"کلیشه‌هاست": [{ORTH: "کلیشه‌ها", NORM: "کلیشه‌ها"}, {ORTH: "ست", NORM: "ست"}],
"کمبودهاست": [{ORTH: "کمبودها", NORM: "کمبودها"}, {ORTH: "ست", NORM: "ست"}],
"کمتره": [{ORTH: "کمتر", NORM: "کمتر"}, {ORTH: "ه", NORM: "ه"}],
"کمکم": [{ORTH: "کمک", NORM: "کمک"}, {ORTH: "م", NORM: "م"}],
"کنارش": [{ORTH: "کنار", NORM: "کنار"}, {ORTH: "ش", NORM: "ش"}],
"کودکانشان": [{ORTH: "کودکان", NORM: "کودکان"}, {ORTH: "شان", NORM: "شان"}],
"کوچکش": [{ORTH: "کوچک", NORM: "کوچک"}, {ORTH: "ش", NORM: "ش"}],
"کیست": [{ORTH: "کی", NORM: "کی"}, {ORTH: "ست", NORM: "ست"}],
"کیفش": [{ORTH: "کیف", NORM: "کیف"}, {ORTH: "ش", NORM: "ش"}],
"گذشته‌اند": [{ORTH: "گذشته‌", NORM: "گذشته‌"}, {ORTH: "اند", NORM: "اند"}],
"گرانقدرش": [{ORTH: "گرانقدر", NORM: "گرانقدر"}, {ORTH: "ش", NORM: "ش"}],
"گرانقدرشان": [{ORTH: "گرانقدر", NORM: "گرانقدر"}, {ORTH: "شان", NORM: "شان"}],
"گردنتان": [{ORTH: "گردن", NORM: "گردن"}, {ORTH: "تان", NORM: "تان"}],
"گردنش": [{ORTH: "گردن", NORM: "گردن"}, {ORTH: "ش", NORM: "ش"}],
"گرفتارند": [{ORTH: "گرفتار", NORM: "گرفتار"}, {ORTH: "ند", NORM: "ند"}],
"گرفتنت": [{ORTH: "گرفتن", NORM: "گرفتن"}, {ORTH: "ت", NORM: "ت"}],
"گروهند": [{ORTH: "گروه", NORM: "گروه"}, {ORTH: "ند", NORM: "ند"}],
"گروگانهایش": [{ORTH: "گروگانهای", NORM: "گروگانهای"}, {ORTH: "ش", NORM: "ش"}],
"گریمش": [{ORTH: "گریم", NORM: "گریم"}, {ORTH: "ش", NORM: "ش"}],
"گفتارمان": [{ORTH: "گفتار", NORM: "گفتار"}, {ORTH: "مان", NORM: "مان"}],
"گلهایش": [{ORTH: "گلهای", NORM: "گلهای"}, {ORTH: "ش", NORM: "ش"}],
"گلویش": [{ORTH: "گلوی", NORM: "گلوی"}, {ORTH: "ش", NORM: "ش"}],
"گناهت": [{ORTH: "گناه", NORM: "گناه"}, {ORTH: "ت", NORM: "ت"}],
"گوشش": [{ORTH: "گوش", NORM: "گوش"}, {ORTH: "ش", NORM: "ش"}],
"گوشم": [{ORTH: "گوش", NORM: "گوش"}, {ORTH: "م", NORM: "م"}],
"گولش": [{ORTH: "گول", NORM: "گول"}, {ORTH: "ش", NORM: "ش"}],
"یادتان": [{ORTH: "یاد", NORM: "یاد"}, {ORTH: "تان", NORM: "تان"}],
"یادم": [{ORTH: "یاد", NORM: "یاد"}, {ORTH: "م", NORM: "م"}],
"یادمان": [{ORTH: "یاد", NORM: "یاد"}, {ORTH: "مان", NORM: "مان"}],
"یارانش": [{ORTH: "یاران", NORM: "یاران"}, {ORTH: "ش", NORM: "ش"}],
💫 Port master changes over to develop (#2979) * Create aryaprabhudesai.md (#2681) * Update _install.jade (#2688) Typo fix: "models" -> "model" * Add FAC to spacy.explain (resolves #2706) * Remove docstrings for deprecated arguments (see #2703) * When calling getoption() in conftest.py, pass a default option (#2709) * When calling getoption() in conftest.py, pass a default option This is necessary to allow testing an installed spacy by running: pytest --pyargs spacy * Add contributor agreement * update bengali token rules for hyphen and digits (#2731) * Less norm computations in token similarity (#2730) * Less norm computations in token similarity * Contributor agreement * Remove ')' for clarity (#2737) Sorry, don't mean to be nitpicky, I just noticed this when going through the CLI and thought it was a quick fix. That said, if this was intention than please let me know. * added contributor agreement for mbkupfer (#2738) * Basic support for Telugu language (#2751) * Lex _attrs for polish language (#2750) * Signed spaCy contributor agreement * Added polish version of english lex_attrs * Introduces a bulk merge function, in order to solve issue #653 (#2696) * Fix comment * Introduce bulk merge to increase performance on many span merges * Sign contributor agreement * Implement pull request suggestions * Describe converters more explicitly (see #2643) * Add multi-threading note to Language.pipe (resolves #2582) [ci skip] * Fix formatting * Fix dependency scheme docs (closes #2705) [ci skip] * Don't set stop word in example (closes #2657) [ci skip] * Add words to portuguese language _num_words (#2759) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Update Indonesian model (#2752) * adding e-KTP in tokenizer exceptions list * add exception token * removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception * add tokenizer exceptions list * combining base_norms with norm_exceptions * adding norm_exception * fix double key in lemmatizer * remove unused import on punctuation.py * reformat stop_words to reduce number of lines, improve readibility * updating tokenizer exception * implement is_currency for lang/id * adding orth_first_upper in tokenizer_exceptions * update the norm_exception list * remove bunch of abbreviations * adding contributors file * Fixed spaCy+Keras example (#2763) * bug fixes in keras example * created contributor agreement * Adding French hyphenated first name (#2786) * Fix typo (closes #2784) * Fix typo (#2795) [ci skip] Fixed typo on line 6 "regcognizer --> recognizer" * Adding basic support for Sinhala language. (#2788) * adding Sinhala language package, stop words, examples and lex_attrs. * Adding contributor agreement * Updating contributor agreement * Also include lowercase norm exceptions * Fix error (#2802) * Fix error ValueError: cannot resize an array that references or is referenced by another array in this way. Use the resize function * added spaCy Contributor Agreement * Add charlax's contributor agreement (#2805) * agreement of contributor, may I introduce a tiny pl languge contribution (#2799) * Contributors agreement * Contributors agreement * Contributors agreement * Add jupyter=True to displacy.render in documentation (#2806) * Revert "Also include lowercase norm exceptions" This reverts commit 70f4e8adf37cfcfab60be2b97d6deae949b30e9e. * Remove deprecated encoding argument to msgpack * Set up dependency tree pattern matching skeleton (#2732) * Fix bug when too many entity types. Fixes #2800 * Fix Python 2 test failure * Require older msgpack-numpy * Restore encoding arg on msgpack-numpy * Try to fix version pin for msgpack-numpy * Update Portuguese Language (#2790) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Portuguese - Add/remove stopwords, fix tokenizer, add currency symbols * Extended punctuation and norm_exceptions in the Portuguese language * Correct error in spacy universe docs concerning spacy-lookup (#2814) * Update Keras Example for (Parikh et al, 2016) implementation (#2803) * bug fixes in keras example * created contributor agreement * baseline for Parikh model * initial version of parikh 2016 implemented * tested asymmetric models * fixed grevious error in normalization * use standard SNLI test file * begin to rework parikh example * initial version of running example * start to document the new version * start to document the new version * Update Decompositional Attention.ipynb * fixed calls to similarity * updated the README * import sys package duh * simplified indexing on mapping word to IDs * stupid python indent error * added code from https://github.com/tensorflow/tensorflow/issues/3388 for tf bug workaround * Fix typo (closes #2815) [ci skip] * Update regex version dependency * Set version to 2.0.13.dev3 * Skip seemingly problematic test * Remove problematic test * Try previous version of regex * Revert "Remove problematic test" This reverts commit bdebbef45552d698d390aa430b527ee27830f11b. * Unskip test * Try older version of regex * 💫 Update training examples and use minibatching (#2830) <!--- Provide a general summary of your changes in the title. --> ## Description Update the training examples in `/examples/training` to show usage of spaCy's `minibatch` and `compounding` helpers ([see here](https://spacy.io/usage/training#tips-batch-size) for details). The lack of batching in the examples has caused some confusion in the past, especially for beginners who would copy-paste the examples, update them with large training sets and experienced slow and unsatisfying results. ### Types of change enhancements ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Visual C++ link updated (#2842) (closes #2841) [ci skip] * New landing page * Add contribution agreement * Correcting lang/ru/examples.py (#2845) * Correct some grammatical inaccuracies in lang\ru\examples.py; filled Contributor Agreement * Correct some grammatical inaccuracies in lang\ru\examples.py * Move contributor agreement to separate file * Set version to 2.0.13.dev4 * Add Persian(Farsi) language support (#2797) * Also include lowercase norm exceptions * Remove in favour of https://github.com/explosion/spaCy/graphs/contributors * Rule-based French Lemmatizer (#2818) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> Add a rule-based French Lemmatizer following the english one and the excellent PR for [greek language optimizations](https://github.com/explosion/spaCy/pull/2558) to adapt the Lemmatizer class. ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> - Lemma dictionary used can be found [here](http://infolingu.univ-mlv.fr/DonneesLinguistiques/Dictionnaires/telechargement.html), I used the XML version. - Add several files containing exhaustive list of words for each part of speech - Add some lemma rules - Add POS that are not checked in the standard Lemmatizer, i.e PRON, DET, ADV and AUX - Modify the Lemmatizer class to check in lookup table as a last resort if POS not mentionned - Modify the lemmatize function to check in lookup table as a last resort - Init files are updated so the model can support all the functionalities mentioned above - Add words to tokenizer_exceptions_list.py in respect to regex used in tokenizer_exceptions.py ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [X] I have submitted the spaCy Contributor Agreement. - [X] I ran the tests, and all new and existing tests passed. - [X] My changes don't require a change to the documentation, or if they do, I've added all required information. * Set version to 2.0.13 * Fix formatting and consistency * Update docs for new version [ci skip] * Increment version [ci skip] * Add info on wheels [ci skip] * Adding "This is a sentence" example to Sinhala (#2846) * Add wheels badge * Update badge [ci skip] * Update README.rst [ci skip] * Update murmurhash pin * Increment version to 2.0.14.dev0 * Update GPU docs for v2.0.14 * Add wheel to setup_requires * Import prefer_gpu and require_gpu functions from Thinc * Add tests for prefer_gpu() and require_gpu() * Update requirements and setup.py * Workaround bug in thinc require_gpu * Set version to v2.0.14 * Update push-tag script * Unhack prefer_gpu * Require thinc 6.10.6 * Update prefer_gpu and require_gpu docs [ci skip] * Fix specifiers for GPU * Set version to 2.0.14.dev1 * Set version to 2.0.14 * Update Thinc version pin * Increment version * Fix msgpack-numpy version pin * Increment version * Update version to 2.0.16 * Update version [ci skip] * Redundant ')' in the Stop words' example (#2856) <!--- Provide a general summary of your changes in the title. --> ## Description <!--- Use this section to describe your changes. If your changes required testing, include information about the testing environment and the tests you ran. If your test fixes a bug reported in an issue, don't forget to include the issue number. If your PR is still a work in progress, that's totally fine – just include a note to let us know. --> ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [ ] I have submitted the spaCy Contributor Agreement. - [ ] I ran the tests, and all new and existing tests passed. - [ ] My changes don't require a change to the documentation, or if they do, I've added all required information. * Documentation improvement regarding joblib and SO (#2867) Some documentation improvements ## Description 1. Fixed the dead URL to joblib 2. Fixed Stack Overflow brand name (with space) ### Types of change Documentation ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * raise error when setting overlapping entities as doc.ents (#2880) * Fix out-of-bounds access in NER training The helper method state.B(1) gets the index of the first token of the buffer, or -1 if no such token exists. Normally this is safe because we pass this to functions like state.safe_get(), which returns an empty token. Here we used it directly as an array index, which is not okay! This error may have been the cause of out-of-bounds access errors during training. Similar errors may still be around, so much be hunted down. Hunting this one down took a long time...I printed out values across training runs and diffed, looking for points of divergence between runs, when no randomness should be allowed. * Change PyThaiNLP Url (#2876) * Fix missing comma * Add example showing a fix-up rule for space entities * Set version to 2.0.17.dev0 * Update regex version * Revert "Update regex version" This reverts commit 62358dd867d15bc6a475942dff34effba69dd70a. * Try setting older regex version, to align with conda * Set version to 2.0.17 * Add spacy-js to universe [ci-skip] * Add spacy-raspberry to universe (closes #2889) * Add script to validate universe json [ci skip] * Removed space in docs + added contributor indo (#2909) * - removed unneeded space in documentation * - added contributor info * Allow input text of length up to max_length, inclusive (#2922) * Include universe spec for spacy-wordnet component (#2919) * feat: include universe spec for spacy-wordnet component * chore: include spaCy contributor agreement * Minor formatting changes [ci skip] * Fix image [ci skip] Twitter URL doesn't work on live site * Check if the word is in one of the regular lists specific to each POS (#2886) * 💫 Create random IDs for SVGs to prevent ID clashes (#2927) Resolves #2924. ## Description Fixes problem where multiple visualizations in Jupyter notebooks would have clashing arc IDs, resulting in weirdly positioned arc labels. Generating a random ID prefix so even identical parses won't receive the same IDs for consistency (even if effect of ID clash isn't noticable here.) ### Types of change bug fix ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Fix typo [ci skip] * fixes symbolic link on py3 and windows (#2949) * fixes symbolic link on py3 and windows during setup of spacy using command python -m spacy link en_core_web_sm en closes #2948 * Update spacy/compat.py Co-Authored-By: cicorias <cicorias@users.noreply.github.com> * Fix formatting * Update universe [ci skip] * Catalan Language Support (#2940) * Catalan language Support * Ddding Catalan to documentation * Sort languages alphabetically [ci skip] * Update tests for pytest 4.x (#2965) <!--- Provide a general summary of your changes in the title. --> ## Description - [x] Replace marks in params for pytest 4.0 compat ([see here](https://docs.pytest.org/en/latest/deprecations.html#marks-in-pytest-mark-parametrize)) - [x] Un-xfail passing tests (some fixes in a recent update resolved a bunch of issues, but tests were apparently never updated here) ### Types of change <!-- What type of change does your PR cover? Is it a bug fix, an enhancement or new feature, or a change to the documentation? --> ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Fix regex pin to harmonize with conda (#2964) * Update README.rst * Fix bug where Vocab.prune_vector did not use 'batch_size' (#2977) Fixes #2976 * Fix typo * Fix typo * Remove duplicate file * Require thinc 7.0.0.dev2 Fixes bug in gpu_ops that would use cupy instead of numpy on CPU * Add missing import * Fix error IDs * Fix tests
2018-11-29 18:30:29 +03:00
}