* test sPacy commit to git fri 04052019 10:54
* change Data format from my format to master format
* ทัทั้งนี้ ---> ทั้งนี้
* delete stop_word translate from Eng
* Adjust formatting and readability
* add Thai norm_exception
* Add Dobita21 SCA
* editรึ : หรือ,
* Update Dobita21.md
* Auto-format
* Integrate norms into language defaults
* add acronym and some norm exception words
* test sPacy commit to git fri 04052019 10:54
* change Data format from my format to master format
* ทัทั้งนี้ ---> ทั้งนี้
* delete stop_word translate from Eng
* Adjust formatting and readability
* add Thai norm_exception
* Add Dobita21 SCA
* editรึ : หรือ,
* Update Dobita21.md
* Auto-format
* Integrate norms into language defaults
* test sPacy commit to git fri 04052019 10:54
* change Data format from my format to master format
* ทัทั้งนี้ ---> ทั้งนี้
* delete stop_word translate from Eng
* Adjust formatting and readability
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [ ] I have submitted the spaCy Contributor Agreement.
- [ ] I ran the tests, and all new and existing tests passed.
- [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.
Co-authored-by: Ines Montani <ines@ines.io>
* added tag_map for indonesian
* changed tag map from .py to .txt to see if tests pass
* added symbols import
* added utf8 encoding flag
* added missing SCONJ symbol
* Auto-format
* Remove unused imports
* Make tag map available in Indonesian defaults
I wrote a small script to read the UD English training data and check
that our tag map and morph rules were resulting in the best POS map.
This hadn't been done for some time, and there have been various changes
to the UD schema since it has been done. After these changes we should
see much better agreement between our POS assignments and the UD POS
tags.
* Add xfail test for vocab.writing_system
* Add vocab.writing_system property
* Set Language.Defaults.writing_system
* Set default writing system
* Remove xfail on test_vocab_writing_system