spaCy/spacy
Paul O'Leary McCann 0f01f46e02
Update Cython string types (#9143)
* Replace all basestring references with unicode

`basestring` was a compatability type introduced by Cython to make
dealing with utf-8 strings in Python2 easier. In Python3 it is
equivalent to the unicode (or str) type.

I replaced all references to basestring with unicode, since that was
used elsewhere, but we could also just replace them with str, which
shoudl also be equivalent.

All tests pass locally.

* Replace all references to unicode type with str

Since we only support python3 this is simpler.

* Remove all references to unicode type

This removes all references to the unicode type across the codebase and
replaces them with `str`, which makes it more drastic than the prior
commits. In order to make this work importing `unicode_literals` had to
be removed, and one explicit unicode literal also had to be removed (it
is unclear why this is necessary in Cython with language level 3, but
without doing it there were errors about implicit conversion).

When `unicode` is used as a type in comments it was also edited to be
`str`.

Additionally `coding: utf8` headers were removed from a few files.
2021-09-13 17:02:17 +02:00
..
cli Auto-format code with black (#8895) 2021-08-06 13:38:06 +02:00
displacy Tidy up code 2021-06-28 12:08:15 +02:00
lang Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
matcher Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
ml Use 0-vector for OOV lexemes (#8639) 2021-07-13 14:48:12 +10:00
pipeline Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
tests Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
tokens Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
training Add new parameter for saving every n epoch in pretraining (#8912) 2021-08-12 11:14:48 +02:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
__main__.py Tidy up 2020-06-22 00:45:40 +02:00
about.py bump to 3.1.1 2021-07-19 14:48:27 +02:00
attrs.pxd Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
attrs.pyx Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
compat.py Use Literal type for nr_feature_tokens 2020-09-23 16:00:03 +02:00
default_config_pretraining.cfg Add new parameter for saving every n epoch in pretraining (#8912) 2021-08-12 11:14:48 +02:00
default_config.cfg Add training option to set annotations on update (#7767) 2021-04-26 16:53:53 +02:00
errors.py Fix check for RIGHT_ATTRS in dep matcher (#8807) 2021-08-04 09:20:41 +02:00
glossary.py Add Chinese PTB tags to glossary (#7993) 2021-05-06 18:43:03 +10:00
kb.pxd Replace cpdef variables with cdef (#7834) 2021-04-26 16:54:02 +02:00
kb.pyx Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
language.py Pass excludes when serializing vocab (#8824) 2021-08-03 14:42:44 +02:00
lexeme.pxd Fix Lexeme.from_ptr 2020-08-10 16:43:37 +02:00
lexeme.pyi Add stub files for main cython classes (#8427) 2021-08-07 12:30:03 +02:00
lexeme.pyx Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
lookups.py Tidy up code 2021-06-28 12:08:15 +02:00
morphology.pxd Clean up Morphology imports and definitions (#7441) 2021-04-26 16:54:23 +02:00
morphology.pyx Clean up Morphology imports and definitions (#7441) 2021-04-26 16:54:23 +02:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
pipe_analysis.py Tidy up and auto-format 2020-09-29 21:39:28 +02:00
py.typed Add py.typed 2021-03-16 09:48:31 +01:00
schemas.py Add new parameter for saving every n epoch in pretraining (#8912) 2021-08-12 11:14:48 +02:00
scorer.py Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
strings.pxd Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
strings.pyi Add stub files for main cython classes (#8427) 2021-08-07 12:30:03 +02:00
strings.pyx Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
structs.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
symbols.pxd introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
symbols.pyx introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
tokenizer.pxd Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
tokenizer.pyx Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
vectors.pyx Fix vectors data on GPU (#7626) 2021-04-19 18:30:03 +10:00
vocab.pxd Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
vocab.pyi Add stub files for main cython classes (#8427) 2021-08-07 12:30:03 +02:00
vocab.pyx Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00