spaCy/spacy/pipeline
Paul O'Leary McCann 0f01f46e02
Update Cython string types (#9143)
* Replace all basestring references with unicode

`basestring` was a compatability type introduced by Cython to make
dealing with utf-8 strings in Python2 easier. In Python3 it is
equivalent to the unicode (or str) type.

I replaced all references to basestring with unicode, since that was
used elsewhere, but we could also just replace them with str, which
shoudl also be equivalent.

All tests pass locally.

* Replace all references to unicode type with str

Since we only support python3 this is simpler.

* Remove all references to unicode type

This removes all references to the unicode type across the codebase and
replaces them with `str`, which makes it more drastic than the prior
commits. In order to make this work importing `unicode_literals` had to
be removed, and one explicit unicode literal also had to be removed (it
is unclear why this is necessary in Cython with language level 3, but
without doing it there were errors about implicit conversion).

When `unicode` is used as a type in comments it was also edited to be
`str`.

Additionally `coding: utf8` headers were removed from a few files.
2021-09-13 17:02:17 +02:00
..
_parser_internals Update Cython string types (#9143) 2021-09-13 17:02:17 +02:00
__init__.py Add SpanCategorizer component (#6747) 2021-06-24 12:35:27 +02:00
attributeruler.py Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
dep_parser.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
entity_linker.py Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
entityruler.py Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
functions.py Tidy up and auto-format 2021-02-13 12:55:56 +11:00
lemmatizer.py Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
morphologizer.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
multitask.pyx Replace negative rows with 0 in StaticVectors (#7674) 2021-04-22 18:04:15 +10:00
ner.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
pipe.pxd TrainablePipe (#6213) 2020-10-08 21:33:49 +02:00
pipe.pyx Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
sentencizer.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
senter.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
spancat.py Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
tagger.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
textcat_multilabel.py Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
textcat.py Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
tok2vec.py Ensemble textcat with listener (#8012) 2021-05-31 18:21:06 +10:00
trainable_pipe.pxd Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
trainable_pipe.pyx Pass excludes when serializing vocab (#8824) 2021-08-03 14:42:44 +02:00
transition_parser.pxd TrainablePipe (#6213) 2020-10-08 21:33:49 +02:00
transition_parser.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00