mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-26 01:46:28 +03:00
Add stub files for main cython classes (#8427)
* Add stub files for main API classes * Add contributor agreement for ezorita * Update types for ndarray and hash() * Fix __getitem__ and __iter__ * Add attributes of Doc and Token classes * Overload type hints for Span.__getitem__ * Fix type hint overload for Span.__getitem__ Co-authored-by: Luca Dorigo <dorigoluca@gmail.com>
This commit is contained in:
parent
56d4d87aeb
commit
439f30faad
106
.github/contributors/ezorita.md
vendored
Normal file
106
.github/contributors/ezorita.md
vendored
Normal file
|
@ -0,0 +1,106 @@
|
||||||
|
# spaCy contributor agreement
|
||||||
|
|
||||||
|
This spaCy Contributor Agreement (**"SCA"**) is based on the
|
||||||
|
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
|
||||||
|
The SCA applies to any contribution that you make to any product or project
|
||||||
|
managed by us (the **"project"**), and sets out the intellectual property rights
|
||||||
|
you grant to us in the contributed materials. The term **"us"** shall mean
|
||||||
|
[ExplosionAI GmbH](https://explosion.ai/legal). The term
|
||||||
|
**"you"** shall mean the person or entity identified below.
|
||||||
|
|
||||||
|
If you agree to be bound by these terms, fill in the information requested
|
||||||
|
below and include the filled-in version with your first pull request, under the
|
||||||
|
folder [`.github/contributors/`](/.github/contributors/). The name of the file
|
||||||
|
should be your GitHub username, with the extension `.md`. For example, the user
|
||||||
|
example_user would create the file `.github/contributors/example_user.md`.
|
||||||
|
|
||||||
|
Read this agreement carefully before signing. These terms and conditions
|
||||||
|
constitute a binding legal agreement.
|
||||||
|
|
||||||
|
## Contributor Agreement
|
||||||
|
|
||||||
|
1. The term "contribution" or "contributed materials" means any source code,
|
||||||
|
object code, patch, tool, sample, graphic, specification, manual,
|
||||||
|
documentation, or any other material posted or submitted by you to the project.
|
||||||
|
|
||||||
|
2. With respect to any worldwide copyrights, or copyright applications and
|
||||||
|
registrations, in your contribution:
|
||||||
|
|
||||||
|
* you hereby assign to us joint ownership, and to the extent that such
|
||||||
|
assignment is or becomes invalid, ineffective or unenforceable, you hereby
|
||||||
|
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
|
||||||
|
royalty-free, unrestricted license to exercise all rights under those
|
||||||
|
copyrights. This includes, at our option, the right to sublicense these same
|
||||||
|
rights to third parties through multiple levels of sublicensees or other
|
||||||
|
licensing arrangements;
|
||||||
|
|
||||||
|
* you agree that each of us can do all things in relation to your
|
||||||
|
contribution as if each of us were the sole owners, and if one of us makes
|
||||||
|
a derivative work of your contribution, the one who makes the derivative
|
||||||
|
work (or has it made will be the sole owner of that derivative work;
|
||||||
|
|
||||||
|
* you agree that you will not assert any moral rights in your contribution
|
||||||
|
against us, our licensees or transferees;
|
||||||
|
|
||||||
|
* you agree that we may register a copyright in your contribution and
|
||||||
|
exercise all ownership rights associated with it; and
|
||||||
|
|
||||||
|
* you agree that neither of us has any duty to consult with, obtain the
|
||||||
|
consent of, pay or render an accounting to the other for any use or
|
||||||
|
distribution of your contribution.
|
||||||
|
|
||||||
|
3. With respect to any patents you own, or that you can license without payment
|
||||||
|
to any third party, you hereby grant to us a perpetual, irrevocable,
|
||||||
|
non-exclusive, worldwide, no-charge, royalty-free license to:
|
||||||
|
|
||||||
|
* make, have made, use, sell, offer to sell, import, and otherwise transfer
|
||||||
|
your contribution in whole or in part, alone or in combination with or
|
||||||
|
included in any product, work or materials arising out of the project to
|
||||||
|
which your contribution was submitted, and
|
||||||
|
|
||||||
|
* at our option, to sublicense these same rights to third parties through
|
||||||
|
multiple levels of sublicensees or other licensing arrangements.
|
||||||
|
|
||||||
|
4. Except as set out above, you keep all right, title, and interest in your
|
||||||
|
contribution. The rights that you grant to us under these terms are effective
|
||||||
|
on the date you first submitted a contribution to us, even if your submission
|
||||||
|
took place before the date you sign these terms.
|
||||||
|
|
||||||
|
5. You covenant, represent, warrant and agree that:
|
||||||
|
|
||||||
|
* Each contribution that you submit is and shall be an original work of
|
||||||
|
authorship and you can legally grant the rights set out in this SCA;
|
||||||
|
|
||||||
|
* to the best of your knowledge, each contribution will not violate any
|
||||||
|
third party's copyrights, trademarks, patents, or other intellectual
|
||||||
|
property rights; and
|
||||||
|
|
||||||
|
* each contribution shall be in compliance with U.S. export control laws and
|
||||||
|
other applicable export and import laws. You agree to notify us if you
|
||||||
|
become aware of any circumstance which would make any of the foregoing
|
||||||
|
representations inaccurate in any respect. We may publicly disclose your
|
||||||
|
participation in the project, including the fact that you have signed the SCA.
|
||||||
|
|
||||||
|
6. This SCA is governed by the laws of the State of California and applicable
|
||||||
|
U.S. Federal law. Any choice of law rules will not apply.
|
||||||
|
|
||||||
|
7. Please place an “x” on one of the applicable statement below. Please do NOT
|
||||||
|
mark both statements:
|
||||||
|
|
||||||
|
* [x] I am signing on behalf of myself as an individual and no other person
|
||||||
|
or entity, including my employer, has or will have rights with respect to my
|
||||||
|
contributions.
|
||||||
|
|
||||||
|
* [ ] I am signing on behalf of my employer or a legal entity and I have the
|
||||||
|
actual authority to contractually bind that entity.
|
||||||
|
|
||||||
|
## Contributor Details
|
||||||
|
|
||||||
|
| Field | Entry |
|
||||||
|
|------------------------------- | -------------------- |
|
||||||
|
| Name | Eduard Zorita |
|
||||||
|
| Company name (if applicable) | |
|
||||||
|
| Title or role (if applicable) | |
|
||||||
|
| Date | 06/17/2021 |
|
||||||
|
| GitHub username | ezorita |
|
||||||
|
| Website (optional) | |
|
61
spacy/lexeme.pyi
Normal file
61
spacy/lexeme.pyi
Normal file
|
@ -0,0 +1,61 @@
|
||||||
|
from typing import (
|
||||||
|
Union,
|
||||||
|
Any,
|
||||||
|
)
|
||||||
|
from thinc.types import Floats1d
|
||||||
|
from .tokens import Doc, Span, Token
|
||||||
|
from .vocab import Vocab
|
||||||
|
|
||||||
|
class Lexeme:
|
||||||
|
def __init__(self, vocab: Vocab, orth: int) -> None: ...
|
||||||
|
def __richcmp__(self, other: Lexeme, op: int) -> bool: ...
|
||||||
|
def __hash__(self) -> int: ...
|
||||||
|
def set_attrs(self, **attrs: Any) -> None: ...
|
||||||
|
def set_flag(self, flag_id: int, value: bool) -> None: ...
|
||||||
|
def check_flag(self, flag_id: int) -> bool: ...
|
||||||
|
def similarity(self, other: Union[Doc, Span, Token, Lexeme]) -> float: ...
|
||||||
|
@property
|
||||||
|
def has_vector(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def vector_norm(self) -> float: ...
|
||||||
|
vector: Floats1d
|
||||||
|
rank: str
|
||||||
|
sentiment: float
|
||||||
|
@property
|
||||||
|
def orth_(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def text(self) -> str: ...
|
||||||
|
lower: str
|
||||||
|
norm: int
|
||||||
|
shape: int
|
||||||
|
prefix: int
|
||||||
|
suffix: int
|
||||||
|
cluster: int
|
||||||
|
lang: int
|
||||||
|
prob: float
|
||||||
|
lower_: str
|
||||||
|
norm_: str
|
||||||
|
shape_: str
|
||||||
|
prefix_: str
|
||||||
|
suffix_: str
|
||||||
|
lang_: str
|
||||||
|
flags: int
|
||||||
|
@property
|
||||||
|
def is_oov(self) -> bool: ...
|
||||||
|
is_stop: bool
|
||||||
|
is_alpha: bool
|
||||||
|
is_ascii: bool
|
||||||
|
is_digit: bool
|
||||||
|
is_lower: bool
|
||||||
|
is_upper: bool
|
||||||
|
is_title: bool
|
||||||
|
is_punct: bool
|
||||||
|
is_space: bool
|
||||||
|
is_bracket: bool
|
||||||
|
is_quote: bool
|
||||||
|
is_left_punct: bool
|
||||||
|
is_right_punct: bool
|
||||||
|
is_currency: bool
|
||||||
|
like_url: bool
|
||||||
|
like_num: bool
|
||||||
|
like_email: bool
|
41
spacy/matcher/matcher.pyi
Normal file
41
spacy/matcher/matcher.pyi
Normal file
|
@ -0,0 +1,41 @@
|
||||||
|
from typing import Any, List, Dict, Tuple, Optional, Callable, Union, Iterator, Iterable
|
||||||
|
from ..vocab import Vocab
|
||||||
|
from ..tokens import Doc, Span
|
||||||
|
|
||||||
|
class Matcher:
|
||||||
|
def __init__(self, vocab: Vocab, validate: bool = ...) -> None: ...
|
||||||
|
def __reduce__(self) -> Any: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def __contains__(self, key: str) -> bool: ...
|
||||||
|
def add(
|
||||||
|
self,
|
||||||
|
key: str,
|
||||||
|
patterns: List[List[Dict[str, Any]]],
|
||||||
|
*,
|
||||||
|
on_match: Optional[
|
||||||
|
Callable[[Matcher, Doc, int, List[Tuple[Any, ...]]], Any]
|
||||||
|
] = ...,
|
||||||
|
greedy: Optional[str] = ...
|
||||||
|
) -> None: ...
|
||||||
|
def remove(self, key: str) -> None: ...
|
||||||
|
def has_key(self, key: Union[str, int]) -> bool: ...
|
||||||
|
def get(
|
||||||
|
self, key: Union[str, int], default: Optional[Any] = ...
|
||||||
|
) -> Tuple[Optional[Callable[[Any], Any]], List[List[Dict[Any, Any]]]]: ...
|
||||||
|
def pipe(
|
||||||
|
self,
|
||||||
|
docs: Iterable[Tuple[Doc, Any]],
|
||||||
|
batch_size: int = ...,
|
||||||
|
return_matches: bool = ...,
|
||||||
|
as_tuples: bool = ...,
|
||||||
|
) -> Union[
|
||||||
|
Iterator[Tuple[Tuple[Doc, Any], Any]], Iterator[Tuple[Doc, Any]], Iterator[Doc]
|
||||||
|
]: ...
|
||||||
|
def __call__(
|
||||||
|
self,
|
||||||
|
doclike: Union[Doc, Span],
|
||||||
|
*,
|
||||||
|
as_spans: bool = ...,
|
||||||
|
allow_missing: bool = ...,
|
||||||
|
with_alignments: bool = ...
|
||||||
|
) -> Union[List[Tuple[int, int, int]], List[Span]]: ...
|
22
spacy/strings.pyi
Normal file
22
spacy/strings.pyi
Normal file
|
@ -0,0 +1,22 @@
|
||||||
|
from typing import Optional, Iterable, Iterator, Union, Any
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
def get_string_id(key: str) -> int: ...
|
||||||
|
|
||||||
|
class StringStore:
|
||||||
|
def __init__(
|
||||||
|
self, strings: Optional[Iterable[str]] = ..., freeze: bool = ...
|
||||||
|
) -> None: ...
|
||||||
|
def __getitem__(self, string_or_id: Union[bytes, str, int]) -> Union[str, int]: ...
|
||||||
|
def as_int(self, key: Union[bytes, str, int]) -> int: ...
|
||||||
|
def as_string(self, key: Union[bytes, str, int]) -> str: ...
|
||||||
|
def add(self, string: str) -> int: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def __contains__(self, string: str) -> bool: ...
|
||||||
|
def __iter__(self) -> Iterator[str]: ...
|
||||||
|
def __reduce__(self) -> Any: ...
|
||||||
|
def to_disk(self, path: Union[str, Path]) -> None: ...
|
||||||
|
def from_disk(self, path: Union[str, Path]) -> StringStore: ...
|
||||||
|
def to_bytes(self, **kwargs: Any) -> bytes: ...
|
||||||
|
def from_bytes(self, bytes_data: bytes, **kwargs: Any) -> StringStore: ...
|
||||||
|
def _reset_and_load(self, strings: Iterable[str]) -> None: ...
|
17
spacy/tokens/_retokenize.pyi
Normal file
17
spacy/tokens/_retokenize.pyi
Normal file
|
@ -0,0 +1,17 @@
|
||||||
|
from typing import Dict, Any, Union, List, Tuple
|
||||||
|
from .doc import Doc
|
||||||
|
from .span import Span
|
||||||
|
from .token import Token
|
||||||
|
|
||||||
|
class Retokenizer:
|
||||||
|
def __init__(self, doc: Doc) -> None: ...
|
||||||
|
def merge(self, span: Span, attrs: Dict[Union[str, int], Any] = ...) -> None: ...
|
||||||
|
def split(
|
||||||
|
self,
|
||||||
|
token: Token,
|
||||||
|
orths: List[str],
|
||||||
|
heads: List[Union[Token, Tuple[Token, int]]],
|
||||||
|
attrs: Dict[Union[str, int], List[Any]] = ...,
|
||||||
|
) -> None: ...
|
||||||
|
def __enter__(self) -> Retokenizer: ...
|
||||||
|
def __exit__(self, *args: Any) -> None: ...
|
180
spacy/tokens/doc.pyi
Normal file
180
spacy/tokens/doc.pyi
Normal file
|
@ -0,0 +1,180 @@
|
||||||
|
from typing import (
|
||||||
|
Callable,
|
||||||
|
Protocol,
|
||||||
|
Iterable,
|
||||||
|
Iterator,
|
||||||
|
Optional,
|
||||||
|
Union,
|
||||||
|
Tuple,
|
||||||
|
List,
|
||||||
|
Dict,
|
||||||
|
Any,
|
||||||
|
overload,
|
||||||
|
)
|
||||||
|
from cymem.cymem import Pool
|
||||||
|
from thinc.types import Floats1d, Floats2d, Ints2d
|
||||||
|
from .span import Span
|
||||||
|
from .token import Token
|
||||||
|
from ._dict_proxies import SpanGroups
|
||||||
|
from ._retokenize import Retokenizer
|
||||||
|
from ..lexeme import Lexeme
|
||||||
|
from ..vocab import Vocab
|
||||||
|
from .underscore import Underscore
|
||||||
|
from pathlib import Path
|
||||||
|
import numpy
|
||||||
|
|
||||||
|
class DocMethod(Protocol):
|
||||||
|
def __call__(self: Doc, *args: Any, **kwargs: Any) -> Any: ...
|
||||||
|
|
||||||
|
class Doc:
|
||||||
|
vocab: Vocab
|
||||||
|
mem: Pool
|
||||||
|
spans: SpanGroups
|
||||||
|
max_length: int
|
||||||
|
length: int
|
||||||
|
sentiment: float
|
||||||
|
cats: Dict[str, float]
|
||||||
|
user_hooks: Dict[str, Callable[..., Any]]
|
||||||
|
user_token_hooks: Dict[str, Callable[..., Any]]
|
||||||
|
user_span_hooks: Dict[str, Callable[..., Any]]
|
||||||
|
tensor: numpy.ndarray
|
||||||
|
user_data: Dict[str, Any]
|
||||||
|
has_unknown_spaces: bool
|
||||||
|
@classmethod
|
||||||
|
def set_extension(
|
||||||
|
cls,
|
||||||
|
name: str,
|
||||||
|
default: Optional[Any] = ...,
|
||||||
|
getter: Optional[Callable[[Doc], Any]] = ...,
|
||||||
|
setter: Optional[Callable[[Doc, Any], None]] = ...,
|
||||||
|
method: Optional[DocMethod] = ...,
|
||||||
|
force: bool = ...,
|
||||||
|
) -> None: ...
|
||||||
|
@classmethod
|
||||||
|
def get_extension(
|
||||||
|
cls, name: str
|
||||||
|
) -> Tuple[
|
||||||
|
Optional[Any],
|
||||||
|
Optional[DocMethod],
|
||||||
|
Optional[Callable[[Doc], Any]],
|
||||||
|
Optional[Callable[[Doc, Any], None]],
|
||||||
|
]: ...
|
||||||
|
@classmethod
|
||||||
|
def has_extension(cls, name: str) -> bool: ...
|
||||||
|
@classmethod
|
||||||
|
def remove_extension(
|
||||||
|
cls, name: str
|
||||||
|
) -> Tuple[
|
||||||
|
Optional[Any],
|
||||||
|
Optional[DocMethod],
|
||||||
|
Optional[Callable[[Doc], Any]],
|
||||||
|
Optional[Callable[[Doc, Any], None]],
|
||||||
|
]: ...
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
vocab: Vocab,
|
||||||
|
words: Optional[List[str]] = ...,
|
||||||
|
spaces: Optional[List[bool]] = ...,
|
||||||
|
user_data: Optional[Dict[Any, Any]] = ...,
|
||||||
|
tags: Optional[List[str]] = ...,
|
||||||
|
pos: Optional[List[str]] = ...,
|
||||||
|
morphs: Optional[List[str]] = ...,
|
||||||
|
lemmas: Optional[List[str]] = ...,
|
||||||
|
heads: Optional[List[int]] = ...,
|
||||||
|
deps: Optional[List[str]] = ...,
|
||||||
|
sent_starts: Optional[List[Union[bool, None]]] = ...,
|
||||||
|
ents: Optional[List[str]] = ...,
|
||||||
|
) -> None: ...
|
||||||
|
@property
|
||||||
|
def _(self) -> Underscore: ...
|
||||||
|
@property
|
||||||
|
def is_tagged(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_parsed(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_nered(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_sentenced(self) -> bool: ...
|
||||||
|
def has_annotation(
|
||||||
|
self, attr: Union[int, str], *, require_complete: bool = ...
|
||||||
|
) -> bool: ...
|
||||||
|
@overload
|
||||||
|
def __getitem__(self, i: int) -> Token: ...
|
||||||
|
@overload
|
||||||
|
def __getitem__(self, i: slice) -> Span: ...
|
||||||
|
def __iter__(self) -> Iterator[Token]: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def __unicode__(self) -> str: ...
|
||||||
|
def __bytes__(self) -> bytes: ...
|
||||||
|
def __str__(self) -> str: ...
|
||||||
|
def __repr__(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def doc(self) -> Doc: ...
|
||||||
|
def char_span(
|
||||||
|
self,
|
||||||
|
start_idx: int,
|
||||||
|
end_idx: int,
|
||||||
|
label: Union[int, str] = ...,
|
||||||
|
kb_id: Union[int, str] = ...,
|
||||||
|
vector: Optional[Floats1d] = ...,
|
||||||
|
alignment_mode: str = ...,
|
||||||
|
) -> Span: ...
|
||||||
|
def similarity(self, other: Union[Doc, Span, Token, Lexeme]) -> float: ...
|
||||||
|
@property
|
||||||
|
def has_vector(self) -> bool: ...
|
||||||
|
vector: Floats1d
|
||||||
|
vector_norm: float
|
||||||
|
@property
|
||||||
|
def text(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def text_with_ws(self) -> str: ...
|
||||||
|
ents: Tuple[Span]
|
||||||
|
def set_ents(
|
||||||
|
self,
|
||||||
|
entities: List[Span],
|
||||||
|
*,
|
||||||
|
blocked: Optional[List[Span]] = ...,
|
||||||
|
missing: Optional[List[Span]] = ...,
|
||||||
|
outside: Optional[List[Span]] = ...,
|
||||||
|
default: str = ...
|
||||||
|
) -> None: ...
|
||||||
|
@property
|
||||||
|
def noun_chunks(self) -> Iterator[Span]: ...
|
||||||
|
@property
|
||||||
|
def sents(self) -> Iterator[Span]: ...
|
||||||
|
@property
|
||||||
|
def lang(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def lang_(self) -> str: ...
|
||||||
|
def count_by(
|
||||||
|
self, attr_id: int, exclude: Optional[Any] = ..., counts: Optional[Any] = ...
|
||||||
|
) -> Dict[Any, int]: ...
|
||||||
|
def from_array(self, attrs: List[int], array: Ints2d) -> Doc: ...
|
||||||
|
@staticmethod
|
||||||
|
def from_docs(
|
||||||
|
docs: List[Doc],
|
||||||
|
ensure_whitespace: bool = ...,
|
||||||
|
attrs: Optional[Union[Tuple[Union[str, int]], List[Union[int, str]]]] = ...,
|
||||||
|
) -> Doc: ...
|
||||||
|
def get_lca_matrix(self) -> Ints2d: ...
|
||||||
|
def copy(self) -> Doc: ...
|
||||||
|
def to_disk(
|
||||||
|
self, path: Union[str, Path], *, exclude: Iterable[str] = ...
|
||||||
|
) -> None: ...
|
||||||
|
def from_disk(
|
||||||
|
self, path: Union[str, Path], *, exclude: Union[List[str], Tuple[str]] = ...
|
||||||
|
) -> Doc: ...
|
||||||
|
def to_bytes(self, *, exclude: Union[List[str], Tuple[str]] = ...) -> bytes: ...
|
||||||
|
def from_bytes(
|
||||||
|
self, bytes_data: bytes, *, exclude: Union[List[str], Tuple[str]] = ...
|
||||||
|
) -> Doc: ...
|
||||||
|
def to_dict(self, *, exclude: Union[List[str], Tuple[str]] = ...) -> bytes: ...
|
||||||
|
def from_dict(
|
||||||
|
self, msg: bytes, *, exclude: Union[List[str], Tuple[str]] = ...
|
||||||
|
) -> Doc: ...
|
||||||
|
def extend_tensor(self, tensor: Floats2d) -> None: ...
|
||||||
|
def retokenize(self) -> Retokenizer: ...
|
||||||
|
def to_json(self, underscore: Optional[List[str]] = ...) -> Dict[str, Any]: ...
|
||||||
|
def to_utf8_array(self, nr_char: int = ...) -> Ints2d: ...
|
||||||
|
@staticmethod
|
||||||
|
def _get_array_attrs() -> Tuple[Any]: ...
|
20
spacy/tokens/morphanalysis.pyi
Normal file
20
spacy/tokens/morphanalysis.pyi
Normal file
|
@ -0,0 +1,20 @@
|
||||||
|
from typing import Any, Dict, Iterator, List, Union
|
||||||
|
from ..vocab import Vocab
|
||||||
|
|
||||||
|
class MorphAnalysis:
|
||||||
|
def __init__(
|
||||||
|
self, vocab: Vocab, features: Union[Dict[str, str], str] = ...
|
||||||
|
) -> None: ...
|
||||||
|
@classmethod
|
||||||
|
def from_id(cls, vocab: Vocab, key: Any) -> MorphAnalysis: ...
|
||||||
|
def __contains__(self, feature: str) -> bool: ...
|
||||||
|
def __iter__(self) -> Iterator[str]: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def __hash__(self) -> int: ...
|
||||||
|
def __eq__(self, other: MorphAnalysis) -> bool: ...
|
||||||
|
def __ne__(self, other: MorphAnalysis) -> bool: ...
|
||||||
|
def get(self, field: Any) -> List[str]: ...
|
||||||
|
def to_json(self) -> str: ...
|
||||||
|
def to_dict(self) -> Dict[str, str]: ...
|
||||||
|
def __str__(self) -> str: ...
|
||||||
|
def __repr__(self) -> str: ...
|
124
spacy/tokens/span.pyi
Normal file
124
spacy/tokens/span.pyi
Normal file
|
@ -0,0 +1,124 @@
|
||||||
|
from typing import Callable, Protocol, Iterator, Optional, Union, Tuple, Any, overload
|
||||||
|
from thinc.types import Floats1d, Ints2d, FloatsXd
|
||||||
|
from .doc import Doc
|
||||||
|
from .token import Token
|
||||||
|
from .underscore import Underscore
|
||||||
|
from ..lexeme import Lexeme
|
||||||
|
from ..vocab import Vocab
|
||||||
|
|
||||||
|
class SpanMethod(Protocol):
|
||||||
|
def __call__(self: Span, *args: Any, **kwargs: Any) -> Any: ...
|
||||||
|
|
||||||
|
class Span:
|
||||||
|
@classmethod
|
||||||
|
def set_extension(
|
||||||
|
cls,
|
||||||
|
name: str,
|
||||||
|
default: Optional[Any] = ...,
|
||||||
|
getter: Optional[Callable[[Span], Any]] = ...,
|
||||||
|
setter: Optional[Callable[[Span, Any], None]] = ...,
|
||||||
|
method: Optional[SpanMethod] = ...,
|
||||||
|
force: bool = ...,
|
||||||
|
) -> None: ...
|
||||||
|
@classmethod
|
||||||
|
def get_extension(
|
||||||
|
cls, name: str
|
||||||
|
) -> Tuple[
|
||||||
|
Optional[Any],
|
||||||
|
Optional[SpanMethod],
|
||||||
|
Optional[Callable[[Span], Any]],
|
||||||
|
Optional[Callable[[Span, Any], None]],
|
||||||
|
]: ...
|
||||||
|
@classmethod
|
||||||
|
def has_extension(cls, name: str) -> bool: ...
|
||||||
|
@classmethod
|
||||||
|
def remove_extension(
|
||||||
|
cls, name: str
|
||||||
|
) -> Tuple[
|
||||||
|
Optional[Any],
|
||||||
|
Optional[SpanMethod],
|
||||||
|
Optional[Callable[[Span], Any]],
|
||||||
|
Optional[Callable[[Span, Any], None]],
|
||||||
|
]: ...
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
doc: Doc,
|
||||||
|
start: int,
|
||||||
|
end: int,
|
||||||
|
label: int = ...,
|
||||||
|
vector: Optional[Floats1d] = ...,
|
||||||
|
vector_norm: Optional[float] = ...,
|
||||||
|
kb_id: Optional[int] = ...,
|
||||||
|
) -> None: ...
|
||||||
|
def __richcmp__(self, other: Span, op: int) -> bool: ...
|
||||||
|
def __hash__(self) -> int: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def __repr__(self) -> str: ...
|
||||||
|
@overload
|
||||||
|
def __getitem__(self, i: int) -> Token: ...
|
||||||
|
@overload
|
||||||
|
def __getitem__(self, i: slice) -> Span: ...
|
||||||
|
def __iter__(self) -> Iterator[Token]: ...
|
||||||
|
@property
|
||||||
|
def _(self) -> Underscore: ...
|
||||||
|
def as_doc(self, *, copy_user_data: bool = ...) -> Doc: ...
|
||||||
|
def get_lca_matrix(self) -> Ints2d: ...
|
||||||
|
def similarity(self, other: Union[Doc, Span, Token, Lexeme]) -> float: ...
|
||||||
|
@property
|
||||||
|
def vocab(self) -> Vocab: ...
|
||||||
|
@property
|
||||||
|
def sent(self) -> Span: ...
|
||||||
|
@property
|
||||||
|
def ents(self) -> Tuple[Span]: ...
|
||||||
|
@property
|
||||||
|
def has_vector(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def vector(self) -> Floats1d: ...
|
||||||
|
@property
|
||||||
|
def vector_norm(self) -> float: ...
|
||||||
|
@property
|
||||||
|
def tensor(self) -> FloatsXd: ...
|
||||||
|
@property
|
||||||
|
def sentiment(self) -> float: ...
|
||||||
|
@property
|
||||||
|
def text(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def text_with_ws(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def noun_chunks(self) -> Iterator[Span]: ...
|
||||||
|
@property
|
||||||
|
def root(self) -> Token: ...
|
||||||
|
def char_span(
|
||||||
|
self,
|
||||||
|
start_idx: int,
|
||||||
|
end_idx: int,
|
||||||
|
label: int = ...,
|
||||||
|
kb_id: int = ...,
|
||||||
|
vector: Optional[Floats1d] = ...,
|
||||||
|
) -> Span: ...
|
||||||
|
@property
|
||||||
|
def conjuncts(self) -> Tuple[Token]: ...
|
||||||
|
@property
|
||||||
|
def lefts(self) -> Iterator[Token]: ...
|
||||||
|
@property
|
||||||
|
def rights(self) -> Iterator[Token]: ...
|
||||||
|
@property
|
||||||
|
def n_lefts(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def n_rights(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def subtree(self) -> Iterator[Token]: ...
|
||||||
|
start: int
|
||||||
|
end: int
|
||||||
|
start_char: int
|
||||||
|
end_char: int
|
||||||
|
label: int
|
||||||
|
kb_id: int
|
||||||
|
ent_id: int
|
||||||
|
ent_id_: str
|
||||||
|
@property
|
||||||
|
def orth_(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def lemma_(self) -> str: ...
|
||||||
|
label_: str
|
||||||
|
kb_id_: str
|
24
spacy/tokens/span_group.pyi
Normal file
24
spacy/tokens/span_group.pyi
Normal file
|
@ -0,0 +1,24 @@
|
||||||
|
from typing import Any, Dict, Iterable
|
||||||
|
from .doc import Doc
|
||||||
|
from .span import Span
|
||||||
|
|
||||||
|
class SpanGroup:
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
doc: Doc,
|
||||||
|
*,
|
||||||
|
name: str = ...,
|
||||||
|
attrs: Dict[str, Any] = ...,
|
||||||
|
spans: Iterable[Span] = ...
|
||||||
|
) -> None: ...
|
||||||
|
def __repr__(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def doc(self) -> Doc: ...
|
||||||
|
@property
|
||||||
|
def has_overlap(self) -> bool: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def append(self, span: Span) -> None: ...
|
||||||
|
def extend(self, spans: Iterable[Span]) -> None: ...
|
||||||
|
def __getitem__(self, i: int) -> Span: ...
|
||||||
|
def to_bytes(self) -> bytes: ...
|
||||||
|
def from_bytes(self, bytes_data: bytes) -> SpanGroup: ...
|
208
spacy/tokens/token.pyi
Normal file
208
spacy/tokens/token.pyi
Normal file
|
@ -0,0 +1,208 @@
|
||||||
|
from typing import (
|
||||||
|
Callable,
|
||||||
|
Protocol,
|
||||||
|
Iterator,
|
||||||
|
Optional,
|
||||||
|
Union,
|
||||||
|
Tuple,
|
||||||
|
Any,
|
||||||
|
)
|
||||||
|
from thinc.types import Floats1d, FloatsXd
|
||||||
|
from .doc import Doc
|
||||||
|
from .span import Span
|
||||||
|
from .morphanalysis import MorphAnalysis
|
||||||
|
from ..lexeme import Lexeme
|
||||||
|
from ..vocab import Vocab
|
||||||
|
from .underscore import Underscore
|
||||||
|
|
||||||
|
class TokenMethod(Protocol):
|
||||||
|
def __call__(self: Token, *args: Any, **kwargs: Any) -> Any: ...
|
||||||
|
|
||||||
|
class Token:
|
||||||
|
i: int
|
||||||
|
doc: Doc
|
||||||
|
vocab: Vocab
|
||||||
|
@classmethod
|
||||||
|
def set_extension(
|
||||||
|
cls,
|
||||||
|
name: str,
|
||||||
|
default: Optional[Any] = ...,
|
||||||
|
getter: Optional[Callable[[Token], Any]] = ...,
|
||||||
|
setter: Optional[Callable[[Token, Any], None]] = ...,
|
||||||
|
method: Optional[TokenMethod] = ...,
|
||||||
|
force: bool = ...,
|
||||||
|
) -> None: ...
|
||||||
|
@classmethod
|
||||||
|
def get_extension(
|
||||||
|
cls, name: str
|
||||||
|
) -> Tuple[
|
||||||
|
Optional[Any],
|
||||||
|
Optional[TokenMethod],
|
||||||
|
Optional[Callable[[Token], Any]],
|
||||||
|
Optional[Callable[[Token, Any], None]],
|
||||||
|
]: ...
|
||||||
|
@classmethod
|
||||||
|
def has_extension(cls, name: str) -> bool: ...
|
||||||
|
@classmethod
|
||||||
|
def remove_extension(
|
||||||
|
cls, name: str
|
||||||
|
) -> Tuple[
|
||||||
|
Optional[Any],
|
||||||
|
Optional[TokenMethod],
|
||||||
|
Optional[Callable[[Token], Any]],
|
||||||
|
Optional[Callable[[Token, Any], None]],
|
||||||
|
]: ...
|
||||||
|
def __init__(self, vocab: Vocab, doc: Doc, offset: int) -> None: ...
|
||||||
|
def __hash__(self) -> int: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def __unicode__(self) -> str: ...
|
||||||
|
def __bytes__(self) -> bytes: ...
|
||||||
|
def __str__(self) -> str: ...
|
||||||
|
def __repr__(self) -> str: ...
|
||||||
|
def __richcmp__(self, other: Token, op: int) -> bool: ...
|
||||||
|
@property
|
||||||
|
def _(self) -> Underscore: ...
|
||||||
|
def nbor(self, i: int = ...) -> Token: ...
|
||||||
|
def similarity(self, other: Union[Doc, Span, Token, Lexeme]) -> float: ...
|
||||||
|
def has_morph(self) -> bool: ...
|
||||||
|
morph: MorphAnalysis
|
||||||
|
@property
|
||||||
|
def lex(self) -> Lexeme: ...
|
||||||
|
@property
|
||||||
|
def lex_id(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def rank(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def text(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def text_with_ws(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def prob(self) -> float: ...
|
||||||
|
@property
|
||||||
|
def sentiment(self) -> float: ...
|
||||||
|
@property
|
||||||
|
def lang(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def idx(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def cluster(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def orth(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def lower(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def norm(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def shape(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def prefix(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def suffix(self) -> int: ...
|
||||||
|
lemma: int
|
||||||
|
pos: int
|
||||||
|
tag: int
|
||||||
|
dep: int
|
||||||
|
@property
|
||||||
|
def has_vector(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def vector(self) -> Floats1d: ...
|
||||||
|
@property
|
||||||
|
def vector_norm(self) -> float: ...
|
||||||
|
@property
|
||||||
|
def tensor(self) -> Optional[FloatsXd]: ...
|
||||||
|
@property
|
||||||
|
def n_lefts(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def n_rights(self) -> int: ...
|
||||||
|
@property
|
||||||
|
def sent(self) -> Span: ...
|
||||||
|
sent_start: bool
|
||||||
|
is_sent_start: Optional[bool]
|
||||||
|
is_sent_end: Optional[bool]
|
||||||
|
@property
|
||||||
|
def lefts(self) -> Iterator[Token]: ...
|
||||||
|
@property
|
||||||
|
def rights(self) -> Iterator[Token]: ...
|
||||||
|
@property
|
||||||
|
def children(self) -> Iterator[Token]: ...
|
||||||
|
@property
|
||||||
|
def subtree(self) -> Iterator[Token]: ...
|
||||||
|
@property
|
||||||
|
def left_edge(self) -> Token: ...
|
||||||
|
@property
|
||||||
|
def right_edge(self) -> Token: ...
|
||||||
|
@property
|
||||||
|
def ancestors(self) -> Iterator[Token]: ...
|
||||||
|
def is_ancestor(self, descendant: Token) -> bool: ...
|
||||||
|
def has_head(self) -> bool: ...
|
||||||
|
head: Token
|
||||||
|
@property
|
||||||
|
def conjuncts(self) -> Tuple[Token]: ...
|
||||||
|
ent_type: int
|
||||||
|
ent_type_: str
|
||||||
|
@property
|
||||||
|
def ent_iob(self) -> int: ...
|
||||||
|
@classmethod
|
||||||
|
def iob_strings(cls) -> Tuple[str]: ...
|
||||||
|
@property
|
||||||
|
def ent_iob_(self) -> str: ...
|
||||||
|
ent_id: int
|
||||||
|
ent_id_: str
|
||||||
|
ent_kb_id: int
|
||||||
|
ent_kb_id_: str
|
||||||
|
@property
|
||||||
|
def whitespace_(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def orth_(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def lower_(self) -> str: ...
|
||||||
|
norm_: str
|
||||||
|
@property
|
||||||
|
def shape_(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def prefix_(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def suffix_(self) -> str: ...
|
||||||
|
@property
|
||||||
|
def lang_(self) -> str: ...
|
||||||
|
lemma_: str
|
||||||
|
pos_: str
|
||||||
|
tag_: str
|
||||||
|
def has_dep(self) -> bool: ...
|
||||||
|
dep_: str
|
||||||
|
@property
|
||||||
|
def is_oov(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_stop(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_alpha(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_ascii(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_digit(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_lower(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_upper(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_title(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_punct(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_space(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_bracket(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_quote(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_left_punct(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_right_punct(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def is_currency(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def like_url(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def like_num(self) -> bool: ...
|
||||||
|
@property
|
||||||
|
def like_email(self) -> bool: ...
|
78
spacy/vocab.pyi
Normal file
78
spacy/vocab.pyi
Normal file
|
@ -0,0 +1,78 @@
|
||||||
|
from typing import (
|
||||||
|
Callable,
|
||||||
|
Iterator,
|
||||||
|
Optional,
|
||||||
|
Union,
|
||||||
|
Tuple,
|
||||||
|
List,
|
||||||
|
Dict,
|
||||||
|
Any,
|
||||||
|
)
|
||||||
|
from thinc.types import Floats1d, FloatsXd
|
||||||
|
from . import Language
|
||||||
|
from .strings import StringStore
|
||||||
|
from .lexeme import Lexeme
|
||||||
|
from .lookups import Lookups
|
||||||
|
from .tokens import Doc, Span
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
def create_vocab(
|
||||||
|
lang: Language, defaults: Any, vectors_name: Optional[str] = ...
|
||||||
|
) -> Vocab: ...
|
||||||
|
|
||||||
|
class Vocab:
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
lex_attr_getters: Optional[Dict[str, Callable[[str], Any]]] = ...,
|
||||||
|
strings: Optional[Union[List[str], StringStore]] = ...,
|
||||||
|
lookups: Optional[Lookups] = ...,
|
||||||
|
oov_prob: float = ...,
|
||||||
|
vectors_name: Optional[str] = ...,
|
||||||
|
writing_system: Dict[str, Any] = ...,
|
||||||
|
get_noun_chunks: Optional[Callable[[Union[Doc, Span]], Iterator[Span]]] = ...,
|
||||||
|
) -> None: ...
|
||||||
|
@property
|
||||||
|
def lang(self) -> Language: ...
|
||||||
|
def __len__(self) -> int: ...
|
||||||
|
def add_flag(
|
||||||
|
self, flag_getter: Callable[[str], bool], flag_id: int = ...
|
||||||
|
) -> int: ...
|
||||||
|
def __contains__(self, key: str) -> bool: ...
|
||||||
|
def __iter__(self) -> Iterator[Lexeme]: ...
|
||||||
|
def __getitem__(self, id_or_string: Union[str, int]) -> Lexeme: ...
|
||||||
|
@property
|
||||||
|
def vectors_length(self) -> int: ...
|
||||||
|
def reset_vectors(
|
||||||
|
self, *, width: Optional[int] = ..., shape: Optional[int] = ...
|
||||||
|
) -> None: ...
|
||||||
|
def prune_vectors(self, nr_row: int, batch_size: int = ...) -> Dict[str, float]: ...
|
||||||
|
def get_vector(
|
||||||
|
self,
|
||||||
|
orth: Union[int, str],
|
||||||
|
minn: Optional[int] = ...,
|
||||||
|
maxn: Optional[int] = ...,
|
||||||
|
) -> FloatsXd: ...
|
||||||
|
def set_vector(self, orth: Union[int, str], vector: Floats1d) -> None: ...
|
||||||
|
def has_vector(self, orth: Union[int, str]) -> bool: ...
|
||||||
|
lookups: Lookups
|
||||||
|
def to_disk(
|
||||||
|
self, path: Union[str, Path], *, exclude: Union[List[str], Tuple[str]] = ...
|
||||||
|
) -> None: ...
|
||||||
|
def from_disk(
|
||||||
|
self, path: Union[str, Path], *, exclude: Union[List[str], Tuple[str]] = ...
|
||||||
|
) -> Vocab: ...
|
||||||
|
def to_bytes(self, *, exclude: Union[List[str], Tuple[str]] = ...) -> bytes: ...
|
||||||
|
def from_bytes(
|
||||||
|
self, bytes_data: bytes, *, exclude: Union[List[str], Tuple[str]] = ...
|
||||||
|
) -> Vocab: ...
|
||||||
|
|
||||||
|
def pickle_vocab(vocab: Vocab) -> Any: ...
|
||||||
|
def unpickle_vocab(
|
||||||
|
sstore: StringStore,
|
||||||
|
vectors: Any,
|
||||||
|
morphology: Any,
|
||||||
|
data_dir: Any,
|
||||||
|
lex_attr_getters: Any,
|
||||||
|
lookups: Any,
|
||||||
|
get_noun_chunks: Any,
|
||||||
|
) -> Vocab: ...
|
Loading…
Reference in New Issue
Block a user