| Get a custom spaCy pipeline, tailor-made for your NLP problem by spaCy's core developers. Streamlined, production-ready, predictable and maintainable. Start by completing our 5-minute questionnaire to tell us what you need and we'll be in touch! **[Learn more →](https://explosion.ai/spacy-tailored-pipelines)** |
-|
| Bespoke advice for problem solving, strategy and analysis for applied NLP projects. Services include data strategy, code reviews, pipeline design and annotation coaching. Curious? Fill in our 5-minute questionnaire to tell us what you need and we'll be in touch! **[Learn more →](https://explosion.ai/spacy-tailored-analysis)** |
+| Documentation | |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| ⭐️ **[spaCy 101]** | New to spaCy? Here's everything you need to know! |
+| 📚 **[Usage Guides]** | How to use spaCy and its features. |
+| 🚀 **[New in v3.0]** | New features, backwards incompatibilities and migration guide. |
+| 🪐 **[Project Templates]** | End-to-end workflows you can clone, modify and run. |
+| 🎛 **[API Reference]** | The detailed reference for spaCy's API. |
+| 📦 **[Models]** | Download trained pipelines for spaCy. |
+| 🌌 **[Universe]** | Plugins, extensions, demos and books from the spaCy ecosystem. |
+| ⚙️ **[spaCy VS Code Extension]** | Additional tooling and features for working with spaCy's config files. |
+| 👩🏫 **[Online Course]** | Learn spaCy in this free and interactive online course. |
+| 📺 **[Videos]** | Our YouTube channel with video tutorials, talks and more. |
+| 🛠 **[Changelog]** | Changes and version history. |
+| 💝 **[Contribute]** | How to contribute to the spaCy project and code base. |
+|
| Get a custom spaCy pipeline, tailor-made for your NLP problem by spaCy's core developers. Streamlined, production-ready, predictable and maintainable. Start by completing our 5-minute questionnaire to tell us what you need and we'll be in touch! **[Learn more →](https://explosion.ai/spacy-tailored-pipelines)** |
+|
| Bespoke advice for problem solving, strategy and analysis for applied NLP projects. Services include data strategy, code reviews, pipeline design and annotation coaching. Curious? Fill in our 5-minute questionnaire to tell us what you need and we'll be in touch! **[Learn more →](https://explosion.ai/spacy-tailored-analysis)** |
[spacy 101]: https://spacy.io/usage/spacy-101
[new in v3.0]: https://spacy.io/usage/v3
@@ -58,7 +55,7 @@ open-source software, released under the [MIT license](https://github.com/explos
[api reference]: https://spacy.io/api/
[models]: https://spacy.io/models
[universe]: https://spacy.io/universe
-[spaCy VS Code Extension]: https://github.com/explosion/spacy-vscode
+[spacy vs code extension]: https://github.com/explosion/spacy-vscode
[videos]: https://www.youtube.com/c/ExplosionAI
[online course]: https://course.spacy.io
[project templates]: https://github.com/explosion/projects
@@ -92,7 +89,9 @@ more people can benefit from it.
- State-of-the-art speed
- Production-ready **training system**
- Linguistically-motivated **tokenization**
-- Components for named **entity recognition**, part-of-speech-tagging, dependency parsing, sentence segmentation, **text classification**, lemmatization, morphological analysis, entity linking and more
+- Components for named **entity recognition**, part-of-speech-tagging,
+ dependency parsing, sentence segmentation, **text classification**,
+ lemmatization, morphological analysis, entity linking and more
- Easily extensible with **custom components** and attributes
- Support for custom models in **PyTorch**, **TensorFlow** and other frameworks
- Built in **visualizers** for syntax and NER
@@ -118,8 +117,8 @@ For detailed installation instructions, see the
### pip
Using pip, spaCy releases are available as source packages and binary wheels.
-Before you install spaCy and its dependencies, make sure that
-your `pip`, `setuptools` and `wheel` are up to date.
+Before you install spaCy and its dependencies, make sure that your `pip`,
+`setuptools` and `wheel` are up to date.
```bash
pip install -U pip setuptools wheel
@@ -174,9 +173,9 @@ with the new version.
## 📦 Download model packages
-Trained pipelines for spaCy can be installed as **Python packages**. This
-means that they're a component of your application, just like any other module.
-Models can be installed using spaCy's [`download`](https://spacy.io/api/cli#download)
+Trained pipelines for spaCy can be installed as **Python packages**. This means
+that they're a component of your application, just like any other module. Models
+can be installed using spaCy's [`download`](https://spacy.io/api/cli#download)
command, or manually by pointing pip to a path or URL.
| Documentation | |
@@ -242,8 +241,7 @@ do that depends on your system.
| **Mac** | Install a recent version of [XCode](https://developer.apple.com/xcode/), including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled. |
| **Windows** | Install a version of the [Visual C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) or [Visual Studio Express](https://visualstudio.microsoft.com/vs/express/) that matches the version that was used to compile your Python interpreter. |
-For more details
-and instructions, see the documentation on
+For more details and instructions, see the documentation on
[compiling spaCy from source](https://spacy.io/usage#source) and the
[quickstart widget](https://spacy.io/usage#section-quickstart) to get the right
commands for your platform and Python version.
diff --git a/requirements.txt b/requirements.txt
index f711d0012..48d188ec9 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -18,7 +18,7 @@ numpy>=1.15.0; python_version < "3.9"
numpy>=1.19.0; python_version >= "3.9"
requests>=2.13.0,<3.0.0
tqdm>=4.38.0,<5.0.0
-pydantic>=1.7.4,!=1.8,!=1.8.1,<1.11.0
+pydantic>=1.7.4,!=1.8,!=1.8.1,<3.0.0
jinja2
langcodes>=3.2.0,<4.0.0
# Official Python utilities
diff --git a/setup.cfg b/setup.cfg
index a6b60ba59..852ff4049 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -62,7 +62,7 @@ install_requires =
numpy>=1.15.0; python_version < "3.9"
numpy>=1.19.0; python_version >= "3.9"
requests>=2.13.0,<3.0.0
- pydantic>=1.7.4,!=1.8,!=1.8.1,<1.11.0
+ pydantic>=1.7.4,!=1.8,!=1.8.1,<3.0.0
jinja2
# Official Python utilities
setuptools
@@ -113,6 +113,8 @@ cuda117 =
cupy-cuda117>=5.0.0b4,<13.0.0
cuda11x =
cupy-cuda11x>=11.0.0,<13.0.0
+cuda12x =
+ cupy-cuda12x>=11.5.0,<13.0.0
cuda-autodetect =
cupy-wheel>=11.0.0,<13.0.0
apple =
diff --git a/setup.py b/setup.py
index 243554c7a..3b6fae37b 100755
--- a/setup.py
+++ b/setup.py
@@ -1,10 +1,9 @@
#!/usr/bin/env python
from setuptools import Extension, setup, find_packages
import sys
-import platform
import numpy
-from distutils.command.build_ext import build_ext
-from distutils.sysconfig import get_python_inc
+from setuptools.command.build_ext import build_ext
+from sysconfig import get_path
from pathlib import Path
import shutil
from Cython.Build import cythonize
@@ -88,30 +87,6 @@ COPY_FILES = {
}
-def is_new_osx():
- """Check whether we're on OSX >= 10.7"""
- if sys.platform != "darwin":
- return False
- mac_ver = platform.mac_ver()[0]
- if mac_ver.startswith("10"):
- minor_version = int(mac_ver.split(".")[1])
- if minor_version >= 7:
- return True
- else:
- return False
- return False
-
-
-if is_new_osx():
- # On Mac, use libc++ because Apple deprecated use of
- # libstdc
- COMPILE_OPTIONS["other"].append("-stdlib=libc++")
- LINK_OPTIONS["other"].append("-lc++")
- # g++ (used by unix compiler on mac) links to libstdc++ as a default lib.
- # See: https://stackoverflow.com/questions/1653047/avoid-linking-to-libstdc
- LINK_OPTIONS["other"].append("-nodefaultlibs")
-
-
# By subclassing build_extensions we have the actual compiler that will be used which is really known only after finalize_options
# http://stackoverflow.com/questions/724664/python-distutils-how-to-get-a-compiler-that-is-going-to-be-used
class build_ext_options:
@@ -204,7 +179,7 @@ def setup_package():
include_dirs = [
numpy.get_include(),
- get_python_inc(plat_specific=True),
+ get_path("include"),
]
ext_modules = []
ext_modules.append(
diff --git a/spacy/__init__.py b/spacy/__init__.py
index 1a18ad0d5..8aa2eccd7 100644
--- a/spacy/__init__.py
+++ b/spacy/__init__.py
@@ -13,7 +13,6 @@ from thinc.api import Config, prefer_gpu, require_cpu, require_gpu # noqa: F401
from . import pipeline # noqa: F401
from . import util
from .about import __version__ # noqa: F401
-from .cli.info import info # noqa: F401
from .errors import Errors
from .glossary import explain # noqa: F401
from .language import Language
@@ -77,3 +76,9 @@ def blank(
# We should accept both dot notation and nested dict here for consistency
config = util.dot_to_dict(config)
return LangClass.from_config(config, vocab=vocab, meta=meta)
+
+
+def info(*args, **kwargs):
+ from .cli.info import info as cli_info
+
+ return cli_info(*args, **kwargs)
diff --git a/spacy/cli/__init__.py b/spacy/cli/__init__.py
index 4fc076f9a..f3c6dbfed 100644
--- a/spacy/cli/__init__.py
+++ b/spacy/cli/__init__.py
@@ -14,6 +14,7 @@ from .debug_diff import debug_diff # noqa: F401
from .debug_model import debug_model # noqa: F401
from .download import download # noqa: F401
from .evaluate import evaluate # noqa: F401
+from .find_function import find_function # noqa: F401
from .find_threshold import find_threshold # noqa: F401
from .info import info # noqa: F401
from .init_config import fill_config, init_config # noqa: F401
diff --git a/spacy/cli/assemble.py b/spacy/cli/assemble.py
index ee2500b27..f74bbacb5 100644
--- a/spacy/cli/assemble.py
+++ b/spacy/cli/assemble.py
@@ -40,7 +40,8 @@ def assemble_cli(
DOCS: https://spacy.io/api/cli#assemble
"""
- util.logger.setLevel(logging.DEBUG if verbose else logging.INFO)
+ if verbose:
+ util.logger.setLevel(logging.DEBUG)
# Make sure all files and paths exists if they are needed
if not config_path or (str(config_path) != "-" and not config_path.exists()):
msg.fail("Config file not found", config_path, exits=1)
diff --git a/spacy/cli/evaluate.py b/spacy/cli/evaluate.py
index 6235b658d..2276ca6b0 100644
--- a/spacy/cli/evaluate.py
+++ b/spacy/cli/evaluate.py
@@ -28,6 +28,7 @@ def evaluate_cli(
displacy_path: Optional[Path] = Opt(None, "--displacy-path", "-dp", help="Directory to output rendered parses as HTML", exists=True, file_okay=False),
displacy_limit: int = Opt(25, "--displacy-limit", "-dl", help="Limit of parses to render as HTML"),
per_component: bool = Opt(False, "--per-component", "-P", help="Return scores per component, only applicable when an output JSON file is specified."),
+ spans_key: str = Opt("sc", "--spans-key", "-sk", help="Spans key to use when evaluating Doc.spans"),
# fmt: on
):
"""
@@ -53,6 +54,7 @@ def evaluate_cli(
displacy_limit=displacy_limit,
per_component=per_component,
silent=False,
+ spans_key=spans_key,
)
diff --git a/spacy/cli/find_function.py b/spacy/cli/find_function.py
new file mode 100644
index 000000000..f99ce2adc
--- /dev/null
+++ b/spacy/cli/find_function.py
@@ -0,0 +1,69 @@
+from typing import Optional, Tuple
+
+from catalogue import RegistryError
+from wasabi import msg
+
+from ..util import registry
+from ._util import Arg, Opt, app
+
+
+@app.command("find-function")
+def find_function_cli(
+ # fmt: off
+ func_name: str = Arg(..., help="Name of the registered function."),
+ registry_name: Optional[str] = Opt(None, "--registry", "-r", help="Name of the catalogue registry."),
+ # fmt: on
+):
+ """
+ Find the module, path and line number to the file the registered
+ function is defined in, if available.
+
+ func_name (str): Name of the registered function.
+ registry_name (Optional[str]): Name of the catalogue registry.
+
+ DOCS: https://spacy.io/api/cli#find-function
+ """
+ if not registry_name:
+ registry_names = registry.get_registry_names()
+ for name in registry_names:
+ if registry.has(name, func_name):
+ registry_name = name
+ break
+
+ if not registry_name:
+ msg.fail(
+ f"Couldn't find registered function: '{func_name}'",
+ exits=1,
+ )
+
+ assert registry_name is not None
+ find_function(func_name, registry_name)
+
+
+def find_function(func_name: str, registry_name: str) -> Tuple[str, int]:
+ registry_desc = None
+ try:
+ registry_desc = registry.find(registry_name, func_name)
+ except RegistryError as e:
+ msg.fail(
+ f"Couldn't find registered function: '{func_name}' in registry '{registry_name}'",
+ )
+ msg.fail(f"{e}", exits=1)
+ assert registry_desc is not None
+
+ registry_path = None
+ line_no = None
+ if registry_desc["file"]:
+ registry_path = registry_desc["file"]
+ line_no = registry_desc["line_no"]
+
+ if not registry_path or not line_no:
+ msg.fail(
+ f"Couldn't find path to registered function: '{func_name}' in registry '{registry_name}'",
+ exits=1,
+ )
+ assert registry_path is not None
+ assert line_no is not None
+
+ msg.good(f"Found registered function '{func_name}' at {registry_path}:{line_no}")
+ return str(registry_path), int(line_no)
diff --git a/spacy/cli/find_threshold.py b/spacy/cli/find_threshold.py
index 7aa32c0c6..48077fa51 100644
--- a/spacy/cli/find_threshold.py
+++ b/spacy/cli/find_threshold.py
@@ -52,8 +52,8 @@ def find_threshold_cli(
DOCS: https://spacy.io/api/cli#find-threshold
"""
-
- util.logger.setLevel(logging.DEBUG if verbose else logging.INFO)
+ if verbose:
+ util.logger.setLevel(logging.DEBUG)
import_code(code_path)
find_threshold(
model=model,
diff --git a/spacy/cli/init_pipeline.py b/spacy/cli/init_pipeline.py
index 13202cb60..21eea8edf 100644
--- a/spacy/cli/init_pipeline.py
+++ b/spacy/cli/init_pipeline.py
@@ -39,7 +39,8 @@ def init_vectors_cli(
you can use in the [initialize] block of your config to initialize
a model with vectors.
"""
- util.logger.setLevel(logging.DEBUG if verbose else logging.INFO)
+ if verbose:
+ util.logger.setLevel(logging.DEBUG)
msg.info(f"Creating blank nlp object for language '{lang}'")
nlp = util.get_lang_class(lang)()
if jsonl_loc is not None:
@@ -87,7 +88,8 @@ def init_pipeline_cli(
use_gpu: int = Opt(-1, "--gpu-id", "-g", help="GPU ID or -1 for CPU")
# fmt: on
):
- util.logger.setLevel(logging.DEBUG if verbose else logging.INFO)
+ if verbose:
+ util.logger.setLevel(logging.DEBUG)
overrides = parse_config_overrides(ctx.args)
import_code(code_path)
setup_gpu(use_gpu)
@@ -116,7 +118,8 @@ def init_labels_cli(
"""Generate JSON files for the labels in the data. This helps speed up the
training process, since spaCy won't have to preprocess the data to
extract the labels."""
- util.logger.setLevel(logging.DEBUG if verbose else logging.INFO)
+ if verbose:
+ util.logger.setLevel(logging.DEBUG)
if not output_path.exists():
output_path.mkdir(parents=True)
overrides = parse_config_overrides(ctx.args)
diff --git a/spacy/cli/package.py b/spacy/cli/package.py
index 4545578e6..12f195be1 100644
--- a/spacy/cli/package.py
+++ b/spacy/cli/package.py
@@ -403,7 +403,7 @@ def _format_sources(data: Any) -> str:
if author:
result += " ({})".format(author)
sources.append(result)
- return "
-
-
- - Get a custom spaCy pipeline, tailor-made for your NLP problem by - spaCy's core developers. - -
-
- spaCy v3.0 features all new transformer-based pipelines{' '}
- that bring spaCy's accuracy right up to the current{' '}
- state-of-the-art. You can use any pretrained transformer to
- train your own pipelines, and even share one transformer between multiple
- components with multi-task learning. Training is now fully
- configurable and extensible, and you can define your own custom models using{' '}
- PyTorch, TensorFlow and other frameworks.
+
+
+ + Get a custom spaCy pipeline, tailor-made for your NLP problem by + spaCy's core developers. + +
+
-
+