mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-26 17:24:41 +03:00
Update CONTRIBUTING.md [ci skip]
This commit is contained in:
parent
1f1fbdba14
commit
6a7ffffeb3
|
@ -3,7 +3,9 @@
|
||||||
# Contribute to spaCy
|
# Contribute to spaCy
|
||||||
|
|
||||||
Thanks for your interest in contributing to spaCy 🎉 The project is maintained
|
Thanks for your interest in contributing to spaCy 🎉 The project is maintained
|
||||||
by [@honnibal](https://github.com/honnibal) and [@ines](https://github.com/ines),
|
by **[@honnibal](https://github.com/honnibal)**,
|
||||||
|
**[@ines](https://github.com/ines)**, **[@svlandeg](https://github.com/svlandeg)** and
|
||||||
|
**[@adrianeboyd](https://github.com/adrianeboyd)**,
|
||||||
and we'll do our best to help you get started. This page will give you a quick
|
and we'll do our best to help you get started. This page will give you a quick
|
||||||
overview of how things are organized and most importantly, how to get involved.
|
overview of how things are organized and most importantly, how to get involved.
|
||||||
|
|
||||||
|
@ -50,8 +52,7 @@ issue body. A few more tips:
|
||||||
parts and don't just dump your entire script. This will make it easier for us to
|
parts and don't just dump your entire script. This will make it easier for us to
|
||||||
reproduce the error.
|
reproduce the error.
|
||||||
|
|
||||||
- **Getting info about your spaCy installation and environment:** If you're
|
- **Getting info about your spaCy installation and environment:** You can use the command line interface to print details and
|
||||||
using spaCy v1.7+, you can use the command line interface to print details and
|
|
||||||
even format them as Markdown to copy-paste into GitHub issues:
|
even format them as Markdown to copy-paste into GitHub issues:
|
||||||
`python -m spacy info --markdown`.
|
`python -m spacy info --markdown`.
|
||||||
|
|
||||||
|
@ -60,7 +61,7 @@ issue body. A few more tips:
|
||||||
model is incompatible with your spaCy installation. In spaCy v2.0+, you can check
|
model is incompatible with your spaCy installation. In spaCy v2.0+, you can check
|
||||||
this on the command line by running `python -m spacy validate`.
|
this on the command line by running `python -m spacy validate`.
|
||||||
|
|
||||||
- **Sharing a model's output, like dependencies and entities:** spaCy v2.0+
|
- **Sharing a model's output, like dependencies and entities:** spaCy
|
||||||
comes with [built-in visualizers](https://spacy.io/usage/visualizers) that
|
comes with [built-in visualizers](https://spacy.io/usage/visualizers) that
|
||||||
you can run from within your script or a Jupyter notebook. For some issues, it's
|
you can run from within your script or a Jupyter notebook. For some issues, it's
|
||||||
helpful to **include a screenshot** of the visualization. You can simply drag and
|
helpful to **include a screenshot** of the visualization. You can simply drag and
|
||||||
|
@ -99,7 +100,7 @@ questions:
|
||||||
changes to spaCy's built-in methods. In contrast, a library of word
|
changes to spaCy's built-in methods. In contrast, a library of word
|
||||||
alignment functions could easily live as a separate package that depended on
|
alignment functions could easily live as a separate package that depended on
|
||||||
spaCy — there's little difference between writing `import word_aligner` and
|
spaCy — there's little difference between writing `import word_aligner` and
|
||||||
`import spacy.word_aligner`. spaCy v2.0+ makes it easy to implement
|
`import spacy.word_aligner`. spaCy makes it easy to implement
|
||||||
[custom pipeline components](https://spacy.io/usage/processing-pipelines#custom-components),
|
[custom pipeline components](https://spacy.io/usage/processing-pipelines#custom-components),
|
||||||
and add your own attributes, properties and methods to the `Doc`, `Token` and
|
and add your own attributes, properties and methods to the `Doc`, `Token` and
|
||||||
`Span`. If you're looking to implement a new spaCy feature, starting with a
|
`Span`. If you're looking to implement a new spaCy feature, starting with a
|
||||||
|
@ -109,8 +110,8 @@ questions:
|
||||||
library later.
|
library later.
|
||||||
|
|
||||||
- **Would the feature be easier to implement if it relied on "heavy" dependencies spaCy doesn't currently require?**
|
- **Would the feature be easier to implement if it relied on "heavy" dependencies spaCy doesn't currently require?**
|
||||||
Python has a very rich ecosystem. Libraries like scikit-learn, SciPy, Gensim or
|
Python has a very rich ecosystem. Libraries like PyTorch, TensorFlow, scikit-learn, SciPy or Gensim
|
||||||
TensorFlow/Keras do lots of useful things — but we don't want to have them as
|
do lots of useful things — but we don't want to have them as default
|
||||||
dependencies. If the feature requires functionality in one of these libraries,
|
dependencies. If the feature requires functionality in one of these libraries,
|
||||||
it's probably better to break it out into a different package.
|
it's probably better to break it out into a different package.
|
||||||
|
|
||||||
|
@ -137,19 +138,7 @@ files, a compiler, [pip](https://pip.pypa.io/en/latest/installing/),
|
||||||
[virtualenv](https://virtualenv.pypa.io/en/stable/) and
|
[virtualenv](https://virtualenv.pypa.io/en/stable/) and
|
||||||
[git](https://git-scm.com) installed. The compiler is usually the trickiest part.
|
[git](https://git-scm.com) installed. The compiler is usually the trickiest part.
|
||||||
|
|
||||||
```
|
If you've made changes to `.pyx` files, you need to **recompile spaCy** before you
|
||||||
python -m pip install -U pip
|
|
||||||
git clone https://github.com/explosion/spaCy
|
|
||||||
cd spaCy
|
|
||||||
|
|
||||||
python -m venv .env
|
|
||||||
source .env/bin/activate
|
|
||||||
export PYTHONPATH=`pwd`
|
|
||||||
pip install -r requirements.txt
|
|
||||||
python setup.py build_ext --inplace
|
|
||||||
```
|
|
||||||
|
|
||||||
If you've made changes to `.pyx` files, you need to recompile spaCy before you
|
|
||||||
can test your changes by re-running `python setup.py build_ext --inplace`.
|
can test your changes by re-running `python setup.py build_ext --inplace`.
|
||||||
Changes to `.py` files will be effective immediately.
|
Changes to `.py` files will be effective immediately.
|
||||||
|
|
||||||
|
@ -184,7 +173,7 @@ sure your test passes and reference the issue in your commit message.
|
||||||
## Code conventions
|
## Code conventions
|
||||||
|
|
||||||
Code should loosely follow [pep8](https://www.python.org/dev/peps/pep-0008/).
|
Code should loosely follow [pep8](https://www.python.org/dev/peps/pep-0008/).
|
||||||
As of `v2.1.0`, spaCy uses [`black`](https://github.com/ambv/black) for code
|
spaCy uses [`black`](https://github.com/ambv/black) for code
|
||||||
formatting and [`flake8`](http://flake8.pycqa.org/en/latest/) for linting its
|
formatting and [`flake8`](http://flake8.pycqa.org/en/latest/) for linting its
|
||||||
Python modules. If you've built spaCy from source, you'll already have both
|
Python modules. If you've built spaCy from source, you'll already have both
|
||||||
tools installed.
|
tools installed.
|
||||||
|
@ -216,8 +205,7 @@ list of available editor integrations.
|
||||||
#### Disabling formatting
|
#### Disabling formatting
|
||||||
|
|
||||||
There are a few cases where auto-formatting doesn't improve readability – for
|
There are a few cases where auto-formatting doesn't improve readability – for
|
||||||
example, in some of the language data files like the `tag_map.py`, or in
|
example, in some of the language data files or in the tests that construct `Doc` objects from lists of words and other labels.
|
||||||
the tests that construct `Doc` objects from lists of words and other labels.
|
|
||||||
Wrapping a block in `# fmt: off` and `# fmt: on` lets you disable formatting
|
Wrapping a block in `# fmt: off` and `# fmt: on` lets you disable formatting
|
||||||
for that particular code. Here's an example:
|
for that particular code. Here's an example:
|
||||||
|
|
||||||
|
@ -281,6 +269,9 @@ except: # noqa: E722
|
||||||
### Python conventions
|
### Python conventions
|
||||||
|
|
||||||
All Python code must be written **compatible with Python 3.6+**.
|
All Python code must be written **compatible with Python 3.6+**.
|
||||||
|
|
||||||
|
#### I/O and handling paths
|
||||||
|
|
||||||
Code that interacts with the file-system should accept objects that follow the
|
Code that interacts with the file-system should accept objects that follow the
|
||||||
`pathlib.Path` API, without assuming that the object inherits from `pathlib.Path`.
|
`pathlib.Path` API, without assuming that the object inherits from `pathlib.Path`.
|
||||||
If the function is user-facing and takes a path as an argument, it should check
|
If the function is user-facing and takes a path as an argument, it should check
|
||||||
|
@ -290,14 +281,18 @@ accept **file-like objects**, as it makes the library IO-agnostic. Working on
|
||||||
buffers makes the code more general, easier to test, and compatible with Python
|
buffers makes the code more general, easier to test, and compatible with Python
|
||||||
3's asynchronous IO.
|
3's asynchronous IO.
|
||||||
|
|
||||||
|
#### Composition vs. inheritance
|
||||||
|
|
||||||
Although spaCy uses a lot of classes, **inheritance is viewed with some suspicion**
|
Although spaCy uses a lot of classes, **inheritance is viewed with some suspicion**
|
||||||
— it's seen as a mechanism of last resort. You should discuss plans to extend
|
— it's seen as a mechanism of last resort. You should discuss plans to extend
|
||||||
the class hierarchy before implementing.
|
the class hierarchy before implementing.
|
||||||
|
|
||||||
|
#### Naming conventions
|
||||||
|
|
||||||
We have a number of conventions around variable naming that are still being
|
We have a number of conventions around variable naming that are still being
|
||||||
documented, and aren't 100% strict. A general policy is that instances of the
|
documented, and aren't 100% strict. A general policy is that instances of the
|
||||||
class `Doc` should by default be called `doc`, `Token` `token`, `Lexeme` `lex`,
|
class `Doc` should by default be called `doc`, `Token` → `token`, `Lexeme` → `lex`,
|
||||||
`Vocab` `vocab` and `Language` `nlp`. You should avoid naming variables that are
|
`Vocab` → `vocab` and `Language` → `nlp`. You should avoid naming variables that are
|
||||||
of other types these names. For instance, don't name a text string `doc` — you
|
of other types these names. For instance, don't name a text string `doc` — you
|
||||||
should usually call this `text`. Two general code style preferences further help
|
should usually call this `text`. Two general code style preferences further help
|
||||||
with naming. First, **lean away from introducing temporary variables**, as these
|
with naming. First, **lean away from introducing temporary variables**, as these
|
||||||
|
|
Loading…
Reference in New Issue
Block a user