mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-31 07:57:35 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			131 lines
		
	
	
		
			5.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			131 lines
		
	
	
		
			5.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| <a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
 | ||
| 
 | ||
| # spaCy examples
 | ||
| 
 | ||
| For spaCy v3 we've converted many of the [v2 example
 | ||
| scripts](https://github.com/explosion/spaCy/tree/v2.3.x/examples/) into
 | ||
| end-to-end [spacy projects](https://spacy.io/usage/projects) workflows. The
 | ||
| workflows include all the steps to go from data to packaged spaCy models.
 | ||
| 
 | ||
| ## 🪐 Pipeline component demos
 | ||
| 
 | ||
| The simplest demos for training a single pipeline component are in the
 | ||
| [`pipelines`](https://github.com/explosion/projects/blob/v3/pipelines) category
 | ||
| including:
 | ||
| 
 | ||
| - [`pipelines/ner_demo`](https://github.com/explosion/projects/blob/v3/pipelines/ner_demo):
 | ||
|   Train a named entity recognizer
 | ||
| - [`pipelines/textcat_demo`](https://github.com/explosion/projects/blob/v3/pipelines/textcat_demo):
 | ||
|   Train a text classifier
 | ||
| - [`pipelines/parser_intent_demo`](https://github.com/explosion/projects/blob/v3/pipelines/parser_intent_demo):
 | ||
|   Train a dependency parser for custom semantics
 | ||
| 
 | ||
| ## 🪐 Tutorials
 | ||
| 
 | ||
| The [`tutorials`](https://github.com/explosion/projects/blob/v3/tutorials)
 | ||
| category includes examples that work through specific NLP use cases end-to-end:
 | ||
| 
 | ||
| - [`tutorials/textcat_goemotions`](https://github.com/explosion/projects/blob/v3/tutorials/textcat_goemotions):
 | ||
|   Train a text classifier to categorize emotions in Reddit posts
 | ||
| - [`tutorials/nel_emerson`](https://github.com/explosion/projects/blob/v3/tutorials/nel_emerson):
 | ||
|   Use an entity linker to disambiguate mentions of the same name
 | ||
| 
 | ||
| Check out the [projects documentation](https://spacy.io/usage/projects) and
 | ||
| browse through the [available
 | ||
| projects](https://github.com/explosion/projects/)!
 | ||
| 
 | ||
| ## 🚀 Get started with a demo project
 | ||
| 
 | ||
| The
 | ||
| [`pipelines/ner_demo`](https://github.com/explosion/projects/blob/v3/pipelines/ner_demo)
 | ||
| project converts the spaCy v2
 | ||
| [`train_ner.py`](https://github.com/explosion/spaCy/blob/v2.3.x/examples/training/train_ner.py)
 | ||
| demo script into a spaCy v3 project.
 | ||
| 
 | ||
| 1. Clone the project:
 | ||
| 
 | ||
|    ```bash
 | ||
|    python -m spacy project clone pipelines/ner_demo
 | ||
|    ```
 | ||
| 
 | ||
| 2. Install requirements and download any data assets:
 | ||
| 
 | ||
|    ```bash
 | ||
|    cd ner_demo
 | ||
|    python -m pip install -r requirements.txt
 | ||
|    python -m spacy project assets
 | ||
|    ```
 | ||
| 
 | ||
| 3. Run the default workflow to convert, train and evaluate:
 | ||
| 
 | ||
|    ```bash
 | ||
|    python -m spacy project run all
 | ||
|    ```
 | ||
| 
 | ||
|    Sample output:
 | ||
| 
 | ||
|    ```none
 | ||
|    ℹ Running workflow 'all'
 | ||
|    
 | ||
|    ================================== convert ==================================
 | ||
|    Running command: /home/user/venv/bin/python scripts/convert.py en assets/train.json corpus/train.spacy
 | ||
|    Running command: /home/user/venv/bin/python scripts/convert.py en assets/dev.json corpus/dev.spacy
 | ||
|    
 | ||
|    =============================== create-config ===============================
 | ||
|    Running command: /home/user/venv/bin/python -m spacy init config --lang en --pipeline ner configs/config.cfg --force
 | ||
|    ℹ Generated config template specific for your use case
 | ||
|    - Language: en
 | ||
|    - Pipeline: ner
 | ||
|    - Optimize for: efficiency
 | ||
|    - Hardware: CPU
 | ||
|    - Transformer: None
 | ||
|    ✔ Auto-filled config with all values
 | ||
|    ✔ Saved config
 | ||
|    configs/config.cfg
 | ||
|    You can now add your data and train your pipeline:
 | ||
|    python -m spacy train config.cfg --paths.train ./train.spacy --paths.dev ./dev.spacy
 | ||
|    
 | ||
|    =================================== train ===================================
 | ||
|    Running command: /home/user/venv/bin/python -m spacy train configs/config.cfg --output training/ --paths.train corpus/train.spacy --paths.dev corpus/dev.spacy --training.eval_frequency 10 --training.max_steps 100 --gpu-id -1
 | ||
|    ℹ Using CPU
 | ||
|    
 | ||
|    =========================== Initializing pipeline ===========================
 | ||
|    [2021-03-11 19:34:59,101] [INFO] Set up nlp object from config
 | ||
|    [2021-03-11 19:34:59,109] [INFO] Pipeline: ['tok2vec', 'ner']
 | ||
|    [2021-03-11 19:34:59,113] [INFO] Created vocabulary
 | ||
|    [2021-03-11 19:34:59,113] [INFO] Finished initializing nlp object
 | ||
|    [2021-03-11 19:34:59,265] [INFO] Initialized pipeline components: ['tok2vec', 'ner']
 | ||
|    ✔ Initialized pipeline
 | ||
|    
 | ||
|    ============================= Training pipeline =============================
 | ||
|    ℹ Pipeline: ['tok2vec', 'ner']
 | ||
|    ℹ Initial learn rate: 0.001
 | ||
|    E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
 | ||
|    ---  ------  ------------  --------  ------  ------  ------  ------
 | ||
|      0       0          0.00      7.90    0.00    0.00    0.00    0.00
 | ||
|     10      10          0.11     71.07    0.00    0.00    0.00    0.00
 | ||
|     20      20          0.65     22.44   50.00   50.00   50.00    0.50
 | ||
|     30      30          0.22      6.38   80.00   66.67  100.00    0.80
 | ||
|     40      40          0.00      0.00   80.00   66.67  100.00    0.80
 | ||
|     50      50          0.00      0.00   80.00   66.67  100.00    0.80
 | ||
|     60      60          0.00      0.00  100.00  100.00  100.00    1.00
 | ||
|     70      70          0.00      0.00  100.00  100.00  100.00    1.00
 | ||
|     80      80          0.00      0.00  100.00  100.00  100.00    1.00
 | ||
|     90      90          0.00      0.00  100.00  100.00  100.00    1.00
 | ||
|    100     100          0.00      0.00  100.00  100.00  100.00    1.00
 | ||
|    ✔ Saved pipeline to output directory
 | ||
|    training/model-last
 | ||
|    ```
 | ||
| 
 | ||
| 4. Package the model:
 | ||
| 
 | ||
|    ```bash
 | ||
|    python -m spacy project run package
 | ||
|    ```
 | ||
| 
 | ||
| 5. Visualize the model's output with [Streamlit](https://streamlit.io):
 | ||
| 
 | ||
|    ```bash
 | ||
|    python -m spacy project run visualize-model
 | ||
|    ```
 |