Update emoji library in rule-based matcher example (#13014)

This commit is contained in:
Madeesh Kannan 2023-09-25 18:20:30 +02:00 committed by GitHub
parent 935a5455b6
commit b4501db6f8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -850,14 +850,14 @@ negative pattern. To keep it simple, we'll either add or subtract `0.1` points
this way, the score will also reflect combinations of emoji, even positive _and_ this way, the score will also reflect combinations of emoji, even positive _and_
negative ones. negative ones.
With a library like [Emojipedia](https://github.com/bcongdon/python-emojipedia), With a library like [emoji](https://github.com/carpedm20/emoji), we can also
we can also retrieve a short description for each emoji for example, 😍's retrieve a short description for each emoji for example, 😍's official title
official title is "Smiling Face With Heart-Eyes". Assigning it to a is "Smiling Face With Heart-Eyes". Assigning it to a
[custom attribute](/usage/processing-pipelines#custom-components-attributes) on [custom attribute](/usage/processing-pipelines#custom-components-attributes) on
the emoji span will make it available as `span._.emoji_desc`. the emoji span will make it available as `span._.emoji_desc`.
```python ```python
from emojipedia import Emojipedia # Installation: pip install emojipedia import emoji # Installation: pip install emoji
from spacy.tokens import Span # Get the global Span object from spacy.tokens import Span # Get the global Span object
Span.set_extension("emoji_desc", default=None) # Register the custom attribute Span.set_extension("emoji_desc", default=None) # Register the custom attribute
@ -869,9 +869,9 @@ def label_sentiment(matcher, doc, i, matches):
elif doc.vocab.strings[match_id] == "SAD": elif doc.vocab.strings[match_id] == "SAD":
doc.sentiment -= 0.1 # Subtract 0.1 for negative sentiment doc.sentiment -= 0.1 # Subtract 0.1 for negative sentiment
span = doc[start:end] span = doc[start:end]
emoji = Emojipedia.search(span[0].text) # Get data for emoji # Verify if it is an emoji and set the extension attribute correctly.
span._.emoji_desc = emoji.title # Assign emoji description if emoji.is_emoji(span[0].text):
span._.emoji_desc = emoji.demojize(span[0].text, delimiters=("", ""), language=doc.lang_).replace("_", " ")
``` ```
To label the hashtags, we can use a To label the hashtags, we can use a
@ -1097,7 +1097,7 @@ come directly from
[Semgrex](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html): [Semgrex](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html):
| Symbol | Description | | Symbol | Description |
| --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- | | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| `A < B` | `A` is the immediate dependent of `B`. | | `A < B` | `A` is the immediate dependent of `B`. |
| `A > B` | `A` is the immediate head of `B`. | | `A > B` | `A` is the immediate head of `B`. |
| `A << B` | `A` is the dependent in a chain to `B` following dep &rarr; head paths. | | `A << B` | `A` is the dependent in a chain to `B` following dep &rarr; head paths. |