Update emoji library in rule-based matcher example (#13014)

2025-11-07 19:37:38 +03:00 · 2023-09-25 18:20:30 +02:00 · 2023-09-25 18:20:30 +02:00 · b4501db6f8
commit b4501db6f8
parent 935a5455b6
1 changed files with 29 additions and 29 deletions
--- a/website/docs/usage/rule-based-matching.mdx
+++ b/website/docs/usage/rule-based-matching.mdx
@ -850,14 +850,14 @@ negative pattern. To keep it simple, we'll either add or subtract `0.1` points
 this way, the score will also reflect combinations of emoji, even positive _and_
 negative ones.
-With a library like [Emojipedia](https://github.com/bcongdon/python-emojipedia),
+With a library like [emoji](https://github.com/carpedm20/emoji), we can also
-we can also retrieve a short description for each emoji – for example, 😍's
+retrieve a short description for each emoji – for example, 😍's official title
-official title is "Smiling Face With Heart-Eyes". Assigning it to a
+is "Smiling Face With Heart-Eyes". Assigning it to a
 [custom attribute](/usage/processing-pipelines#custom-components-attributes) on
 the emoji span will make it available as `span._.emoji_desc`.
 ```python
-from emojipedia import Emojipedia  # Installation: pip install emojipedia
+import emoji  # Installation: pip install emoji
 from spacy.tokens import Span  # Get the global Span object
 Span.set_extension("emoji_desc", default=None)  # Register the custom attribute
@ -869,9 +869,9 @@ def label_sentiment(matcher, doc, i, matches):
    elif doc.vocab.strings[match_id] == "SAD":
        doc.sentiment -= 0.1  # Subtract 0.1 for negative sentiment
    span = doc[start:end]
-    emoji = Emojipedia.search(span[0].text)  # Get data for emoji
+    # Verify if it is an emoji and set the extension attribute correctly.
-    span._.emoji_desc = emoji.title  # Assign emoji description
+    if emoji.is_emoji(span[0].text):
-
+        span._.emoji_desc = emoji.demojize(span[0].text, delimiters=("", ""), language=doc.lang_).replace("_", " ")
 ```
 To label the hashtags, we can use a
@ -1097,7 +1097,7 @@ come directly from
 [Semgrex](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html):
 | Symbol                                  | Description                                                                                                                    |
-| --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
+| --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
 | `A < B`                                 | `A` is the immediate dependent of `B`.                                                                                         |
 | `A > B`                                 | `A` is the immediate head of `B`.                                                                                              |
 | `A << B`                                | `A` is the dependent in a chain to `B` following dep &rarr; head paths.                                                        |