From 9f02e3c0279af4bd758b8d090f93b99200c686ce Mon Sep 17 00:00:00 2001 From: Ines Montani Date: Wed, 17 Jul 2019 15:13:50 +0200 Subject: [PATCH] Adjust example Not actually supported in this alignment interpretation --- website/docs/usage/linguistic-features.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/website/docs/usage/linguistic-features.md b/website/docs/usage/linguistic-features.md index cc4bbed6d..d73a7e0db 100644 --- a/website/docs/usage/linguistic-features.md +++ b/website/docs/usage/linguistic-features.md @@ -967,9 +967,8 @@ attributes. For details, see the respective usage pages. spaCy's tokenization is non-destructive and uses language-specific rules optimized for compatibility with treebank annotations. Other tools and resources -can sometimes tokenize things differently – for example, `"I'm"` → `["I", "am"]` -instead of `["I", "'m"]`, or `"Obama's"` → `["Obama", "'", "s"]` instead of -`["Obama", "'s"]`. +can sometimes tokenize things differently – for example, `"I'm"` → +`["I", "'", "m"]` instead of `["I", "'m"]`. In cases like that, you often want to align the tokenization so that you can merge annotations from different sources together, or take vectors predicted by