mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-11-04 01:48:04 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			95 lines
		
	
	
		
			1.9 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			95 lines
		
	
	
		
			1.9 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
spacy.en.EN
 | 
						|
============
 | 
						|
 | 
						|
.. automodule:: spacy.en
 | 
						|
 | 
						|
Tokenizer API
 | 
						|
-------------
 | 
						|
 | 
						|
.. automethod:: spacy.en.EN.tokenize
 | 
						|
    :noindex:
 | 
						|
 | 
						|
.. automethod:: spacy.en.EN.lookup
 | 
						|
    :noindex:
 | 
						|
 | 
						|
Lexeme Features Flag IDs
 | 
						|
------------------------
 | 
						|
 | 
						|
A number of boolean features are computed for English Lexemes. To access a feature,
 | 
						|
pass its ID to the :py:meth:`spacy.word.Lexeme.check_flag` function.
 | 
						|
 | 
						|
Orthographic Features
 | 
						|
---------------------
 | 
						|
 | 
						|
These features describe the `orthographic` (lettering) type of the word. The
 | 
						|
function used to compute the value is listed along with the flag.
 | 
						|
 | 
						|
.. data:: IS_ALPHA
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_alpha`
 | 
						|
 | 
						|
.. data:: IS_DIGIT
 | 
						|
    
 | 
						|
    :py:func:`spacy.orth.is_digit`
 | 
						|
    
 | 
						|
.. data:: IS_UPPER
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_upper`
 | 
						|
 | 
						|
.. data:: IS_PUNCT
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_punct`
 | 
						|
 | 
						|
.. data:: IS_SPACE
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_space`
 | 
						|
 | 
						|
.. data:: IS_ASCII
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_ascii`
 | 
						|
 | 
						|
.. data:: IS_TITLE
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_title`
 | 
						|
 | 
						|
.. data:: IS_LOWER
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_lower`
 | 
						|
 | 
						|
.. data:: IS_UPPER
 | 
						|
 | 
						|
    :py:func:`spacy.orth.is_upper`
 | 
						|
 | 
						|
Distributional Orthographic Features
 | 
						|
------------------------------------
 | 
						|
 | 
						|
These features describe how often the lower-cased form of the word appears
 | 
						|
in various case-styles in a large sample of English text. See :py:func:`spacy.orth.oft_case`
 | 
						|
 | 
						|
.. data:: OFT_UPPER
 | 
						|
.. data:: OFT_LOWER
 | 
						|
.. data:: OFT_TITLE
 | 
						|
 | 
						|
 | 
						|
Tag Dictionary Features
 | 
						|
-----------------------
 | 
						|
 | 
						|
These features describe whether the word commonly occurs with a given
 | 
						|
part-of-speech, in a large text corpus, using a part-of-speech tagger designed
 | 
						|
to reduce the tag-dictionary bias of its training corpus. See
 | 
						|
:py:func:`spacy.orth.can_tag`.
 | 
						|
 | 
						|
.. data:: CAN_PUNCT
 | 
						|
.. data:: CAN_CONJ
 | 
						|
.. data:: CAN_NUM
 | 
						|
.. data:: CAN_DET
 | 
						|
.. data:: CAN_ADP
 | 
						|
.. data:: CAN_ADJ
 | 
						|
.. data:: CAN_ADV
 | 
						|
.. data:: CAN_VERB
 | 
						|
.. data:: CAN_NOUN
 | 
						|
.. data:: CAN_PDT
 | 
						|
.. data:: CAN_POS
 | 
						|
.. data:: CAN_PRON
 | 
						|
.. data:: CAN_PRT
 |