Fix formatting

This commit is contained in:
ines 2017-11-01 00:43:22 +01:00
parent 5683fd65ed
commit 4b196fdf7f
10 changed files with 22 additions and 28 deletions

View File

@ -1,7 +1,6 @@
#!/usr/bin/env python
# coding: utf8
"""
A simple example of extracting relations between phrases and entities using
"""A simple example of extracting relations between phrases and entities using
spaCy's named entity recognizer and the dependency parse. Here, we extract
money and currency values (entities labelled as MONEY) and then check the
dependency tree to find the noun phrase they are referring to for example:

View File

@ -1,8 +1,7 @@
#!/usr/bin/env python
# coding: utf8
"""
This example shows how to navigate the parse tree including subtrees attached
to a word.
"""This example shows how to navigate the parse tree including subtrees
attached to a word.
Based on issue #252:
"In the documents and tutorials the main thing I haven't found is

View File

@ -1,9 +1,10 @@
#!/usr/bin/env python
# coding: utf8
"""Match a large set of multi-word expressions in O(1) time.
The idea is to associate each word in the vocabulary with a tag, noting whether
they begin, end, or are inside at least one pattern. An additional tag is used
for single-word patterns. Complete patterns are also stored in a hash set.
When we process a document, we look up the words in the vocabulary, to
associate the words with the tags. We then search for tag-sequences that
correspond to valid candidates. Finally, we look up the candidates in the hash

View File

@ -1,5 +1,6 @@
"""
Example of multi-processing with Joblib. Here, we're exporting
#!/usr/bin/env python
# coding: utf8
"""Example of multi-processing with Joblib. Here, we're exporting
part-of-speech-tagged, true-cased, (very roughly) sentence-separated text, with
each "sentence" on a newline, and spaces between tokens. Data is loaded from
the IMDB movie reviews dataset and will be loaded automatically via Thinc's

View File

@ -1,7 +1,6 @@
#!/usr/bin/env python
# coding: utf8
"""
Example of training spaCy's named entity recognizer, starting off with an
"""Example of training spaCy's named entity recognizer, starting off with an
existing model or a blank model.
For more details, see the documentation:

View File

@ -1,7 +1,6 @@
#!/usr/bin/env python
# coding: utf8
"""
Example of training an additional entity type
"""Example of training an additional entity type
This script shows how to add a new entity type to an existing pre-trained NER
model. To keep the example short and simple, only four sentences are provided

View File

@ -1,10 +1,7 @@
#!/usr/bin/env python
# coding: utf8
"""
Example of training spaCy dependency parser, starting off with an existing model
or a blank model.
For more details, see the documentation:
"""Example of training spaCy dependency parser, starting off with an existing
model or a blank model. For more details, see the documentation:
* Training: https://alpha.spacy.io/usage/training
* Dependency Parse: https://alpha.spacy.io/usage/linguistic-features#dependency-parse

View File

@ -3,9 +3,8 @@
"""
A simple example for training a part-of-speech tagger with a custom tag map.
To allow us to update the tag map with our custom one, this example starts off
with a blank Language class and modifies its defaults.
For more details, see the documentation:
with a blank Language class and modifies its defaults. For more details, see
the documentation:
* Training: https://alpha.spacy.io/usage/training
* POS Tagging: https://alpha.spacy.io/usage/linguistic-features#pos-tagging

View File

@ -3,9 +3,8 @@
"""Train a multi-label convolutional neural network text classifier on the
IMDB dataset, using the TextCategorizer component. The dataset will be loaded
automatically via Thinc's built-in dataset loader. The model is added to
spacy.pipeline, and predictions are available via `doc.cats`.
For more details, see the documentation:
spacy.pipeline, and predictions are available via `doc.cats`. For more details,
see the documentation:
* Training: https://alpha.spacy.io/usage/training
* Text classification: https://alpha.spacy.io/usage/text-classification

View File

@ -13,8 +13,7 @@ import from spacy.language import Language
@plac.annotations(
vectors_loc=("Path to vectors", "positional", None, str))
def main(vectors_loc):
nlp = Language()
nlp = Language() # start off with a blank Language class
with open(vectors_loc, 'rb') as file_:
header = file_.readline()
nr_row, nr_dim = header.split()
@ -24,9 +23,11 @@ def main(vectors_loc):
pieces = line.split()
word = pieces[0]
vector = numpy.asarray([float(v) for v in pieces[1:]], dtype='f')
nlp.vocab.set_vector(word, vector)
doc = nlp(u'class colspan')
print(doc[0].similarity(doc[1]))
nlp.vocab.set_vector(word, vector) # add the vectors to the vocab
# test the vectors and similarity
text = 'class colspan'
doc = nlp(text)
print(text, doc[0].similarity(doc[1]))
if __name__ == '__main__':