---
title: StringStore
tag: class
source: spacy/strings.pyx
---

Look up strings by 64-bit hashes. As of v2.0, spaCy uses hash values instead of
integer IDs. This ensures that strings always map to the same ID, even from
different `StringStores`.

## StringStore.\_\_init\_\_ {#init tag="method"}

Create the `StringStore`.

> #### Example
>
> ```python
> from spacy.strings import StringStore
> stringstore = StringStore(["apple", "orange"])
> ```

| Name        | Type          | Description                                |
| ----------- | ------------- | ------------------------------------------ |
| `strings`   | iterable      | A sequence of strings to add to the store. |
| **RETURNS** | `StringStore` | The newly constructed object.              |

## StringStore.\_\_len\_\_ {#len tag="method"}

Get the number of strings in the store.

> #### Example
>
> ```python
> stringstore = StringStore(["apple", "orange"])
> assert len(stringstore) == 2
> ```

| Name        | Type | Description                         |
| ----------- | ---- | ----------------------------------- |
| **RETURNS** | int  | The number of strings in the store. |

## StringStore.\_\_getitem\_\_ {#getitem tag="method"}

Retrieve a string from a given hash, or vice versa.

> #### Example
>
> ```python
> stringstore = StringStore(["apple", "orange"])
> apple_hash = stringstore["apple"]
> assert apple_hash == 8566208034543834098
> assert stringstore[apple_hash] == "apple"
> ```

| Name           | Type                 | Description                |
| -------------- | -------------------- | -------------------------- |
| `string_or_id` | bytes, str or uint64 | The value to encode.       |
| **RETURNS**    | str or int           | The value to be retrieved. |

## StringStore.\_\_contains\_\_ {#contains tag="method"}

Check whether a string is in the store.

> #### Example
>
> ```python
> stringstore = StringStore(["apple", "orange"])
> assert "apple" in stringstore
> assert not "cherry" in stringstore
> ```

| Name        | Type | Description                            |
| ----------- | ---- | -------------------------------------- |
| `string`    | str  | The string to check.                   |
| **RETURNS** | bool | Whether the store contains the string. |

## StringStore.\_\_iter\_\_ {#iter tag="method"}

Iterate over the strings in the store, in order. Note that a newly initialized
store will always include an empty string `''` at position `0`.

> #### Example
>
> ```python
> stringstore = StringStore(["apple", "orange"])
> all_strings = [s for s in stringstore]
> assert all_strings == ["apple", "orange"]
> ```

| Name       | Type | Description            |
| ---------- | ---- | ---------------------- |
| **YIELDS** | str  | A string in the store. |

## StringStore.add {#add tag="method" new="2"}

Add a string to the `StringStore`.

> #### Example
>
> ```python
> stringstore = StringStore(["apple", "orange"])
> banana_hash = stringstore.add("banana")
> assert len(stringstore) == 3
> assert banana_hash == 2525716904149915114
> assert stringstore[banana_hash] == "banana"
> assert stringstore["banana"] == banana_hash
> ```

| Name        | Type   | Description              |
| ----------- | ------ | ------------------------ |
| `string`    | str    | The string to add.       |
| **RETURNS** | uint64 | The string's hash value. |

## StringStore.to_disk {#to_disk tag="method" new="2"}

Save the current state to a directory.

> #### Example
>
> ```python
> stringstore.to_disk("/path/to/strings")
> ```

| Name   | Type         | Description                                                                                                           |
| ------ | ------------ | --------------------------------------------------------------------------------------------------------------------- |
| `path` | str / `Path` | A path to a directory, which will be created if it doesn't exist. Paths may be either strings or `Path`-like objects. |

## StringStore.from_disk {#from_disk tag="method" new="2"}

Loads state from a directory. Modifies the object in place and returns it.

> #### Example
>
> ```python
> from spacy.strings import StringStore
> stringstore = StringStore().from_disk("/path/to/strings")
> ```

| Name        | Type          | Description                                                                |
| ----------- | ------------- | -------------------------------------------------------------------------- |
| `path`      | str / `Path`  | A path to a directory. Paths may be either strings or `Path`-like objects. |
| **RETURNS** | `StringStore` | The modified `StringStore` object.                                         |

## StringStore.to_bytes {#to_bytes tag="method"}

Serialize the current state to a binary string.

> #### Example
>
> ```python
> store_bytes = stringstore.to_bytes()
> ```

| Name        | Type  | Description                                      |
| ----------- | ----- | ------------------------------------------------ |
| **RETURNS** | bytes | The serialized form of the `StringStore` object. |

## StringStore.from_bytes {#from_bytes tag="method"}

Load state from a binary string.

> #### Example
>
> ```python
> fron spacy.strings import StringStore
> store_bytes = stringstore.to_bytes()
> new_store = StringStore().from_bytes(store_bytes)
> ```

| Name         | Type          | Description               |
| ------------ | ------------- | ------------------------- |
| `bytes_data` | bytes         | The data to load from.    |
| **RETURNS**  | `StringStore` | The `StringStore` object. |

## Utilities {#util}

### strings.hash_string {#hash_string tag="function"}

Get a 64-bit hash for a given string.

> #### Example
>
> ```python
> from spacy.strings import hash_string
> assert hash_string("apple") == 8566208034543834098
> ```

| Name        | Type   | Description         |
| ----------- | ------ | ------------------- |
| `string`    | str    | The string to hash. |
| **RETURNS** | uint64 | The hash.           |