infi.clickhouse_orm/docs/table_engines.md

Table Engines
=============

See: [ClickHouse Documentation](https://clickhouse.yandex/reference_en.html#Table+engines)

Each model must have an engine instance, used when creating the table in ClickHouse.

The following engines are supported by the ORM:

- TinyLog
- Log
- Memory
- MergeTree / ReplicatedMergeTree
- CollapsingMergeTree / ReplicatedCollapsingMergeTree
- SummingMergeTree / ReplicatedSummingMergeTree
- ReplacingMergeTree / ReplicatedReplacingMergeTree
- Buffer
- Merge
- Distributed


Simple Engines
--------------

`TinyLog`, `Log` and `Memory` engines do not require any parameters:

    engine = engines.TinyLog()

    engine = engines.Log()
    
    engine = engines.Memory()


Engines in the MergeTree Family
-------------------------------

To define a `MergeTree` engine, supply the date column name and the names (or expressions) for the key columns:

    engine = engines.MergeTree('EventDate', ('CounterID', 'EventDate'))

You may also provide a sampling expression:

    engine = engines.MergeTree('EventDate', ('CounterID', 'EventDate'), sampling_expr='intHash32(UserID)')

A `CollapsingMergeTree` engine is defined in a similar manner, but requires also a sign column:

    engine = engines.CollapsingMergeTree('EventDate', ('CounterID', 'EventDate'), 'Sign')

For a `SummingMergeTree` you can optionally specify the summing columns:

    engine = engines.SummingMergeTree('EventDate', ('OrderID', 'EventDate', 'BannerID'),
                                      summing_cols=('Shows', 'Clicks', 'Cost'))

For a `ReplacingMergeTree` you can optionally specify the version column:

    engine = engines.ReplacingMergeTree('EventDate', ('OrderID', 'EventDate', 'BannerID'), ver_col='Version')

### Custom partitioning

ClickHouse supports [custom partitioning](https://clickhouse.yandex/docs/en/table_engines/custom_partitioning_key/) expressions since version 1.1.54310
You can use custom partitioning with any MergeTree family engine.
To set custom partitioning:
* skip date_col (first) constructor parameter or fill it with None value
* add name to order_by (second) constructor parameter
* add partition_key parameter. It should be a tuple of expressions, by which partition are built.

Standard partitioning by date column can be added using toYYYYMM(date) function.

Example:
 
    engine = engines.ReplacingMergeTree(order_by=('OrderID', 'EventDate', 'BannerID'), ver_col='Version',
                                        partition_key=('toYYYYMM(EventDate)', 'BannerID'))


### Data Replication

Any of the above engines can be converted to a replicated engine (e.g. `ReplicatedMergeTree`) by adding two parameters, `replica_table_path` and `replica_name`:

    engine = engines.MergeTree('EventDate', ('CounterID', 'EventDate'),
                               replica_table_path='/clickhouse/tables/{layer}-{shard}/hits',
                               replica_name='{replica}')


Buffer Engine
-------------

A `Buffer` engine is only used in conjunction with a `BufferModel`.
The model should be a subclass of both `models.BufferModel` and the main model. 
The main model is also passed to the engine:

    class PersonBuffer(models.BufferModel, Person):

        engine = engines.Buffer(Person)

Additional buffer parameters can optionally be specified:

        engine = engines.Buffer(Person, num_layers=16, min_time=10, 
                                max_time=100, min_rows=10000, max_rows=1000000, 
                                min_bytes=10000000, max_bytes=100000000)

Then you can insert objects into Buffer model and they will be handled by ClickHouse properly:

    db.create_table(PersonBuffer)
    suzy = PersonBuffer(first_name='Suzy', last_name='Jones')
    dan = PersonBuffer(first_name='Dan', last_name='Schwartz')
    db.insert([dan, suzy])
    
    
Merge Engine
-------------

[ClickHouse docs](https://clickhouse.yandex/docs/en/single/index.html#merge)  
A `Merge` engine is only used in conjunction with a `MergeModel`.   
This table does not store data itself, but allows reading from any number of other tables simultaneously. So you can't insert in it.
Engine parameter specifies re2 (similar to PCRE) regular expression, from which data is selected.

    class MergeTable(models.MergeModel):
        engine = engines.Merge('^table_prefix')


---

[<< Field Types](field_types.md) | [Table of Contents](toc.md) | [Schema Migrations >>](schema_migrations.md)
refactor documentation 2017-04-26 15:47:02 +03:00			`Table Engines`
			`=============`

Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00			`See: [ClickHouse Documentation](https://clickhouse.yandex/reference_en.html#Table+engines)`

refactor documentation 2017-04-26 15:47:02 +03:00			`Each model must have an engine instance, used when creating the table in ClickHouse.`

Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00			`The following engines are supported by the ORM:`

			`- TinyLog`
			`- Log`
			`- Memory`
			`- MergeTree / ReplicatedMergeTree`
			`- CollapsingMergeTree / ReplicatedCollapsingMergeTree`
			`- SummingMergeTree / ReplicatedSummingMergeTree`
			`- ReplacingMergeTree / ReplicatedReplacingMergeTree`
			`- Buffer`
Added Merge engine 1) Divided readonly and system flags of Field model. Readonly flag only restricts insert operations, while system flag restricts also create and drop table operations 2) Added Merge engine and tests for it 3) Added docs for Merge engine 4) Added opportunity to make Field readonly. This is useful for "virtual" columns (https://clickhouse.yandex/docs/en/single/index.html#virtual-columns) 2017-09-07 15:44:27 +03:00			`- Merge`
add Distributed engine 2017-11-21 14:30:25 +03:00			`- Distributed`
Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00

			`Simple Engines`
			`--------------`

			`TinyLog`, `Log` and `Memory` engines do not require any parameters:

			`engine = engines.TinyLog()`

			`engine = engines.Log()`

			`engine = engines.Memory()`


			`Engines in the MergeTree Family`
			`-------------------------------`

refactor documentation 2017-04-26 15:47:02 +03:00			To define a `MergeTree` engine, supply the date column name and the names (or expressions) for the key columns:

			`engine = engines.MergeTree('EventDate', ('CounterID', 'EventDate'))`

			`You may also provide a sampling expression:`

			`engine = engines.MergeTree('EventDate', ('CounterID', 'EventDate'), sampling_expr='intHash32(UserID)')`

			A `CollapsingMergeTree` engine is defined in a similar manner, but requires also a sign column:

			`engine = engines.CollapsingMergeTree('EventDate', ('CounterID', 'EventDate'), 'Sign')`

			For a `SummingMergeTree` you can optionally specify the summing columns:

			`engine = engines.SummingMergeTree('EventDate', ('OrderID', 'EventDate', 'BannerID'),`
			`summing_cols=('Shows', 'Clicks', 'Cost'))`

			For a `ReplacingMergeTree` you can optionally specify the version column:

			`engine = engines.ReplacingMergeTree('EventDate', ('OrderID', 'EventDate', 'BannerID'), ver_col='Version')`

1. Added support of custom partitioning (https://clickhouse.yandex/docs/en/table_engines/custom_partitioning_key/) 2. Added attribute server_version to Database class 3. Changed Engine.create_table_sql(), Engine.drop_table_sql(), Model.create_table_sql(), Model.drop_table_sql() parameter to db from db_name 2018-04-12 12:21:46 +03:00			`### Custom partitioning`

			`ClickHouse supports [custom partitioning](https://clickhouse.yandex/docs/en/table_engines/custom_partitioning_key/) expressions since version 1.1.54310`
			`You can use custom partitioning with any MergeTree family engine.`
			`To set custom partitioning:`
			`* skip date_col (first) constructor parameter or fill it with None value`
			`* add name to order_by (second) constructor parameter`
			`* add partition_key parameter. It should be a tuple of expressions, by which partition are built.`

			`Standard partitioning by date column can be added using toYYYYMM(date) function.`

			`Example:`

			`engine = engines.ReplacingMergeTree(order_by=('OrderID', 'EventDate', 'BannerID'), ver_col='Version',`
			`partition_key=('toYYYYMM(EventDate)', 'BannerID'))`


Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00			`### Data Replication`
refactor documentation 2017-04-26 15:47:02 +03:00
Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00			Any of the above engines can be converted to a replicated engine (e.g. `ReplicatedMergeTree`) by adding two parameters, `replica_table_path` and `replica_name`:
refactor documentation 2017-04-26 15:47:02 +03:00
Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00			`engine = engines.MergeTree('EventDate', ('CounterID', 'EventDate'),`
			`replica_table_path='/clickhouse/tables/{layer}-{shard}/hits',`
			`replica_name='{replica}')`


			`Buffer Engine`
refactor documentation 2017-04-26 15:47:02 +03:00			`-------------`

Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00			A `Buffer` engine is only used in conjunction with a `BufferModel`.
			The model should be a subclass of both `models.BufferModel` and the main model.
			`The main model is also passed to the engine:`
refactor documentation 2017-04-26 15:47:02 +03:00
			`class PersonBuffer(models.BufferModel, Person):`

			`engine = engines.Buffer(Person)`

Add simple engines: TinyLog, Log, Memory 2017-04-28 18:36:40 +03:00			`Additional buffer parameters can optionally be specified:`

			`engine = engines.Buffer(Person, num_layers=16, min_time=10,`
			`max_time=100, min_rows=10000, max_rows=1000000,`
			`min_bytes=10000000, max_bytes=100000000)`

refactor documentation 2017-04-26 15:47:02 +03:00			`Then you can insert objects into Buffer model and they will be handled by ClickHouse properly:`

			`db.create_table(PersonBuffer)`
			`suzy = PersonBuffer(first_name='Suzy', last_name='Jones')`
			`dan = PersonBuffer(first_name='Dan', last_name='Schwartz')`
			`db.insert([dan, suzy])`
Added Merge engine 1) Divided readonly and system flags of Field model. Readonly flag only restricts insert operations, while system flag restricts also create and drop table operations 2) Added Merge engine and tests for it 3) Added docs for Merge engine 4) Added opportunity to make Field readonly. This is useful for "virtual" columns (https://clickhouse.yandex/docs/en/single/index.html#virtual-columns) 2017-09-07 15:44:27 +03:00

			`Merge Engine`
			`-------------`

			`[ClickHouse docs](https://clickhouse.yandex/docs/en/single/index.html#merge)`
			A `Merge` engine is only used in conjunction with a `MergeModel`.
			`This table does not store data itself, but allows reading from any number of other tables simultaneously. So you can't insert in it.`
			`Engine parameter specifies re2 (similar to PCRE) regular expression, from which data is selected.`

			`class MergeTable(models.MergeModel):`
			`engine = engines.Merge('^table_prefix')`
refactor documentation 2017-04-26 15:47:02 +03:00
refactor documentation 2017-04-28 13:44:45 +03:00
			`---`

			`[<< Field Types](field_types.md) \| [Table of Contents](toc.md) \| [Schema Migrations >>](schema_migrations.md)`