mirror of
				https://github.com/carrotquest/django-clickhouse.git
				synced 2025-10-24 20:51:07 +03:00 
			
		
		
		
	Added more docs
This commit is contained in:
		
							parent
							
								
									c0afa7b53a
								
							
						
					
					
						commit
						f2dc978634
					
				|  | @ -1 +1,2 @@ | ||||||
| # django-clickhouse | # django-clickhouse | ||||||
|  | Documentation is [here](docs/index.md) | ||||||
|  | @ -1,9 +1,9 @@ | ||||||
| # Basic information | # Basic information | ||||||
| ## <a name="about">About</a> | ## About | ||||||
| This project's goal is to build [Yandex ClickHouse](https://clickhouse.yandex/) database into [Django](https://www.djangoproject.com/) project.   | This project's goal is to build [Yandex ClickHouse](https://clickhouse.yandex/) database into [Django](https://www.djangoproject.com/) project.   | ||||||
| It is based on [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm) library.   | It is based on [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm) library.   | ||||||
| 
 | 
 | ||||||
| ## <a name="features">Features</a> | ## Features | ||||||
| * Multiple ClickHouse database configuration in [settings.py](https://docs.djangoproject.com/en/2.1/ref/settings/) | * Multiple ClickHouse database configuration in [settings.py](https://docs.djangoproject.com/en/2.1/ref/settings/) | ||||||
| * ORM to create and manage ClickHouse models. | * ORM to create and manage ClickHouse models. | ||||||
| * ClickHouse migration system. | * ClickHouse migration system. | ||||||
|  | @ -11,26 +11,26 @@ It is based on [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhous | ||||||
| * Effective periodical synchronization of django models to ClickHouse without loosing data. | * Effective periodical synchronization of django models to ClickHouse without loosing data. | ||||||
| * Synchronization process monitoring. | * Synchronization process monitoring. | ||||||
| 
 | 
 | ||||||
| ## <a name="requirements">Requirements</a> | ## Requirements | ||||||
| * [Python 3](https://www.python.org/downloads/) | * [Python 3](https://www.python.org/downloads/) | ||||||
| * [Django](https://docs.djangoproject.com/) 1.7+ | * [Django](https://docs.djangoproject.com/) 1.7+ | ||||||
| * [Yandex ClickHouse](https://clickhouse.yandex/) | * [Yandex ClickHouse](https://clickhouse.yandex/) | ||||||
| * [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm) | * [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm) | ||||||
| * pytz | * [pytz](https://pypi.org/project/pytz/) | ||||||
| * six | * [six](https://pypi.org/project/six/) | ||||||
| * typing | * [typing](https://pypi.org/project/typing/) | ||||||
| * psycopg2 | * [psycopg2](https://www.psycopg.org/) | ||||||
| * celery | * [celery](http://www.celeryproject.org/) | ||||||
| * statsd | * [statsd](https://pypi.org/project/statsd/) | ||||||
| 
 | 
 | ||||||
| ### Optional libraries | ### Optional libraries | ||||||
| * [redis-py](https://redis-py.readthedocs.io/en/latest/) for [RedisStorage](storages.md#redis_storage) | * [redis-py](https://redis-py.readthedocs.io/en/latest/) for [RedisStorage](storages.md#redisstorage) | ||||||
| * [django-pg-returning](https://github.com/M1hacka/django-pg-returning)  | * [django-pg-returning](https://github.com/M1hacka/django-pg-returning)  | ||||||
|   for optimizing registering updates in [PostgreSQL](https://www.postgresql.org/) |   for optimizing registering updates in [PostgreSQL](https://www.postgresql.org/) | ||||||
| * [django-pg-bulk-update](https://github.com/M1hacka/django-pg-bulk-update) | * [django-pg-bulk-update](https://github.com/M1hacka/django-pg-bulk-update) | ||||||
|   for performing effective bulk update operation in [PostgreSQL](https://www.postgresql.org/) |   for performing effective bulk update and create operations in [PostgreSQL](https://www.postgresql.org/) | ||||||
| 
 | 
 | ||||||
| ## <a name="installation">Installation</a> | ## Installation | ||||||
| Install via pip:   | Install via pip:   | ||||||
| `pip install django-clickhouse` ([not released yet](https://github.com/carrotquest/django-clickhouse/issues/3))     | `pip install django-clickhouse` ([not released yet](https://github.com/carrotquest/django-clickhouse/issues/3))     | ||||||
| or via setup.py:   | or via setup.py:   | ||||||
|  |  | ||||||
|  | @ -3,19 +3,18 @@ | ||||||
| Library configuration is made in settings.py. All parameters start with `CLICKHOUSE_` prefix. | Library configuration is made in settings.py. All parameters start with `CLICKHOUSE_` prefix. | ||||||
| Prefix can be changed using `CLICKHOUSE_SETTINGS_PREFIX` parameter. | Prefix can be changed using `CLICKHOUSE_SETTINGS_PREFIX` parameter. | ||||||
| 
 | 
 | ||||||
| ### <a name="databases">CLICKHOUSE_SETTINGS_PREFIX</a> | ### CLICKHOUSE_SETTINGS_PREFIX | ||||||
| Defaults to: `'CLICKHOUSE_'`   | Defaults to: `'CLICKHOUSE_'`   | ||||||
| You can change `CLICKHOUSE_` prefix in settings using this parameter to anything your like. | You can change `CLICKHOUSE_` prefix in settings using this parameter to anything your like. | ||||||
| 
 | 
 | ||||||
| ### <a name="databases">CLICKHOUSE_DATABASES</a> | ### CLICKHOUSE_DATABASES | ||||||
| Defaults to: `{}`   | Defaults to: `{}`   | ||||||
| A dictionary, defining databases in django-like style.   | A dictionary, defining databases in django-like style.   | ||||||
| <!--- TODO Add link  ---> | <!--- TODO Add link  ---> | ||||||
| Key is an alias to communicate with this database in [connections]() and [using]().   | Key is an alias to communicate with this database in [connections]() and [using]().   | ||||||
| Value is a configuration dict with parameters: | Value is a configuration dict with parameters: | ||||||
| * [infi.clickhouse_orm database parameters](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/class_reference.md#database) | * [infi.clickhouse_orm database parameters](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/class_reference.md#database) | ||||||
| <!--- TODO Add link  ---> | * `migrate: bool` - indicates if this database should be migrated. See [migrations](migrations.md).   | ||||||
| * `migrate: bool` - indicates if this database should be migrated. See [migrations]().   |  | ||||||
| 
 | 
 | ||||||
| Example: | Example: | ||||||
| ```python | ```python | ||||||
|  | @ -24,22 +23,28 @@ CLICKHOUSE_DATABASES = { | ||||||
|         'db_name': 'test', |         'db_name': 'test', | ||||||
|         'username': 'default', |         'username': 'default', | ||||||
|         'password': '' |         'password': '' | ||||||
|     } |     }, | ||||||
|  |     'reader': { | ||||||
|  |         'db_name': 'read_only', | ||||||
|  |         'username': 'reader', | ||||||
|  |         'readonly': True, | ||||||
|  |         'password': '' | ||||||
|  |     }    | ||||||
| } | } | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| ### <a name="default_db_alias">CLICKHOUSE_DEFAULT_DB_ALIAS</a> | ### CLICKHOUSE_DEFAULT_DB_ALIAS | ||||||
| Defaults to: `'default'`   | Defaults to: `'default'`   | ||||||
| <!--- TODO Add link  ---> | <!--- TODO Add link  ---> | ||||||
| A database alias to use in [QuerySets]() if direct [using]() is not specified. | A database alias to use in [QuerySets]() if direct [using]() is not specified. | ||||||
| 
 | 
 | ||||||
| ### <a name="sync_storage">CLICKHOUSE_SYNC_STORAGE</a> | ### CLICKHOUSE_SYNC_STORAGE | ||||||
| Defaults to: `'django_clickhouse.storages.RedisStorage'`   | Defaults to: `'django_clickhouse.storages.RedisStorage'`   | ||||||
| An intermediate storage class to use. Can be a string or class. [More info about storages](storages.md). | An intermediate storage class to use. Can be a string or class. [More info about storages](storages.md). | ||||||
| 
 | 
 | ||||||
| ### <a name="redis_config">CLICKHOUSE_REDIS_CONFIG</a> | ### CLICKHOUSE_REDIS_CONFIG | ||||||
| Default to: `None`   | Default to: `None`   | ||||||
| Redis configuration for [RedisStorage](storages.md#redis_storage).   | Redis configuration for [RedisStorage](storages.md#redisstorage).   | ||||||
| If given, should be a dictionary of parameters to pass to [redis-py](https://redis-py.readthedocs.io/en/latest/#redis.Redis).     | If given, should be a dictionary of parameters to pass to [redis-py](https://redis-py.readthedocs.io/en/latest/#redis.Redis).     | ||||||
| 
 | 
 | ||||||
| Example:   | Example:   | ||||||
|  | @ -52,45 +57,42 @@ CLICKHOUSE_REDIS_CONFIG = { | ||||||
| } | } | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| ### <a name="sync_batch_size">CLICKHOUSE_SYNC_BATCH_SIZE</a> | ### CLICKHOUSE_SYNC_BATCH_SIZE | ||||||
| Defaults to: `10000`   | Defaults to: `10000`   | ||||||
| Maximum number of operations, fetched by sync process from intermediate storage per sync round. | Maximum number of operations, fetched by sync process from intermediate storage per sync round. | ||||||
| 
 | 
 | ||||||
| ### <a name="sync_delay">CLICKHOUSE_SYNC_DELAY</a> | ### CLICKHOUSE_SYNC_DELAY | ||||||
| Defaults to: `5` | Defaults to: `5` | ||||||
| A delay in seconds between two sync rounds start. | A delay in seconds between two sync rounds start. | ||||||
| 
 | 
 | ||||||
| ### <a name="models_module">CLICKHOUSE_MODELS_MODULE</a> | ### CLICKHOUSE_MODELS_MODULE | ||||||
| Defaults to: `'clickhouse_models'`   | Defaults to: `'clickhouse_models'`   | ||||||
| <!--- TODO Add link  ---> | Module name inside [django app](https://docs.djangoproject.com/en/3.0/intro/tutorial01/),  | ||||||
| Module name inside [django app](https://docs.djangoproject.com/en/2.2/intro/tutorial01/),  | where [ClickHouseModel](models.md#clickhousemodel) classes are search during migrations. | ||||||
| where [ClickHouseModel]() classes are search during migrations. |  | ||||||
| 
 | 
 | ||||||
| ### <a name="database_router">CLICKHOUSE_DATABASE_ROUTER</a> | ### CLICKHOUSE_DATABASE_ROUTER | ||||||
| Defaults to: `'django_clickhouse.routers.DefaultRouter'`   | Defaults to: `'django_clickhouse.routers.DefaultRouter'`   | ||||||
| <!--- TODO Add link  ---> | A dotted path to class, representing [database router](routing.md#router). | ||||||
| A dotted path to class, representing [database router](). |  | ||||||
| 
 | 
 | ||||||
| ### <a name="migrations_package">CLICKHOUSE_MIGRATIONS_PACKAGE</a> | ### CLICKHOUSE_MIGRATIONS_PACKAGE | ||||||
| Defaults to: `'clickhouse_migrations'` | Defaults to: `'clickhouse_migrations'` | ||||||
| A python package name inside [django app](https://docs.djangoproject.com/en/2.2/intro/tutorial01/),  | A python package name inside [django app](https://docs.djangoproject.com/en/3.0/intro/tutorial01/),  | ||||||
| where migration files are searched. | where migration files are searched. | ||||||
| 
 | 
 | ||||||
| ### <a name="migration_history_model">CLICKHOUSE_MIGRATION_HISTORY_MODEL</a> | ### CLICKHOUSE_MIGRATION_HISTORY_MODEL | ||||||
| Defaults to: `'django_clickhouse.migrations.MigrationHistory'`   | Defaults to: `'django_clickhouse.migrations.MigrationHistory'`   | ||||||
| <!--- TODO Add link  ---> | A dotted name of a ClickHouseModel subclass (including module path), | ||||||
| A dotted name of a ClickHouseModel subclass (including module path), representing [MigrationHistory]() model. |  representing [MigrationHistory model](migrations.md#migrationhistory-clickhousemodel). | ||||||
| 
 | 
 | ||||||
| ### <a name="migrate_with_default_db">CLICKHOUSE_MIGRATE_WITH_DEFAULT_DB</a> | ### CLICKHOUSE_MIGRATE_WITH_DEFAULT_DB | ||||||
| Defaults to: `True`   | Defaults to: `True`   | ||||||
| A boolean flag enabling automatic ClickHouse migration,  | A boolean flag enabling automatic ClickHouse migration,  | ||||||
| when you call [`migrate`](https://docs.djangoproject.com/en/2.2/ref/django-admin/#django-admin-migrate) on default database. | when you call [`migrate`](https://docs.djangoproject.com/en/2.2/ref/django-admin/#django-admin-migrate) on `default` database. | ||||||
| 
 | 
 | ||||||
| ### <a name="statd_prefix">CLICKHOUSE_STATSD_PREFIX</a> | ### CLICKHOUSE_STATSD_PREFIX | ||||||
| Defaults to: `clickhouse`   | Defaults to: `clickhouse`   | ||||||
| <!--- TODO Add link  ---> | A prefix in [statsd](https://pythonhosted.org/python-statsd/) added to each library metric. See [monitoring](monitoring.md). | ||||||
| A prefix in [statsd](https://pythonhosted.org/python-statsd/) added to each library metric. See [metrics]() |  | ||||||
| 
 | 
 | ||||||
| ### <a name="celery_queue">CLICKHOUSE_CELERY_QUEUE</a> | ### CLICKHOUSE_CELERY_QUEUE | ||||||
| Defaults to: `'celery'`   | Defaults to: `'celery'`   | ||||||
| A name of a queue, used by celery to plan library sync tasks. | A name of a queue, used by celery to plan library sync tasks. | ||||||
|  |  | ||||||
|  | @ -5,7 +5,9 @@ | ||||||
|   * [Features](basic_information.md#features) |   * [Features](basic_information.md#features) | ||||||
|   * [Requirements](basic_information.md#requirements) |   * [Requirements](basic_information.md#requirements) | ||||||
|   * [Installation](basic_information.md#installation) |   * [Installation](basic_information.md#installation) | ||||||
|  |   * [Design motivation](motivation.md) | ||||||
| * Usage   | * Usage   | ||||||
|  |   * [Overview](overview.md) | ||||||
|   * [Models](models.md) |   * [Models](models.md) | ||||||
|      * [DjangoModel](models.md#DjangoModel) |      * [DjangoModel](models.md#DjangoModel) | ||||||
|      * [ClickHouseModel](models.md#ClickHouseModel) |      * [ClickHouseModel](models.md#ClickHouseModel) | ||||||
|  | @ -14,4 +16,6 @@ | ||||||
|   * [Migrations](migrations.md) |   * [Migrations](migrations.md) | ||||||
|   * [Synchronization](synchronization.md) |   * [Synchronization](synchronization.md) | ||||||
|     * [Storages](storages.md) |     * [Storages](storages.md) | ||||||
|         * [RedisStorage](storages.md#redis_storage) |         * [RedisStorage](storages.md#redisstorage) | ||||||
|  |     * [Monitoring](monitoring.md) | ||||||
|  |     * [Performance notes](performance.md) | ||||||
|  |  | ||||||
|  | @ -5,7 +5,7 @@ but makes it a little bit more django-like. | ||||||
| 
 | 
 | ||||||
| ## File structure | ## File structure | ||||||
| Each django app can have optional `clickhouse_migrations` package. | Each django app can have optional `clickhouse_migrations` package. | ||||||
|  This is a default package name, it can be changed with [CLICKHOUSE_MIGRATIONS_PACKAGE](configuration.md#migrations_package) setting. |  This is a default package name, it can be changed with [CLICKHOUSE_MIGRATIONS_PACKAGE](configuration.md#clickhouse_migrations_package) setting. | ||||||
| 
 | 
 | ||||||
| Package contains py files, starting with 4-digit number.  | Package contains py files, starting with 4-digit number.  | ||||||
| A number gives an order in which migrations will be applied. | A number gives an order in which migrations will be applied. | ||||||
|  | @ -17,24 +17,27 @@ my_app | ||||||
| >>>> __init__.py | >>>> __init__.py | ||||||
| >>>> 0001_initial.py | >>>> 0001_initial.py | ||||||
| >>>> 0002_add_new_field_to_my_model.py | >>>> 0002_add_new_field_to_my_model.py | ||||||
|  | >> clickhouse_models.py | ||||||
|  | >> urls.py | ||||||
|  | >> views.py | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| ## Migration files | ## Migration files | ||||||
| Each file must contain a `Migration` class, inherited from `django_clickhouse.migrations.Migration`. | Each file must contain a `Migration` class, inherited from `django_clickhouse.migrations.Migration`. | ||||||
| The class should define an `operations` attribute - a list of operations to apply one by one. | The class should define an `operations` attribute - a list of operations to apply one by one. | ||||||
| Operation is one of operations, supported by [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/schema_migrations.md). | Operation is one of [operations, supported by infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/schema_migrations.md). | ||||||
| 
 | 
 | ||||||
| ```python | ```python | ||||||
| from django_clickhouse import migrations | from django_clickhouse import migrations | ||||||
|  | from my_app.clickhouse_models import ClickHouseUser | ||||||
| 
 | 
 | ||||||
| class Migration(migrations.Migration): | class Migration(migrations.Migration): | ||||||
|     operations = [ |     operations = [ | ||||||
|         migrations.CreateTable(ClickHouseTestModel), |         migrations.CreateTable(ClickHouseUser) | ||||||
|         migrations.CreateTable(ClickHouseCollapseTestModel) |  | ||||||
|     ] |     ] | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| ## MigrationHistory ClickHouse model | ## MigrationHistory ClickHouseModel | ||||||
| This model stores information about applied migrations.   | This model stores information about applied migrations.   | ||||||
| By default, library uses `django_clickhouse.migrations.MigrationHistory` model, | By default, library uses `django_clickhouse.migrations.MigrationHistory` model, | ||||||
|  but this can be changed using `CLICKHOUSE_MIGRATION_HISTORY_MODEL` setting. |  but this can be changed using `CLICKHOUSE_MIGRATION_HISTORY_MODEL` setting. | ||||||
|  | @ -45,27 +48,30 @@ MigrationHistory model is stored in default database. | ||||||
| 
 | 
 | ||||||
| ## Automatic migrations | ## Automatic migrations | ||||||
| When library is installed, it tries applying migrations every time, | When library is installed, it tries applying migrations every time, | ||||||
| you call `python manage.py migrate`. If you want to disable this, use [CLICKHOUSE_MIGRATE_WITH_DEFAULT_DB](configuration.md#migrate_with_default_db) settings. | you call [django migrate](https://docs.djangoproject.com/en/3.0/ref/django-admin/#django-admin-migrate). If you want to disable this, use [CLICKHOUSE_MIGRATE_WITH_DEFAULT_DB](configuration.md#clickhouse_migrate_with_default_db) setting. | ||||||
|  |    | ||||||
|  | By default migrations are applied to all [CLICKHOUSE_DATABASES](configuration.md#clickhouse_databases), which have no flags: | ||||||
|  | * `'migrate': False` | ||||||
|  | * `'readonly': True` | ||||||
| 
 | 
 | ||||||
| Note: migrations are only applied, when `default` database is migrated. | Note: migrations are only applied, with django `default` database.   | ||||||
| So if you call `python manage.py migrate --database=secondary` they wouldn't be applied. | So if you call `python manage.py migrate --database=secondary` they wouldn't be applied. | ||||||
| 
 | 
 | ||||||
| ## Migration algorithm | ## Migration algorithm | ||||||
| - Gets a list of databases from `CLICKHOUSE_DATABASES` settings. Migrate them one by one.   | - Get a list of databases from `CLICKHOUSE_DATABASES` setting. Migrate them one by one.   | ||||||
|   - Find all django apps from `INSTALLED_APPS` settings, which have no `readonly=True` setting and have `migrate=True` settings.   |   - Find all django apps from `INSTALLED_APPS` setting, which have no `readonly=True` attribute and have `migrate=True` attribute. Migrate them one by one.   | ||||||
|     Migrate them one by one.   |     * Iterate over `INSTAALLED_APPS`, searching for [clickhouse_migrations package](#file-structure)   | ||||||
|     * Iterate over `INSTAALLED_APPS`, searching for `clickhouse_migrations` package   |  | ||||||
|     * If package was not found, skip app.   |     * If package was not found, skip app.   | ||||||
|     * Get a list of migrations applied from `MigrationHistory` model    |     * Get a list of migrations applied from [MigrationHistory model](#migrationhistory-clickhousemodel)    | ||||||
|     * Get a list of unapplied migrations |     * Get a list of unapplied migrations | ||||||
|     * Get `Migration` class from each migration and call it `apply()` method |     * Get [Migration class](#migration-files) from each migration and call it `apply()` method | ||||||
|     * `apply()` iterates operations, checking if it should be applied with [router](router.md) |     * `apply()` iterates operations, checking if it should be applied with [router](routing.md) | ||||||
|     * If migration should be applied, it is applied |     * If migration should be applied, it is applied | ||||||
|     * Mark migration as applied in `MigrationHistory` model |     * Mark migration as applied in [MigrationHistory model](#migrationhistory-clickhousemodel) | ||||||
| 
 | 
 | ||||||
| ## Security notes | ## Security notes | ||||||
| 1) ClickHouse has no transaction system, as django relational databases.  | 1) ClickHouse has no transaction system, as django relational databases.  | ||||||
|   As a result, if migration fails, it would be partially applied and there's no correct way to rollback. |   As a result, if migration fails, it would be partially applied and there's no correct way to rollback. | ||||||
|   I recommend to make migrations as small as possible, so it should be easier to determine and correct the result if something goes wrong. |   I recommend to make migrations as small as possible, so it should be easier to determine and correct the result if something goes wrong. | ||||||
| 2) Unlike django, this library is enable to unapply migrations.  | 2) Unlike django, this library is enable to unapply migrations.  | ||||||
|   This functionality may be implemented in the future. |   This functionality may be implemented in the future. | ||||||
|  |  | ||||||
|  | @ -1,20 +1,20 @@ | ||||||
| # Models | # Models | ||||||
| Model is a pythonic class representing database table in your code. | Model is a pythonic class representing database table in your code. | ||||||
|  It also defined an interface (methods) to perform operations on this table |  It also defines an interface (methods) to perform operations on this table | ||||||
|  and describes its configuration inside framework. |  and describes its configuration inside framework. | ||||||
|   |   | ||||||
| This library operates 2 kinds of models:   | This library operates 2 kinds of models:   | ||||||
| * Django model, describing tables in source relational model   | * DjangoModel, describing tables in source relational database (PostgreSQL, MySQL, etc.)   | ||||||
| * ClickHouseModel, describing models in [ClickHouse](https://clickhouse.yandex/docs/en) database | * ClickHouseModel, describing models in [ClickHouse](https://clickhouse.yandex/docs/en) database | ||||||
|    |    | ||||||
| In order to distinguish them, I will refer them as ClickHouseModel and DjangoModel in further documentation. | In order to distinguish them, I will refer them as ClickHouseModel and DjangoModel in further documentation. | ||||||
| 
 | 
 | ||||||
| ## DjangoModel | ## DjangoModel | ||||||
| Django provides a [model system](https://docs.djangoproject.com/en/2.2/topics/db/models/)  | Django provides a [model system](https://docs.djangoproject.com/en/3.0/topics/db/models/)  | ||||||
|  to interact with relational databases.  |  to interact with relational databases.  | ||||||
|  In order to perform [synchronization](synchronization.md) we need to "catch" all DML operations |  In order to perform [synchronization](synchronization.md) we need to "catch" all [DML operations](https://en.wikipedia.org/wiki/Data_manipulation_language) | ||||||
|  on source django model and save information about it in [storage](storages.md). |  on source django model and save information about them in [storage](storages.md). | ||||||
|  To achieve this library introduces abstract `django_clickhouse.models.ClickHouseSyncModel` class. |  To achieve this, library introduces abstract `django_clickhouse.models.ClickHouseSyncModel` class. | ||||||
|  Each model, inherited from `ClickHouseSyncModel` will automatically save information, needed to sync to storage.   |  Each model, inherited from `ClickHouseSyncModel` will automatically save information, needed to sync to storage.   | ||||||
| Read [synchronization](synchronization.md) section for more info. | Read [synchronization](synchronization.md) section for more info. | ||||||
| 
 | 
 | ||||||
|  | @ -25,7 +25,7 @@ Read [synchronization](synchronization.md) section for more info. | ||||||
| * All queries of [django-pg-returning](https://pypi.org/project/django-pg-returning/) library | * All queries of [django-pg-returning](https://pypi.org/project/django-pg-returning/) library | ||||||
| * All queries of [django-pg-bulk-update](https://pypi.org/project/django-pg-bulk-update/) library | * All queries of [django-pg-bulk-update](https://pypi.org/project/django-pg-bulk-update/) library | ||||||
| 
 | 
 | ||||||
| You can also combine your custom django manager and queryset using mixins from `django_clickhouse.models` package. | You can also combine your custom django manager and queryset using mixins from `django_clickhouse.models` package: | ||||||
|    |    | ||||||
| **Important note**: Operations are saved in [transaction.on_commit()](https://docs.djangoproject.com/en/2.2/topics/db/transactions/#django.db.transaction.on_commit).  | **Important note**: Operations are saved in [transaction.on_commit()](https://docs.djangoproject.com/en/2.2/topics/db/transactions/#django.db.transaction.on_commit).  | ||||||
|  The goal is avoiding syncing operations, not committed to relational database. |  The goal is avoiding syncing operations, not committed to relational database. | ||||||
|  | @ -44,9 +44,12 @@ class User(ClickHouseSyncModel): | ||||||
|     birthday = models.DateField() |     birthday = models.DateField() | ||||||
| 
 | 
 | ||||||
| # All operations will be registered to sync with ClickHouse models: | # All operations will be registered to sync with ClickHouse models: | ||||||
| MyModel.objects.create(first_name='Alice', age=16, , birthday=date(2003, 6, 1)) | User.objects.create(first_name='Alice', age=16, birthday=date(2003, 6, 1)) | ||||||
| MyModel(first_name='Bob', age=17, birthday=date(2002, 1, 1)).save() | User(first_name='Bob', age=17, birthday=date(2002, 1, 1)).save() | ||||||
| MyModel.objects.update(first_name='Candy') | User.objects.update(first_name='Candy') | ||||||
|  | 
 | ||||||
|  | # Custom manager | ||||||
|  | 
 | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| ## ClickHouseModel | ## ClickHouseModel | ||||||
|  | @ -56,10 +59,10 @@ This kind of model is based on [infi.clickhouse_orm Model](https://github.com/In | ||||||
| You should define `ClickHouseModel` subclass for each table you want to access and sync in ClickHouse. | You should define `ClickHouseModel` subclass for each table you want to access and sync in ClickHouse. | ||||||
| Each model should be inherited from `django_clickhouse.clickhouse_models.ClickHouseModel`. | Each model should be inherited from `django_clickhouse.clickhouse_models.ClickHouseModel`. | ||||||
| By default, models are searched in `clickhouse_models` module of each django app. | By default, models are searched in `clickhouse_models` module of each django app. | ||||||
| You can change modules name, using stting [CLICKHOUSE_MODELS_MODULE](configuration.md#models_module) | You can change modules name, using setting [CLICKHOUSE_MODELS_MODULE](configuration.md#clickhouse_models_module) | ||||||
|   |   | ||||||
| You can read more about creating models and fields [here](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/models_and_databases.md#defining-models): | You can read more about creating models and fields [here](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/models_and_databases.md#defining-models): | ||||||
|  all capabilites are supported. At the same time, django-clickhouse libraries adds: | all capabilities are supported. At the same time, django-clickhouse libraries adds: | ||||||
| * [routing attributes and methods](routing.md) | * [routing attributes and methods](routing.md) | ||||||
| * [sync attributes and methods](synchronization.md) | * [sync attributes and methods](synchronization.md) | ||||||
| 
 | 
 | ||||||
|  | @ -68,6 +71,8 @@ Example: | ||||||
| from django_clickhouse.clickhouse_models import ClickHouseModel | from django_clickhouse.clickhouse_models import ClickHouseModel | ||||||
| from django_clickhouse.engines import MergeTree | from django_clickhouse.engines import MergeTree | ||||||
| from infi.clickhouse_orm import fields | from infi.clickhouse_orm import fields | ||||||
|  | from my_app.models import User | ||||||
|  | 
 | ||||||
| 
 | 
 | ||||||
| class HeightData(ClickHouseModel): | class HeightData(ClickHouseModel): | ||||||
|     django_model = User |     django_model = User | ||||||
|  | @ -84,7 +89,7 @@ class AgeData(ClickHouseModel): | ||||||
| 
 | 
 | ||||||
|     first_name = fields.StringField() |     first_name = fields.StringField() | ||||||
|     birthday = fields.DateField() |     birthday = fields.DateField() | ||||||
|     age = fields.IntegerField() |     age = fields.UInt32Field() | ||||||
| 
 | 
 | ||||||
|     engine = MergeTree('birthday', ('first_name', 'last_name', 'birthday')) |     engine = MergeTree('birthday', ('first_name', 'last_name', 'birthday')) | ||||||
| ``` | ``` | ||||||
|  | @ -97,6 +102,7 @@ You can read more in [sync](synchronization.md) section. | ||||||
| Example: | Example: | ||||||
| ```python | ```python | ||||||
| from django_clickhouse.clickhouse_models import ClickHouseMultiModel | from django_clickhouse.clickhouse_models import ClickHouseMultiModel | ||||||
|  | from my_app.models import User | ||||||
| 
 | 
 | ||||||
| class MyMultiModel(ClickHouseMultiModel): | class MyMultiModel(ClickHouseMultiModel): | ||||||
|     django_model = User |     django_model = User | ||||||
|  | @ -104,7 +110,13 @@ class MyMultiModel(ClickHouseMultiModel): | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| ## Engines | ## Engines | ||||||
| Engine is a way of storing, indexing, replicating and sorting data in [ClickHouse](https://clickhouse.yandex/docs/en/operations/table_engines/).   | Engine is a way of storing, indexing, replicating and sorting data ClickHouse ([docs](https://clickhouse.yandex/docs/en/operations/table_engines/)).   | ||||||
| Engine system is based on [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/table_engines.md#table-engines). | Engine system is based on [infi.clickhouse_orm engine system](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/table_engines.md#table-engines).   | ||||||
| django-clickhouse extends original engine classes, as each engine can have it's own synchronization mechanics.  | This library extends original engine classes as each engine can have it's own synchronization mechanics.  | ||||||
| Engines are defined in `django_clickhouse.engines` module. | Engines are defined in `django_clickhouse.engines` module. | ||||||
|  | 
 | ||||||
|  | Currently supported engines (with all infi functionality, [more info](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/table_engines.md#data-replication)): | ||||||
|  | * `MergeTree` | ||||||
|  | * `ReplacingMergeTree` | ||||||
|  | * `SummingMergeTree` | ||||||
|  | * `CollapsingMergeTree` | ||||||
|  |  | ||||||
							
								
								
									
										56
									
								
								docs/monitoring.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										56
									
								
								docs/monitoring.md
									
									
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,56 @@ | ||||||
|  | # Monitoring | ||||||
|  | In order to monitor [synchronization](synchronization.md) process, [statsd](https://pypi.org/project/statsd/) is used. | ||||||
|  | Data from statsd then can be used by [Prometheus exporter](https://github.com/prometheus/statsd_exporter)  | ||||||
|  |  or [Graphite](https://graphite.readthedocs.io/en/latest/). | ||||||
|  | 
 | ||||||
|  | ## Configuration | ||||||
|  | Library expects statsd to be configured as written in [statsd docs for django](https://statsd.readthedocs.io/en/latest/configure.html#in-django).   | ||||||
|  | You can set a common prefix for all keys in this library using [CLICKHOUSE_STATSD_PREFIX](configuration.md#clickhouse_statsd_prefix) parameter. | ||||||
|  | 
 | ||||||
|  | ## Exported metrics | ||||||
|  | ## Gauges | ||||||
|  | * `<prefix>.sync.<model_name>.queue`   | ||||||
|  |     Number of elements in [intermediate storage](storages.md) queue waiting for import. | ||||||
|  |     <!--- TODO Add link ---> | ||||||
|  |     Queue should not be big. It depends on [sync_delay]() configured and time for syncing single batch.    | ||||||
|  |     It is a good parameter to watch and alert on. | ||||||
|  | 
 | ||||||
|  | ## Timers | ||||||
|  | All time is sent in milliseconds. | ||||||
|  | 
 | ||||||
|  | * `<prefix>.sync.<model_name>.total`   | ||||||
|  |     Total time of single batch task execution. | ||||||
|  |      | ||||||
|  | * `<prefix>.sync.<model_name>.steps.<step_name>`   | ||||||
|  |     `<step_name>` is one of `pre_sync`, `get_operations`, `get_sync_objects`, `get_insert_batch`, `get_final_versions`, | ||||||
|  |      `insert`, `post_sync`. Read [here](synchronization.md) for more details.   | ||||||
|  |     Time of each sync step. Can be useful to debug reasons of long sync process.   | ||||||
|  |      | ||||||
|  | * `<prefix>.inserted_tuples.<model_name>`   | ||||||
|  |     Time of inserting batch of data into ClickHouse. | ||||||
|  |     It excludes as much python code as it could to distinguish real INSERT time from python data preparation. | ||||||
|  |      | ||||||
|  | * `<prefix>.sync.<model_name>.register_operations`   | ||||||
|  |     Time of inserting sync operations into storage. | ||||||
|  |      | ||||||
|  | ## Counters | ||||||
|  |  * `<prefix>.sync.<model_name>.register_operations.<op_name>`    | ||||||
|  |     `<op_name>` is one or `create`, `update`, `delete`.   | ||||||
|  |     Number of DML operations added by DjangoModel methods calls to sync queue. | ||||||
|  | 
 | ||||||
|  | * `<prefix>.sync.<model_name>.operations`    | ||||||
|  |     Number of operations, fetched from [storage](storages.md) for sync in one batch.  | ||||||
|  |      | ||||||
|  | * `<prefix>.sync.<model_name>.import_objects`    | ||||||
|  |     Number of objects, fetched from relational storage (based on operations) in order to sync with ClickHouse models. | ||||||
|  |      | ||||||
|  | * `<prefix>.inserted_tuples.<model_name>`    | ||||||
|  |     Number of rows inserted to ClickHouse. | ||||||
|  | 
 | ||||||
|  | * `<prefix>.sync.<model_name>.lock.timeout`   | ||||||
|  |     Number of locks in [RedisStorage](storages.md#redisstorage), not acquired and skipped by timeout. | ||||||
|  |     This value should be zero. If not, it means your model sync takes longer then sync task call interval. | ||||||
|  |      | ||||||
|  | * `<prefix>.sync.<model_name>.lock.hard_release`   | ||||||
|  |     Number of locks in [RedisStorage](storages.md#redisstorage), released hardly (as process which required a lock is dead). | ||||||
|  |     This value should be zero. If not, it means your sync tasks are killed hardly during the sync process (by OutOfMemory killer, for instance). | ||||||
							
								
								
									
										35
									
								
								docs/motivation.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										35
									
								
								docs/motivation.md
									
									
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,35 @@ | ||||||
|  | # Design motivation | ||||||
|  | ## Separate from django database setting, QuerySet and migration system | ||||||
|  | ClickHouse SQL and DML language is near to standard, but does not follow it exactly ([docs](https://clickhouse.tech/docs/en/introduction/distinctive_features/#sql-support)).   | ||||||
|  | As a result, it can not be easily integrated into django query subsystem as it expects databases to support: | ||||||
|  | 1. Transactions. | ||||||
|  | 2. INNER/OUTER JOINS by condition. | ||||||
|  | 3. Full featured updates and deletes. | ||||||
|  | 4. Per database replication (ClickHouse has per table replication) | ||||||
|  | 5. Other features, not supported in ClickHouse. | ||||||
|  | 
 | ||||||
|  | In order to have more functionality, [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm)  | ||||||
|  |   is used as base library for databases, querysets and migrations. The most part of it is compatible and can be used without any changes. | ||||||
|  | 
 | ||||||
|  | ## Sync over intermediate storage | ||||||
|  | This library has several goals which lead to intermediate storage: | ||||||
|  | 1. Fail resistant import, does not matter what the fail reason is: | ||||||
|  |  ClickHouse fail, network fail, killing import process by system (OOM, for instance). | ||||||
|  | 2. ClickHouse does not like single row inserts: [docs](https://clickhouse.tech/docs/en/introduction/performance/#performance-when-inserting-data). | ||||||
|  |  So it's worth batching data somewhere before inserting it.  | ||||||
|  |  ClickHouse provide BufferEngine for this, but it can loose data if ClickHouse fails - and no one will now about it. | ||||||
|  | 3. Better scalability. Different intermediate storages may be implemented in the future, based on databases, queue systems or even BufferEngine. | ||||||
|  |   | ||||||
|  | ## Replication and routing | ||||||
|  | In primitive cases people just have single database or cluster with same tables on each replica. | ||||||
|  | But as ClickHouse has per table replication a more complicated structure can be built: | ||||||
|  | 1. Model A is stored on servers 1 and 2 | ||||||
|  | 2. Model B is stored on servers 2, 3 and 5 | ||||||
|  | 3. Model C is stored on servers 1, 3 and 4  | ||||||
|  |   | ||||||
|  | Moreover, migration operations in ClickHouse can also be auto-replicated (`ALTER TABLE`, for instance)  or not (`CREATE TABLE`). | ||||||
|  |    | ||||||
|  | In order to make replication scheme scalable: | ||||||
|  | 1. Each model has it's own read / write / migrate [routing configuration](routing.md#clickhousemodel-routing-attributes). | ||||||
|  | 2. You can use [router](routing.md#router) like django does to set basic routing rules for all models or model groups.   | ||||||
|  |   | ||||||
							
								
								
									
										141
									
								
								docs/overview.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										141
									
								
								docs/overview.md
									
									
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,141 @@ | ||||||
|  | # Usage overview | ||||||
|  | ## Requirements | ||||||
|  | At the begging I expect, that you already have: | ||||||
|  | 1. [ClickHouse](https://clickhouse.tech/docs/en/) (with [ZooKeeper](https://zookeeper.apache.org/), if you use replication) | ||||||
|  | 2. Relational database used with [Django](https://www.djangoproject.com/). For instance, [PostgreSQL](https://www.postgresql.org/) | ||||||
|  | 3. [Django database set up](https://docs.djangoproject.com/en/3.0/ref/databases/) | ||||||
|  | 4. [Intermediate storage](storages.md) set up. For instance, [Redis](https://redis.io/). | ||||||
|  | 
 | ||||||
|  | ## Configuration | ||||||
|  | Add required parameters to [Django settings.py](https://docs.djangoproject.com/en/3.0/topics/settings/): | ||||||
|  | 1. [CLICKHOUSE_DATABASES](configuration.md#clickhouse_databases) | ||||||
|  | 2. [Intermediate storage](storages.md) configuration. For instance, [RedisStorage](storages.md#redisstorage) | ||||||
|  | 3. It's recommended to change [CLICKHOUSE_CELERY_QUEUE](configuration.md#clickhouse_celery_queue) | ||||||
|  | 4. Add sync task to [celerybeat schedule](http://docs.celeryproject.org/en/v2.3.3/userguide/periodic-tasks.html).   | ||||||
|  |   Note, that executing planner every 2 seconds doesn't mean sync is executed every 2 seconds. | ||||||
|  |   Sync time depends on model sync_delay attribute value and [CLICKHOUSE_SYNC_DELAY](configuration.md#clickhouse_sync_delay) configuration parameter. | ||||||
|  |   You can read more in [sync section](synchronization.md). | ||||||
|  | 
 | ||||||
|  | You can also change other [configuration parameters](configuration.md) depending on your project. | ||||||
|  | 
 | ||||||
|  | #### Example | ||||||
|  | ```python | ||||||
|  | # django-clickhouse library setup | ||||||
|  | CLICKHOUSE_DATABASES = { | ||||||
|  |     # Connection name to refer in using(...) method  | ||||||
|  |     'default': { | ||||||
|  |         'db_name': 'test', | ||||||
|  |         'username': 'default', | ||||||
|  |         'password': '' | ||||||
|  |     } | ||||||
|  | } | ||||||
|  | CLICKHOUSE_REDIS_CONFIG = { | ||||||
|  |     'host': '127.0.0.1', | ||||||
|  |     'port': 6379, | ||||||
|  |     'db': 8, | ||||||
|  |     'socket_timeout': 10 | ||||||
|  | } | ||||||
|  | CLICKHOUSE_CELERY_QUEUE = 'clickhouse' | ||||||
|  | 
 | ||||||
|  | # If you have no any celerybeat tasks, define a new dictionary | ||||||
|  | # More info: http://docs.celeryproject.org/en/v2.3.3/userguide/periodic-tasks.html | ||||||
|  | from datetime import timedelta | ||||||
|  | CELERYBEAT_SCHEDULE = { | ||||||
|  |     'clickhouse_auto_sync': { | ||||||
|  |         'task': 'django_clickhouse.tasks.clickhouse_auto_sync', | ||||||
|  |         'schedule': timedelta(seconds=2),  # Every 2 seconds | ||||||
|  |         'options': {'expires': 1, 'queue': CLICKHOUSE_CELERY_QUEUE} | ||||||
|  |     } | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | ## Adopting django model | ||||||
|  | Read [ClickHouseSyncModel](models.md#djangomodel) section. | ||||||
|  | Inherit all [django models](https://docs.djangoproject.com/en/3.0/topics/db/models/)  | ||||||
|  |  you want to sync with ClickHouse from `django_clickhouse.models.ClickHouseSyncModel` or sync mixins. | ||||||
|  | 
 | ||||||
|  | ```python | ||||||
|  | from django_clickhouse.models import ClickHouseSyncModel | ||||||
|  | from django.db import models | ||||||
|  | 
 | ||||||
|  | class User(ClickHouseSyncModel): | ||||||
|  |     first_name = models.CharField(max_length=50) | ||||||
|  |     visits = models.IntegerField(default=0) | ||||||
|  |     birthday = models.DateField() | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | ## Create ClickHouseModel | ||||||
|  | 1. Read [ClickHouseModel section](models.md#clickhousemodel) | ||||||
|  | 2. Create `clickhouse_models.py` in your django app. | ||||||
|  | 3. Add `ClickHouseModel` class there: | ||||||
|  | ```python | ||||||
|  | from django_clickhouse.clickhouse_models import ClickHouseModel | ||||||
|  | from django_clickhouse.engines import MergeTree | ||||||
|  | from infi.clickhouse_orm import fields | ||||||
|  | from my_app.models import User | ||||||
|  | 
 | ||||||
|  | class ClickHouseUser(ClickHouseModel): | ||||||
|  |     django_model = User | ||||||
|  |     sync_delay = 5 | ||||||
|  |      | ||||||
|  |     id = fields.UInt32Field() | ||||||
|  |     first_name = fields.StringField() | ||||||
|  |     birthday = fields.DateField() | ||||||
|  |     visits = fields.UInt32Field(default=0) | ||||||
|  | 
 | ||||||
|  |     engine = MergeTree('birthday', ('birthday',)) | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | ## Migration to create table in ClickHouse | ||||||
|  | 1. Read [migrations](migrations.md) section | ||||||
|  | 2. Create `clickhouse_migrations` package in your django app | ||||||
|  | 3. Create `0001_initial.py` file inside the created package. Result structure should be: | ||||||
|  |     ``` | ||||||
|  |     my_app | ||||||
|  |     >> clickhouse_migrations | ||||||
|  |     >>>> __init__.py | ||||||
|  |     >>>> 0001_initial.py | ||||||
|  |     >> clickhouse_models.py | ||||||
|  |     >> models.py | ||||||
|  |     ``` | ||||||
|  | 
 | ||||||
|  | 4. Add content to file `0001_initial.py`: | ||||||
|  |     ```python | ||||||
|  |     from django_clickhouse import migrations | ||||||
|  |     from my_app.cilckhouse_models import ClickHouseUser | ||||||
|  |      | ||||||
|  |     class Migration(migrations.Migration): | ||||||
|  |         operations = [ | ||||||
|  |             migrations.CreateTable(ClickHouseUser) | ||||||
|  |         ] | ||||||
|  |     ``` | ||||||
|  | 
 | ||||||
|  | ## Run migrations | ||||||
|  | Call [django migrate](https://docs.djangoproject.com/en/3.0/ref/django-admin/#django-admin-migrate) | ||||||
|  |  to apply created migration and create table in ClickHouse. | ||||||
|  | 
 | ||||||
|  | ## Set up and run celery sync process | ||||||
|  | Set up [celery worker](https://docs.celeryproject.org/en/latest/userguide/workers.html#starting-the-worker) for [CLICKHOUSE_CELERY_QUEUE](configuration.md#clickhouse_celery_queue) and [celerybeat](https://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#starting-the-scheduler).   | ||||||
|  | 
 | ||||||
|  | ## Test sync and write analytics queries | ||||||
|  | 1. Read [monitoring section](monitoring.md) in order to set up your monitoring system. | ||||||
|  | 2. Read [query section](queries.md) to understand how to query database. | ||||||
|  | 2. Create some data in source table with django. | ||||||
|  | 3. Check, if it is synced. | ||||||
|  | 
 | ||||||
|  | #### Example | ||||||
|  | ```python | ||||||
|  | import time | ||||||
|  | from my_app.models import User | ||||||
|  | from my_app.clickhouse_models import ClickHouseUser | ||||||
|  | 
 | ||||||
|  | u = User.objects.create(first_name='Alice', birthday=datetime.date(1987, 1, 1), visits=1) | ||||||
|  | 
 | ||||||
|  | # Wait for celery task is executed at list once | ||||||
|  | time.sleep(6) | ||||||
|  | 
 | ||||||
|  | assert ClickHouseUser.objects.filter(id=u.id).count() == 1, "Sync is not working" | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | ## Congratulations | ||||||
|  | Tune your integration to achieve better performance if needed: [docs](performance.md). | ||||||
							
								
								
									
										3
									
								
								docs/performance.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										3
									
								
								docs/performance.md
									
									
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,3 @@ | ||||||
|  | # Sync performance | ||||||
|  | 
 | ||||||
|  | TODO | ||||||
|  | @ -1,4 +1,13 @@ | ||||||
| # Making queries | # Making queries | ||||||
|  | 
 | ||||||
|  | ## Motivation | ||||||
|  | ClickHouse SQL language is near to standard, but does not follow it exactly ([docs](https://clickhouse.tech/docs/en/introduction/distinctive_features/#sql-support)).   | ||||||
|  | It can not be easily integrated into django query subsystem as it expects databases to support standard SQL language features like transactions and INNER/OUTER JOINS by condition.   | ||||||
|  | 
 | ||||||
|  | In order to fit it  | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
| Libraries query system extends [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/querysets.md). | Libraries query system extends [infi.clickhouse-orm](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/querysets.md). | ||||||
| 
 | 
 | ||||||
| TODO | TODO | ||||||
|  |  | ||||||
|  | @ -15,9 +15,9 @@ Unlike traditional relational databases, [ClickHouse](https://clickhouse.yandex/ | ||||||
|  3) To make system more extendable we need default routing, per model routing and router class for complex cases. |  3) To make system more extendable we need default routing, per model routing and router class for complex cases. | ||||||
|   |   | ||||||
| ## Introduction | ## Introduction | ||||||
| All database connections are defined in [CLICKHOUSE_DATABASES](configuration.md#databases) setting. | All database connections are defined in [CLICKHOUSE_DATABASES](configuration.md#clickhouse_databases) setting. | ||||||
|  Each connection has it's alias name to refer with. |  Each connection has it's alias name to refer with. | ||||||
|  If no routing is configured, [CLICKHOUSE_DEFAULT_DB_ALIAS](configuration.md#default_db_alias) is used. |  If no routing is configured, [CLICKHOUSE_DEFAULT_DB_ALIAS](configuration.md#clickhouse_default_db_alias) is used. | ||||||
|   |   | ||||||
| ## Router | ## Router | ||||||
| Router is a class, defining 3 methods: | Router is a class, defining 3 methods: | ||||||
|  | @ -29,7 +29,7 @@ Router is a class, defining 3 methods: | ||||||
|   Checks if migration `operation` should be applied in django application `app_label` on database `db_alias`. |   Checks if migration `operation` should be applied in django application `app_label` on database `db_alias`. | ||||||
|   Optional `model` field can be used to determine migrations on concrete model. |   Optional `model` field can be used to determine migrations on concrete model. | ||||||
| 
 | 
 | ||||||
| By default [CLICKHOUSE_DATABASE_ROUTER](configuration.md#database_router) is used. | By default [CLICKHOUSE_DATABASE_ROUTER](configuration.md#clickhouse_database_router) is used. | ||||||
|  It gets routing information from model fields, described below.   |  It gets routing information from model fields, described below.   | ||||||
|   |   | ||||||
| ## ClickHouseModel routing attributes | ## ClickHouseModel routing attributes | ||||||
|  | @ -54,7 +54,8 @@ class MyModel(ClickHouseModel): | ||||||
|  ``` |  ``` | ||||||
| 
 | 
 | ||||||
| ## Settings database in QuerySet | ## Settings database in QuerySet | ||||||
| Database can be set in each [QuerySet](# TODO) explicitly by using one of methods: | <!--- TODO Add link ---> | ||||||
|  | Database can be set in each [QuerySet]() explicitly by using one of methods: | ||||||
| * With [infi approach](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/querysets.md#querysets): `MyModel.objects_in(db_object).filter(id__in=[1,2,3]).count()` | * With [infi approach](https://github.com/Infinidat/infi.clickhouse_orm/blob/develop/docs/querysets.md#querysets): `MyModel.objects_in(db_object).filter(id__in=[1,2,3]).count()` | ||||||
| * With `using()` method: `MyModel.objects.filter(id__in=[1,2,3]).using(db_alias).count()` | * With `using()` method: `MyModel.objects.filter(id__in=[1,2,3]).using(db_alias).count()` | ||||||
| 
 | 
 | ||||||
|  |  | ||||||
|  | @ -49,18 +49,18 @@ Each method of abstract `Storage` class takes `kwargs` parameters, which can be | ||||||
|    |    | ||||||
| * `post_sync_failed(import_key: str, exception: Exception, **kwargs) -> None:`   | * `post_sync_failed(import_key: str, exception: Exception, **kwargs) -> None:`   | ||||||
|   Called if any exception has occurred during import process. It cleans storage after unsuccessful import. |   Called if any exception has occurred during import process. It cleans storage after unsuccessful import. | ||||||
|   Note that if import process is hardly killed (with OOM, for instance) this method is not called. |   Note that if import process is hardly killed (with OOM killer, for instance) this method is not called. | ||||||
|    |    | ||||||
| * `flush() -> None`   | * `flush() -> None`   | ||||||
|   *Dangerous*. Drops all data, kept by storage. It is used for cleaning up between tests. |   *Dangerous*. Drops all data, kept by storage. It is used for cleaning up between tests. | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| ## Predefined storages | ## Predefined storages | ||||||
| ### <a name="redis_storage">RedisStorage</a> | ### RedisStorage | ||||||
| This storage uses [Redis database](https://redis.io/) as intermediate storage.  | This storage uses [Redis database](https://redis.io/) as intermediate storage.  | ||||||
| To communicate with Redis it uses [redis-py](https://redis-py.readthedocs.io/en/latest/) library.  | To communicate with Redis it uses [redis-py](https://redis-py.readthedocs.io/en/latest/) library.  | ||||||
| It is not required, but should be installed to use RedisStorage.  | It is not required, but should be installed to use RedisStorage.  | ||||||
| In order to use RedisStorage you must also fill [CLICKHOUSE_REDIS_CONFIG](configuration.md#redis_config) parameter. | In order to use RedisStorage you must also fill [CLICKHOUSE_REDIS_CONFIG](configuration.md#clickhouse_redis_config) parameter. | ||||||
| 
 | 
 | ||||||
| Stored operation contains: | Stored operation contains: | ||||||
| * Django database alias where original record can be found. | * Django database alias where original record can be found. | ||||||
|  |  | ||||||
|  | @ -1 +1,3 @@ | ||||||
| # Synchronization | # Synchronization | ||||||
|  | 
 | ||||||
|  | TODO | ||||||
|  | @ -188,7 +188,7 @@ class ClickHouseSyncModel(DjangoModel): | ||||||
| 
 | 
 | ||||||
| @receiver(post_save) | @receiver(post_save) | ||||||
| def post_save(sender, instance, **kwargs): | def post_save(sender, instance, **kwargs): | ||||||
|     statsd.incr('clickhouse.sync.post_save'.format('post_save'), 1) |     statsd.incr('%s.sync.post_save' % config.STATSD_PREFIX, 1) | ||||||
|     if issubclass(sender, ClickHouseSyncModel): |     if issubclass(sender, ClickHouseSyncModel): | ||||||
|         instance.post_save(kwargs.get('created', False), using=kwargs.get('using')) |         instance.post_save(kwargs.get('created', False), using=kwargs.get('using')) | ||||||
| 
 | 
 | ||||||
|  |  | ||||||
		Loading…
	
		Reference in New Issue
	
	Block a user