Added README note by [issue 51](https://github.com/carrotquest/django-clickhouse/issues/51)
6.2 KiB
Usage overview
Requirements
At the beginning I expect, that you already have:
- ClickHouse (with ZooKeeper, if you use replication)
- Relational database used with Django. For instance, PostgreSQL
- Django database set up
- Intermediate storage set up. For instance, Redis
- Celery set up in order to sync data automatically.
Configuration
Add required parameters to Django settings.py:
- Add
'django_clickhouse'
toINSTALLED_APPS
- CLICKHOUSE_DATABASES
- Intermediate storage configuration. For instance, RedisStorage
- It's recommended to change CLICKHOUSE_CELERY_QUEUE
- Add sync task to celerybeat schedule.
Note, that executing planner every 2 seconds doesn't mean sync is executed every 2 seconds. Sync time depends on model sync_delay attribute value and CLICKHOUSE_SYNC_DELAY configuration parameter. You can read more in sync section.
You can also change other configuration parameters depending on your project.
Example
INSTALLED_APPS = (
# Your apps may go here
'django_clickhouse',
# Your apps may go here
)
# django-clickhouse library setup
CLICKHOUSE_DATABASES = {
# Connection name to refer in using(...) method
'default': {
'db_name': 'test',
'username': 'default',
'password': ''
}
}
CLICKHOUSE_REDIS_CONFIG = {
'host': '127.0.0.1',
'port': 6379,
'db': 8,
'socket_timeout': 10
}
CLICKHOUSE_CELERY_QUEUE = 'clickhouse'
# If you have no any celerybeat tasks, define a new dictionary
# More info: http://docs.celeryproject.org/en/v2.3.3/userguide/periodic-tasks.html
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
'clickhouse_auto_sync': {
'task': 'django_clickhouse.tasks.clickhouse_auto_sync',
'schedule': timedelta(seconds=2), # Every 2 seconds
'options': {'expires': 1, 'queue': CLICKHOUSE_CELERY_QUEUE}
}
}
Adopting django model
Read ClickHouseSyncModel section.
Inherit all django models
you want to sync with ClickHouse from django_clickhouse.models.ClickHouseSyncModel
or sync mixins.
from django_clickhouse.models import ClickHouseSyncModel
from django.db import models
class User(ClickHouseSyncModel):
first_name = models.CharField(max_length=50)
visits = models.IntegerField(default=0)
birthday = models.DateField()
Create ClickHouseModel
- Read ClickHouseModel section
- Create
clickhouse_models.py
in your django app. - Add
ClickHouseModel
class there:
from django_clickhouse.clickhouse_models import ClickHouseModel
from django_clickhouse.engines import MergeTree
from infi.clickhouse_orm import fields
from my_app.models import User
class ClickHouseUser(ClickHouseModel):
django_model = User
# Uncomment the line below if you want your models to be synced automatically
# sync_enabled = True
id = fields.UInt32Field()
first_name = fields.StringField()
birthday = fields.DateField()
visits = fields.UInt32Field(default=0)
engine = MergeTree('birthday', ('birthday',))
Important note: clickhouse_model.py
file is not anyhow imported by django initialization code. So if your models are not used anywhere excluding this file, you should import it somewhere in your code if you want synchroniztion working correctly. For instance, you can customise AppConfig like:
from django.apps import AppConfig
class MyAppConfig(AppConfig):
name = 'my_app'
def ready(self):
from my_app.clickhouse_models import ClickHouseUser
Migration to create table in ClickHouse
-
Read migrations section
-
Create
clickhouse_migrations
package in your django app -
Create
0001_initial.py
file inside the created package. Result structure should be:my_app | clickhouse_migrations |-- __init__.py |-- 0001_initial.py | clickhouse_models.py | models.py
-
Add content to file
0001_initial.py
:from django_clickhouse import migrations from my_app.clickhouse_models import ClickHouseUser class Migration(migrations.Migration): operations = [ migrations.CreateTable(ClickHouseUser) ]
Run migrations
Call django migrate to apply created migration and create table in ClickHouse.
Set up and run celery sync process
Set up celery worker for CLICKHOUSE_CELERY_QUEUE and celerybeat.
Test sync and write analytics queries
- Read monitoring section in order to set up your monitoring system.
- Read query section to understand how to query database.
- Create some data in source table with django.
- Check, if it is synced.
Example
import time
from my_app.models import User
from my_app.clickhouse_models import ClickHouseUser
u = User.objects.create(first_name='Alice', birthday=datetime.date(1987, 1, 1), visits=1)
# Wait for celery task is executed at list once
time.sleep(6)
assert ClickHouseUser.objects.filter(id=u.id).count() == 1, "Sync is not working"
Congratulations
Tune your integration to achieve better performance if needed: docs.