mirror of
				https://github.com/carrotquest/django-clickhouse.git
				synced 2025-11-04 01:47:46 +03:00 
			
		
		
		
	
		
			
				
	
	
	
		
			5.6 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			5.6 KiB
		
	
	
	
	
	
	
	
Usage overview
Requirements
At the beginning I expect, that you already have:
- ClickHouse (with ZooKeeper, if you use replication)
 - Relational database used with Django. For instance, PostgreSQL
 - Django database set up
 - Intermediate storage set up. For instance, Redis
 - Celery set up in order to sync data automatically.
 
Configuration
Add required parameters to Django settings.py:
- Add 
'django_clickhouse'toINSTALLED_APPS - CLICKHOUSE_DATABASES
 - Intermediate storage configuration. For instance, RedisStorage
 - It's recommended to change CLICKHOUSE_CELERY_QUEUE
 - Add sync task to celerybeat schedule.
Note, that executing planner every 2 seconds doesn't mean sync is executed every 2 seconds. Sync time depends on model sync_delay attribute value and CLICKHOUSE_SYNC_DELAY configuration parameter. You can read more in sync section. 
You can also change other configuration parameters depending on your project.
Example
INSTALLED_APPS = (
    # Your apps may go here
    'django_clickhouse',
    # Your apps may go here
)
# django-clickhouse library setup
CLICKHOUSE_DATABASES = {
    # Connection name to refer in using(...) method 
    'default': {
        'db_name': 'test',
        'username': 'default',
        'password': ''
    }
}
CLICKHOUSE_REDIS_CONFIG = {
    'host': '127.0.0.1',
    'port': 6379,
    'db': 8,
    'socket_timeout': 10
}
CLICKHOUSE_CELERY_QUEUE = 'clickhouse'
# If you have no any celerybeat tasks, define a new dictionary
# More info: http://docs.celeryproject.org/en/v2.3.3/userguide/periodic-tasks.html
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
    'clickhouse_auto_sync': {
        'task': 'django_clickhouse.tasks.clickhouse_auto_sync',
        'schedule': timedelta(seconds=2),  # Every 2 seconds
        'options': {'expires': 1, 'queue': CLICKHOUSE_CELERY_QUEUE}
    }
}
Adopting django model
Read ClickHouseSyncModel section.
Inherit all django models
you want to sync with ClickHouse from django_clickhouse.models.ClickHouseSyncModel or sync mixins.
from django_clickhouse.models import ClickHouseSyncModel
from django.db import models
class User(ClickHouseSyncModel):
    first_name = models.CharField(max_length=50)
    visits = models.IntegerField(default=0)
    birthday = models.DateField()
Create ClickHouseModel
- Read ClickHouseModel section
 - Create 
clickhouse_models.pyin your django app. - Add 
ClickHouseModelclass there: 
from django_clickhouse.clickhouse_models import ClickHouseModel
from django_clickhouse.engines import MergeTree
from infi.clickhouse_orm import fields
from my_app.models import User
class ClickHouseUser(ClickHouseModel):
    django_model = User
    
    # Uncomment the line below if you want your models to be synced automatically
    # sync_enabled = True
    
    id = fields.UInt32Field()
    first_name = fields.StringField()
    birthday = fields.DateField()
    visits = fields.UInt32Field(default=0)
    engine = MergeTree('birthday', ('birthday',))
Migration to create table in ClickHouse
- 
Read migrations section
 - 
Create
clickhouse_migrationspackage in your django app - 
Create
0001_initial.pyfile inside the created package. Result structure should be:my_app | clickhouse_migrations |-- __init__.py |-- 0001_initial.py | clickhouse_models.py | models.py - 
Add content to file
0001_initial.py:from django_clickhouse import migrations from my_app.cilckhouse_models import ClickHouseUser class Migration(migrations.Migration): operations = [ migrations.CreateTable(ClickHouseUser) ] 
Run migrations
Call django migrate to apply created migration and create table in ClickHouse.
Set up and run celery sync process
Set up celery worker for CLICKHOUSE_CELERY_QUEUE and celerybeat.
Test sync and write analytics queries
- Read monitoring section in order to set up your monitoring system.
 - Read query section to understand how to query database.
 - Create some data in source table with django.
 - Check, if it is synced.
 
Example
import time
from my_app.models import User
from my_app.clickhouse_models import ClickHouseUser
u = User.objects.create(first_name='Alice', birthday=datetime.date(1987, 1, 1), visits=1)
# Wait for celery task is executed at list once
time.sleep(6)
assert ClickHouseUser.objects.filter(id=u.id).count() == 1, "Sync is not working"
Congratulations
Tune your integration to achieve better performance if needed: docs.