A Python library for working with the ClickHouse database (https://clickhouse.yandex/)
Go to file
Jenkins CI 232a8d29ad Finished Release v2.1.1
* develop:
  Releasing v2.1.1
  fix precedence of ~ operator in Q objects
  ignore non-numeric parts of version string
  Fixes to make the tests pass on ClickHouse v21.9
  Bump urllib3 from 1.25.9 to 1.26.5 in /examples/db_explorer
  Bump jinja2 from 2.11.2 to 2.11.3 in /examples/db_explorer
  Simplified
  Support for adding a column to the beginning of a table
  1. add stddevPop func 2. add stddevSamp func
  changes reverted after rebase
  initializing changes related to string enums for pull request
2021-10-21 14:47:55 +03:00
docs Added functions for working with external dictionaries 2020-07-14 22:01:50 +03:00
examples Merge pull request #169 from Infinidat/dependabot/pip/examples/db_explorer/jinja2-2.11.3 2021-10-21 12:27:05 +03:00
scripts Support for data skipping indexes 2020-06-06 20:56:32 +03:00
src/infi Merge pull request #164 from behldizh/add-stddev-funcs-develop 2021-10-21 09:39:51 +03:00
tests fix precedence of ~ operator in Q objects 2021-10-21 05:33:23 +00:00
.gitignore Functions WIP 2020-04-19 07:17:52 +03:00
.noseids Merge branch 'develop' into funcs 2019-07-13 11:51:10 +03:00
buildout.cfg Use Python 3.8 2020-05-28 23:07:59 +03:00
CHANGELOG.md Releasing v2.1.1 2021-10-21 14:12:46 +03:00
LICENSE HOSTDEV-2736 change license and add license file 2017-06-18 12:35:33 +03:00
README.md Added usage examples 2020-06-26 17:53:39 +03:00
setup.in HOSTDEV-2736 change license and add license file 2017-06-18 12:35:33 +03:00
tox.ini add instructions to test with tox 2018-04-21 11:49:14 +03:00

Introduction

This project is simple ORM for working with the ClickHouse database. It allows you to define model classes whose instances can be written to the database and read from it.

Let's jump right in with a simple example of monitoring CPU usage. First we need to define the model class, connect to the database and create a table for the model:

from infi.clickhouse_orm import Database, Model, DateTimeField, UInt16Field, Float32Field, Memory, F

class CPUStats(Model):

    timestamp = DateTimeField()
    cpu_id = UInt16Field()
    cpu_percent = Float32Field()

    engine = Memory()

db = Database('demo')
db.create_table(CPUStats)

Now we can collect usage statistics per CPU, and write them to the database:

import psutil, time, datetime

psutil.cpu_percent(percpu=True) # first sample should be discarded
while True:
    time.sleep(1)
    stats = psutil.cpu_percent(percpu=True)
    timestamp = datetime.datetime.now()
    db.insert([
        CPUStats(timestamp=timestamp, cpu_id=cpu_id, cpu_percent=cpu_percent)
        for cpu_id, cpu_percent in enumerate(stats)
    ])

Querying the table is easy, using either the query builder or raw SQL:

# Calculate what percentage of the time CPU 1 was over 95% busy
queryset = CPUStats.objects_in(db)
total = queryset.filter(CPUStats.cpu_id == 1).count()
busy = queryset.filter(CPUStats.cpu_id == 1, CPUStats.cpu_percent > 95).count()
print('CPU 1 was busy {:.2f}% of the time'.format(busy * 100.0 / total))

# Calculate the average usage per CPU
for row in queryset.aggregate(CPUStats.cpu_id, average=F.avg(CPUStats.cpu_percent)):
    print('CPU {row.cpu_id}: {row.average:.2f}%'.format(row=row))

This and other examples can be found in the examples folder.

To learn more please visit the documentation.