mirror of https://github.com/Infinidat/infi.clickhouse_orm.git synced 2025-09-25 21:06:34 +03:00

A Python library for working with the ClickHouse database (https://clickhouse.yandex/)

Go to file

Itai Shirav 635197de38 Support for data skipping indexes		2020-06-06 20:56:32 +03:00
docs	Support for data skipping indexes	2020-06-06 20:56:32 +03:00
scripts	Support for data skipping indexes	2020-06-06 20:56:32 +03:00
src/infi	Support for data skipping indexes	2020-06-06 20:56:32 +03:00
tests	Support for data skipping indexes	2020-06-06 20:56:32 +03:00
.gitignore	Functions WIP	2020-04-19 07:17:52 +03:00
.noseids	Merge branch 'develop' into funcs	2019-07-13 11:51:10 +03:00
buildout.cfg	Use Python 3.8	2020-05-28 23:07:59 +03:00
CHANGELOG.md	Support for data skipping indexes	2020-06-06 20:56:32 +03:00
LICENSE	HOSTDEV-2736 change license and add license file	2017-06-18 12:35:33 +03:00
README.md	docs	2020-05-28 19:38:41 +03:00
setup.in	HOSTDEV-2736 change license and add license file	2017-06-18 12:35:33 +03:00
tox.ini	add instructions to test with tox	2018-04-21 11:49:14 +03:00

README.md

Introduction

This project is simple ORM for working with the ClickHouse database. It allows you to define model classes whose instances can be written to the database and read from it.

Let's jump right in with a simple example of monitoring CPU usage. First we need to define the model class, connect to the database and create a table for the model:

from infi.clickhouse_orm import Database, Model, DateTimeField, UInt16Field, Float32Field, Memory, F

class CPUStats(Model):

    timestamp = DateTimeField()
    cpu_id = UInt16Field()
    cpu_percent = Float32Field()

    engine = Memory()

db = Database('demo')
db.create_table(CPUStats)

Now we can collect usage statistics per CPU, and write them to the database:

import psutil, time, datetime

psutil.cpu_percent(percpu=True) # first sample should be discarded
while True:
    time.sleep(1)
    stats = psutil.cpu_percent(percpu=True)
    timestamp = datetime.datetime.now()
    db.insert([
        CPUStats(timestamp=timestamp, cpu_id=cpu_id, cpu_percent=cpu_percent)
        for cpu_id, cpu_percent in enumerate(stats)
    ])

Querying the table is easy, using either the query builder or raw SQL:

# Calculate what percentage of the time CPU 1 was over 95% busy
queryset = CPUStats.objects_in(db)
total = queryset.filter(CPUStats.cpu_id == 1).count()
busy = queryset.filter(CPUStats.cpu_id == 1, CPUStats.cpu_percent > 95).count()
print('CPU 1 was busy {:.2f}% of the time'.format(busy * 100.0 / total))

# Calculate the average usage per CPU
for row in queryset.aggregate(CPUStats.cpu_id, average=F.avg(CPUStats.cpu_percent)):
    print('CPU {row.cpu_id}: {row.average:.2f}%'.format(row=row))

To learn more please visit the documentation.