mirror of https://github.com/Infinidat/infi.clickhouse_orm.git synced 2025-09-06 11:54:46 +03:00

A Python library for working with the ClickHouse database (https://clickhouse.yandex/)

Go to file

Oliver Margetts c357aad8e7 Create python-publish.yml		2021-08-16 18:26:07 +00:00
.github/workflows	Create python-publish.yml	2021-08-16 18:26:07 +00:00
clickhouse_orm	Enhancement: add timezones to all date functions	2021-08-08 10:11:21 +01:00
docs	Chore: update scripts/docs	2021-08-16 09:36:15 +01:00
examples	Chore: fix linting for examples	2021-08-16 09:44:48 +01:00
scripts	Chore: update scripts/docs	2021-08-16 09:36:15 +01:00
tests	Compatibility: update tests for 21.3+	2021-08-14 12:24:33 +01:00
.gitignore	Tooling: use poetry	2021-07-27 22:28:50 +01:00
CHANGELOG.md	Update changelog	2021-08-08 10:11:55 +01:00
LICENSE	Chore: license tweak	2021-07-28 22:50:11 +01:00
pyproject.toml	Chore: fix linting for examples	2021-08-16 09:44:48 +01:00
README.md	Update README.md	2021-08-06 18:41:17 +00:00
setup.cfg	Chore: fix linting for examples	2021-08-16 09:44:48 +01:00

README.md

A fork of infi.clikchouse_orm aimed at more frequent maintenance and bugfixes.

Introduction

This project is simple ORM for working with the ClickHouse database. It allows you to define model classes whose instances can be written to the database and read from it.

Let's jump right in with a simple example of monitoring CPU usage. First we need to define the model class, connect to the database and create a table for the model:

from clickhouse_orm import Database, Model, DateTimeField, UInt16Field, Float32Field, Memory, F

class CPUStats(Model):

    timestamp = DateTimeField()
    cpu_id = UInt16Field()
    cpu_percent = Float32Field()

    engine = Memory()

db = Database('demo')
db.create_table(CPUStats)

Now we can collect usage statistics per CPU, and write them to the database:

import psutil, time, datetime

psutil.cpu_percent(percpu=True) # first sample should be discarded
while True:
    time.sleep(1)
    stats = psutil.cpu_percent(percpu=True)
    timestamp = datetime.datetime.now()
    db.insert([
        CPUStats(timestamp=timestamp, cpu_id=cpu_id, cpu_percent=cpu_percent)
        for cpu_id, cpu_percent in enumerate(stats)
    ])

Querying the table is easy, using either the query builder or raw SQL:

# Calculate what percentage of the time CPU 1 was over 95% busy
queryset = CPUStats.objects_in(db)
total = queryset.filter(CPUStats.cpu_id == 1).count()
busy = queryset.filter(CPUStats.cpu_id == 1, CPUStats.cpu_percent > 95).count()
print('CPU 1 was busy {:.2f}% of the time'.format(busy * 100.0 / total))

# Calculate the average usage per CPU
for row in queryset.aggregate(CPUStats.cpu_id, average=F.avg(CPUStats.cpu_percent)):
    print('CPU {row.cpu_id}: {row.average:.2f}%'.format(row=row))

This and other examples can be found in the examples folder.

To learn more please visit the documentation.