jardin documentation¶
jardin (noun, french) – garden, yard, grove.
Jardin is a pandas.DataFrame
-based ORM for Python applications.
Getting started¶
Installation¶
$ pip install jardin
or
$ echo 'jardin' >> requirements.txt
$ pip install -r requirements.txt
Setup¶
In your working directory (the root of your app), create a file named jardin_conf.py
:
# jardin_conf.py
DATABASES = {
'my_master_database': 'https://username:password@master_database.url:port',
'my_replica_database': 'https://username:password@replica_database.url:port'
}
LOG_LEVEL = logging.DEBUG
WATERMARK = 'My Great App'
You can also place this file anywhere you want and point to it with the environment variable JARDIN_CONF
.
If you’d like to balance the load among a few databases - especially among replica databases - you may give multiple database URLs, separated by whitespace:
# jardin_conf.py
DATABASES = {
'my_replicas': 'https://user:pass@replica1.url:port https://user:pass@replica2.url:port'
}
# On first access, jardin randomly picks an URL from the list and maintains connection
# "stickiness" during the lifetime of the process. In a long-running process,
# application may ask jardin to switch to other connections on the list by
# calling 'jardin.reset_session()'.
Then, in your app, say you have a table called users
:
# app.py
import jardin
class User(jardin.Model):
db_names = {'master': 'my_master_database', 'replica': 'my_replica_database'}
In the console:
>>> from app import User
>>> users = User.last(4)
# /* My Great App */ SELECT * FROM users ORDER BY u.created_at DESC LIMIT 4;
>>> users
id name email ...
0 John john@beatl.es ...
1 Paul paul@beatl.es ...
2 George george@beatl.es ...
3 Ringo ringo@beatl.es ...
The resulting object is a pandas dataframe:
>>> import pandas
>>> isinstance(users, pandas.DataFrame)
True
>>> isinstance(users, jardin.Collection)
True
Querying¶
SELECT queries¶
Here is the basic syntax to select records from the database
>>> users = User.select(
select=['id', 'name'],
where={'email': 'paul@beatl.es'},
order='id ASC',
limit=1)
# SELECT u.id, u.name FROM users u WHERE u.email = 'paul@beatl.es' ORDER BY u.id ASC LIMIT 1; /* My Great App */
>>> users
id name
1 Paul
Arguments¶
See API reference.
where argument
Here are the different ways to feed a condition clause to a query.
where = "name = 'John'"
where = {'name': 'John'}
where = {'id': (0, 3)}
– selects whereid
is between 0 and 3where = {'id': [0, 1, 2]}
– selects whereid
is in the arraywhere = [{'id': (0, 10), 'instrument': 'drums'}, ["created_at > %(created_at)s", {'created_at': '1963-03-22'}]]
For other operators than =
, see Comparators.
inner_join, left_join arguments
The simplest way to join another table is as follows
>>> User.select(inner_join=["instruments i ON i.id = u.instrument_id"])
If you have configured your models associations, see Features, you can simply pass the class as argument:
>>> User.select(inner_join=[Instrument])
Individual record selection¶
You can also look-up a single record by id:
>>> User.find(1)
# SELECT * FROM users u WHERE u.id = 1 LIMIT 1;
User(id=1, name='Paul', email='paul@beatl.es', ...)
>>> User.find_by(values={'name': 'Paul'})
# SELECT * FROM users u WHERE u.name = 'Paul' LIMIT 1;
User(id=1, name='Paul', email='paul@beatl.es', ...)
Note that the returned object is a Record
object which allows you to access attributes in those way:
>>> user['name']
Paul
>>> user.name
Paul
INSERT queries¶
>>> user = User.insert(values={'name': 'Pete', 'email': 'pete@beatl.es'})
# INSERT INTO users (name, email) VALUES ('Pete', 'pete@beatl.es') RETURNING id;
# SELECT u.* FROM users WHERE u.id = 4;
>>> user
id name email
4 Pete pete@beatl.es
UPDATE queries¶
>>> users = User.update(values={'hair': 'long'}, where={'name': 'John'})
# UPDATE users u SET (u.hair) = ('long') WHERE u.name = 'John' RETURNING id;
# SELECT * FROM users u WHERE u.name = 'John';
DELETE queries¶
>>> User.delete(where={'id': 1})
# DELETE FROM users u WHERE u.id = 1;
Raw queries¶
>>> jardin.query(sql='SELECT * FROM users WHERE id IN %(ids)s;', params={'ids': [1, 2, 3]})
# SELECT * FROM users WHERE id IN (1, 2, 3);
Query from SQL file¶
>>> jardin.query(filename='path/to/file.sql', params={...})
The path is relative to the working directory (i.e. where your app was launched).
Comparators¶
The syntax where={'id': 123}
works well for =
conditions but breaks down for other operators. For that purpose, jardin offers comparators.
For example
>>> from jardin.comparators import *
>>> User.count(where={'created_at': gt(datetime.utcnow() - timedelta(day=1))})
# SELECT COUNT(*) FROM users WHERE created_at > '2018-04-29 12:00:00';
All comparators¶
Comparator | Operator | Example | Result |
---|---|---|---|
lt |
< | {'n': lt(3)} |
WHERE n < 3 |
leq |
<= | {'n': leq(3)} |
WHERE n <= 3 |
gt |
> | {'n': gt(3)} |
WHERE n > 3 |
geq |
<= | {'n': geq(3)} |
WHERE n >= 3 |
not_null |
{'n': not_null()} |
WHERE n IS NOT NULL |
|
not_in |
{'n': not_in([1, 2])} |
WHERE n IS NOT IN (1, 2) |
Features¶
..Associations ..————
..Belongs-to and has-many relationships can be declared as such:
..And then used as such:
..Or:
Query watermarking¶
By defining a watermark in your jardin_conf.py
file:
WATERMARK = 'MyGreatApp'
Queries will show up as such in your SQL logs:
/* MyGreatApp | path/to/file.py:function_name:line_number */ SELECT * FROM ....;
Scopes¶
Query scopes can be defined inside your model as such:
class User(jardin.Model):
scopes = {
'active': {'active': True},
'recent': ["last_sign_up_at > %(week_ago)s", {'week_ago': datetime.utcnow() - timedelta(weeks=1)}]
}
Then used as such:
User.select(scopes = ['active', 'recent'])
Which will issue this statement
SELECT * FROM users u WHERE u.active IS TRUE AND u.last_sign_up_at > ...;
Soft deletes¶
If you don’t want to actually remove rows from the database when deleting a record, you can activate soft-deletes:
class User(jardin.Model):
soft_delete = True
When the destroy
method is called on a model instance, the deleted_at
database field on the corresponding table will be set to the current UTC time.
Then, when calling select
, count
, delete
or update
, rows with a non-NULL deleted_at value will be ignored. This can be overridden by passing the skip_soft_delete=True
argument.
The find
method is not affected.
To force delete a single record, call destroy(force=True)
.
To customize the database column used to store the deletion timestamp, do:
class User(jardin.Model):
soft_delete = 'my_custom_db_column'
Multiple databases and master/replica split¶
Multiple databases can be declared in jardin_conf.py
:
DATABASES = {
'my_first_db': 'postgres://...',
'my_first_db_replica': 'postgres://...',
'my_second_db': 'postgres://...',
'my_second_db_replical': 'postgres://...'
}
And then in your model declarations:
class Db1Model(jardin.Model):
db_name = {'master': 'my_first_db', 'replica': 'my_first_db_replica'}
class Db2Model(jardin.Model):
db_name = {'master': 'my_second_db', 'replica': 'my_second_db_replica'}
class User(Db1Model): pass
class Project(Db2Model): pass
Replica lag measurement¶
You can measure the current replica lag in seconds using any class inheriting from jardin.Model
:
jardin.Model.replica_lag()
# 0.001
MyModel.replica_lag()
# 0.001
Connection drops recovery¶
The exceptions psycopg2.InterfaceError
and psycopg2.OperationalError
are rescued and a new connection is initiated. Three attempts with exponential decay are made before bubbling up the exception.
API reference¶
jardin.Model¶
-
class
jardin.
Model
(**kwargs)¶ Base class from which your models should inherit.
-
collection_class
¶ alias of
Collection
-
destroy
(force=False)¶ Deletes the record. If the model has
soft_delete
activated, the record will not actually be deleted.Parameters: force (boolean) – forces the record to be actually deleted if soft_delete
is activated.
-
classmethod
find
(id, **kwargs)¶ Finds a record by its id in the model’s table in the replica database. :returns: an instance of the model.
-
classmethod
find_by
(values={}, **kwargs)¶ Returns a single record matching the criteria in
values
found in the model’s table in the replica database.Parameters: values (dict) – Criteria to find the record. Returns: an instance of the model.
-
classmethod
insert
(**kwargs)¶ Performs an INSERT statement on the model’s table in the master database.
Parameters: values (dict) – A dictionary containing the values to be inserted. datetime
,dict
andbool
objects can be passed as is and will be correctly serialized by psycopg2.
-
classmethod
last
(limit=1, **kwargs)¶ Returns the last limit records inserted in the model’s table in the replica database. Rows are sorted by
created_at
.
-
classmethod
query
(sql=None, filename=None, **kwargs)¶ run raw sql from sql or file against.
Parameters: - sql (string) – Raw SQL query to pass directly to the connection.
- filename (string) – Path to a file containing a SQL query. The path should be relative to CWD.
- db (string) – optional Database name from your
jardin_conf.py
, overrides the default database set in the model declaration. - role (string) – optional One of
('master', 'replica')
to override the default.
Returns: jardin.Collection
collection, which is apandas.DataFrame
.
-
classmethod
replica_lag
(**kwargs)¶ Returns the current replication lag in seconds between the master and replica databases.
Returns: float
-
classmethod
table_schema
()¶ Returns the table schema.
Returns: dict
-
jardin.Collection¶
-
class
jardin.
Collection
(data=None, index=None, columns=None, dtype=None, copy=False)¶ Base class for collection of records. Inherits from pandas.DataFrame.
-
index_by
(field)¶ Returns a dict with a key for each value of field and the first record with that value as value. :param field: Name of the field to index by. :type field: string.
-
records
()¶ Returns an iterator to loop over the rows, each being an instance of the model’s record class, i.e. jardin_record by default.
-