tspurway/hustle
A column oriented, embarrassingly distributed relational event database.
{ "createdAt": "2014-02-19T02:13:45Z", "defaultBranch": "master", "description": "A column oriented, embarrassingly distributed relational event database.", "fullName": "tspurway/hustle", "homepage": "", "language": "Python", "name": "hustle", "pushedAt": "2018-04-14T02:03:05Z", "stargazersCount": 238, "topics": [], "updatedAt": "2025-08-26T02:15:19Z", "url": "https://github.com/tspurway/hustle"}![Hustle]!(doc/_static/hustle.png)
A column oriented, embarrassingly distributed, relational event database.
Features
Section titled “Features”- column oriented - super fast queries
- events - write only semantics
- distributed insert - designed for petabyte scale distributed datasets with massive write loads
- compressed - bitmap indexes, lz4, and prefix trie compression
- relational - join gigantic data sets
- partitioned - smart shards
- embarrassingly distributed (based on Disco)
- embarrassingly fast (uses LMDB)
- NoSQL - Python DSL
- bulk append only semantics
- highly available, horizontally scalable
- REPL/CLI query interface
Example Query
Section titled “Example Query”select(impressions.ad_id, impressions.date, h_sum(pix.amount), h_count(), where=((impressions.date < '2014-01-13') & (impressions.ad_id == 30010), pix.date < '2014-01-13'), join=(impressions.site_id, pix.site_id), order_by=impressions.date)Installation
Section titled “Installation”After cloning this repo, here are some considerations:
- you will need Python 2.7 or higher - note that it probably won’t work on 2.6 (has to do with pickling lambdas…)
- you need to install Disco 0.5 and its dependencies - get that working first
- you need to install Hustle and its ‘deps’ thusly:
cd hustlesudo ./bootstrap.shPlease refer to the Installation Guide for more details
Documentation
Section titled “Documentation”Credits
Section titled “Credits”Special thanks to following open-source projects: