You are on page 1of 3

data brewery

Bubbles operations

Bubbles Operations
For Bubbles v0.1, June 2013 Operation
Metadata operations eld_lter obj, keep, drop, rename Filters elds of an object. Keep keep only listed elds, drop keep all except elds in the drop list, rename new eld names. rows sql

Arguments

Description

Signatures

Row operations
lter_by_value lter_by_set obj, eld, value obj, eld, set Get rows where eld is equal to value. Get rows where eld is one of values from the set. Get rows where eld is within given range. Get rows selected by the predicate. Predicate receives values for given elds. Distinct values for key elds Every rst row with distinct value for key elds Provide a sample of objects rows based on mode. The mode might be: rst, nth, random. Returns object with rows ordered based on order. Order is a list of tuples (eld, order). Aggregate measures by keys rows sql rows

lter_by_range lter_by_predicate

obj, eld, from, to obj, elds, predicate

(not yet) rows records rows sql rows rows sql rows sql rows

distinct rst_unique sample

obj[, key] obj[, key][,discard] obj, value[, mode]

sort

obj, order

aggregate

obj, keys, measures, include_count

Field Operations
text_substitute string_strip obj, eld, substitions obj, [elds, [chars]] Perform substitutions (pattern, value) on eld. Strip whitespaces (or chars) from elds or all string and text elds. Appends elds to the object with specied constant values. Changes specied elds (or all date elds) to a date dimension key in form YYYYMMDD. unknown_date value is used for empty date elds. rows rows

append_constant_elds

obj, elds, values

rows sql rows sql

dates_to_dimension

obj, [elds, [unknown_date]]

Revision 1, June 2013, Bubbles 0.1 prototype

data brewery
Operation
Compositions
append join_details objects[] master, detail, master_key, detail_key dimension, source, dimension_key, source_key dimension, source, dimension_key, source_key dimension, source, dimension_key, source_key, elds, version_eld Append objects with same elds

Bubbles operations

Arguments

Description

Signatures

rows sql rows,rows sql,sql

Composes master and detail objects using left (inner) join by matching master_key eld(s) with detail_key eld(s). Get keys that were added to the source if compared with dimension. Comparison is done on specied keys. Get whole rows that were added to the source if compared with dimension. Comparison is done on specied keys. Get rows that were changed in the source (elds are compared for change). Row matching is done on specied keys.

added_keys

sql,sql

added_rows

sql,sql sql,rows

changed_rows

sql,sql

Auditing
distinct_count obj[, elds] Count number of rows for distinct values of elds (or all elds) sql

Assertions
assert_unique obj[, key] There should be no row (or key) duplicates in the object. sql

Conversions
as_dict as_records fetch_all obj, key, value obj obj Converts object to a python dictionary. Return an object with records representation Fetches (consumes) all rows into a list and returns an object with rows representation. rows rows sql rows

Output
pretty_print obj, target Produces textual output to target (or stdout) formatted as table. rows

Notes
All objects with sql representation currently provide also rows representation. The statements are executed (not necessarily fetched) and objects are handled as iterator objects. Therefore all rows operations can be used.

Revision 1, June 2013, Bubbles 0.1 prototype

data brewery

Bubbles operations

Assertions raise ProbeAssertionError on failure. Can be used in Pipelines to stop the process when condition is not met. Most of the keys may be either a single elds or list of elds (composite keys)

Revision 1, June 2013, Bubbles 0.1 prototype