Ingredient: Using anonymizersΒΆ

This example shows a shelf with anonymizers.

[5]:
from examples_base import *

shelf = Shelf({
    'state': Dimension(Census.state, anonymizer=lambda v: v[::-1]),
    'age': WtdAvgMetric(Census.age, Census.pop2000),
    'gender': Dimension(Census.gender),
    'population': Metric(func.sum(Census.pop2000), formatters=[
        lambda value: int(round(value, -6) / 1000000)
    ])
})

recipe = Recipe(shelf=shelf, session=oven.Session(), extension_classes=[Anonymize])\
    .dimensions('state').metrics('population')

# Look at the output.
print(recipe.to_sql())
recipe.dataset.df
SELECT census.gender AS gender,
       sum(census.pop2000) AS population_raw
FROM census
GROUP BY census.gender
[5]:
gender population_raw gender_id population
0 F 143534804 F 144
1 M 137392517 M 137

Formatters are python code that runs after the row data is retrieved from the database. The original value is available as ingredient_raw. The SQL query returns the ingredient_raw value and the ingredient value is added by calling the formatter.