Core Architecture Cost Mapping Systems
How to Structure Recipe BOMs in PostgreSQL
Multi-unit restaurant operators and culinary managers require deterministic cost mapping to survive margin compression and procurement volatility. The foundation of any food cost analytics pipeline is a rigorously normalized Bill of Materials (BOM) schema. When architecting this layer in PostgreSQL, the primary engineering challenge is balancing relational integrity with the recursive, nested nature of culinary prep. A production-ready BOM must support arbitrary sub-recipe depth, dynamic yield adjustments, location-specific procurement costs, and strict unit normalization without introducing calculation drift. This guide details the exact schema topology, recursive query patterns, and Python automation hooks required to operationalize recipe costing at scale.
Schema Topology & Constraint Enforcement
Start with a strict separation of master culinary data and transactional cost data. The recipes table holds immutable prep definitions, while recipe_bom_lines maps parent-child relationships. As established in foundational Designing Recipe BOM Databases practices, ingredient master records must remain decoupled from vendor pricing tables, allowing the same BOM to resolve against multiple procurement contracts or regional cost centers. Each BOM line must store parent_recipe_id, child_id, child_type, raw_quantity, raw_uom, and yield_factor. Crucially, never store computed costs directly in the BOM. Compute them on-demand or via materialized views to maintain immutable audit trails.
CREATE TABLE recipes (
recipe_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
recipe_name VARCHAR(150) NOT NULL,
portion_size DECIMAL(10,3) NOT NULL,
portion_uom VARCHAR(10) NOT NULL,
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE ingredients (
ingredient_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
sku VARCHAR(50) UNIQUE,
description VARCHAR(200),
base_uom VARCHAR(10) NOT NULL,
density_g_per_ml DECIMAL(6,3) -- Required for volumetric-to-weight conversions
);
CREATE TABLE recipe_bom_lines (
bom_line_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
parent_recipe_id UUID REFERENCES recipes(recipe_id) ON DELETE CASCADE,
child_id UUID NOT NULL,
child_type VARCHAR(10) CHECK (child_type IN ('INGREDIENT', 'SUBRECIPE')),
raw_quantity DECIMAL(12,5) NOT NULL,
raw_uom VARCHAR(10) NOT NULL,
yield_factor DECIMAL(5,4) DEFAULT 1.0000,
CONSTRAINT valid_yield CHECK (yield_factor > 0.001 AND yield_factor <= 1.0000),
UNIQUE (parent_recipe_id, child_id, child_type)
);
The yield_factor constraint enforces culinary reality: yield must be expressed as a decimal fraction (e.g., 0.85 for 85% usable product). Storing raw inputs without this guardrail causes silent division-by-zero errors during cost roll-ups. All financial calculations should leverage PostgreSQL’s NUMERIC type or Python’s decimal module to prevent IEEE-754 floating-point drift, which compounds rapidly across multi-level BOMs.
Recursive Resolution & Nested Sub-Recipes
Culinary prep is inherently hierarchical. A recipe_bom_lines table with child_type = 'SUBRECIPE' creates a directed acyclic graph (DAG) that must be traversed to resolve true ingredient-level costs. PostgreSQL recursive CTEs provide deterministic traversal without application-layer recursion overhead.
WITH RECURSIVE bom_tree AS (
-- Anchor: Start with the target recipe
SELECT
parent_recipe_id,
child_id,
child_type,
raw_quantity,
raw_uom,
yield_factor,
1 AS depth,
ARRAY[child_id::text] AS path
FROM recipe_bom_lines
WHERE parent_recipe_id = 'TARGET_RECIPE_UUID'::UUID
UNION ALL
-- Recursive step: Expand sub-recipes
SELECT
b.parent_recipe_id,
b.child_id,
b.child_type,
b.raw_quantity * r.yield_factor, -- Apply yield at each level
b.raw_uom,
b.yield_factor,
bt.depth + 1,
bt.path || b.child_id::text
FROM recipe_bom_lines b
JOIN bom_tree bt ON b.parent_recipe_id = bt.child_id
WHERE b.child_type = 'SUBRECIPE'
AND b.child_id::text <> ALL(bt.path) -- Prevent infinite loops
)
SELECT
child_id AS ingredient_id,
SUM(raw_quantity) AS total_raw_qty,
raw_uom,
MAX(depth) AS max_nesting_level
FROM bom_tree
WHERE child_type = 'INGREDIENT'
GROUP BY child_id, raw_uom;
This pattern guarantees that every sub-recipe is expanded exactly once, with yield factors applied multiplicatively at each depth. The path array acts as a cycle-detection mechanism, ensuring operational safety when culinary teams accidentally create circular references.
Deterministic Unit Normalization & Yield Application
Unit conversion is the primary source of cost calculation drift in multi-location operations. Volumetric measures (cup, tbsp, fl_oz) must be normalized to a single base unit (typically grams or kilograms) before cost multiplication. The density_g_per_ml column in the ingredients table enables deterministic conversion:
normalized_weight_g = raw_quantity * density_g_per_ml * (1000 / ml_per_uom)
Yield factors must be applied to the input quantity, not the output. If a recipe calls for 1000g of raw potatoes with a 0.82 yield, the system must track 1000g as the procurement cost driver, while the edible portion becomes 820g. This distinction is critical for accurate theoretical vs. actual variance reporting. All conversion matrices should be versioned and stored in a lookup table to support regional procurement standards. For precise financial arithmetic, Python’s decimal module or PostgreSQL NUMERIC operations must be used exclusively.
Python/Pandas Automation Hooks for Cost Roll-Up
Application-layer automation should focus on fetching normalized BOMs, applying location-specific pricing, and generating audit-ready cost sheets. The following pattern uses pandas for vectorized operations and psycopg2 for deterministic database interaction.
import pandas as pd
from decimal import Decimal, ROUND_HALF_UP
from sqlalchemy import create_engine
# 1. Fetch BOM tree and pricing via parameterized query
DB_URI = "postgresql+psycopg2://user:pass@host/dbname"
engine = create_engine(DB_URI)
query = """
SELECT
r.recipe_id, r.recipe_name,
bl.child_id, bl.child_type, bl.raw_quantity, bl.raw_uom, bl.yield_factor,
i.base_uom, i.density_g_per_ml,
p.cost_per_unit, p.uom AS pricing_uom
FROM recipe_bom_lines bl
JOIN recipes r ON bl.parent_recipe_id = r.recipe_id
LEFT JOIN ingredients i ON bl.child_id = i.ingredient_id
LEFT JOIN pricing p ON bl.child_id = p.ingredient_id AND p.location_id = 'LOC_01'
WHERE r.recipe_id = %s AND bl.child_type = 'INGREDIENT';
"""
df = pd.read_sql(query, engine, params=('TARGET_RECIPE_UUID',))
# 2. Deterministic unit normalization & yield application
def normalize_to_grams(row):
if row['raw_uom'] == 'g':
return row['raw_quantity']
elif row['raw_uom'] in ('kg', 'kilogram'):
return row['raw_quantity'] * 1000
elif row['raw_uom'] in ('ml', 'fl_oz', 'cup') and pd.notna(row['density_g_per_ml']):
# Simplified volumetric conversion for demonstration
ml_equiv = row['raw_quantity'] * {'ml':1, 'fl_oz':29.5735, 'cup':236.588}[row['raw_uom']]
return ml_equiv * row['density_g_per_ml']
return row['raw_quantity']
df['normalized_weight_g'] = df.apply(normalize_to_grams, axis=1)
df['procurement_weight_g'] = df['normalized_weight_g'] / df['yield_factor']
# 3. Vectorized cost calculation using Decimal for financial precision
df['line_cost'] = df.apply(
lambda x: Decimal(str(x['procurement_weight_g'])) * Decimal(str(x['cost_per_unit'])) / Decimal('1000'),
axis=1
)
# 4. Aggregate & export
recipe_cost = df['line_cost'].sum().quantize(Decimal('0.0001'), rounding=ROUND_HALF_UP)
print(f"Total Recipe Cost: ${recipe_cost}")
This pipeline isolates transformation logic from storage, ensuring that cost roll-ups remain reproducible across environments. By leveraging pandas vectorization, multi-recipe costing scales linearly rather than exponentially.
Operational Reliability & Audit Integrity
Production food cost systems must prioritize idempotency and traceability. Never mutate historical BOM lines. Instead, implement soft-deletion (is_active = FALSE) and versioned recipe snapshots. When procurement contracts update, create a new pricing record with an effective date range rather than overwriting existing rates. This preserves historical cost accuracy for menu engineering analysis.
For read-heavy dashboards, materialize the recursive cost roll-up into a PostgreSQL MATERIALIZED VIEW refreshed via scheduled jobs or pg_cron. This eliminates repeated CTE evaluation during peak reporting windows. As documented in broader Core Architecture & Cost Mapping Systems frameworks, separating transactional writes from analytical reads prevents lock contention and ensures deterministic query performance.
Finally, implement strict input validation at the application layer. Reject BOM submissions where yield_factor <= 0, raw_quantity < 0, or where unit conversion matrices lack density mappings. Automated validation prevents silent data corruption that only surfaces during month-end reconciliation.