About
Interpreting Data in cosmos_sync
Delivery sales orders
We are a foodservices distributor with both delivery operations and retail stores. Delivery orders are stored in sales_orders (with sales_order_lines). These orders are valid if (and only if) confirmed_code = "Y". For delivery order originating from the main warehouse, the delivery_route is prefixed with A, e.g. A-PT and A-PN, with the exception of K1 and K2 (K stands for Kaohsiung), which are also delivery routes originating from the main warehouse. The delivery routes map to areas as follows:
| Route | Area |
|---|---|
| A-PT | Pingtung City |
| A-PH | Hengchun |
| A-PN | Pingtung North |
| A-PS | Chaozhou |
| A-PK | Donggang |
| K1 | Kaohsiung-1 |
| K2 | Kaohsiung-2 |
| PT | Pingtung City (Store originated) |
| PH | Hengchun (Store originated) |
| PS | Chaozhou (Store originated) |
| PK | Donggang (Store originated) |
Note that sales returns are in the sales_returns table. There, we also need to filter for confirmed_code = "Y". These need to be netted out when calculating net sales revenue from deliveries.
Prior to utilizing sales_orders and sales_returns, we need to remove any “deleted” records from the deletion_log table.
Important: The deletion_log table uses internal ERP table names, not our view names:
| Our View | ERP Table Name |
|---|---|
| sales_order_lines | COPTH |
| sales_return_lines | COPTI |
Note: The record_pk_* columns may contain trailing whitespace and need to be trimmed.
Recommended approach (Python-based filtering for better performance):
# Load deleted records once at start of analysis
deleted_order_lines = con.sql("""
SELECT
TRIM(record_pk_001) as order_type,
TRIM(record_pk_002) as order_number,
TRIM(record_pk_003) as line_number
FROM cosmos_sync.deletion_log
WHERE table_name = 'COPTH'
""").to_polars()
deleted_return_lines = con.sql("""
SELECT
TRIM(record_pk_001) as return_type,
TRIM(record_pk_002) as return_number,
TRIM(record_pk_003) as line_number
FROM cosmos_sync.deletion_log
WHERE table_name = 'COPTI'
""").to_polars()
# Create sets for efficient lookup
deleted_order_keys = set(
zip(deleted_order_lines["order_type"],
deleted_order_lines["order_number"],
deleted_order_lines["line_number"])
)
# Filter after loading data
orders_df = orders_df.filter(
~pl.struct(["order_type", "order_number", "line_number"]).map_elements(
lambda x: (x["order_type"], x["order_number"], x["line_number"]) in deleted_order_keys,
return_dtype=pl.Boolean
)
)Why Python-based filtering: SQL-based NOT EXISTS with TRIM() causes query timeouts due to inability to use indexes. Loading the ~1,700 deleted order records into Python sets provides fast O(1) lookups
POS retail sales
Retail sales data are in retail_sales and retail_sale_lines. These are data from POS terminals at the stores. The store code to store mapping is below.
| Store code | Store |
|---|---|
| 00 | Main Warehouse |
| 01 | Pingtung City |
| 02 | Chaozhou |
| 03 | Hengchung |
| 04 | Donggang |
For POS sales data, returns simply show up as negative amounts.
Analysis Conventions
When we look at business results, we group store-originated deliveries with the stores. For example, sales orders with route PT would be combined with retail sales with store 01 when looking at how the Pingtung City store is doing. This is done since store leaders at each store at responsible for delivery performance originating from their stores.
Route Changes
December 2024: All PH (Hengchun store-originated) deliveries became the responsibility of the main warehouse. Therefore, PH route data is only comparable within 2025 YTD and should not be used for YoY comparisons.
2024-2025 Route Reorganization (A-PK / K2): Approximately 98 customers were reassigned from route A-PK to route K2. For YoY analysis, A-PK and K2 should be analyzed together to avoid misleading conclusions about individual route performance. Looking at A-PK alone shows a -12.5% decline, but when combined with K2, the decline is -4.3% with customer count actually increasing.
Customer Status Fields
closure_date: If this field is populated in the customers table, it means the customer closed their shop on or after that date (we record the date we were informed).
- Use for churn analysis: Helps distinguish between customers who went out of business vs. those who potentially switched to a competitor.
- Format: YYYYMMDD text string
- If NULL: Customer has not formally closed; if they stopped ordering, they may have switched suppliers.
Visualization Standards
- No dual-axis charts: Always use separate charts for metrics with different scales. Dual-axis charts are misleading and difficult to interpret.
- Avoid fixed scale domains: Let the data determine the scale unless there’s a specific reason to fix it (e.g., percentage charts at 0-100%).
Margin calculation
Historical procurement costs can be found in procurement_costs. Additional product information can be found in product_information.
Important: Category 1 (生產製造) products are manufactured internally by Oceanforest (subsidiary). An accounting change on Jan 1, 2025 shifted transfer pricing, so Category 1 margins are NOT comparable YoY. Exclude Category 1 when doing YoY margin trend analysis.
Watch Out: Common Analysis Pitfalls
1. Scope Mismatch (Geographic/Route)
Issue: Calculating metrics for wrong geographic scope (e.g., company-wide vs specific route). - Always verify delivery_route filter matches intended analysis scope - Example: Pingtung City = ["A-PT", "PT"] only, not all routes
2. B2B Penetration Rate Expectations
Issue: Assuming B2C-like penetration rates (50-80%) for B2B foodservice. - Reality: B2B max penetration is typically 10-30% - If penetration >40%: Warning - recheck calculation - If penetration >50%: Almost certainly wrong
3. Time-Matched Costs for Margin Calculations
Issue: Using product_information.latest_cost (snapshot) instead of monthly costs. - Correct: Join with procurement_costs matching order month to inventory_month - Why: Some products have >10% cost variation within a year - Example: Corn (25888) varied from $16.06 (Jan) to $14.05 (Nov) - 12.5% change
4. Excluded Items in Margin Analysis
Issue: Including service fees and samples in margin calculations. - Exclude: Service items (69620 開店設計費, 69617 維修費), placeholder “*” - Exclude: Samples (product codes starting with “S”) - Exclude: Items with null/zero cost (would show 100% margin)
5. Margin Aggregation Method
Issue: Using simple average of margin percentages across products. - Wrong: margin_pct.mean() - gives equal weight to low-revenue items - Correct: Weighted margin = SUM(revenue - cost) / SUM(revenue) * 100
6. Sales Returns Netting
Issue: Forgetting to net out sales returns from sales_returns table. - Delivery revenue should subtract confirmed returns (confirmed_code = 'Y') - POS retail returns show as negative amounts (automatic)
7. Category 1 YoY Comparison
Issue: Comparing Category 1 (生產製造) margins year-over-year. - Jan 1, 2025 accounting change in transfer pricing - Always exclude Category 1 from YoY margin trend analysis
Product Categories
Product Category 1 (商品分類一/product_category_1)
| Code | Name | Notes |
|---|---|---|
| 1 | 生產製造 | INTERNAL - Oceanforest subsidiary |
| 2 | 成品 | Finished goods |
| 3 | 生產分裝 | Production packaging |
| 4 | 冷藏類 | Refrigerated |
| 5 | 冷凍類 | Frozen |
| 6 | 器具類 | Equipment |
| 7 | 免洗餐具 | Disposable tableware |
| 8 | 酒.咖啡豆.書 | Alcohol, coffee beans, books |
| 9 | 調味.添加物 | Seasonings, additives |
| 10 | 其他 | Other |
Product Category 2 (大分類/product_category_2)
| Code | Name | Code | Name |
|---|---|---|---|
| 1 | 調味雞肉 | 18 | 茶品 |
| 2 | 調味豬肉 | 19 | 糖品 |
| 3 | 薯製品 | 20 | 罐頭 |
| 4 | 炸物 | 21 | 麵粉/澱粉 |
| 5 | 餅皮 | 22 | 抹醬/果醬 |
| 6 | 加工食品 | 23 | 濃縮果汁 |
| 7 | 麵包烘焙 | 24 | 麵食 |
| 8 | 中式點心 | 25 | 肉鬆/海苔 |
| 9 | 冷凍麵食 | 26 | 調味料理粉 |
| 10 | 即食料理 | 27 | 調味調理品 |
| 11 | 冷凍蔬菜 | 28 | 配料 |
| 12 | 乳製品-冷藏 | 29 | 香辛料 |
| 13 | 調味調理-冷藏 | 30 | 油品 |
| 14 | 其他食品-冷藏 | 31 | 料理包 |
| 15 | 調製飲品 | 32 | 免洗用品 |
| 16 | 乳製品 | 33 | 生活器具 |
| 17 | 純咖啡粉/咖啡豆 | 34 | 添加物 |
Product Category 3 (中分類/product_category_3)
Detailed subcategory codes (150+) available in the database. Common examples:
- 101-103: 雞肉製品 (Chicken products)
- 201-202: 豬肉製品 (Pork products)
- 301-304: 薯製品 (Potato products)
- 401-410: 炸物 (Fried foods)
- 501-504: 餅皮 (Pastry shells)
- 601-606: 加工食品 (Processed foods)
- 701-707: 麵包烘焙 (Bakery)
- 801-806: 中式點心 (Chinese dim sum)