Empowering true personalization through data simplification and standardization.
I believe it’s important to look at more than just if a customer has bought an item in the past. I believe it’s not just important to look at the ordered items, but the entirety of the ordered products as well.
In trying to standardize millions of orders, I ran into a challenge. It’s easy to see what the customer purchases most often. But what about one-button reorders of their most popular order?
I solved this by generating a globally unique identifier for the items in the order. Using this guid I could then create aggregations for a specific customer_id and based on the specific ordered items, even across different stores or restaurants.
When a customer has ordered the same large pepperoni and large cheese pizzas, with a 2 liter of Pepsi 7 out of their last 8 orders it’s an indicator of what they are looking to order again. Use this data to:
- Make it super simple for them to order this again with one-click.
- Identify promotions or offers that may fit well with this regular purchase
- Create a custom promotion to encourage added items to the standard order.
Your order looks something like this:
{
"order_id": "ORD123456",
"customer_id": "CUST98765",
"order_timestamp": "2024-10-16T14:23:00Z",
"delivery_address": {
"street": "123 Main St",
"city": "Lexington",
"state": "KY",
"zip_code": "40508"
},
"items": [
{
"item_id": "ITEM001",
"item_name": "Wireless Mouse",
"item_code": "WM123",
"quantity_sold": 2,
"price_per_unit": 29.99,
"total_price": 59.98
},
{
"item_id": "ITEM002",
"item_name": "Laptop Stand",
"item_code": "LS456",
"quantity_sold": 1,
"price_per_unit": 39.99,
"total_price": 39.99
},
{
"item_id": "ITEM003",
"item_name": "USB-C Hub",
"item_code": "USB789",
"quantity_sold": 1,
"price_per_unit": 19.99,
"total_price": 19.99
}
],
"order_total": 119.96,
"payment_method": "Credit Card",
"payment_status": "Paid",
"delivery_status": "Processing",
"workers_on_shift": 4,
"notes": "Please leave the package at the front door."
}
ChatGPT identifies these as some of the values gained by using a single identifier:
Creating a single identifier for customer orders based on items purchased is important for several reasons:
- Uniqueness and Consistency: A single identifier ensures that each set of items purchased is uniquely and consistently represented. This prevents duplication and ensures that every order can be tracked and referenced accurately, especially in systems where order updates or modifications might occur.
- Easier Data Management: When dealing with complex order data, such as nested structures of items within an order, a unique identifier simplifies data management. It allows for easier retrieval, updates, and indexing of order records, improving the efficiency of database operations like searches, aggregations, and updates.
- Aggregations and Analytics: A single identifier allows businesses to analyze purchasing patterns more effectively. For instance, identifying trends in specific item combinations or frequently purchased bundles can help optimize inventory, tailor marketing efforts, or create targeted promotions.
- Order Tracking and Audit Trails: In industries like logistics and e-commerce, being able to track specific combinations of items in an order through a single identifier can streamline tracking and auditing. This makes it easier to trace issues with specific orders, such as delays or incorrect shipments, improving customer service.
- Data Integrity: Using a single identifier for a group of items ensures that the integrity of the order data is maintained. It prevents issues where items from different orders might be mistakenly grouped together, ensuring that each identifier accurately reflects a distinct set of items purchased in a specific order.
- Simplified Integration with Other Systems: For systems that rely on interoperability (e.g., ERP systems, CRMs, or third-party analytics tools), having a consistent and unique identifier for orders helps streamline the integration process, allowing for seamless data sharing across different platforms.
Overall, a single identifier for orders ensures efficiency, clarity, and accuracy in managing and analyzing order data, which is essential for businesses to maintain smooth operations and deliver better customer experiences.
Here is a simple script to take update your Elasticsearch order index and create a globally unique identifier for each. Sorting the array is important for consistency. I believe keeping the quantity included is important, but would recommend against using the pricing as part of the key as that can fluctuate even as the product remains the same and will generate new guids based on any fields nested in the object. I’ll share the way to exclude these fields in a follow up post soon.
Elasticsearch update_by_query to create those guids based on nested item objects:
POST /your_index/_update_by_query
{
“script”: {
“source”: “def nestedObjects = ctx._source[‘items’]; def keyValuePairs = []; for (def obj : nestedObjects) { def objAsString = []; for (def entry : obj.entrySet()) { if (entry.getKey() != ‘guid’) { objAsString.add(entry.getKey() + ‘=’ + entry.getValue()); } } Collections.sort(objAsString); keyValuePairs.add(String.join(‘,’, objAsString)); } Collections.sort(keyValuePairs); def concatenated = String.join(‘|’, keyValuePairs); long hash = 0L; for (int i = 0; i < concatenated.length(); i++) { hash = 31L * hash + (long)concatenated.charAt(i); } if (hash < 0) { hash = -hash; } String guid = Long.toHexString(hash); ctx._source[‘guid’] = guid; }”
},
“query”: {
“match_all”: {}
}
}