fix: Complete Turkish character normalization
Browse files- Fixed İ character handling (must normalize BEFORE lower())
- Added all uppercase Turkish characters to normalization
- Added GPT-5 prompt instructions for Turkish characters
- Now all variations work: MARLİN, MARLIN, marlin, Marlin
All return correct price: 53,000 TL
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
__pycache__/smart_warehouse_with_price.cpython-312.pyc
CHANGED
Binary files a/__pycache__/smart_warehouse_with_price.cpython-312.pyc and b/__pycache__/smart_warehouse_with_price.cpython-312.pyc differ
|
|
smart_warehouse_with_price.py
CHANGED
@@ -49,15 +49,26 @@ def get_product_price_and_link(product_name, variant=None):
|
|
49 |
|
50 |
root = ET.fromstring(xml_content)
|
51 |
|
52 |
-
#
|
53 |
-
|
54 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
-
#
|
57 |
-
|
|
|
58 |
for tr, en in tr_map.items():
|
59 |
-
|
60 |
-
|
|
|
|
|
|
|
|
|
61 |
|
62 |
best_match = None
|
63 |
best_score = 0
|
@@ -303,6 +314,11 @@ If user asks about specific size (S, M, L, XL, XXL, SMALL, MEDIUM, LARGE, X-LARG
|
|
303 |
If user asks generally (without size), return ALL variants of the product.
|
304 |
{warehouse_filter}
|
305 |
|
|
|
|
|
|
|
|
|
|
|
306 |
IMPORTANT BRAND AND PRODUCT TYPE RULES:
|
307 |
- GOBIK: Spanish textile brand we import. When user asks about "gobik", return ALL products with "GOBIK" in the name.
|
308 |
- Product names contain type information: FORMA (jersey/cycling shirt), TAYT (tights), İÇLİK (base layer), YAĞMURLUK (raincoat), etc.
|
|
|
49 |
|
50 |
root = ET.fromstring(xml_content)
|
51 |
|
52 |
+
# Turkish character normalization FIRST (before lower())
|
53 |
+
tr_map = {
|
54 |
+
'İ': 'i', 'I': 'i', 'ı': 'i', # All I variations to i
|
55 |
+
'Ğ': 'g', 'ğ': 'g',
|
56 |
+
'Ü': 'u', 'ü': 'u',
|
57 |
+
'Ş': 's', 'ş': 's',
|
58 |
+
'Ö': 'o', 'ö': 'o',
|
59 |
+
'Ç': 'c', 'ç': 'c'
|
60 |
+
}
|
61 |
|
62 |
+
# Apply normalization to original (before lower)
|
63 |
+
search_name_normalized = product_name
|
64 |
+
search_variant_normalized = variant if variant else ""
|
65 |
for tr, en in tr_map.items():
|
66 |
+
search_name_normalized = search_name_normalized.replace(tr, en)
|
67 |
+
search_variant_normalized = search_variant_normalized.replace(tr, en)
|
68 |
+
|
69 |
+
# Now lowercase
|
70 |
+
search_name = search_name_normalized.lower()
|
71 |
+
search_variant = search_variant_normalized.lower()
|
72 |
|
73 |
best_match = None
|
74 |
best_score = 0
|
|
|
314 |
If user asks generally (without size), return ALL variants of the product.
|
315 |
{warehouse_filter}
|
316 |
|
317 |
+
CRITICAL TURKISH CHARACTER RULES:
|
318 |
+
- "MARLIN" and "MARLİN" are the SAME product (Turkish İ vs I)
|
319 |
+
- Treat these as equivalent: I/İ/ı, Ö/ö, Ü/ü, Ş/ş, Ğ/ğ, Ç/ç
|
320 |
+
- If user writes "Marlin", also match "MARLİN" in the list
|
321 |
+
|
322 |
IMPORTANT BRAND AND PRODUCT TYPE RULES:
|
323 |
- GOBIK: Spanish textile brand we import. When user asks about "gobik", return ALL products with "GOBIK" in the name.
|
324 |
- Product names contain type information: FORMA (jersey/cycling shirt), TAYT (tights), İÇLİK (base layer), YAĞMURLUK (raincoat), etc.
|