LLM ranking and generation models learn faster when examples are unambiguous. Product feed training data is only as good as its identifiers. Use barcode and MPN matching to group identical items across merchants, then layer price, discount, and availability to produce clean labels that teach models to compare, rank, and