WWW 2026

SEER: Set Encoding for Efficient Representation in Large-Scale E-commerce

Yining Lin (yinilin@ebay.com) · Yuming Shen (yumishen@ebay.com) · Yipeng Zhang (yipezhang@ebay.com) · Canran Xu (canxu@ebay.com)

Abstract

Large Language Models (LLMs) perform well across many tasks but degrade when processing large collections of repetitive or highly similar inputs, a common scenario in applications such as near-duplicate search results and large e-commerce catalogs. In these settings, concatenation-based approaches such as long-context prompting and supervised fine-tuning suffer from attention saturation and diminished signal-to-noise ratio, causing models to miss subtle but important distinctions as input size grows. SEER (Set Encoding for Efficient Representation) addresses this problem by compressing a variable-sized item set into a single learned token that can be injected directly into an LLM. Experiments on a large-scale e-commerce dataset show that SEER substantially outperforms in-context and fine-tuned LLM baselines while remaining stable even when processing thousands of highly similar items.

Why SEER Matters

Method Overview

SEER framework overview
SEER compresses a large set of item embeddings into a learned summary token that is injected into the LLM.

Main Results

Approach Acc_attr Acc_sum / BERTScore F1_conf
GPT-4.1-mini (ICL) 0.359 0.091 / 0.477 0.612
GPT-5 (ICL) 0.547 0.168 / 0.673 0.733
LLaMA-3.1-8B (SFT) 0.847 0.188 / 0.567 0.916
SEER 0.889 0.401 / 0.709 0.976

SEER improves over both in-context prompting and supervised fine-tuning across all three evaluation objectives. The largest gain appears in set summarization, where the summary token preserves enough structure for the model to enumerate distinct product concepts without prompt bloat.

Robustness to Large Input Sets

SEER robustness against increasing input set size
SEER remains stable as the input set grows, while the fine-tuned LLM baseline degrades under longer and more repetitive inputs.

The evaluation set contains up to 1,000 items per sample, far beyond the 100-item maximum used during training. Even under this distribution shift, SEER preserves strong attribute grounding, summarization quality, and conflict detection performance, highlighting its suitability for real-world large-scale product collections.

Experimental Setting

Citation

@inproceedings{lin2026seer,
  title     = {SEER: Set Encoding for Efficient Representation in Large-Scale E-commerce},
  author    = {Lin, Yining and Shen, Yuming and Zhang, Yipeng and Xu, Canran},
  booktitle = {Proceedings of the ACM Web Conference 2026 (WWW '26)},
  year      = {2026},
  address   = {Dubai, United Arab Emirates},
  doi       = {10.1145/3774904.3792958}
}