SEER | Yipeng Zhang

Abstract

Large Language Models (LLMs) perform well across many tasks but degrade when processing large collections of repetitive or highly similar inputs, a common scenario in applications such as near-duplicate search results and large e-commerce catalogs. In these settings, concatenation-based approaches such as long-context prompting and supervised fine-tuning suffer from attention saturation and diminished signal-to-noise ratio, causing models to miss subtle but important distinctions as input size grows. SEER (Set Encoding for Efficient Representation) addresses this problem by compressing a variable-sized item set into a single learned token that can be injected directly into an LLM. Experiments on a large-scale e-commerce dataset show that SEER substantially outperforms in-context and fine-tuned LLM baselines while remaining stable even when processing thousands of highly similar items.

Why SEER Matters

Built for dense item sets: targets product groups where listings are highly redundant yet differ in fine-grained attributes such as storage, color, or carrier.
Single-token set interface: replaces prompt concatenation with one learned summary token that preserves both shared signals and subtle contrasts.
Task-conditioned encoding: uses learnable queries to extract different set views for attribution grounding, set summarization, conflict detection, and reconstruction.
Industry-scale robustness: maintains strong performance on long input sets far beyond the training range, which is critical for large e-commerce catalogs.

Method Overview

SEER framework overview — SEER compresses a large set of item embeddings into a learned summary token that is injected into the LLM.

Frozen item encoder: each listing is first mapped into a semantic embedding using an in-house BERT-based encoder.
SEER adapter: multi-head attention with learnable task queries aggregates the set into a compact task-conditioned representation.
Neural injection: the learned vector replaces a special <summary_token> embedding in the prompt, letting the LLM consume set-level semantics directly.
Staged alignment: first align the adapter to the frozen LLM, then jointly fine-tune both modules for smooth cross-modal integration.

Main Results

Approach	Acc_attr	Acc_sum / BERTScore	F1_conf
GPT-4.1-mini (ICL)	0.359	0.091 / 0.477	0.612
GPT-5 (ICL)	0.547	0.168 / 0.673	0.733
LLaMA-3.1-8B (SFT)	0.847	0.188 / 0.567	0.916
SEER	0.889	0.401 / 0.709	0.976

SEER improves over both in-context prompting and supervised fine-tuning across all three evaluation objectives. The largest gain appears in set summarization, where the summary token preserves enough structure for the model to enumerate distinct product concepts without prompt bloat.

Robustness to Large Input Sets

SEER robustness against increasing input set size — SEER remains stable as the input set grows, while the fine-tuned LLM baseline degrades under longer and more repetitive inputs.

The evaluation set contains up to 1,000 items per sample, far beyond the 100-item maximum used during training. Even under this distribution shift, SEER preserves strong attribute grounding, summarization quality, and conflict detection performance, highlighting its suitability for real-world large-scale product collections.

Experimental Setting

Dataset: 523,422 unique cellphone listings from eBay.com, covering 2,706 product concepts.
Training data: 1.5 million examples across attribution grounding, set summarization, conflict detection, and information reconstruction.
Backbone: LLaMA-3.1-8B-Instruct with a 4-layer multi-head SEER adapter.
Baselines: GPT-5, GPT-4.1-mini, and a supervised fine-tuned LLaMA-3.1-8B baseline based on prompt concatenation.

Citation

@inproceedings{lin2026seer,
  title     = {SEER: Set Encoding for Efficient Representation in Large-Scale E-commerce},
  author    = {Lin, Yining and Shen, Yuming and Zhang, Yipeng and Xu, Canran},
  booktitle = {Proceedings of the ACM Web Conference 2026 (WWW '26)},
  year      = {2026},
  address   = {Dubai, United Arab Emirates},
  doi       = {10.1145/3774904.3792958}
}