LinkedIn is pushing the boundaries of recommendation systems, moving beyond traditional models to embrace generative sequential architectures. This shift, exemplified by their Generative Recommender (GR), promises more nuanced understanding of user behavior over time. However, scaling these advanced models presents significant engineering hurdles.
The move to GR, which models user activity as token sequences, offers richer long-context personalization than older Deep Learning Recommendation Models (DLRM). This upgrade was crucial as user interactions on the platform became more dynamic and sequence-driven. In LinkedIn Engineering's own production deployments, the GR system demonstrated tangible benefits, including a 2.10% increase in session time spent.
Traditional DLRMs focus on per-user activity, while GRs leverage a user's entire history as ordered token streams. This means GRs utilize a broader time window (360 days versus 90) and employ transformer-based architectures, leading to larger model sizes and more complex data handling.