As future small cell base stations (SCBSs) are set to be multi-mode capable (i.e., transmitting on both licensed and unlicensed bands), a cost-effective integration of both technologies/systems coping with peak data demands, is crucial. Using tools from reinforcement learning (RL), a distributed cross-system traffic steering framework is proposed, whereby SCBSs autonomously optimize their long-term performance, as a function of traffic load and users’ heterogeneous requirements. Leveraging the (existing) Wi-Fi component, SCBSs learn their optimal transmission strategies over both unlicensed and licensed bands. The proposed traffic steering solution is validated in a Long-Term Evolution (LTE) simulator augmented with Wi-Fi hotspots. Remarkably, it is shown that the cross-system learning-based approach outperforms several benchmark algorithms and traffic steering policies, with gains reaching up to 300% when using a traffic-aware scheduler (as compared to the classical proportional fair (PF) scheduler).