To improve the cellular energy efficiency, without sacrificing quality-of-service (QoS) at the users, the network topology must be densified to enable higher spatial reuse. We analyze a combination of two densification approaches, namely "massive" multiple-input multiple-output (MIMO) base stations and small-cell access points. If the latter are operator-deployed, a spatial soft-cell approach can be taken where the multiple transmitters serve the users by joint non-coherent multiflow beamforming. We minimize the total power consumption (both dynamic emitted power and static hardware power) while satisfying QoS constraints. This problem is proved to have a hidden convexity that enables efficient solution algorithms. Interestingly, the optimal solution promotes exclusive assignment of users to transmitters. Furthermore, we provide promising simulation results showing how the total power consumption can be greatly improved by combining massive MIMO and small cells; this is possible with both optimal and low-complexity beamforming.