Novel Internet applications often require low latency and high throughput at the same time, posing challenges to access aggregation networks (AAN). Low-Latency Low-Loss Scalable-Throughput (L4S) Internet service and related schedulers have been proposed to meet these requirements and also allow the coexistence of Classic and L4S flows in the same system. AANs generally apply Hierarchical QoS (HQoS) to enforce fairness among their subscribers. It allows subscribers to utilize their fair share as they desire, and it also protects traffic of various subscribers from each other. The traffic management engines of available P4-programmable hardware switches do not support complex HQoS and L4S scheduling. In this demo paper, we show how a recent core-stateless L4S Active Queue Management (AQM) proposal called VDQ-CSAQM can be implemented in P4, and executed in high-speed programmable hardware switches. We also show how a cloud-rendered gaming service benefits from the low latency and HQoS provided by our VDQ-CSAQM.