Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference WorkloadsDecember 29, 2024
HETEGAN: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained DevicesDecember 28, 2024