Publication:

Evaluating the Network Effects of Orchestration Strategies for AI Workloads in Modern Data Centers

 
dc.contributor.authorPereira dos Santos, José Pedro
dc.contributor.authorManiotis, Pavlos
dc.contributor.authorWang, Chen
dc.contributor.authorTantawi, Asser
dc.contributor.authorTardieu, Olivier
dc.contributor.authorWauters, Tim
dc.contributor.authorDe Turck, Filip
dc.date.accessioned2026-03-19T16:00:38Z
dc.date.available2026-03-19T16:00:38Z
dc.date.createdwos2025-10-22
dc.date.issued2025
dc.description.abstractThe exponential growth in Artificial Intelligence (AI) adoption presents unique challenges and opportunities for deploying AI workloads in modern Data Center (DC) networks, particularly in terms of performance, scalability, and reliability. AI workloads, such as inference and distributed training, impose different network demands: inference is primarily computebound and typically requires low network latency, while distributed training is network-bound and requires high bandwidth, placing significant strain on the network. This paper focuses on the network requirements of widely known AI communication patterns, and studies their impact on modern DC architectures by analyzing the effects of different orchestration strategies-specifically packing and spreading-on throughput, response time, and network congestion. The results show that packing strategies generally deliver higher performance for most covered AI collectives. However, spreading strategies can be beneficial in certain scenarios, such as when larger workloads span across higher number of racks, as they can help mitigate network congestion between the switches of leaf-spine network configurations. This paper offers valuable insights into optimizing the orchestration of popular AI collectives in data center networks, presenting informed strategies to improve performance in response to growing AI demands, with findings demonstrating completion time reductions of up to 30 %.
dc.description.wosFundingTextJose Santos is funded by the Research Foundation Flanders (FWO), grant number 1299323N. This work was performed during an internship at IBM Research, Yorktown Heights, NY, USA with financial support from the Research Foundation Flanders (FWO).
dc.identifier.doi10.1109/NETSOFT64993.2025.11080575
dc.identifier.isbn979-8-3315-4346-4
dc.identifier.issn2693-9770
dc.identifier.urihttps://imec-publications.be/handle/20.500.12860/58894
dc.language.isoeng
dc.provenance.editstepusergreet.vanhoof@imec.be
dc.publisherIEEE
dc.source.beginpage285
dc.source.conferenceIEEE 11th International Conference on Network Softwarization (NetSoft)
dc.source.conferencedate2025-06-23
dc.source.conferencelocationBudapest
dc.source.endpage293
dc.source.journal2025 IEEE 11TH INTERNATIONAL CONFERENCE ON NETWORK SOFTWARIZATION, NETSOFT
dc.source.numberofpages9
dc.title

Evaluating the Network Effects of Orchestration Strategies for AI Workloads in Modern Data Centers

dc.typeProceedings paper
dspace.entity.typePublication
dspace.file.typePDF
imec.internal.crawledAt2025-10-22
imec.internal.sourcecrawler
Files

Original bundle

Name:
8841.pdf
Size:
543.5 KB
Format:
Adobe Portable Document Format
Description:
Published
Name:
8841_acc.pdf
Size:
523.35 KB
Format:
Adobe Portable Document Format
Description:
Accepted
Publication available in collections: