Presentation
Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory
Event Type
Workshop
W
HPC
Memory
OS and Runtime Systems
Runtime Systems
TimeMonday, 18 November 201911:08am - 11:30am
Location501
DescriptionEmerging Non-Volatile Memories have byteaddressability
and low latency, close to the latency of main
memory, together with the non-volatility of storage devices.
Similarly, recently emerging interconnect fabrics, such as Gen-
Z, provide high bandwidth, together with exceptionally low
latency. These concurrently emerging technologies are making
possible new system architectures in the data centers including
systems with Fabric-Attached Memories (FAMs). FAMs can serve
to create scalable, high-bandwidth, distributed, shared, byteaddressable,
and non-volatile memory pools at a rack scale,
opening up new usage models and opportunities.
Based on these attractive properties, in this paper we propose
FAM-aware, checkpoint-based, post-copy live migration
mechanism to improve the performance of migration. We have
implemented our prototype with a Linux open source checkpoint
tool, CRIU (Checkpoint/Restore In Userspace). According to our
evaluation results, compared to the existing solution, our FAMaware
post-copy can improve at least 15% the total migration
time, at least 33% the busy time, and can let the migrated
application perform at least 12% better during migration.
and low latency, close to the latency of main
memory, together with the non-volatility of storage devices.
Similarly, recently emerging interconnect fabrics, such as Gen-
Z, provide high bandwidth, together with exceptionally low
latency. These concurrently emerging technologies are making
possible new system architectures in the data centers including
systems with Fabric-Attached Memories (FAMs). FAMs can serve
to create scalable, high-bandwidth, distributed, shared, byteaddressable,
and non-volatile memory pools at a rack scale,
opening up new usage models and opportunities.
Based on these attractive properties, in this paper we propose
FAM-aware, checkpoint-based, post-copy live migration
mechanism to improve the performance of migration. We have
implemented our prototype with a Linux open source checkpoint
tool, CRIU (Checkpoint/Restore In Userspace). According to our
evaluation results, compared to the existing solution, our FAMaware
post-copy can improve at least 15% the total migration
time, at least 33% the busy time, and can let the migrated
application perform at least 12% better during migration.
Archive