This tech talk explores the design and evolution of the PPEC Agent and PPEC Proxy, which form the backbone of a virtualization stack leveraging libvirt, KVM, and QEMU to manage virtual machine (VM) creation, disk attachment, and performance tuning of VMs on bare-metal systems. The session will outline key engineering decisions, the challenges faced in optimizing resource management for VMs, and how advanced tuning techniques like NUMA core optimization play a vital role in performance improvements.
The PPEC system is designed to handle VM lifecycle management, focusing on performance optimization. The PPEC Agent runs as a long-lived daemon on all bare-metal machines, while the PPEC Proxy listens on a predefined port for predefined task requests from the Orchestrator service (PPEC API) and communicates via Unix Domain Socket (UDS).
This section introduces the Orchestrator Service, which sets the groundwork for VM creation and reliable infrastructure management by:
-
IP Allocation: The Orchestrator uses a dedicated IPDB service to assign IPs, taking into account various factors like the VRF under which the requested host(s) belongs to, etc.
-
Bare-Metal Candidate Filtering: The Orchestrator filters eligible bare-metal candidates for provisioning based on a combination of factors:
- VM Placement Policies: The Orchestrator’s candidate filtering is aligned with various policies provided by the PPEC API, which considers operational setups and non-functional requirements such as:
- High Availability (HA): Ensures VMs are provisioned on diverse hardware to prevent single points of failure.
- Cost Optimization: Considers factors like power utilization and operational expenses, selecting servers with the best balance of cost and performance.
- Compute Resource Availability: The Orchestrator further refines selection based on the availability of compute resources that align with the VM’s workload:
- Resource-Specific Workloads: Matches workload-specific needs, such as CPU, memory, and storage requirements.
- NUMA-Aware Placement: Optimizes performance by aligning VMs with specific NUMA nodes to ensure local memory access, reducing latency for performance-critical applications.
-
Data Synchronization for Real-Time Eligibility: The Orchestrator leverages periodic updates from the PPEC Agent to access up-to-date baremetal information, ensuring the latest view of resources across regions. This data, accessible to SREs through the PPEC CLI and PPEC UI, is vital for accurate candidate selection, manual oversight, and proactive alerting in case of unintended activities done on baremetal in terms of VM Lifecycle.
-
PPEC CLI and UI Access: The CLI provides users with a unified interface for managing VM lifecycle operations, viewing baremetal details, and facilitating capacity planning.
-
Maker-Checker Process: To safeguard major operations (e.g., IP or VM updates/deletions), a two-person approval flow is enforced, with nuances like resource-type restrictions for the approvers, enhancing reliability and accuracy in operational actions.
These capabilities streamline VM provisioning, and capacity planning, and ensure high system integrity through PPEC’s coordinated ecosystem.
-
- Automated VM Creation and Deletion via QEMU, KVM, and libvirt: This session will explore how PPEC Agent and PPEC Proxy leverage the pre-established technologies of libvirt, KVM, and QEMU to automate the process of VM creation, deletion, and storage management. PPEC Agent built using Golang, significantly reducing manual intervention and simplifying lifecycle management.
- Concurrency Management with Lockfile: One of the key contributions of PPEC Agent is its use of a lockfile mechanism to handle concurrency during VM creation, ensuring that multiple requests are managed safely and without conflict.
- Challenges in Resource Allocation: In the early stages, resource allocation was handled in a more static fashion, leading to difficulties in optimizing VMs for workloads that demanded low-latency and efficient resource isolation. This section will detail the evolution of resource management within PPEC.
-
- NUMA Optimization for Performance: A deep dive into how NUMA (Non-Uniform Memory Access) tuning allows for optimal CPU and memory performance by mapping VM cores to specific NUMA nodes. This tuning minimizes memory latency and ensures higher performance for demanding workloads.
- Dynamic Disk Attachment: Explore how PPEC Agent dynamically attaches disks to VMs, ensuring flexible storage management based on the VM’s needs.
- Resource Monitoring: PPEC Agent fetches critical statistics (CPU, memory and disk utilization, and other important system metrics) from the bare-metal host to help optimize VM placement and performance. This section will cover how these metrics guide VM resource allocation strategies.
All state changes in the baremetal are tightly controlled through ppec agent. Any deviations are alerted about. This ensures consistent views across the deployments on the state of the system.
- Improving VM Creation Efficiency: With continuous evolution, the PPEC stack has become more adept at handling performance-sensitive workloads by fine-tuning the interaction between the proxy and agent, improving overall efficiency.
- Simplified Monitoring and Debugging: As the stack matured, monitoring and debugging of VMs became more seamless, allowing for better operational control of the VMs and reducing overhead.
- Challenges with NUMA Tuning and Disk Attachment: The session will discuss some of the technical challenges encountered with NUMA optimization and dynamic disk attachment, along with lessons learned about balancing resource performance with operational complexity.
The PPEC Agent Syncer plays a critical role in ensuring that the PPEC Agent stays up to date with minimal disruption to the running VMs and services. This section will explore the checksum-based periodic verification process used by the syncer to detect any discrepancies between the agent’s current state and the desired state.
Additionally, the talk will cover the self-update cycle of the PPEC Agent Syncer, detailing how it autonomously initiates upgrades to ensure consistency and performance, all while maintaining seamless operation and minimizing downtime during these updates.
This talk is ideal for software engineers interested in the operational aspects of virtualization, performance tuning, and resource management on bare-metal systems.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}