Energy Management in Virtualized Environments
Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD)
Inside
Inside Xen
Xen Hypervisor
Hypervisor
Motivations
Motivations and
and Goals
Goals
Online Learning Algorithm
Performs dynamic evaluation of a set of DPM and DVFS
policies
at run time and selects the best suited for the current
workload
Guarantees convergence and performance close to that
of the best
available policy in the set
Working Set
Expert 1
Expert 2
Expert 3
Expert selection
Manages Power
Device
DVFS
DVFS
Expert N
:Dormant Experts
Controller
:Operational Expert
Scheduler
-
Virtualization
Lower datacenter energy
Virtual Machine Power Oriented
Scheduling
consumption
Workload migration across
Handle non-stationary
physical machine
workloads
Minimize impact on performance
Service
VM
Customization
Online Learning Algorithm
Workload
characterization
Energy Oriented
-
- I/O Intensiveness:
Implements a scheduler capable
Maintain metrics
of adapting to workload (guest)
for I/O accesses per guest
characteristics
- CPU Intensiveness: Use
Migration: Guest balancing and AppsCPU
Apps
Apps
clustering
performance
counters
OS
OS
OS
Co-locate guests to free up
resources
Guest 1
Guest 2
Guest n
Online Learning Algorithm
Workload
Characterization
Online Learning
Algorithm
Credit
Scheduler
VM Scheduling
I/O Intensive?
Hypervisor
CPU Intensive?
Hardware
For qsort
I/O
N/W
CPU intensive ( ->1) vs Memory intensive (
-> 0)
Experimental Setup
= measure of CPU intensiveness
Workloads: qsort, djpeg, blowfish, dgzip
Leakage impact ()
80
low
Frequency of Selection
OS implementation and
Results
70
m edium
60
high
Lower
Perf
Delay
CPU Xscale
Higher
energy
savings
Identifies both CPU-intensive and
memory intensive phases correctly
50
40
30
25%
Avg.
20
CPU intensive
0.75
10
0
208MHz
312MHz
416MHz
75%
Energy Saving/Performance Delay
Results for CPU
time
DPM
DPM
DPM
DPM &
& DVFS
DVFS
Trace Name
tRI
tRI
HDD
HP-1Trace
20.5
29
HP-2 Trace
5.9
8.4
2
HP-3 Trace
17.2
tRI : Average Request Inter-arrival Time (in sec)
Policy
Description
PM-1
switch CPU to ACPI state C1 (remove clock supply) and move to lowest voltage setting
PM-2
switch CPU to ACPI state C6 (remove power)
PM-3
switch CPU to ACPI state C6 and switch the memory to self- refresh mode
Benchmark
mcf
HP1 Trace
HP2 Trace
%energy
%delay
%energy
%delay
%energy
Oracle
0
68.17
0
65.9
0
71.2
Timeout
4.2
49.9
4.4
46.9
3.3
55
Ad Timeout
7.7
66.3
8.7
64.7
6
67.7
TISMDP
3.4
44.8
2.26
36.7
1.8
42.3
Predictive
8
66.6
9.2
65.2
6.5
68
Characteristics
Fixed Timeout
Timeout = 7*Tbe
Adaptive Timeout
(Douglis, USENIX95)
Initial timeout = 7*Tbe;
Adjustment = +1Tbe/-1Tbe
Exponential Predictive
(Hwang, ICCAD97)
In+l = a in + (1 a).In
with a = 0.5
Low delay
TISMDP
(Simunic, TCAD01)
Optimized for delay constraint of 3.5% on
HP-1 trace
High energy
savings
bzip2
HP3 Trace
%delay
Expert
art
DPM: With Online Learning
Preference
I
V
HP-1 Trace
CPU1
CPU2
CPUn
Experimental Setup
Policy
DPM: With Individual Experts
Device
CPU0
AMD quad core CPU
SPEC benchmarks
mem intensive
0.4
520MHz
HDD
CPUs
HP-2 Trace
sixtrack
HP-3 Trace
%delay
%energy
%delay
%energy
%delay
%energy
3.5
45
2.61
37.41
2.55
49.5
6.13
60.64
5.86
54.2
4.36
61.02
7.68
65.5
8.59
64.1
5.69
66.28
Freq
%delay
1.9
%EnergysavingsPM-i
PM-1
PM-2
PM-3
29
5.2
0.7
-0.5
1.4
63
8.1
0.1
-2.1
0.8
163
8.1
-6.3
-10.7
1.9
37
4.7
-0.6
-2.1
1.4
86
7.4
-2.4
-5
0.8
223
7.8
-9.0
-14
1.9
32
6
1
-0.1
1.4
76
7.3
-1.7
-4
0.8
202
8
-8
-13
1.9
37
5
-0.5
-2
1.4
86
6
-4.3
-7.2
0.8
227
7
-11
-16.1
Recent CPUs might perform better with a run to sleep
Power/Performance Results for HDD HP-1 trace
policy due to:
Comparison with fixed timeout experts
Improved CPU efficiency
Idle power management support
Supported by NSF-GreenLight project, CNS, Sun Microsystems, UC Micro,
Summary
Hypervisor VM scheduler
implementation
Power Management: DPM/
DVFS
Workload characterization
aware
Adaptive Behavior