|Views: 3
|Likes: 0

Published by Mohammad Bina

Predictive Dynamic Thermal Management for Multicore Systems (PDTM)

Predictive Dynamic Thermal Management for Multicore Systems (PDTM)

See more

See less

Predictive Dynamic Thermal Management for MulticoreSystems

InchoonYeo,ChihChunLiu,andEunJungKimDepartmentofComputerScienceTexasA&MUniversityCollegeStation,TX77840{ryanyeo,chihchun,ejkim}@cs.tamu.edu

ABSTRACT

Recently, processor power density has been increasing at analarming rate resulting in high on-chip temperature. Highertemperature increases current leakage and causes poor re-liability. In this paper, we propose a Predictive DynamicThermal Management (PDTM) based on Application-basedThermal Model (ABTM) and Core-based Thermal Model(CBTM) in the multicore systems. ABTM predicts futuretemperature based on the application speciﬁc thermal be-havior, while CBTM estimates core temperature pattern bysteady state temperature and workload. The accuracy of ourprediction model is 1.6% error in average compared to themodel in HybDTM [8], which has at most 5% error. Basedon predicted temperature from ABTM and CBTM, the pro-posed PDTM can maintain the system temperature belowa desired level by moving the running application from thepossible overheated core to the future coolest core (migra-tion) and reducing the processor resources (priority schedul-ing) within multicore systems. PDTM enables the explo-ration of the tradeoﬀ between throughput and fairness intemperature-constrained multicore systems. We implementPDTM on Intel’s Quad-Core system with a speciﬁc devicedriver to access Digital Thermal Sensor (DTS). Comparedagainst Linux standard scheduler, PDTM can decrease av-erage temperature about 10%, and peak temperature by5

◦

C with negligible impact of performance under 1%, whilerunning single

SPEC2006

benchmark. Moreover, our PDTMoutperforms HRTM [10] in reducing average temperature byabout 7% and peak temperature by about 3

◦

C with perfor-mance overhead by 0.15% when running single benchmark.

Categories and Subject Descriptors

C.4 [

PERFORMANCE OF SYSTEMS

]: Reliability, Avail-ability, and Serviceability

General Terms

Design, Experimentation, Temperature, Performance

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for proﬁt or commercial advantage and that copiesbear this notice and the full citation on the ﬁrst page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior speciﬁcpermission and/or a fee.

DAC 2008,

June 8–13, 2008, Anaheim, California, USA.Copyright 2008 ACM 978-1-60558-115-6/08/0006 ...$5.00.

Keywords

Dynamic Thermal Management, Operating System, Tem-perature

1. INTRODUCTION

Chip multiprocessors (CMPs) have already been employedas the main trend in new generation processors. A CMP in-cludes multiple cores within one single die area to increasethe microprocessors’ performance. However, the increasedcomplexity and decreased feature sizes have caused very highpower density in modern processors. The power dissipatedis converted into heat and the processors are pushing thelimits of packaging and cooling solutions. The increased op-erating temperature potentially aﬀects the system reliability.Moreover, leakage power increases exponentially with oper-ating temperature. Increasing leakage power can furtherraise the temperature resulting in a thermal runaway [3].Hence, there is a need to control temperature at all levels of system design.Recently, many hardware and software-based DynamicThermal Management (DTM) [3, 5, 6, 13, 14] techniqueshave been proposed in sense of that they, except [14], startto control the temperature after the current temperaturereaches at thecritical temperaturethreshold. DynamicTher-mal Management can be characterized as temporal or spa-tial. Temporal management schemes, such as Dynamic Fre-quency Scaling (DFS),Dynamic Voltage Scaling (DVS),clockgating, slowdown the CPU computation to reduce heat dis-sipation. Although they could eﬀectively reduce tempera-ture, they incur signiﬁcant performance overhead. On theother hand, spatial management schemes, such as threadmigration, can reduce the temperature without throttlingthe computation [10]. However, neighboring thermal eﬀectand application thermal behavior are not considered in priorworks. Due to packaging technology in CMP, the tempera-ture of each core will be aﬀected by other cores. The temper-ature diﬀerential between cores can be as much as 10

∼

15

◦

C [11]. There are signiﬁcant variations in the thermal be-havior among diﬀerent applications [11, 14].Motivated by these facts, we propose a Predictive Dy-namic Thermal Management (PDTM) in the context of mul-ticore systems. Our PDTM scheme utilizes an advancedfuture temperature prediction model for each core to esti-mate the thermal behavior considering both core tempera-ture and applications temperature variations and take ap-propriate measures to avoid thermal emergencies. To theauthors’ best knowledge, no prior attempt has been made

734

41.2

to implement the temperature prediction model along withthe thermal-aware scheduling on a real four-core productunder Linux environment. The experimental results on In-tel’s Quad-Core system running two

SPEC2006

benchmarkssimultaneously show the proposed PDTM lowers temper-ature by about 5% in average and reduces up to 3

◦

C inpeak temperature with only at most 8% performance over-head compared to Linux standard scheduler without DTM.Moreover, to validate the presented PDTM, we also rebuiltHRTM [10], and our PDTM outperforms HRTM in reducingaverage temperature by about 7%, performance overhead by0.15%, and peak temperature by about 3

◦

C, while runningsingle benchmark. The main contributions of this paper aresummarized as follows:

•

We propose an advanced future temperature predic-tion model for multicore systems with only 1.6% errorin average.

•

We demonstrate that our scheme outperforms the ex-isting DTM schemes (HRTM and HybDTM) and pro-vides thermal fairness among cores.

•

The proposed PDTM incurs low performance overheadwhich is only 1% when running single benchmark, and8% when running two benchmarks simultaneously.

•

Most importantly, there is no additional hardware unitrequired for our prediction model and thermal-awarescheme. It means that our model and scheme is scal-able for all the multicore systems and can be appliedto real-world CMP products.The remainder of the paper is organized as follows : Theexisting DTM will be introduced in Section 2. Section 3provides the explanation of proposed temperature predictionmodel-ABTM and CBTM in detail. In Section 4, we explainthe system overview of PDTM and how to apply PDTM inmulticore systems. In Section 5, the implementation andanalysis results are discussed and conclusions are providedin Section 6.

2. RELATED WORK

Several thermal control techniques have been proposedand applied in modern processors via either hardware-basedor software mechanisms [3, 12]. Hardware-based DTM mech-anisms, such as Dynamic Frequency Scaling (DFS) and Dy-namic Voltage Scaling (DVS), as well as clock gating, areable to eﬀectively reduce processor’s temperature and guar-antee thermal safety, but with high execution performanceoverhead. Therefore, as the multicore processors becomepopular, some software-based mechanisms, such as powerdensity management in a CMP has been studied in [10].The proposed mechanism, called heat-and-run, has two keycomponents: SMT thread assignment and CMP thread mi-gration. Within heat-and-run the SMT thread assignmentattempts to increase processor-resource utilization by co-scheduling threads which use complementary resources; onthe other hand, the CMP thread migration cools overheatedcores by migrating threads away from overheated cores andassigning them to free SMT contexts on alternate cores tomaintain throughput while allowing cooling overheated cores.They showed that for four cores CMP running ﬁve threads,heat-and-run thread assignment (HRTA) and heat-and-runthread migration (HRTM) achieve 9% higher average through-put than stop-go and 6% higher average throughput thanDVS. Moreover, when performance is constrained by tem-perature, the performance gains brought by thread migra-tion and the importance of limiting the migration frequencyto reduce performance overhead has been conﬁrmed in [9].In [9], a new migration method for temperature-constrainedmulticore is proposed to exchange threads whenever the si-multaneous occurrence of a cold and a hot core is detected.The authors demonstrate that their method yields the samethroughput with HRTM, but requires much less migrations.However, both of these two works above are based on sim-ulated results, and neglect the thermal-correlation betweencores. The power dissipated by the rest of the chip is as-sumed to be negligible. Most importantly, the migrationaction in [9] above is triggered by the current tempera-ture (when the temperature is higher than maximum al-lowed temperature) in these two papers; however, instead of considering the current temperature, we believe that an ac-curate future temperature prediction model could performbetter in lowering the peak temperature.In [8], HybDTM, a methodology for ﬁne-grained, coor-dinated thermal management using both software (prior-ity scheduling) and hardware (clock gating) techniques, isproposed. In order to estimate temperature, HybDTM pro-posed a regression-based thermal model based on using hard-ware performance counters. However, HybDTM can not ef-fectively reduce overheat temperature without performanceoverhead, because real temperature cannot be estimatedsolely by hardware performance counter, and both of pri-ority scheduling and clock gating will introduce high perfor-mance overhead. Their performance overhead is 9.9% com-pared to the case without any DTM. Therefore, we proposePDTM which includes an advanced future temperature pre-diction model with very low performance overhead for thereal-world product (Intel’s Quad-Core). Rather than us-ing the performance counter for temperature, we utilize theregression analysis for application-based thermal behavioras ﬁne-grained scheme, and core-based thermal behavior ascoarse-grained scheme to provide very accurate temperatureprediction model for DTM.

3. PREDICTIVE THERMAL MODEL

In this section, we present a thermal model to predict thefuture temperature at any point during the execution of aspeciﬁc application. The model is based on our observationthat the rate of change in temperature during the execu-tion of an application depends on the diﬀerence betweenthe current temperature and the steady state temperatureof the application

1

. Moreover, the thermal behavior is dif-ferent among applications. Since the system temperatureis aﬀected by both each application’s thermal behavior andeach processors thermal pattern, we deﬁne the application-based thermal model and the processor-based thermal modelin this paper.

3.1 The Application-based Thermal Model

The Application-based Thermal Model (ABTM) accom-modates short-term thermal behavior in order to predict the

1

The steady state temperature of an application is deﬁnedas the temperature the system would reach if the applicationis executed inﬁnitely.

735

0501001502002503005560657075808590time(sec)

t e m p e r a t u r e ( C e l s i u s )

Figure 1: Real temperature of one core on running

bzip2

benchmark

future temperature in ﬁne-grained. As shown in Figure 1,there are rapid temperature changes even when the workloadis statically 100%. Speciﬁcally, this model ﬁrst derives thethermal behavior from local intervals (short term temper-ature reactions) and then predicts the future temperatureby incorporating this behavior into a regression based ap-proach that is known as the Recursive Least Square Method(RLSM). In the general least-squares problem, the outputof a linear model

y

is given by the linear parameterized ex-pression

y

=

θ

1

f

1

(

u

) +

θ

2

f

2

(

u

) +

···

+

θ

n

f

n

(

u

)

,

(1)where

u

= [

u

1

,

u

2

,

···

,

u

n

] is the model’s input vector,

f

1

,...,

f

n

are known functions of

u

, and

θ

1

,

θ

2

,...,

θ

n

are un-known parameters to be estimated. In our study, let theinput vector,

u

, and the output vector,

y

, be time units andworking temperature respectively. To identify the unknownparameters

θ

i

, experiments usually have to be performed toobtain a training data set composed of data pairs (

u

i

;

y

i

),

i

= 1,

···

,

m

}

. Expressed in matrix notation, the followingequation can be obtained:

Y

=

Xθ

where

X

is an

m

×

n

matrix:

X

=

f

1

(

u

1

)

···

f

n

(

u

1

).........

f

1

(

u

m

)

···

f

n

(

u

m

)

(2)

θ

is a

n

×

1 unknown parameter vector:

θ

= [

θ

1

,θ

2

,...,θ

n

]

T

(3)and

Y

is a

n

×

1 output vector:

Y

= [

Y

1

,Y

2

,...,Y

n

]

T

(4)If

X

T

X

is nonsingular, the least square estimator can bederived as

θ

= (

X

T

X

)

−

1

X

T

Y,

(5)Denote the

i

th

row of the joint data matrix [

X

:

Y

] by[

X

T i

:

Y

i

]. Suppose that a new data pair [

X

T k

+1

:

Y

k

+1

]becomes available as the (

k

+ 1)

th

entry in the data set.To avoid recalculating the least squares estimator using allinput and output data samples, let

P

k

= (

X

T

X

)

−

1

for the

k

th

in Equation (5). Likewise, the recursive least squaremethod at (

k

+ 1)

th

can be developed as

P

k

+1

=

P

k

−

P

k

x

k

+1

x

T k

+1

P

k

1 +

y

T k

+1

P

k

y

k

+1

,

(6)

t

i

+△t

temperature trigger threshold migration threshold

△t

t

i

temperature time currenttemperature future temperature

Figure 2: The calculation of

∆

t

(migration time) us-ing ABTM

where

y

k

+1

is the output vector and

x

k

+1

is input vector of of

f

k

+1

.

θ

k

+1

=

θ

k

+

P

k

+1

x

k

+1

(

y

k

+1

−

x

T k

+1

θ

k

) (7)where matrix

P

is an intermediate variable in the algorithm.Eventually, we get future temperature,

y

n

, by an applicationthermal behavior using the current

θ

vector. Detailed de-scriptions of the Least Square Method and Recursive LeastSquare Method can be found in the literatures [4]. WithEquation (1), ABTM can predict future temperature for anapplication as shown in Figure 2. How the ABTM appliedin PDTM is explained in 3.3

3.2 The Core-based Thermal Model

The heat transfer equations model the steady state tem-perature of systems with heat sources [7]. It has been ob-served in those models that the temperature changes expo-nentially to the steady state starting from any initial tem-perature. In other words, the rate of temperature change isproportional to the diﬀerence between the current tempera-ture and the steady state [7]. We initially assume that thesteady state temperature of the application is known. Laterwe will relax this constraint. Let

T

ss

be the steady statetemperature of an application. Let

T

(

t

) represent the tem-perature at time t and let

T

init

be the temperature when anapplication starts execution (

T

(0)=

T

init

). The predictionmodel assumes that the rate of change of temperature isproportional to the diﬀerence between the current temper-ature and the steady state temperature of the application[15]. Thus

dT dt

=

b

×

(

T

ss

−

T

)

.

(8)Solving Equation (8) with

T

(0) =

T

init

and

T

(

∞

)=

T

ss

, weget

T

(

t

) =

T

ss

−

(

T

ss

−

T

init

)

×

e

−

bt

(9)where

b

is a processor-speciﬁc constant. The value of

b

is de-termined using Equation (8) by observing heating and cool-ing curves corresponding to all

SPEC2006

benchmarks on thecore. Also, since the value of

b

is diﬀerent to the amountof workload,

b

should be determined by the workload oneach processor. Running several benchmarks, we obtained

b

= 0

.

009 when the workload is 100%. We precompute thesteady state temperature of an application oﬄine. Then byrearranging Equation (9), we get the steady state tempera-

736