# 1. Introduction

Recently, more attention has been paid to the role of demand response (DR) to time-varying pricing such as time-of-use (TOU) pricing and critical peak pricing (CPP) [1-3]. The benefits of DR include the saving of electricity bills, the reduction of peak demand, the deferred infrastructure investment, and the mitigation of market power from a monopoly or an oligopoly [4-6]. These effects can be improved by the advanced metering infrastructure (AMI) and various technologies such as the bidirectional communication capability in smart grids [7, 8].

The benefits of DR can be obtained with various types of appliances such as electrical vehicle charging systems, dishwashers, and heating, ventilation and air conditioning (HVAC) systems [9, 10]. Among the possible systems for DR, this paper focuses on HVAC systems except for the ventilation, which are denoted as thermostatically controlled loads (TCLs). This is because TCLs account for the considerable percentage of residential electricity consumption [11], and air within the indoor environment can be used as the thermal storage to delay or advance electricity usage [12]. In fact, approximately 48% of the residential electricity consumption was used for heating and cooling in the United States in 2009 [13]. In addition, considering that it is not suitable for the human operators to respond to the frequent changes (e.g., every 5 min) of electricity prices, the automated control can be a desirable scheme for DR [14]. Thus, the possibility of adding a fully automated DR function to TCLs can be another reason for selecting them as the appropriate DR resources in the smart grid environment [15-17]. The addition of DR capability in the TCL with an existing on-off controller can be described in the hierarchical structure consisting of supervisory control and regulatory control [18]. In other words, the function of the existing on-off controller corresponds to the regulatory control in the timescale of seconds and the response to the electricity prices for the reduction of electricity cost corresponds to the supervisory control in the timescale of minutes. It is true that DR capability can be effectively implemented in the TCL with the intelligent control methods such as adaptive control and evolutionary algorithm [19-21]. However, the traditional control method such as the on-off control is still a commonly used method in the TCL because of low cost [22, 23]. As a result, it is useful that the TCLs with the existing on-off controller have become newly available DR resources by adding DR capability. The addition of DR capability implies the development of a suitable supervisory control because the existing on-off control corresponds to the regulatory control in the hierarchical structure [18].

As a specific supervisory control method for implementing DR capability in the TCL, it is proposed to change the set point of the room temperature according to the electricity prices [24]. A linear function, a step function, and an exponential function are also presented in [24] as the possible shapes of the set point change functions. However, the specific set point change function is left as another problem to be addressed by applying other optimization techniques. In [7, 16, 17], the use of on-off control for using TCLs as DR resources is considered. However, those studies focus not on the implementation of the DR function in an individual TCL, but on the centralized direct load control of aggregate TCLs. Decentralized configuration in [18] can be used as another realization of the supervisory control, in which additional controller operates independently and the control signals are generated by combining the output of the additional controller with the output of the existing regulatory controller. However, such a decentralized implementation for adding DR capability to the TCL has not been studied yet. Thus, as specific realizations of the supervisory control in the hierarchical structure, two methods are proposed in this paper for adding DR capability to an individual TCL with an existing on-off controller. One of the proposed methods involves optimally changing a set point, which is a specific realization of the set point change function; and the other method involves paralleling an identified system without delay in parallel with the actual TCL with an existing on-off controller, which is a specific realization of decentralized configuration.

The remainder of this paper is organized as follows. The dynamic model of a TCL and the operation of the on-off controller in a TCL are described in Section 2. Further, the formulation of the DR operation of a TCL without delay as a dynamic programming problem and the determination of the optimal solution are also discussed in Section 2. The proposed methods for adding DR capability to a TCL with an existing on-off controller are presented in Section 3. Section 4 is the simulation section, in which the performance of the proposed methods is analyzed through the simulations with an electric heater. The conclusions are provided in Section 5.

# 2. DR Operation of a TCL without Delay

## 2.1 Dynamic model of a TCL

The dynamic model of a TCL, which is simple but captures the actual dynamic behavior considerably accurately, is proposed in [25]. In particular, the model is represented as a heating load or a heater as in a differential equation as

For a cooling load or an air-conditioner, the model is expressed as

where T is the interior average temperature; τe is the effective thermal constant; Tf is the ambient temperature, to which T asymptotically converges when the heater or the air-conditioner is turned off; Tg is the temperature gain of a heater or an air-conditioner, which relates to the heat capacity of the heater or the air-conditioner; w is a binary variable (0: off, 1: on), which denotes the state of the on-off controller. The parameter τe implies the conditions of homes and buildings about heat loss, such as the status of doors and windows (open or closed) and the insulation. For example, if a window is open, heat loss becomes worse so that τe has a small value, which implies room temperature changes fast toward the ambient temperature [25]. Similarly, insulation is also reflected to τe so that recent improvement in the insulation technology in homes and buildings make τe small. The discretized model of a TCL for computational purpose is presented in [26].

As an alternative dynamic model of a TCL, it is proposed to use the first-order plus dead-time (FOPDT) system described in [27, 28], whose transfer function is represented as

where K, τ and TD are the gain, time constant and deadtime of the system, respectively. This model is more useful when it is necessary to reduce the computational burden for the identification and control of the system [27, 28]. The parameter τ is very similar to τe in (1) and (2), which means that τ reflects the conditions of homes and buildings about heat loss, such as the insulation technology applied in homes and buildings. Different from the model in (1) or (2), the FOPDT model does not include an ambient temperature parameter. This is because the parameter of ambient temperature is implicitly reflected in the transfer function in (3) around the operating range. Actually, the ambient temperature cannot be known exactly when the TCL is operating at a target temperature, because the ambient temperature is a converged value when the TCL is continuously turned off. Moreover, with regard to the on-off control of the TCL, the ambient temperature is not used in determining the control. Thus, the FOPDT model is a reasonable choice as a dynamic model of the TCL when the aspects of identification and control are considered. Meanwhile, as will be described in Section 3, the proposed methods are based on the identification, and the focus of these methods is to develop a control strategy for the TCL for DR. Consequently, the FOPDT model is selected as the dynamic model of the TCL in this paper.

## 2.2 Operation of on-off controller

The operation of the existing on-off controller usually makes a binary decision of the on- or off-state [16, 17]. Especially when the binary decision is on-state, the power consumption level also can be combined. Thus, a variable for indicating the power consumption level during the on-state or Pk is introduced. There is a maximum value for Pk , which is denoted as Pmax , because Pk is related to tdhe capacity of the TCL. Then, for a heating load, the controller output (or the system input) uk is expressed as

For a cooling load, it can be similarly represented as

In (4) and (5), xk, xref, and δ mean room temperature at time k, the set point of room temperature, and the dead band of the on-off control, respectively.

## 2.3 Optimal solution of DR operation without delay

Without the delay TD , H(s) in (3) can be represented in the continuous-time domain as

Let the current time be t0 and the control input u(t) be constant for t0 ≤ t < t0 + TS , where TS is the sampling time. Then, the solution of (6) can be determined as

After the time period of TS , the value of (7) becomes

Thus, (6) can be converted into the discrete-time linear system for the sampling time TS as:

where

The sampling time TS corresponds to the timescale of the supervisory control for DR, for example, 5 min for realtime pricing [4, 29].

As presented in [30], the DR function can be formulated as a dynamic programming problem as

where

Jk(xk) cost-to-go function at time k gN(xN) terminal cost gk(xk,uk) transition cost at time k ρk electricity price at time k xk state related to room temperature at time k admissible minimum state at time k admissible maximum state at time k uk control related to electricity consumption at time k minimum control at time k which satisfies the constraint of (11-c) maximum control at time k which satisfies the constraint of (11-c) N number of times control is applied

The cost-go-function Jk( xk ) can be composed as

From the principle of optimality of dynamic programming, the optimal cost-to-function can be represented as

By the backward DP procedure from the last time N, the optimal cost-to-function can be derived as [30]

The first term {ρk − A⋅ρk+1}uk in (14) shows that is linear to the control uk . Then, using this linear property, the optimal solution of (11) is determined as

It seems that (15) depends only on the price because of the branch condition with only the price term. However, the optimal solution (15) depends not only on the price but also on the room temperature because and are not constant but variable with respect to the current room temperature xk . This is why and have the subscript k. For example, suppose that the set point of a heating system is equal to 26℃ . Then, may be equal to 0.5 when xk = 25℃ , but may increase to 1.0 when xk = 24℃ .

The optimal solution in (15) can be interpreted as that the electricity consumption decreases when the price at the current time period is greater than that in the next time period. This is the same as the general DR operation. In addition, (15) suggests that the determination of high or low price depends on the system characteristics such as the value of A. In order to achieve a significant benefit from DR, the comfort of the user of the TCL should be inevitably sacrificed to some degree [15, 26]. Thus, it is desirable to give the user the choice of allowed discomfort [26]. For the TCL, this discomfort level corresponds to the temperature range for DR as given as the constraint in (11-c).

# 3. Methods for Adding DR Capability

When it is assumed that the time delay is a multiple of the sampling time TS , that is, TD = mTS , the discrete-time linear system without delay in (9) can be represented as the system with delay

Accordingly, the optimal solution in (15) for the TCL without delay should also be modified as

where and are the minimum and maximum controls, respectively, at the time index k + m , which satisfies the constraint of (11-c).

The optimal control in (17) is based on the information of the state in the future time index of k + m. Thus, although (17) is valid in a mathematical sense, it has a serious limitation when used in the practical applications owing to the dependence on the future state. Thus, as solutions for practical use, this paper proposes two methods for adding DR capability to a TCL.

## 3.1 Method of changing a set point

In order to resolve the dependence on the unknown future state, (17) can be converted into the optimal law for changing set point. For a TCL such as an electric heater, a rise in room temperature results from more electricity consumption. Thus, and correspond to and , respectively. Then, (17) can be transformed into the optimal set point law as

On the other hand, for a TCL such as an air conditioning system, a decrease in room temperature results from more electricity consumption. Then, and correspond to and , respectively, and (17) can be transformed into

Different from (17), it is possible to apply (18) or (19) to the DR function of a real-life TCL because amd can be decided in advance as the admissible range of room temperature for the DR function. For example, when the original set point of an electric heater is 25℃ without regard to DR, the admissible range for DR can be set in advance by the user as [24℃ , 26℃] for a certain time duration including the future time index of k + 1 + m in (18) or (19). This method of changing a set point is represented as a block diagram in Fig. 1. The system identification block is intentionally included in Fig. 1 to emphasize the need for the information on the system dynamics when determining in (18) or (19). The performance of this method of changing a set point depends entirely on the operation of the existing controller. Thus, when the existing controller is not properly operating, an undesirable situation may happen, in which the addition of DR capability rather increases the cost of power consumption when compared with the case without DR.

**Fig. 1.**Block diagram of the method of changing a set point.

## 3.2 Method of paralleling an identified system

Let and be defined as the contributions of the original system structure without the DR function to xk and uk , respectively. Similarly, let and be defined as the additional components of xk and uk , respectively, owing to the added DR capability. Then, by the superposition property of the linear system, the state xk can be expressed as

By substituting (20) into (16) gives the system equation with separated state variables as

Particularly, can be interpreted as the output of the existing on-off controller when the DR function is not included. However, this separation cannot be observed explicitly in real situations.

In the method of paralleling an identified system without delay, the operation of the original system structure is not modified. In other words, the existing on-off controller operates according to the specified reference value, as it originally does as in (4) or (5). Instead, DR capability is implemented by adding the control output for DR determined from the identified system without delay, which is an ideal system for computational purpose, as follows:

Then, the constraints in (11-c) should be changed for the identified system of (22) as

where xref is the specified reference value of the existing controller set by the user regardless of the DR function. Further, from (9) and (15), the optimal DR control of the identified system without delay for a TCL such as an electric heater can be determined as

Similarly, for a TCL such as an air conditioning system, the optimal DR control of the identified system without delay can be expressed as

When and are set as constant values for all k, then, from the optimal control law for the system without delay, has a value of either or . This value indicates an allowed deviation of temperature for DR set by the user. Then, the optimal control in (24) or (25) can have possibly four values as

where the values correspond respectively to the following four cases of state transition as

The values in (26) can be interpreted as the supervisory control bias for DR, which is added to the output of the existing on-off controller. When one of the values in (26) is added to the output of the existing on-off controller, the case can happen when the resulting value is less than zero or greater than Pmax . However, the lower limit (zero) and the upper limit ( Pmax ) are the physical constraints of the TCL; therefore, the saturation function is applied before the TCL as

For an input to the existing on-off controller, because of the relationship of , the error ek provided into the existing on-off controller becomes

Thus, the reference of the existing controller is changed from xref to . It should be noted that the value of is not a measured quantity; however, it can be computationally determined from the identified system with the optimal control of (24) or (25). This method of paralleling an identified system without delay is represented schematically in Fig. 2.

**Fig. 2.**Block diagram of the method of paralleling an identified system without delay.

## 3.3 Comments on payback effect

The proposed methods are intended for adding DR capability to an individual TCL. However, when the number of TCL becomes large and all the TCLs operate in the same way according to the common electricity prices, the stability of the power system can be affected by those collective DR operations [16]. In particular, the adverse effect such as new peak demand is likely to occur during an off-peak period just after the DR operation stops [15]. The fluctuation of the electricity prices may result from the new peak demand. This phenomenon is similar to so-called ‘payback effect’ [31-34], which can appear as a form of delayed consumption when the demand is restored after the service interruption.

The solution to this negative effect of DR operation is the diversity of demand [16, 25]. Thus, it is necessary to investigate whether the diversity property can hold in the proposed methods to avoid the negative effects. The first diversity element in the proposed methods is the temperature range for DR specified by the user. There is a sort of trade-off between the cost benefit from DR and the comfort level. Thus, the specific temperature range for DR may be different from each user, which makes the distribution of the on- and off-states diversified in time. The second diversity element is the power consumption level of the existing on-off controller set by the user. Different values of Pk at the on-state mean different time durations necessary for the same change of temperature, which in turn makes the distribution of the on- and off-states diversified in time. The last diversity element is specific parameter values of the dynamic model of a TCL as given in (3). Because the area and structure of the room space are all different, the specific dynamic model cannot be equal to each other, which supports the diversification of the on- and off-states in time.

# 4. Simulation and Verification

## 4.1 Simulation settings

In the simulation, an electric heater is considered as a TCL, whose transfer function is the same as that described in [28] and represented as

where the unit of s is min −1 . Four cases are composed according to the values of TD ; that is, TD is equal to 0 (no delay), 1, 3, and 5 min. The sampling time TS is set to 5 min, considering the update interval of electricity price in real-time pricing [4, 29]. According to (10), the parameters of the discrete-time linear system are determined as

The dead band δ in (4) is set to 1, such that the operation of the on-off controller is represented as

The upper limit of the controller output, or Pmax , is set to 1. The output values of 0 and 1 correspond to the actual power consumption of 0 kW and 12 kW, respectively [28]. Simulations are performed for two values of Pk in (32), which are Pk = Pmax = 1.0 and Pk = Pmax / 2 = 0.5 . Simulation time N is set to 200 min. As the settings of room temperature, the following values are used for all k :

The supervisory control bias values in the four cases of state transitions for the method of paralleling an identified system can be determined from (26) and (27) as

As a metric for the performance verification of the proposed methods, the ratio of the cost reduction with DR to the cost without DR, that is, the cost benefit from the added DR capability, is defined as follows:

The cost benefit can be significantly affected by the electricity prices because the DR operation depends on them. Thus, the simulations are performed 100 times with different profiles of time-varying prices, which are generated from the historical data on the locational marginal prices of PJM in 2008 [35]. Then, the average values of the cost benefit and the temperature are analyzed. The simulations are performed with Matlab/Simulink.

## 4.2 Simulation results

The simulation results of 100 trials in the case when Pk = Pmax = 1.0 are summarized in Table 1. There are some items which are not suitable in the case without DR; that is, the cost benefit is always equal to zero by the definition in (35); the average temperature is always the same because of irrelevance to time-varying prices. Thus, they are all indicated with the hyphens in Table 1.

**Table 1.**Simulation results of 100 trials when Pk = Pmax for various values of time delay.

It can be observed from Table 1 that the positive cost benefit can be achieved by both methods on average when compared with the case without DR. The cost benefit tends to decrease slightly as the delay increases. Although the degree of difference is not large, the cost benefit with the method of changing a set point (abbreviated as “set point method”) is greater than the cost benefit with the method of paralleling an identified system (abbreviated as “paralleling method”). In terms of the average temperature, there is no significant difference between the proposed methods and the case without DR. Moreover, the average temperature increases as the delay increases in all the cases without DR and with the proposed methods. This is because of the application of full power in the on-state, which renders the overshoot above the high temperature bound become larger than the overshoot below the low temperature bound. This can be clearly observed from the temperature variation of a selected trial shown in Fig. 3, where the range of DR is also indicated as the shaded band.

**Fig. 3.**Variation in the temperature and control inputs to TCL for a selected trial among 100 trials when Pk = Pmax; (a) TD = 0; (b) TD = 1min; (c) TD = 3min; (d) TD = 5min.

The oscillatory temperature variation is the inherent characteristic of the on-off controller; therefore, the deviation from xref appears in all cases without DR and it becomes large as the delay increase. In other words, the on-off controller produces overshoot and undershoot around a set point of room temperature even without DR and the proposed methods. Similarly, the range for DR in the proposed method becomes a kind of a loose constraint for the on-off controller because deviation of room temperature from the range for DR appears. However, the important thing is that cost benefit can be achieved from the proposed methods for DR without causing excessive deviation compared to the case without DR. The temperature deviations of the proposed methods increase compared to the case without DR. Between the proposed methods, the paralleling method is better than the set point method for the deviation from the temperature range when the delay is small. On the contrary, when the delay is large, the set point method becomes a better choice than the paralleling method.

The control inputs to the TCL for various values of TD are also shown in Fig. 3 along with the corresponding variations in temperature. For the set point method, the values in the on- and off-states are 1 and 0, respectively, which are the same as in the case without DR. However, the time instants in the two methods are different. The power consumption is advanced in time by adjusting a set point when the electricity price is low in the case with the set point method. This is the reason for achieving the cost benefit with the set point method. On the other hand, the values in the on- and off-states in the case with the paralleling method are different from the case without DR. In other words, there are eight values by combining the on/off states and four values from the identified system as given in (34). However, only four values appear as the actual control signal owing to the saturation constraints of the existing on-off controller. This differentiation of the control input contributes to the cost benefit in the case with the paralleling method.

Although the cost benefit can be achieved on the average with the proposed method when compared with the case without DR, the situation can happen when the cost in the cases with the proposed methods is even greater according to the specific profile of the prices. The numbers of trials with the negative cost benefit for various values of delay are listed in Table 2. The reason for the negative cost benefit is the bigger overshoot within the high temperature area, which results in more power consumption. The negative cost benefit occurs slightly more often when the delay is greater. Between the two methods, the set point method is slightly better than the paralleling method in terms of the occurrence of the negative cost benefit.

**Table 2.**Number of trials among 100 trials with the negative cost benefit when Pk = Pmax.

Particularly in the paralleling method, the saturation function of the existing on-off controller prevents the control input of the identified system from being fully applied. Such a negative saturation effect can be relieved by setting lower power in the on-state of the on-off controller, for example, Pk = Pmax / 2 = 0.5 instead of Pk = Pmax = 1.0 . The simulation results of 100 trials for the case when Pk = Pmax / 2 = 0.5 are summarized in Table 3. It can be observed from Table 3 that the performance of the paralleling method is significantly improved when compared with the results when Pk = Pmax as given in Table 1. Moreover, in contrast to the results when Pk = Pmax, the cost benefit with the paralleling method is better than that with the set point method. In the same vein, according to the number of trials among 100 trials with the negative cost benefit when Pk = Pmax / 2 as listed in Table 4, the case with the negative cost benefit does not occur with the paralleling method. These performance improvements for the paralleling method are because of the increase in the number of possible control inputs. In other words, six values among the possible eight values are included as the final control input when Pk = Pmax/2, while four values are included when Pk = Pmax. This increase in the number of the control input for the paralleling method can be clearly observed in the time variation of the control input for a selected trial as shown in Fig. 4.

**Table 3.**Simulation results of 100 trials when Pk = Pmax/2 for various values of time delay.

**Table 4.**Number of trials among 100 trials with the negative cost benefit when Pk = Pmax/2.

**Fig. 4.**Variation in the temperature and control inputs to TCL for a selected trial among 100 trials when Pk = Pmax/2; (a) TD = 0; (b) TD = 1min; (c) TD = 3min; (d) TD = 5min.

When compared with the results in the case with Pk = Pmax = 1.0 , the common observations for the two methods when Pk = Pmax / 2 = 0.5 can be described as follows; the average temperature is closer to the user set point; the standard deviation of both the temperature and the cost benefit becomes smaller; the possibility of the negative cost benefit becomes low. Consequently, it is recommended to set a suitable power consumption level of the existing controller in the on-state instead of the maximum power level, when the cost benefit is required to be consistently achieved by adding DR capability to a TCL with an existing on-off controller. However, when the power consumption level in the on-state is considerably closer or equal to the maximum power level, the set point method is recommended over the paralleling method because the cost benefit for the set point method is slightly better than that for the paralleling method, and the possibility of the negative cost benefit is slightly lower.

# 5. Conclusion

It is necessary to add the DR function to the TCLs in the smart grid environment for achieving desirable benefits of DR because a TCL can be one of the most appropriate resources for the DR. There are still a considerable number of TCLs using an on-off controller. Thus, two methods were proposed to add the DR function to the TCL with the existing on-off controller — a method of changing a set point and a method of paralleling an identified system without delay. These methods were derived from the optimal solution of dynamic programming.

The proposed methods were verified through the simulations with an electric heater for different power consumption levels in the on-state. The simulation results show that considerable cost benefit can be achieved for both the proposed methods when compared with the case without DR. Although the cost benefit can be obtained when the maximum power is set in the on-state, the use of the medium power consumption level in the on-state results in more benefits such as the reduced temperature deviation from the specified temperature range for DR and low possibility of negative cost benefit. Consequently, it can be concluded that it is important to set a suitable power consumption level of the existing on-off controller in the on-state in order to consistently achieve the cost benefit. It can also be suggested that, particularly when the delay is greater, the method of changing a set point can be a better choice over the method of paralleling an identified system.