﻿ 马尔科夫决策过程下含可再生能源的实时电价策略
 上海理工大学学报  2020, Vol. 42 Issue (5): 467-478 PDF

1. 上海理工大学 管理学院，上海 200093;
2. 日照职业技术学院 公共教学部，日照 276826

Real-time pricing strategy in consideration of renewable energy sources by using Markov decision process for its uncertainty
XU Zhihong1,2, GAO Yan1, CHENG Panhong1
1. Business School, University of Shanghai for Science and Technology, Shanghai 200093, China;
2. Public Teaching Department, Rizhao Polytechnic, Rizhao 276826, China
Abstract: Applying Markov decision process to represent the uncertainty in the scheduling of various types of appliances and renewable energy generation, a new model of maximizing the expected users’ welfare was proposed. In order to balance the relationship between the utility of users’ electricity consumption (indicating the satisfaction degree of users after using the purchased electricity) and the payment cost, a weight factor was introduced as an unknown variable. The model was solved by an improved simulated annealing algorithm and the optimal weight factor was obtained. The simulation results verify the rationality of the model and the feasibility of the algorithm, which can guide users to optimize the energy allocation and schedule different appliances, so as to achieve the goal of maximizing the overall interest of users.
Key words: renewable energy source     Markov decision process     simulated annealing algorithm     real-time pricing

a. 结合不同类型家电，考虑用户用电行为的不确定性，将家电分为必须运行、弹性和半弹性三类；引入权重因子作为未知变量来平衡用户效用和支付成本之间的关系，建立期望用户福利最大化模型。

b. 运用马尔科夫决策过程表示可再生能源发电量的不确定性，研究可再生能源对用户使用各类家电的影响，指导用户进行能源优化配置和调度各类家电。

c. 根据用户用电行为特点和可再生能源发电量的不确定性，将问题离散为若干子问题，提出了一种用户侧模拟退火算法，解决了该优化问题，实现了最优调度策略。

1 问题阐述

1.1 马尔科夫决策过程

1.2 家电分类

 ${l_{a_{u,i}^{}}}{\rm{ = }}(l_{a_{u,i}^{}}^t,\;l_{a_{u,i}^{}}^{t + 1}, \;\cdots ,l_{a_{u,i}^{}}^K)$ (1)
1.2.1 必须运行的家电

1.2.2 弹性家电

 $0 \leqslant l_{{a_{u,i}}}^k \leqslant r_{{a_{u,i}}}^{\max },\;\;\forall k \in \mathbb{K}$ (2)

 \begin{aligned} P\{ {S_{{B_i}}}&(k + 1) = l_{{B_i}}^{k + 1}|{S_{{B_i}}}(0) = l_{{B_i}}^0,\;{S_{{B_i}}}(1) = l_{{B_i}}^1, \cdots ,{S_{{B_i}}}(k) = \\ & l_{{B_i}}^k\}\;= P\{ {S_{{B_i}}}(k + 1)\;= l_{{B_i}}^{k + 1}|{S_{{B_i}}}(k) = l_{{B_i}}^k\} \end{aligned}

 $\sum\limits_{n = 1}^N {{\text{π}} _{{B_i}}^n(k) = 1}$ (3)

$p_{k,k + 1}^i = {P} \{ S_{{B_i}}^{}(k + 1) = l_{{B_i}}^{k + 1}|S_{{B_i}}^{}(k) = l_{{B_i}}^k\}$ 。根据 $k$ 时刻的电价 ${\lambda ^k}$ ，状态转移概率矩阵表示为

 ${{P}}_{k,k + 1}^i({\lambda ^k}) \!=\! \left[\!\! \begin{array}{l} a_{11}^i + b_{11}^i{\lambda ^k}\quad \cdots \quad a_{1N}^i + b_{1N}^i{\lambda ^k} \\ a_{21}^i + b_{21}^i{\lambda ^k}\quad \cdots \quad a_{2N}^i + b_{2N}^i{\lambda ^k} \\ \quad \quad \vdots \quad \quad \quad \;\quad \; \quad \quad \quad \vdots \\ a_{N1}^i + b_{N1}^i{\lambda ^k}\quad \cdots \quad a_{NN}^i + b_{NN}^i{\lambda ^k} \end{array} \!\! \right]$ (4)

 $\begin{split}&0 \leqslant a_{ef}^i + b_{ef}^i{\lambda ^k} \leqslant 1,\\ &\qquad e = 1,2, \cdots ,N,\;\;f = 1,2, \cdots ,N\end{split}$ (5)

 $\sum\limits_{e = 1}^N {a_{ef}^i + b_{ef}^i{\lambda ^k}} = 1, \;f = 1,2, \cdots ,N$ (6)

$\bar A_{ef}^i = [a_{ef}^i,b_{ef}^i]$ ，显然， $P_{k,k + 1}^i({\lambda ^k})$ ${\lambda ^k}$ $\bar A_{ef}^i$ 决定， $P_{k,k + 1}^i({\lambda ^k})$ 的值与价格 ${\lambda ^k}$ 成反比。

 $\begin{split} \bar l_{{B_i}}^k = E\left[ {l_{{B_i}}^k} \right] &= S_{{B_i}}^1(k){\text{π}} _1^i(k) + S_{{B_i}}^2(k){\text{π}} _2^i(k) + \cdots +\\& S_{{B_i}}^N(k){\text{π}} _N^i(k),\;\;S_{{B_i}}^n \in {S_{{B_i}}} \end{split}$ (7)

1.2.3 半弹性家电

${C_i}$ 类型家电为半弹性电器，例如，洗衣机、烘干机、洗碗机等。在给定的时间间隔内可以任意选择半弹性家电的工作时间，但其总功耗是固定的。这种家用电器的启动时间可以推迟，但必须是连续的。这意味着一旦启动，它的任务就需要在一定时间内完成，否则在任务完成前停止操作，成本会很大（电饭锅）。

 \begin{split} & \tilde l_{a_{u,i}^{},d}^k = \\ & \left\{ \begin{aligned} & {g_{a_{u,i}^{},d}},\;\;\text{如果}\;j \in \left\{ {1,2, \cdots ,\left[ {\dfrac{{{{E'}_{a_{u,i}^{},d}}}}{{{g_{a_{u,i}^{},d}}}}} \right]} \right\}\;\text{且}\;{k_j} \in {{\mathbb{K}'}_{a_{u,i}^{},d}} \\ & {{E'}_{a_{u,i}^{},d}} - {g_{a_{u,i}^{},d}} \times \left[ {\dfrac{{{{E'}_{a_{u,i}^{},d}}}}{{{g_{a_{u,i}^{},d}}}}} \right], \\ & \qquad\quad \text{如果}\;j = \left[ {\dfrac{{{{E'}_{a_{u,i}^{},d}}}}{{{g_{a_{u,i}^{},d}}}}} \right] + 1\;\text{且}\;{k_j} \in {{\mathbb{K}'}_{a_{u,i}^{},d}} \\ & 0,\;\;{\text{其他}} \end{aligned} \right.\\[-60pt]\end{split} (8)
 $\sum\limits_{k \in {{\mathbb{K}'}_{a_{u,i}^{},d}}} {\tilde l_{a_{u,i}^{},d}^k} = {E'_{a_{u,i}^{},d}}$ (9)

 $S_{{a_{u,i}}}^k \in \left\{ 0,1, \cdots , {D_{a_{u,i}^{}}}\right\}$

$W_{{a_{u,i}},d}^k \in \Bigg\{ {0,1, \cdots ,} {\left\{ {\left[ {\dfrac{{{{E'}_{a_{u,i}^{},d}}}}{{{g_{a_{u,i}^{},d}}}}} \right] + 1} \right\}} \Bigg\}$

$V_{{a_{u,i}},d}^k \in \left\{ { 0,1, \cdots ,} {({\beta _{a_{u,i}^{},d}} - {\alpha _{a_{u,i}^{},d}})} - {\left\{ {\left[ {\dfrac{{{{E'}_{a_{u,i}^{},d}}}}{{{g_{a_{u,i}^{},d}}}}} \right] + 1} \right\}} \right\}$

${w_{a_{u,i}^{},d}},{v_{a_{u,i}^{},d}}$ 分别表示最大工作时间和最大等待时间。

${w_{a_{u,i}^{},d}} = \left\{ {\left[ {\dfrac{{{{E'}_{_{a_{u,i}^{},d}}}}}{{{g_{_{a_{u,i}^{},d}}}}}} \right] + 1} \right\}$

${v_{a_{u,i}^{},d}} = ({\beta _{a_{u,i}^{},d}} - {\alpha _{a_{u,i}^{},d}}) - \left\{ {\left[ {\dfrac{{{{E'}_{a_{u,i}^{},d}}}}{{{g_{a_{u,i}^{},d}}}}} \right] + 1} \right\}$

 图 1 具有2模式下半弹性家电的马尔科夫决策过程 Fig. 1 Markov decision process of semielastic appliances with 2 modes

 \begin{split}l_{{a_{u,i}}}^k = \left\{ \begin{aligned} & z_{{a_{u,i}}}^{{k^*}}(L_{{a_{u,i}}}^k) {g_{{a_{u,i}},d}},\quad \;\text{如果}\;0 \leqslant {W_{{a_{u,i}},d}} \leqslant \left[ {\frac{{{{E'}_{{a_{u,i}},d}}}}{{{g_{{a_{u,i}},d}}}}} \right] \\ & z_{{a_{u,i}}}^{{k^*}}(L_{{a_{u,i}}}^k) \left( {{{E'}_{{a_{u,i}},d}} - {g_{{a_{u,i}},d}} \left[ {\frac{{{{E'}_{{a_{u,i}},d}}}}{{{g_{{a_{u,i}},d}}}}} \right]} \right),\\ & \quad \text{如果}\;\left[ {\frac{{{{E'}_{{a_{u,i}},d}}}}{{{g_{{a_{u,i}},d}}}}} \right] < {W_{{a_{u,i}},d}} \leqslant {w_{{a_{u,i}},d}} \\ & 0,\;{\text{其他}} \end{aligned} \right.\\[-50pt]\end{split} (10)
 $\begin{split} z_{{a_{u,i}}}^{{k^*}}(&S_{{a_{u,i}}}^k,W_{{a_{u,i}},d}^k,V_{{a_{u,i}},d}^k) = 1,\\ &\forall {\beta _{{a_{u,i}},d}} - {w_{{a_{u,i}},d}}\; \leqslant k \leqslant {\beta _{{a_{u,i}},d}}\end{split}$ (11)

 $L_{{a_{u,i}}}^{k + 1} = \left[ \begin{array}{l} {\mathbb{S}_{a_{u,i}^{}}}(L_{{a_{u,i}}}^k,\theta _{{a_{u,i}}}^{k + 1}) \\ {\mathbb{W}_{a_{u,i}^{}}}(L_{{a_{u,i}}}^k,z_{{a_{u,i}}}^k,\theta _{{a_{u,i}}}^{k + 1}) \\ {\mathbb{V}_{a_{u,i}^{}}}(L_{{a_{u,i}}}^k,z_{{a_{u,i}}}^k,\theta _{{a_{u,i}}}^{k + 1}) \end{array} \right],\;\;\;\;k \in \mathbb{K}$ (12)

 ${\mathbb{S}_{a_{u,i}^{}}}(L_{{a_{u,i}}}^k,\theta _{{a_{u,i}}}^{k + 1}) = \left\{ {\begin{array}{*{20}{l}} {\theta _{{a_{u,i}}}^{k + 1},}&{{\text{如果}}\;S_{{a_{u,i}}}^k = 0} \\ {S_{{a_{u,i}}}^k,}&{{\text{其他}}} \end{array}} \right.$ (13)
 $\begin{split} & {\mathbb{W}_{{a_{u,i}}}}(L_{{a_{u,i}}}^k,z_{{a_{u,i}}}^k,\theta _{{a_{u,i}}}^{k + 1}) = \\ & \left\{\!\!\! {\begin{array}{*{20}{l}} {{W_{{a_{u,i}},d}} - 1,}&{\text{如果}}\;S_{{a_{u,i}}}^k \!=\! {d_{{a_{u,i}}}},\\ \;& 2 \!\leqslant\! {W_{{a_{u,i}},d}} \!\leqslant\! {w_{{a_{u,i}},d}},z_{{a_{u,i}}}^k \!=\! 1 \\ {{W_{{a_{u,i}},d}},}&{{\text{如果}}\;{W_{{a_{u,i}},d}} \geqslant 1,z_{{a_{u,i}}}^k = 0} \\ {{w_{{a_{u,i}},d}},}&{{\text{如果}}\;0 \leqslant {W_{{a_{u,i}},d}} \leqslant 1\;\text{且}\;\theta _{{a_{u,i}}}^{k + 1} = {d_{{a_{u,i}}}}} \\ {0,}&{{\text{如果}}\;0 \leqslant {W_{{a_{u,i}},d}} \leqslant 1\;\text{且}\;\theta _{{a_{u,i}}}^{k + 1} = 0} \end{array}} \right.\\[-50pt]\end{split}$ (14)
 $\begin{split} & {\mathbb{V}_{{a_{u,i}}}}(L_{{a_{u,i}}}^k,z_{{a_{u,i}}}^k,\theta _{{a_{u,i}}}^{k + 1}) = \\ & \left\{\!\!\! {\begin{array}{*{20}{l}} {V_{{a_{u,i}},d}^k - 1,}\!\!\!\!\!\!&{\text{如果}\;S_{{a_{u,i}}}^k = {d_{{a_{u,i}}}},W_{{a_{u,i}},d}^k{\rm{ = }}{w_{{a_{u,i}},d}},z_{{a_{u,i}}}^k = 0} \\ {v_{{a_{u,i}}}^k,}\!\!\!\!\!\!&{\text{如果}\;0 \leqslant W_{{a_{u,i}},d}^k \leqslant 1,\theta _{{a_{u,i}}}^{k + 1} = {d_{{a_{u,i}}}}} \\ {0,}&{\text{如果}\;S_{{a_{u,i}}}^k = {d_{{a_{u,i}}}},2 \leqslant W_{{a_{u,i}},d}^k \leqslant {w_{{a_{u,i}},d}},} \\ {}&{z_{{a_{u,i}}}^k = 1;\text{或者如果}\;W_{{a_{u,i}},d}^k = 1\;{\text{且}}\;\theta _{{a_{u,i}}}^{k + 1} = 0} \end{array}} \right.\\[-45pt] \end{split}$ (15)
1.3 可再生能源

 $\begin{split} P ({F_i},{F_j}) &= \dfrac{{{\delta _{{F_i},{F_j}}}}}{{\displaystyle\sum\limits_{{F_j} = 1}^F {{\delta _{{F_i},{F_j}}}} }},\;\forall {F_i},{F_j} \in F\\ &\displaystyle\sum\limits_{{F_j} = 1}^F {P({F_i},{F_j})} = 1\\[-70pt] \end{split}$ (16)

 $Pcdf({F_i},{F_k}) = \sum\limits_{{F_j} = 1}^{{F_k}} {{{P}} ({F_i},{F_j})} ,\;\;\forall {F_i},{F_k} \in F$ (17)

 $V = {V_{\max }} + {Z_{{F_i}}}({V_{\max }} - {V_{\min }})$ (18)

 $y_i^k = \frac{1}{2} \rho A {(V(k))^3} {C_p}$ (19)

1.4 实时电价

 ${\lambda ^k}(l_i^k) = \left\{ \begin{array}{l} {m_k},\quad 0 \leqslant l_i^k \leqslant {b_k} \\ {n_k},\quad l_i^k > {b_k} \\ \end{array} \right.$ (20)

1.5 用户的效用函数

a. 效用函数是一个非递减函数，即 $\dfrac{{{{\partial}} U(l)}}{{{{\partial}} l}} \geqslant 0$

b. 边际效用是非递增的，即效用是凹函数。因此， $\dfrac{{{{\rm{\partial}} ^2}U(l)}}{{{\rm{\partial}} {l^2}}} \leqslant 0$

c. 当用户不用电时，效用为零，即 $U(0) = 0$

 $U(l_i^k,\omega _i^k) = \left\{ \begin{split} & \omega _i^kl_i^k - \frac{{{\alpha _i}}}{2}{\left( {l_i^k} \right)^2},\quad 0 \leqslant l_i^k \leqslant \frac{{\omega _i^k}}{{{\alpha _i}}} \\& \frac{{{{\left( {\omega _i^k} \right)}^2}}}{{2{\alpha _i}}},\quad \quad \quad \quad \;l_i^k > \frac{{\omega _i^k}}{{{\alpha _i}}} \\ \end{split} \right.$ (21)

1.6 用户的支付函数

$P({l_i}^k)$ 表示用户 $i$ $k$ 时刻的支付函数。在本文的模型中，不考虑设备的成本。根据式（20）中所示的价格，对于 $k$ 时刻的总用电量 ${l_i}^k$ ，用户的成本函数 $P({l_i}^k)$ 由下列2个函数的最大值[26]得到：

 $P({l_i}^k) = {\lambda ^k}{l_i}^k = {m_k}{l_i}^k,\;P({l_i}^k) = {\lambda ^k}{l_i}^k = {n_k}{l_i}^k + ({m_k} - {n_k}){b_k}$

 $P({l_i}^k) = {\lambda ^k}{l_i}^k = \max \left\{ {{m_k}{l_i}^k,{n_k}{l_i}^k + ({m_k} - {n_k}){b_k}} \right\}$ (22)
2 用户福利最大化模型 2.1 问题阐述

 $\begin{split} &{\rm{P1}}\;\;\;\;\;\max \;{\rm{E}}\left( {\displaystyle\sum\limits_{k = t}^K {\displaystyle\sum\limits_{i = 1}^q {\left[ {\beta U(l_i^k,\omega _i^k) - (1 - \beta )P(l_i^k)} \right]} } } \right)\\ &{\rm{s.t.}}\;\;\;\;\;{\text{式}}(2)\sim (7)\;\;\;\;\;\forall k \in \mathbb{K},\forall a_{u,i}^{} \in {B_i} \end{split}\tag{23(a)}$
 $\begin{split} \quad{\text{式}}(8)\sim (15)\;\;\;\;\;&\forall k \in {\mathbb{K}'_{a_{u,i}^{},d}},\\ &\forall a_{u,i}^{} \in {C_i},{d_{a_{u,i}^{}}} \in {\mathbb{D}_{a_{u,i}^{}}}\end{split}\quad\tag{23(b)}$
 $\qquad\;\;\; {\text{式}}(16)\sim (19),\;\;\;\forall {F_i},{F_k},{F_j} \in F\qquad \tag{23(c)}$
 $\qquad\quad l_i^k = E_{{a_{u,i}}}^k + \sum\limits_{a_{u,i}^{} \in {B_i}} {l_{{a_{u,i}}}^k} + \sum\limits_{a_{u,i}^{} \in {C_i}} {l_{{a_{u,i}}}^k}\quad \tag{23(d)}$
 ${\rm{E}}\left[ {\sum\limits_{i = 1}^q {l_i^k} } \right] \leqslant {G_k},\;\forall k \in \mathbb{K}\qquad\qquad\tag{23(e)}$
 ${l_i}^k = {\left( {{x_i}^k - {y_i}^k} \right)^ + }\qquad\qquad\qquad\quad\tag{23(f)}$

 $\Pr \left[ {\sum\limits_{i = 1}^q {l_i^k} - {G_k} \geqslant \eta } \right] \leqslant \varepsilon\tag{24}$

2.2 改进的模拟退火算法

b. 在初始解附近，扰动生成新解 $\bar l_{{B_i}}^{\tau + 1}, l_{{a_{u,i}}}^{\tau + 1},\;y_i^{\tau + 1},$ $\beta {_i^{\tau + 1}}$ ，计算目标函数 ${V^{\tau + 1}}$

c. 运用Metropolis法则决定是否接受新解；如果 $\Delta V = {V^{\tau + 1}} - {V^\tau } > 0$ ，则接受 ${V^{\tau + 1}}$ 为新解，运行步骤3；否则以概率 $\exp ( - \Delta V/{T_0}) \geqslant rand$ 接受 ${V^{\tau + 1}}$ 作为新解，运行步骤3；如果 $\Delta V = {V^{\tau + 1}} - {V^\tau } \leqslant 0$ ，则重复步骤2。

 图 2 模拟退火算法流程图 Fig. 2 Flowchart of the simulated annealing algorithm
3 数值模拟

 $m_k^{} = \left\{ \begin{split} & 0.3\;\text{元}/({\rm{kW\cdot h}}),\begin{array}{*{20}{c}} {} \end{array}[0:00,\begin{array}{*{20}{l}} {} \end{array}10:00] \\ & 0.7\;\text{元}/({\rm{kW\cdot h}}),\begin{array}{*{20}{l}} {} \end{array}[10:00,\begin{array}{*{20}{l}} {} \end{array}16:00] \\ & 1\;\text{元}/({\rm{kW\cdot h}}),\begin{array}{*{20}{l}} {} \end{array}\;\;\;[16:00,\begin{array}{*{20}{l}} {} \end{array}20:00] \\ & 0.8\;\text{元}/({\rm{kW\cdot h}}),\begin{array}{*{20}{l}} {} \end{array}[20:00,\begin{array}{*{20}{l}} {} \end{array}24:00] \end{split} \right.$
 $n_k^{} = \left\{ \begin{split} & 0.9\;\text{元}/({\rm{kW\cdot h}}),\begin{array}{*{20}{c}} {} \end{array}[0:00,\begin{array}{*{20}{c}} {} \end{array}8:00] \\ & 0.7\;\text{元}/({\rm{kW\cdot h}}),\begin{array}{*{20}{c}} {} \end{array}[8:00,\begin{array}{*{20}{c}} {} \end{array}13:00] \\ & 1.5\;\text{元}/({\rm{kW\cdot h}}),\begin{array}{*{20}{c}} {} \end{array}[13:00,\begin{array}{*{20}{c}} {} \end{array}24:00] \end{split} \right.$

 U(l_i^k,\omega _i^k) = \left\{ \begin{aligned} & \omega _i^kl_i^k - \frac{{{\alpha _i}}}{2}{\left( {l_i^k} \right)^2},\quad 0 \leqslant l_i^k \leqslant \frac{{\omega _i^k}}{{{\alpha _i}}} \\ & \frac{{{{\left( {\omega _i^k} \right)}^2}}}{{2{\alpha _i}}},\quad \quad \quad \quad \;l_i^k > \frac{{\omega _i^k}}{{{\alpha _i}}} \end{aligned} \right.
3.1 目标函数中最优权重因子的确定

 图 3 权重值 Fig. 3 Weight value

 图 4 不同权重值下用户总福利值 Fig. 4 Total welfare of users under different weight values

 图 5 不同权重值下迭代次数收敛图 Fig. 5 Convergence and iteration times under different weight values

 图 6 不同权重值下弹性家电的用电量 Fig. 6 Electricity consumption of elastic appliances under different weight values

 图 7 不同权重值下半弹性家电的用电量 Fig. 7 Electricity consumption of semielastic appliances under different weight values

 图 8 不同权重值对可再生能源的用电量 Fig. 8 Electricity consumption from renewable energy sources under different weight values
3.2 最优权重因子下有无可再生能源的讨论

 图 9 最优权重值下迭代次数收敛图 Fig. 9 Convergence and iteration times under optimal weight value

 图 10 最优权重值下弹性家电的用电量 Fig. 10 Electricity consumption of elastic appliances under optimal weight value

 图 11 最优权重值下半弹性家电的用电量 Fig. 11 Electricity consumption of semielastic appliances under optimal weight value

 图 12 最优权重值下用户总福利值 Fig. 12 Total welfare of users under optimal weight value
4 结束语