使用广义简约梯度法求解有约束优化问题

广义简约梯度法是简约梯度法的扩展,用于处理含有非线性约束的有约束优化问题,核心思想是用等式约束来减少优化变量的个数。

理论方法

首先考虑优化问题:
$$
\begin{aligned}
\min \quad f(X)\
\mathrm{ s.t. } \quad h_j(X)\leq0,&j=1,2,…,m\
l_k(X)=0,&k=1,2,…,l\
x_i^l\leq x_i\leq x_i^u,&i=1,2,…,n
\end{aligned}
$$

现对不等式约束增加松弛变量。可以描述为:
$$
\begin{aligned}
\min \quad f(X)\
\mathrm{ s.t. } \quad h_j(X)+x_{n+j}=0,&j=1,2,…,m\
h_k(X)=0,&k=1,2,…,l\
x_i^l\leq x_i\leq x_i^u,&i=1,2,…,n\
x_{n+j}\geq 0,&j=1,2,…,m
\end{aligned}
$$
再将模型统一简化为:
$$
\begin{aligned}
\min \quad f(X)\
\mathrm{ s.t. } \quad g_j(X)=0,&j=1,2,…,m+l\
x_i^l\leq x_i\leq x_i^u,&i=1,2,…,n+m\
\end{aligned}
$$
广义简约梯度是基于消元法的,现在将变量分解为两部分。
$$ X= \begin{bmatrix} Y\ Z \end{bmatrix} $$
其中Y是独立变量(design or independent variables),Z是非独立变量(state or dependent variables)。由于X满足约束条件 $g_j(X)=0$,所以
$$\mathrm{d}X=[C]\mathrm{d}Y+[D]\mathrm{d}Z $$
其中
$$[C]= \begin{bmatrix} \frac{\partial g_1}{\partial y_1}& \cdots&\frac{\partial g_1}{\partial y_{n-l}} \ \vdots&\ddots &\vdots \ \frac{\partial g_{m+l}}{\partial y_1}&\cdots&\frac{\partial g_{m+l}}{\partial y_{n-l}} \end{bmatrix} [D]= \begin{bmatrix} \frac{\partial g_1}{\partial z_1}& \cdots&\frac{\partial g_1}{\partial z_{m+l}} \ \vdots&\ddots &\vdots \ \frac{\partial g_{m+l}}{\partial z_1}&\cdots&\frac{\partial g_{m+l}}{\partial z_{m+l}} \end{bmatrix} $$
随着x的变化,约束g恒为零,所以 $$\mathrm{d}g=0$$ 所以
$$
\mathrm{d}Z=-[D]^{-1}[C]\mathrm{d}Y
\mathrm{d}f(X)=(\nabla^T_Y f-\nabla^T_Zf[D]^{-1}[C])\mathrm{d}Y $$
也可以表述为
$$\frac{\mathrm{d}f(X)}{\mathrm{d}Y}=G_R
G_R=\nabla_Y f-([D]^{-1}[C])^T\nabla_Zf
$$
其中 $G_R$ 就是广义简约梯度。
具体的实现算法可以表示为
初始化变量X,其中X包含Y和Z两部分。划分独立变量和非独立变量的标准为:划分应尽可能避免矩阵D为非奇异的情况;每个变量的上下界可以理解为非独立变量;由于松弛变量以线性约束的形式出现在问题中,所以松弛变量应当指定为非独立变量。
计算广义简约梯度。
检验收敛性。如果广义简约梯度的模足够小,则说明当前的变量值可以作为优化问题的最优解。
计算搜索方向。广义简约梯度可以理解为非约束优化问题目标函数的梯度,使用最速下降,共轭梯度等方法去确定搜索方向。
沿着搜索方向确定步长。使用以为搜索方法确定步长,值得注意的是这里的步长不能超过变量的上下界。

算例测试

算例1:线性约束问题

$$\begin{aligned} \min \quad obj =& x_1^3+3x_2^3+2(x_1-x_2)^2+2e^{-(x_1+x_2)}\ \mathrm{ s.t. } \quad &x_1-x_2\geq-2\ &x_1+x_2\leq25\ &x_1\geq0\ &x_2\geq0 \end{aligned}$$
计算迭代很快收敛,具体的数值结果可以去跑代码

算例2:非线性约束

$$\begin{aligned} \min \quad obj =& 4e^{x_1}-x_2^2+x_3^3-12\ \mathrm{ s.t. } \quad &x_1^2+x_2=20\ &x_1+x_3=7\ &x_1\geq0\ &x_2\geq0\ &x_3\geq0 \end{aligned} $$

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# Generalized Reduced Gradient Algorithm 

import numpy as np
import matplotlib.pyplot as plt
from sympy import *
import pandas as pd
def generalized_reduced_gradient():
ans_list = []
alpha_list = []
d_list = []
x1, x2, x3, x4 = symbols('x1 x2 x3 x4')
xvars = [x1, x2, x3, x4]
x_l = [0,0,0,0]
fx = x1**3+3*x2**3+2*(x1-x2)**2+2*exp(-x1-x2)
hxs = [x1-x2+2-x3,x1+x2-25+x4]
xcurr = np.array([2, 4, 5,1])

# x1, x2, x3 = symbols('x1 x2 x3')
# xvars = [x1, x2, x3]
# x_l = [0,0,0]
# fx = 4 * exp(x1) - x2 ** 2 + x3**3 - 12 # Function to be minimized
# hxs = [20 - x1**2- x2, x1 + x3 - 7] # Constraints to be obeyed
# xcurr = np.array([2, 4, 5]) # Starting solution

# x1, x2, x3 = symbols('x1 x2 x3')
# xvars = [x1, x2, x3]
# x_l = [0,0,0]
# fx = 4 * x1 - x2 ** 2 + x3**2 - 12 # Function to be minimized
# hxs = [ x1 + x3 - 7,x1+x2-50,20 - x1*x3] # Constraints to be obeyed
# xcurr = np.array([2, 4, 5]) # Starting solution

alpha_0 = 1 # Parameter initializations
gamma = 0.4
max_iter = 10
max_outer_iter = 30
eps_1, eps_2, eps_3 = 0.001, 0.001, 0.001


dfx = np.array([diff(fx, xvar) for xvar in xvars])
dhxs = np.array([[diff(hx, xvar) for xvar in xvars] for hx in hxs])
nonbasic_vars = len(xvars) - len(hxs)
opt_sols = []

for outer_iter in range(max_outer_iter):

print( '\n\nOuter loop iteration: {0}, optimal solution: {1}'.format(outer_iter + 1, xcurr))
ans_list.append([float(xcurr[i]) for i in range(len(xcurr))])
opt_sols.append(fx.subs(zip(xvars, xcurr)))

# Step 1

delta_f = np.array([df.subs(zip(xvars, xcurr)) for df in dfx])
delta_h = np.array([[dh.subs(zip(xvars, xcurr)) for dh in dhx] for dhx in dhxs]) # Value of h'_i(xcurr) for all i
J = np.array([dhx[nonbasic_vars:] for dhx in delta_h]) # Computation of J and C matrices
C = np.array([dhx[:nonbasic_vars] for dhx in delta_h])
delta_f_bar = delta_f[nonbasic_vars:]
delta_f_cap = delta_f[:nonbasic_vars]

J_inv = np.linalg.inv(np.array(J, dtype=float))
delta_f_tilde = delta_f_cap - delta_f_bar.dot(J_inv.dot(C))

# Step 2

if abs(delta_f_tilde[0]) <= eps_1:
break

d_bar = - delta_f_tilde.T # Direction of search in current iteration
d_cap = - J_inv.dot(C.dot(d_bar))
d = np.concatenate((d_bar, d_cap)).T
d_list.append([float(d[i]) for i in range(len(d))])
# Step 3

alpha = alpha_0

while alpha > 0.001:

print( '\nAlpha value: {0}\n'.format(alpha))

# Step 3(a)
print(xcurr.T , alpha , d)
v = xcurr.T + alpha * d

v = np.array([x_l[i] if v[i]<x_l[i] else v[i] for i in range(len(v))])
v_bar = v[:nonbasic_vars]
v_cap = v[nonbasic_vars:]
flag = False

for iters in range(max_iter):#一维搜索
print( 'Iteration: {0}, optimal solution obtained at x = {1}'.format(iters + 1, v))
h = np.array([hx.subs(zip(xvars, v)) for hx in hxs])
if all([abs(h_i) < eps_2 for h_i in h]): # Check if candidate satisfies all constraints
if fx.subs(zip(xvars, xcurr)) <= fx.subs(zip(xvars, v)):
alpha = alpha * gamma
break
else:
xcurr = v # Obtained a candidate better than the current optimal solution
flag = True
break

# Step 3(b)

delta_h_v = np.array([[dh.subs(zip(xvars, v)) for dh in dhx] for dhx in dhxs])
J_inv_v = np.linalg.inv(np.array([dhx[nonbasic_vars:] for dhx in delta_h_v], dtype=float))
v_next_cap = v_cap - J_inv_v.dot(h)
v_next_cap = np.array([x_l[i] if v_next_cap[i]<x_l[i] else v_next_cap[i] for i in range(len(v_next_cap))])
# Step 3(c)

if abs(np.linalg.norm(np.array(v_cap - v_next_cap, dtype=float), 1)) > eps_3:
v_cap = v_next_cap
v = np.concatenate((v_bar, v_cap))
else:
v_cap = v_next_cap
v = np.concatenate((v_bar, v_cap))
h = np.array([hx.subs(zip(xvars, v)) for hx in hxs])
if all([abs(h_i) < eps_2 for h_i in h]):

# Step 3(d)

if fx.subs(zip(xvars, xcurr)) <= fx.subs(zip(xvars, v)):
alpha = alpha * gamma # Search for lower values of alpha
break
else:
xcurr = v
flag = True
break
else:
alpha = alpha * gamma
break

if flag == True:
break

print( '\n\nFinal solution obtained is: {0}'.format(xcurr))
print( 'Value of the function at this point: {0}\n'.format(fx.subs(zip(xvars, xcurr))))

plt.plot(opt_sols, 'ro') # Plot the solutions obtained after every iteration
plt.show()
print(d_list)
print(ans_list)

if __name__ == '__main__':
generalized_reduced_gradient()

参考:Singiresu S. Rao, Engineering optimization: theory and practice. John Wiley & Sons, 2009.

Donate
  • Copyright: Copyright is owned by the author. For commercial reprints, please contact the author for authorization. For non-commercial reprints, please indicate the source.

扫一扫,分享到微信

微信分享二维码
  • Copyrights © 2015-2024 galaxy
  • Visitors: | Views:

请我喝杯咖啡吧~

支付宝
微信