Gradient discent algorithm
- Minimize cost function
- Gradient discent is used many minimization problems
- For given cost function, cost (W,b), it will find W, b to minimize cost
- It can be applied to mort general function : cost (w1, w2, ...)
How it works?
- Start with initial guesses
- Start at 0,0 (or any other value)
- Keeping changing W and b a little bit to try and reduce cost(W,b)
- Each time you change the parameters, you select the gradient which reduces cost(W,b) the most possible
- Repeat
- Do so until you converge to a local minimum
- Has an interesting property
- Where you start can determine which minimum you end up
- Has an interesting property
- Where you start can determine which minimum you end up
x_data = [1., 2., 3., 4.]
y_data = [1., 3., 5., 7.]
W = tf.Variable(tf.random.normal([1], -100., 100.))
# random.nomal: 정규분포를 따르는 랜덤넘버를 1개 만든다.
for step in range(300):
hypothesis = W * X
cost = tf.reduce_mean(tf.square(hypothesis - Y))
alpha = 0.01
gradient = tf.reduce_mean(tf.multiply(tf.multiply(W, X) - Y, X))
descent = W - tf.multiply(alpha, gradient)
W.assign(descent)
if step % 10 == 0:
print('{:5} | {:10.4f} | {:10.6f}'.format(
step, cost.numpy(), W.numpy()[0]))
0 | 78516.1484 | -122.657616
10 | 30189.4043 | -75.677628
20 | 11607.8047 | -46.546268
30 | 4463.1934 | -28.482494
40 | 1716.0947 | -17.281507
50 | 659.8373 | -10.335999
60 | 253.7070 | -6.029227
70 | 97.5501 | -3.358683
80 | 37.5080 | -1.702733
90 | 14.4218 | -0.675911
100 | 5.5452 | -0.039199
110 | 2.1321 | 0.355613
120 | 0.8198 | 0.600429
130 | 0.3152 | 0.752234
140 | 0.1212 | 0.846365
150 | 0.0466 | 0.904734
160 | 0.0179 | 0.940928
170 | 0.0069 | 0.963370
180 | 0.0026 | 0.977287
190 | 0.0010 | 0.985916
200 | 0.0004 | 0.991267
210 | 0.0002 | 0.994585
220 | 0.0001 | 0.996642
230 | 0.0000 | 0.997918
240 | 0.0000 | 0.998709
250 | 0.0000 | 0.999199
260 | 0.0000 | 0.999504
270 | 0.0000 | 0.999692
280 | 0.0000 | 0.999809
290 | 0.0000 | 0.999882
반응형
'DL' 카테고리의 다른 글
[DL] Transformer: Attention Is All You Need (2017) (1) | 2022.11.20 |
---|