ํ•ด๋Š”์„  2020. 2. 21. 21:53

๋ณธ ๊ธ€์€ '๋ชจ๋‘๋ฅผ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ์‹œ์ฆŒ 2'์™€ 'pytorch๋กœ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ ๋Ÿฌ๋‹ ์ž…๋ฌธ'์„ ๋ณด๋ฉฐ ๊ณต๋ถ€ํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค.

ํ•„์ž์˜ ์˜๊ฒฌ์ด ์„ž์—ฌ ๋“ค์–ด๊ฐ€ ๋ถ€์ •ํ™•ํ•œ ๋‚ด์šฉ์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


์„ ํ˜• ํšŒ๊ท€ = ํ•™์Šต ๋ฐ์ดํ„ฐ์™€ ๊ฐ€์žฅ ์ž˜ ๋งž๋Š” ํ•˜๋‚˜์˜ ์ง์„ ์„ ์ฐพ๋Š” ๊ณผ์ •!

0. ์ „์ฒด์ ์ธ ํ•™์Šต ๋ฐฉ๋ฒ•

(๋‹ค๋ฅธ ํ•™์Šต์—๋„ ์ ์šฉ๋จ!)

1) ๋ฐ์ดํ„ฐ์— ์ ํ•ฉํ•œ ๊ฐ€์„ค์„ ์„ค์ •ํ•œ๋‹ค.

2) cost function, ์ฆ‰ ์˜ค์ฐจ๋น„์šฉ loss๋ฅผ ๊ตฌํ•  ํ•จ์ˆ˜๋ฅผ ๊ฒฐ์ •ํ•œ๋‹ค.

3) ์˜ค์ฐจํ•จ์ˆ˜์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ์ด์šฉํ•ด์„œ ๋ชจ๋ธ์„ ๊ฐœ์„ ํ•œ๋‹ค.(optimizer)

4) ๋งŒ์กฑ์Šค๋Ÿฌ์šด ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ฌ ๋•Œ ๊นŒ์ง€ 3์„ ๋ฐ˜๋ณตํ•œ๋‹ค.

 

 

1. x๊ฐ€ ํ•˜๋‚˜์ธ linear regression ๊ตฌํ˜„ํ•˜๊ธฐ

1) Hypothesis

๊ฐ€์„ค์€ ์ธ๊ณต์‹ ๊ฒฝ๋ง์˜ ๊ตฌ์กฐ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ฆ‰, ์ฃผ์–ด์ง„ x์— ๋Œ€ํ•ด ์–ด๋–ค y๋ฅผ ๋ฑ‰์„ ์ง€ ์•Œ๋ ค์ค€๋‹ค.

W = weight (๊ฐ€์ค‘์น˜)

b = bias (ํŽธํ–ฅ)

 

 

2) ๋น„์šฉํ•จ์ˆ˜ ๊ณ„์‚ฐ

(๋น„์šฉ ํ•จ์ˆ˜(cost function) = ์†์‹ค ํ•จ์ˆ˜(loss function) = ์˜ค์ฐจ ํ•จ์ˆ˜(error function) = ๋ชฉ์  ํ•จ์ˆ˜(objective function))

 

MSE(Mean Squered Error, ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ)๋ฅผ ์ด์šฉํ•ด ๊ณ„์‚ฐ.

์‰ฝ๊ฒŒ ๋งํ•ด์„œ, ๊ฐ๊ฐ์˜ ์˜ค์ฐจ์˜ ์ œ๊ณฑํ•ฉ์— ๋Œ€ํ•œ ํ‰๊ท !

 

 

3) ์ตœ์ ํ™” optimizer

์ด ๊ณผ์ •์„ ํ†ตํ•ด์„œ cost function์˜ ๊ฐ’์„ ์ตœ์†Œ๋กœ ํ•˜๋Š” W์™€ b๋ฅผ ์ฐพ๋Š”๋‹ค.

๊ทธ๋ฆฌ๊ณ  W์™€ b๋ฅผ ์ฐพ์•„๋‚ด๋Š” ๊ณผ์ •์„ 'ํ•™์Šต'์ด๋ผ๊ณ  ํ•œ๋‹ค.

 

์—ฌ๊ธฐ์„œ๋Š” ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•, Gradient Descent๋ฅผ ์ด์šฉํ•ด์„œ ์ตœ์ ํ™”ํ•œ๋‹ค. ์ฆ‰, ๊ธฐ์šธ๊ธฐ๋ฅผ ์ฐพ์•„์„œ ๊ธฐ์šธ๊ธฐ๊ฐ€ ์ž‘์•„์ง€๋Š” ์ชฝ์œผ๋กœ ์›€์ง์ด๊ฒŒ ๋งŒ๋“ ๋‹ค.

๊ธฐ์šธ๊ธฐ๊ฐ€ ์Œ์ˆ˜์ผ ๋•Œ๋Š” ์ฆ๊ฐ€, ์–‘์ˆ˜์ผ ๋•Œ์—” ๊ฐ์†Œํ•ด์•ผ ํ•œ๋‹ค.

 

๐Ÿ’ก ํ’€๊ณ ์žํ•˜๋Š” ๊ฐ ๋ฌธ์ œ์— ๋”ฐ๋ผ ๊ฐ€์„ค, ๋น„์šฉ ํ•จ์ˆ˜, ์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” ์ „๋ถ€ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์œผ๋ฉฐ ์„ ํ˜• ํšŒ๊ท€์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ๋น„์šฉ ํ•จ์ˆ˜๋Š” ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ, ์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•์ž…๋‹ˆ๋‹ค.

 

3. full code

import torch
import torch.nn as nn #์‹ ๊ฒฝ๋ง ์ƒ์„ฑํ•  ๋•Œ ์‚ฌ์šฉ
import torch.nn.functional as F
import torch.optim as optim #๊ฐ€์ค‘์น˜ ๊ฐฑ์‹ ํ•  ๋•Œ ์‚ฌ์šฉ

# ๋ฐ์ดํ„ฐ
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[2], [4], [6]])
# x๋Š” ์ž…๋ ฅ๊ฐ’, y๋Š” ๊ฒฐ๊ณผ๊ฐ’
#x๋ฅผ ๋„ฃ์—ˆ์„ ๋•Œ y๊ฐ€ ๋‚˜์˜ค๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด์•ผํ•จ.

# ๋ชจ๋ธ ์ดˆ๊ธฐํ™”
W = torch.zeros(1, requires_grad=True)
b = torch.zeros(1, requires_grad=True)
# requires_grad๋กœ ์ด ํ…์„œ๊ฐ€ ํ•™์Šต๋  ์ˆ˜ ์žˆ๋‹ค๊ณ  ๋ช…์‹œ.(์ž๋™ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ)

# optimizer ์„ค์ •
optimizer = optim.SGD([W, b], lr=0.01)

nb_epochs = 2900
for epoch in range(nb_epochs + 1):
    
    # H(x) ๊ณ„์‚ฐ - ๊ฐ€์„ค ๊ณ„์‚ฐ
    hypothesis = x_train * W + b
    
    # cost ๊ณ„์‚ฐ - ์˜ค์ฐจ ๋น„์šฉ! - ์˜ค์ฐจ์˜ ์ œ๊ณฑ๊ทผ์˜ ํ‰๊ท (MSE)
    cost = torch.mean((hypothesis - y_train) ** 2)

    # cost๋กœ H(x) ๊ฐœ์„ 
    optimizer.zero_grad() #gradient๋ฅผ 0์œผ๋กœ ์ดˆ๊ธฐํ™”(W, b๊ฐ€ ์—…๋ฐ์ดํŠธ ๋  ๋•Œ ๋งˆ๋‹ค ๋ฏธ๋ถ„๊ฐ’์ด ๋‹ฌ๋ผ์ง)
    cost.backward() #cost์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ตฌํ•จ
    optimizer.step() #w, b ์—…๋ฐ์ดํŠธ

    # 100๋ฒˆ๋งˆ๋‹ค ๋กœ๊ทธ ์ถœ๋ ฅ
    if epoch % 100 == 0:
        print('Epoch {:4d}/{} W: {:.3f}, b: {:.3f} Cost: {:.6f}'.format(
            epoch, nb_epochs, W.item(), b.item(), cost.item()
        ))

 

 


<Reference>

https://deeplearningzerotoall.github.io/season2/lec_pytorch.html

https://wikidocs.net/53560