• 02. Linear Regression

    2020. 2. 21.

    by. ํ•ด๋Š”์„ 

    ๋ณธ ๊ธ€์€ '๋ชจ๋‘๋ฅผ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ์‹œ์ฆŒ 2'์™€ 'pytorch๋กœ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ ๋Ÿฌ๋‹ ์ž…๋ฌธ'์„ ๋ณด๋ฉฐ ๊ณต๋ถ€ํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค.

    ํ•„์ž์˜ ์˜๊ฒฌ์ด ์„ž์—ฌ ๋“ค์–ด๊ฐ€ ๋ถ€์ •ํ™•ํ•œ ๋‚ด์šฉ์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


    ์„ ํ˜• ํšŒ๊ท€ = ํ•™์Šต ๋ฐ์ดํ„ฐ์™€ ๊ฐ€์žฅ ์ž˜ ๋งž๋Š” ํ•˜๋‚˜์˜ ์ง์„ ์„ ์ฐพ๋Š” ๊ณผ์ •!

    0. ์ „์ฒด์ ์ธ ํ•™์Šต ๋ฐฉ๋ฒ•

    (๋‹ค๋ฅธ ํ•™์Šต์—๋„ ์ ์šฉ๋จ!)

    1) ๋ฐ์ดํ„ฐ์— ์ ํ•ฉํ•œ ๊ฐ€์„ค์„ ์„ค์ •ํ•œ๋‹ค.

    2) cost function, ์ฆ‰ ์˜ค์ฐจ๋น„์šฉ loss๋ฅผ ๊ตฌํ•  ํ•จ์ˆ˜๋ฅผ ๊ฒฐ์ •ํ•œ๋‹ค.

    3) ์˜ค์ฐจํ•จ์ˆ˜์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ์ด์šฉํ•ด์„œ ๋ชจ๋ธ์„ ๊ฐœ์„ ํ•œ๋‹ค.(optimizer)

    4) ๋งŒ์กฑ์Šค๋Ÿฌ์šด ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ฌ ๋•Œ ๊นŒ์ง€ 3์„ ๋ฐ˜๋ณตํ•œ๋‹ค.

     

     

    1. x๊ฐ€ ํ•˜๋‚˜์ธ linear regression ๊ตฌํ˜„ํ•˜๊ธฐ

    1) Hypothesis

    ๊ฐ€์„ค์€ ์ธ๊ณต์‹ ๊ฒฝ๋ง์˜ ๊ตฌ์กฐ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ฆ‰, ์ฃผ์–ด์ง„ x์— ๋Œ€ํ•ด ์–ด๋–ค y๋ฅผ ๋ฑ‰์„ ์ง€ ์•Œ๋ ค์ค€๋‹ค.

    W = weight (๊ฐ€์ค‘์น˜)

    b = bias (ํŽธํ–ฅ)

     

     

    2) ๋น„์šฉํ•จ์ˆ˜ ๊ณ„์‚ฐ

    (๋น„์šฉ ํ•จ์ˆ˜(cost function) = ์†์‹ค ํ•จ์ˆ˜(loss function) = ์˜ค์ฐจ ํ•จ์ˆ˜(error function) = ๋ชฉ์  ํ•จ์ˆ˜(objective function))

     

    MSE(Mean Squered Error, ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ)๋ฅผ ์ด์šฉํ•ด ๊ณ„์‚ฐ.

    ์‰ฝ๊ฒŒ ๋งํ•ด์„œ, ๊ฐ๊ฐ์˜ ์˜ค์ฐจ์˜ ์ œ๊ณฑํ•ฉ์— ๋Œ€ํ•œ ํ‰๊ท !

     

     

    3) ์ตœ์ ํ™” optimizer

    ์ด ๊ณผ์ •์„ ํ†ตํ•ด์„œ cost function์˜ ๊ฐ’์„ ์ตœ์†Œ๋กœ ํ•˜๋Š” W์™€ b๋ฅผ ์ฐพ๋Š”๋‹ค.

    ๊ทธ๋ฆฌ๊ณ  W์™€ b๋ฅผ ์ฐพ์•„๋‚ด๋Š” ๊ณผ์ •์„ 'ํ•™์Šต'์ด๋ผ๊ณ  ํ•œ๋‹ค.

     

    ์—ฌ๊ธฐ์„œ๋Š” ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•, Gradient Descent๋ฅผ ์ด์šฉํ•ด์„œ ์ตœ์ ํ™”ํ•œ๋‹ค. ์ฆ‰, ๊ธฐ์šธ๊ธฐ๋ฅผ ์ฐพ์•„์„œ ๊ธฐ์šธ๊ธฐ๊ฐ€ ์ž‘์•„์ง€๋Š” ์ชฝ์œผ๋กœ ์›€์ง์ด๊ฒŒ ๋งŒ๋“ ๋‹ค.

    ๊ธฐ์šธ๊ธฐ๊ฐ€ ์Œ์ˆ˜์ผ ๋•Œ๋Š” ์ฆ๊ฐ€, ์–‘์ˆ˜์ผ ๋•Œ์—” ๊ฐ์†Œํ•ด์•ผ ํ•œ๋‹ค.

     

    ๐Ÿ’ก ํ’€๊ณ ์žํ•˜๋Š” ๊ฐ ๋ฌธ์ œ์— ๋”ฐ๋ผ ๊ฐ€์„ค, ๋น„์šฉ ํ•จ์ˆ˜, ์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” ์ „๋ถ€ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์œผ๋ฉฐ ์„ ํ˜• ํšŒ๊ท€์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ๋น„์šฉ ํ•จ์ˆ˜๋Š” ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ, ์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•์ž…๋‹ˆ๋‹ค.

     

    3. full code

    import torch
    import torch.nn as nn #์‹ ๊ฒฝ๋ง ์ƒ์„ฑํ•  ๋•Œ ์‚ฌ์šฉ
    import torch.nn.functional as F
    import torch.optim as optim #๊ฐ€์ค‘์น˜ ๊ฐฑ์‹ ํ•  ๋•Œ ์‚ฌ์šฉ
    
    # ๋ฐ์ดํ„ฐ
    x_train = torch.FloatTensor([[1], [2], [3]])
    y_train = torch.FloatTensor([[2], [4], [6]])
    # x๋Š” ์ž…๋ ฅ๊ฐ’, y๋Š” ๊ฒฐ๊ณผ๊ฐ’
    #x๋ฅผ ๋„ฃ์—ˆ์„ ๋•Œ y๊ฐ€ ๋‚˜์˜ค๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด์•ผํ•จ.
    
    # ๋ชจ๋ธ ์ดˆ๊ธฐํ™”
    W = torch.zeros(1, requires_grad=True)
    b = torch.zeros(1, requires_grad=True)
    # requires_grad๋กœ ์ด ํ…์„œ๊ฐ€ ํ•™์Šต๋  ์ˆ˜ ์žˆ๋‹ค๊ณ  ๋ช…์‹œ.(์ž๋™ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ)
    
    # optimizer ์„ค์ •
    optimizer = optim.SGD([W, b], lr=0.01)
    
    nb_epochs = 2900
    for epoch in range(nb_epochs + 1):
        
        # H(x) ๊ณ„์‚ฐ - ๊ฐ€์„ค ๊ณ„์‚ฐ
        hypothesis = x_train * W + b
        
        # cost ๊ณ„์‚ฐ - ์˜ค์ฐจ ๋น„์šฉ! - ์˜ค์ฐจ์˜ ์ œ๊ณฑ๊ทผ์˜ ํ‰๊ท (MSE)
        cost = torch.mean((hypothesis - y_train) ** 2)
    
        # cost๋กœ H(x) ๊ฐœ์„ 
        optimizer.zero_grad() #gradient๋ฅผ 0์œผ๋กœ ์ดˆ๊ธฐํ™”(W, b๊ฐ€ ์—…๋ฐ์ดํŠธ ๋  ๋•Œ ๋งˆ๋‹ค ๋ฏธ๋ถ„๊ฐ’์ด ๋‹ฌ๋ผ์ง)
        cost.backward() #cost์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ตฌํ•จ
        optimizer.step() #w, b ์—…๋ฐ์ดํŠธ
    
        # 100๋ฒˆ๋งˆ๋‹ค ๋กœ๊ทธ ์ถœ๋ ฅ
        if epoch % 100 == 0:
            print('Epoch {:4d}/{} W: {:.3f}, b: {:.3f} Cost: {:.6f}'.format(
                epoch, nb_epochs, W.item(), b.item(), cost.item()
            ))
    
    

     

     


    <Reference>

    https://deeplearningzerotoall.github.io/season2/lec_pytorch.html

    https://wikidocs.net/53560

     

     

     

     

    '๐Ÿ“šSTUDY > ๐Ÿ”ฅPytorch ML&DL' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

    05. Logistic Regression  (0) 2020.02.28
    04-2. Loading Data(Mini batch and data load)  (0) 2020.02.24
    04-1. Multivariable Linear regression  (0) 2020.02.24
    03. Deeper Look at Gradient Descent  (0) 2020.02.24
    01. PyTorch ๊ธฐ์ดˆ  (0) 2020.02.13

    ๋Œ“๊ธ€