• 07. Tips and MNIST data

    2020. 3. 1.

    by. ํ•ด๋Š”์„ 

    ๋ณธ ๊ธ€์€ '๋ชจ๋‘๋ฅผ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ์‹œ์ฆŒ 2'์™€ 'pytorch๋กœ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ ๋Ÿฌ๋‹ ์ž…๋ฌธ'์„ ๋ณด๋ฉฐ ๊ณต๋ถ€ํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค.

    ํ•„์ž์˜ ์˜๊ฒฌ์ด ์„ž์—ฌ ๋“ค์–ด๊ฐ€ ๋ถ€์ •ํ™•ํ•œ ๋‚ด์šฉ์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


    1. ํ•™์Šต๋ฅ  (Learning late)

    gradient descent๋ฅผ ํ•  ๋•Œ, ๋ฐœ์ž๊ตญ์˜ ํฌ๊ธฐ๋ฅผ ์ •ํ•˜๋Š” ์ผ! ์ฆ‰, ์–ผ๋งˆ๋‚˜ ์ด๋™ํ• ์ง€ ์ •ํ•œ๋‹ค.

    ์ฒ˜์Œ์— ํ•œ 0.1 ์ •๋„์—์„œ ์‹œ์ž‘ํ•˜๋‹ค๊ฐ€ ๋„ˆ๋ฌด ๋ฐœ์‚ฐํ•˜๋ฉด ์ค„์ด๊ณ , ๋„ˆ๋ฌด ์ง„์ „์ด ์—†์œผ๋ฉด(๋ณ€ํ™”๊ฐ€ ์—†์œผ๋ฉด) ๋Š˜๋ฆฌ๋Š” ์‹์œผ๋กœ ์ตœ์ ์˜ ํ•™์Šต๋ฅ ์„ ์ฐพ์•„๊ฐ€์•ผ ํ•œ๋‹ค.

     

    2. Data Preprocessing

    ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ.

    ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ ํŽธ์ฐจ๊ฐ€ ๋„ˆ๋ฌด ํฌ๋ฉด, ํฐ ๊ฐ’์—๋งŒ ์ง‘์ค‘ํ•˜๊ฒŒ ๋˜์–ด์„œ ํ•™์Šต์ด ์ž˜ ๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค.

    ์ •๊ทœ๋ถ„ํฌํ™”(standardization)์„ ํ†ตํ•ด์„œ ๊ท ๋“ฑํ•˜๊ฒŒ ๋งŒ๋“ค์–ด ์ค˜์•ผ ํ•œ๋‹ค.

    mu = x_train.mean(dim=0)
    sigma = x_train.std(dim=0)
    norm_x_train = (x_train - mu) / sigma

    ์ž๋งคํ’ˆ์œผ๋กœ ๊ฐ€์ค‘์น˜ ์ „์ฒ˜๋ฆฌ๋„ ์žˆ๋‹ค. 09-2์— ๋‚˜์˜จ๋‹ค.

     

    3. MNIST

    ๋ฐฉ๋Œ€ํ•œ ์–‘์˜ ์†๊ธ€์”จ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์•„๋†“์€ ๋ฐ์ดํ„ฐ Set!

     

    1) torch vision

    ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ ์…‹, ๋ชจ๋ธ ์•„ํ‚คํ…์ณ, ํŠธ๋žœ์Šค ํผ ๋“ฑ๋“ฑ์„ ์ œ๊ณตํ•œ๋‹ค.

     

    2) introduction

    • 28 x 28 image => 784 (input์˜ ๊ฐœ์ˆ˜)
    • 1 channel gray image (์ƒ‰์ƒ์ด ํ‘๋ฐฑ์ด๋‹ค = 1์ฑ„๋„์ด๋‹ค)
    • 0~9  digits (0๋ถ€ํ„ฐ 9๊นŒ์ง€์˜ ์ˆซ์ž๊ฐ€ ์žˆ๋‹ค, output์˜ ๊ฐœ์ˆ˜)

    3) Reading data

    import torchvision.datasets as dsets
    
    ...
    
    mnist_train = dsets.MNIST(root="MNIST_data/", train=True, transform=transforms.ToTensor(), download=True)
    mnist_test = dsets.MNIST(root="MNIST_data/", train=False, transform=transforms.ToTensor(), download=True)

    dsets.MNIST

    • root = mnist dataset์ด ์žˆ๋Š”(์žˆ์„) ์žฅ์†Œ - ํŒŒ์ผ ์œ„์น˜
    • train = T๋ฉด ํ•™์Šต ๋ฐ์ดํ„ฐ, F๋ฉด ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ์‚ฌ์šฉ
    • transform = ๋ฐ์ดํ„ฐ์˜ ํ˜•ํƒœ๋ฅผ ๊ฒฐ์ •. ToTenser()๋ฅผ ํ•˜๋ฉด pytorch์—์„œ ์“ฐ๋Š” ํ˜•ํƒœ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”๊ฟ”์คŒ.
    • download = T๋ฉด ํŒŒ์ผ ๊ฒฝ๋กœ์— ๋ฐ์ดํ„ฐ ์…‹์ด ์—†์„ ๊ฒฝ์šฐ, ๋‹ค์šด๋กœ๋“œ๋ฅผ ๋ฐ›๊ฒ ๋‹ค๊ณ  ๋ช…์‹œ.
    data_loader = torch.utils.DataLoader(DataLoader=mnist_train, batch_size=batch_size, shuffle=True, drop_last=True)

    Data loader : mini batch๋ฅผ ๋งŒ๋“ค ๋•Œ ์‚ฌ์šฉ.

    • DataLoader = ์ด ๋ฐ์ด์„œ ์…‹์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜ค๊ฒ ๋‹ค.
    • batch_size = ์–ผ๋งˆ์˜ ํฌ๊ธฐ๋กœ ์ด๋“ค์„ ๋‚˜๋ˆŒ ๊ฒƒ์ด๋‹ค.
    • shuffle = ํ•™์Šตํ•  ๋•Œ ๋งˆ๋‹ค ๋ฐ์ดํ„ฐ๋ฅผ ์„ž์–ด ์ฃผ๊ฒ ๋‹ค.
    • drop_last = T๋ผ๋ฉด, ๋งˆ์ง€๋ง‰ ๋ฐฐ์น˜๊ฐ€ ๋”ฑ ์•ˆ๋–จ์–ด์ง€๊ณ  ์• ๋งคํ•˜๊ฒŒ ๋‚จ์„ ๋•Œ ๊ทธ ๋ฐฐ์น˜๋ฅผ ๋ฒ„๋ฆฐ๋‹ค. (๋‹ค๋ฅธ ๋ฏธ๋‹ˆ ๋ฐฐ์น˜๋ณด๋‹ค ๊ฐœ์ˆ˜๊ฐ€ ์ ์€ ๋งˆ์ง€๋ง‰ ๋ฐฐ์น˜๊ฐ€ ์ƒ๋Œ€์ ์œผ๋กœ ๊ณผ๋Œ€ ํ‰๊ฐ€๋˜๋Š” ํ˜„์ƒ์„ ๋ฐฉ์ง€)
    for epoch in range(training_epochs):
    ...
    	for x, y in data_loader:
    		X = X.view(-1, 28*28).to(device)

    x์—๋Š” mnist image, y์—๋Š” label(0~9)์ด ๋“ค์–ด๊ฐ.

    view๋ฅผ ์ด์šฉํ•ด์„œ ๋ฐ์ดํ„ฐ ํ˜•์‹์„ ๋ฐ”๊ฟ”์ค€๋‹ค. (B, 1, 28, 28) -> (B, 784)

     

    4) etc...

    • Epoch : ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋ฐ์ดํ„ฐ ์…‹ ์ „์ฒด๊ฐ€ ํŠธ๋ ˆ์ด๋‹์— ๋ชจ๋‘ ์‚ฌ์šฉ์ด ๋œ๋‹ค => ํ•œ ์—ํญ์„ ๋Œ์•˜๋‹ค.
    • batch size : ์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ์–ผ๋งˆ์˜ ํฌ๊ธฐ๋กœ ์ž๋ฅด๋Š๋ƒ.
    • Iteration : ์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ๋งŒํผ ๋‚˜๋ˆ„๋ฉด ๋‚˜์˜ค๋Š” ๋ฉ์–ด๋ฆฌ์˜ ๊ฐœ์ˆ˜. ๊ทธ ๊ฐœ์ˆ˜์˜ Iteration์„ ๋Œ์•˜๋‹ค -> 1 epoch

    5) softmax train

     

    import torch
    import torchvision.datasets as dsets
    import torchvision.transforms as transforms
    import matplotlib.pyplot as plt
    import random
    
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    
    # for reproducibility
    random.seed(777)
    torch.manual_seed(777)
    if device == 'cuda':
        torch.cuda.manual_seed_all(777)
    
    # hyperparameters
    training_epochs = 15
    batch_size = 100
    
    # MNIST dataset
    mnist_train = dsets.MNIST(root='MNIST_data/',
                              train=True,
                              transform=transforms.ToTensor(),
                              download=True)
    
    mnist_test = dsets.MNIST(root='MNIST_data/',
                             train=False,
                             transform=transforms.ToTensor(),
                             download=True)
    
    data_loader = torch.utils.data.DataLoader(dataset=mnist_train,
                                              batch_size=batch_size,
                                              shuffle=True,
                                              drop_last=True)
    
    
    
    
    # MNIST data image of shape 28 * 28 = 784 (input), output => 0~9 (10๊ฐœ)
    
    linear = torch.nn.Linear(784, 10, bias=True).to(device)
    # to() = ์—ฐ์‚ฐ์„ ์–ด๋””์„œ ์ˆ˜ํ–‰ํ• ์ง€ ์ •ํ•จ.(๊ธฐ๋ณธ์€ CPU)
    #bias - ํŽธํ–ฅ์„ ์‚ฌ์šฉํ• ๊ฑด์ง€ (๊ธฐ๋ณธ๊ฐ’์€ T)
    
    # define cost/loss & optimizer
    criterion = torch.nn.CrossEntropyLoss().to(device)    # Softmax is internally computed.
    optimizer = torch.optim.SGD(linear.parameters(), lr=0.1)
    
    for epoch in range(training_epochs): # ์•ž์„œ training_epochs์˜ ๊ฐ’์€ 15๋กœ ์ง€์ •ํ•จ.
        avg_cost = 0
        total_batch = len(data_loader)
    
        for X, Y in data_loader: #Iteration ๋Œ๋ ค์ฃผ๊ธฐ (์ด๊ฒŒ ๋‹ค ๋Œ๋ฉด 1epoch)
            # ๋ฐฐ์น˜ ํฌ๊ธฐ๊ฐ€ 100์ด๋ฏ€๋กœ ์•„๋ž˜์˜ ์—ฐ์‚ฐ์—์„œ X๋Š” (100, 784)์˜ ํ…์„œ๊ฐ€ ๋œ๋‹ค.
            X = X.view(-1, 28 * 28).to(device)
            # ๋ ˆ์ด๋ธ”์€ ์›-ํ•ซ ์ธ์ฝ”๋”ฉ์ด ๋œ ์ƒํƒœ๊ฐ€ ์•„๋‹ˆ๋ผ 0 ~ 9์˜ ์ •์ˆ˜.
            Y = Y.to(device)
    
            optimizer.zero_grad()
            hypothesis = linear(X)
            cost = criterion(hypothesis, Y)
            cost.backward()
            optimizer.step()
    
            avg_cost += cost / total_batch
    
        print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
    
    print('Learning finished')
    
    
    
    # Test the model using test sets
    with torch.no_grad():
        X_test = mnist_test.test_data.view(-1, 28 * 28).float().to(device)
        Y_test = mnist_test.test_labels.to(device)
    
        prediction = linear(X_test)
        correct_prediction = torch.argmax(prediction, 1) == Y_test
        accuracy = correct_prediction.float().mean()
        print('Accuracy:', accuracy.item())
    
        # Get one and predict
        r = random.randint(0, len(mnist_test) - 1)
        X_single_data = mnist_test.test_data[r:r + 1].view(-1, 28 * 28).float().to(device)
        Y_single_data = mnist_test.test_labels[r:r + 1].to(device)
    
        print('Label: ', Y_single_data.item())
        single_prediction = linear(X_single_data)
        print('Prediction: ', torch.argmax(single_prediction, 1).item())
    
        plt.imshow(mnist_test.test_data[r:r + 1].view(28, 28), cmap='Greys', interpolation='nearest')
        plt.show()

    <Reference>

    https://deeplearningzerotoall.github.io/season2/lec_pytorch.html

    https://wikidocs.net/60324

     

     

    '๐Ÿ“šSTUDY > ๐Ÿ”ฅPytorch ML&DL' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

    09-1. ํ™œ์„ฑํ™” ํ•จ์ˆ˜(Activation function)  (0) 2020.03.07
    08. Perceptron  (0) 2020.03.03
    06. softmax classification  (0) 2020.02.28
    05. Logistic Regression  (0) 2020.02.28
    04-2. Loading Data(Mini batch and data load)  (0) 2020.02.24

    ๋Œ“๊ธ€