๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
์นดํ…Œ๊ณ ๋ฆฌ ์—†์Œ

[pytorch zero to all ๊ฐ•์˜ ๋‚ด์šฉ ์ •๋ฆฌ] 3๊ฐ• gradient descent

by hyerong 2024. 1. 14.

 

๊ฐ•์˜ ์ฃผ์ œ : 3๊ฐ• Gradient descendent - ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•
๊ฐ•์˜ ๋ชฉํ‘œ : ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ณ  ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋Š” W๊ฐ’์„ ์ฐพ๋Š” ๊ณผ์ •๊ณผ ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณธ๋‹ค. ๋˜ํ•œ ๋ฐฉ์ •์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๊ณ  ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•œ๋‹ค. 
- ๐ŸŽฏ The goal of training or learning in machine learning is to find the optimal value of W that minimizes the loss function.
- ๐Ÿงญ The gradient descent algorithm provides a systematic way to identify the optimal value of W by iteratively updating the weights based on the gradient of the loss function.
- ๐Ÿ’ก The algorithm uses the learning rate (alpha) to control the step size of weight updates, and the gradient provides the direction of movement in the loss graph.
- ๐Ÿงฎ The loss function and gradient can be expressed mathematically, allowing for easy computation and implementation.
- ๐Ÿ’ป Implementation steps involve defining the data, initializing weights, defining the forward network, computing the loss and gradient, and updating the weights in each iteration (epoch).
- ๐Ÿ“Š The algorithm's effectiveness can be observed through the reduction in loss and convergence of the weight value towards the optimal value (true W).
- โœ… Testing the trained system by providing input values and comparing the predicted output to the true values helps validate the effectiveness of the algorithm.
- ๐ŸŽฏ ๊ธฐ๊ณ„ํ•™์Šต์—์„œ ํ›ˆ๋ จ ๋˜๋Š” ํ•™์Šต์˜ ๋ชฉํ‘œ๋Š” ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ์ตœ์ ์˜ W๊ฐ’์„ ์ฐพ๋Š” ๊ฒƒ์ด๋‹ค. 

- ๐Ÿงญ ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ• ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์†์‹ค ํ•จ์ˆ˜์˜ ๊ฒฝ์‚ฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ€์ค‘์น˜๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ˆ˜์ •ํ•˜์—ฌ ์ตœ์ ์˜ W๊ฐ’์„ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ๋Š” ์ฒด๊ณ„์ ์ธ ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•œ๋‹ค. 

- ๐Ÿ’ก ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํ•™์Šต๋ฅ (์•ŒํŒŒ)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ์˜ ๋‹จ๊ณ„ ํฌ๊ธฐ๋ฅผ ์ œ์–ดํ•˜๊ณ  ๊ธฐ์šธ๊ธฐ๋Š” ์†์‹ค ๊ทธ๋ž˜ํ”„์—์„œ ์ด๋™ ๋ฐฉํ–ฅ์„ ์ œ๊ณตํ•œ๋‹ค. 

- ๐Ÿงฎ ์†์‹ค ํ•จ์ˆ˜์™€ ๊ธฐ์šธ๊ธฐ๋ฅผ ์ˆ˜ํ•™์ ์œผ๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์–ด ๊ณ„์‚ฐ๊ณผ ๊ตฌํ˜„์ด ์šฉ์ดํ•˜๋‹ค.
 
- ๐Ÿ’ป ๊ตฌํ˜„ ๋‹จ๊ณ„์—๋Š” ๋ฐ์ดํ„ฐ ์ •์˜, ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”, ์ˆœ๋ฐฉํ–ฅ ๋„คํŠธ์›Œํฌ ์ •์˜, ์†์‹ค ๋ฐ ๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐ, ๊ฐ ๋ฐ˜๋ณต(epoch)์˜ ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ๊ฐ€ ํฌํ•จ๋œ๋‹ค. 

- ๐Ÿ“Š ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํšจ์œจ์„ฑ์€ ์†์‹ค ๊ฐ์†Œ์™€ ๊ฐ€์ค‘์น˜ ๊ฐ’์ด ์ตœ์ ์˜ ๊ฐ’(true W)์œผ๋กœ ์ˆ˜๋ ด๋˜๋Š” ๊ฒƒ์„ ํ†ตํ•ด ๊ด€์ฐฐํ•  ์ˆ˜ ์žˆ๋‹ค. 

- โœ… ์ž…๋ ฅ ๊ฐ’์„ ์ œ๊ณตํ•˜๊ณ  ์˜ˆ์ธก๋œ ์ถœ๋ ฅ์„ ์‹ค์ œ ๊ฐ’๊ณผ ๋น„๊ตํ•˜์—ฌ ํ›ˆ๋ จ๋œ ์‹œ์Šคํ…œ์„ ํ…Œ์ŠคํŠธํ•˜๋ฉด ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํšจ์œจ์„ฑ์„ ๊ฒ€์ฆํ•˜๋Š”๋ฐ ๋„์›€์ด ๋œ๋‹ค.