• ๋Œ€ํ•œ์ „๊ธฐํ•™ํšŒ
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ๋‹จ์ฒด์ด์—ฐํ•ฉํšŒ
  • ํ•œ๊ตญํ•™์ˆ ์ง€์ธ์šฉ์ƒ‰์ธ
  • Scopus
  • crossref
  • orcid

  1. (Dept. of Data Science, Kangwon National University, Republic of Korea.)



CNN, Deep Learning, Multiple Instance Learning, Non-contact Farrowing Status Classification, Segment Anything Model

1. ์„œ ๋ก 

์–‘๋ˆ ์‚ฐ์—…์€ ์ „ ์„ธ๊ณ„ ์ถ•์‚ฐ์—…์—์„œ ์ค‘์š”ํ•œ ๋น„์ค‘์„ ์ฐจ์ง€ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ์‹๋Ÿ‰์•ˆ๋ณด ํ™•๋ณด์— ๊ธฐ์—ฌํ•˜๋Š” ์‚ฐ์—…์œผ๋กœ ์ž๋ฆฌ ์žก๊ณ  ์žˆ๋‹ค. ํŠนํžˆ ๊ตญ๋‚ด์—์„œ๋Š” 2024๋…„ 1์ธ๋‹น ๋ผ์ง€๊ณ ๊ธฐ ์†Œ๋น„๋Ÿ‰์ด ์•ฝ 30kg์œผ๋กœ ๋ณด๊ณ ๋˜์—ˆ์œผ๋ฉฐ, ๋‹ค๋ฅธ ์ฃผ์š” ์œก๋ฅ˜์— ๋น„ํ•ด ์†Œ๋น„ ์ˆ˜์ค€์ด ๊ฐ€์žฅ ๋†’๋‹ค[1]. ์ด๋Ÿฌํ•œ ์‚ฐ์—… ๊ตฌ์กฐ์—์„œ ๋ผ์ง€ ๋ฒˆ์‹์€ ์ƒ์‚ฐ์„ฑ ๋ฐ ์ˆ˜์ต์„ฑ๊ณผ ์ง๊ฒฐ๋˜๋Š” ํ•ต์‹ฌ ์š”์ธ์ด๋‹ค. ํŠนํžˆ ๋ถ„๋งŒ ๊ณผ์ •์€ ๋ชจ๋ˆ๊ณผ ์ž๋ˆ์˜ ๊ฑด๊ฐ•๊ณผ ๋ณต์ง€ ์ธก๋ฉด์—์„œ ์ค‘์š”ํ•œ ์‹œ๊ธฐ์ด๋ฉฐ, ๋ถ„๋งŒ ์ „ํ›„์˜ ์ด์ƒ ์ง•ํ›„๋ฅผ ์กฐ๊ธฐ์— ํŒŒ์•…ํ•˜๊ณ  ์‹ ์†ํžˆ ๋Œ€์‘ํ•˜๋Š” ๊ฒƒ์€ ์ž๋ˆ ์ƒ์กด์œจ๊ณผ ์‚ฌํ›„ ๊ด€๋ฆฌ์˜ ํšจ์œจ์„ ๋†’์ด๋Š” ๋ฐ ๊ธฐ์—ฌํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ตญ๋‚ด ์–‘๋ˆ ํ˜„์žฅ์—์„œ๋Š” ๊ณ ๋ นํ™”์™€ ๋…ธ๋™๋ ฅ ์ œ์•ฝ์œผ๋กœ ๋ถ„๋งŒ ์ „ํ›„์˜ ์—ฐ์† ๊ด€์ฐฐ๊ณผ ์‹ ์† ๋Œ€์‘์— ์–ด๋ ค์›€์„ ๊ฒช๊ณ  ์žˆ๋‹ค. ๋‹ค์Œ ๊ทธ๋ฆผ 1์€ 2015๋…„๋ถ€ํ„ฐ 2023๋…„๊นŒ์ง€์˜ ์–‘๋ˆ ๋†๊ฐ€ ๊ฒฝ์˜์ฃผ ์—ฐ๋ น ๋ถ„ํฌ๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, 2023๋…„์—๋Š” 60์„ธ ์ด์ƒ ๊ฒฝ์˜์ฃผ ๋น„์ค‘์ด 60%๊นŒ์ง€ ์ฆ๊ฐ€ํ–ˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค[2]. ๋…ธ๋™๋ ฅ ๋ถ€์กฑ์— ๋Œ€์‘ํ•˜์—ฌ ์™ธ๊ตญ์ธ ๊ทผ๋กœ์ž ํ™œ์šฉ์ด ํ™•๋Œ€๋˜๊ณ  ์žˆ์œผ๋‚˜ ์ˆ™๋ จ๋„ ์ฐจ์ด์™€ ์˜์‚ฌ์†Œํ†ต ์ œ์•ฝ์œผ๋กœ ์ธํ•ด ์ƒ์‚ฐ์„ฑ ํ–ฅ์ƒ์œผ๋กœ ์ด์–ด์ง€๊ธฐ ์–ด๋ ต๋‹ค[3]. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ถ„๋งŒ ์ „ํ›„ ์ƒํƒœ๋ฅผ ์ž๋™์œผ๋กœ ์ธ์ง€ํ•˜๊ณ  ์ด์ƒ ์ง•ํ›„๋ฅผ ์กฐ๊ธฐ์— ํƒ์ง€ํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ์ ‘๊ทผ์ด ์ œ์•ˆ๋˜์–ด ์™”๋‹ค. ๊ฐ€์†๋„ ์„ผ์„œ๋ฅผ ์ด์šฉํ•œ ํ™œ๋™๊ณผ ์ž์„ธ ๋ณ€ํ™” ๋ถ„์„, ์••๋ ฅ ์„ผ์„œ์™€ ์ ‘์ด‰ ์„ผ์„œ๋ฅผ ํ†ตํ•œ ํ–‰๋™ ๊ฐ์ง€ ๋“ฑ ๋‹ค์–‘ํ•œ ์—ฐ๊ตฌ๊ฐ€ ์ˆ˜ํ–‰๋˜์–ด ์™”์œผ๋ฉฐ[4], ์ด๋Ÿฌํ•œ ์ ‘๊ทผ์€ ๊ณตํ†ต์ ์œผ๋กœ ์ ‘์ด‰ ๊ธฐ๋ฐ˜ ๋ชจ๋‹ˆํ„ฐ๋ง์— ์˜์กดํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ ‘์ด‰ ๊ธฐ๋ฐ˜ ๋ฐฉ์‹์€ ์„ผ์„œ ๋ถ€์ฐฉ๊ณผ ์œ ์ง€๊ด€๋ฆฌ๋กœ ๋น„์šฉ ๋ถ€๋‹ด์ด ๋ฐœ์ƒํ•˜๊ณ , ๊ฐœ์ฒด๋ณ„ ์žฅ๋น„ ๋น„์šฉ์ด ๋ˆ„์ ๋˜๋ฉฐ, ์œ„์ƒ ๊ด€๋ฆฌ ๋ฌธ์ œ์™€ ๊ฐœ์ฒด ์ŠคํŠธ๋ ˆ์Šค ๋“ฑ ํ˜„์žฅ ์ ์šฉ์— ์—ฌ๋Ÿฌ ์ œ์•ฝ์ด ๋”ฐ๋ฅธ๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋ถ„๋งŒ์‚ฌ์—์„œ ํš๋“๋œ ๋ชจ๋ˆ ์ด๋ฏธ์ง€๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋น„๋ถ„๋งŒ๊ณผ ๋ถ„๋งŒ์„ ์ด์ง„ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋น„์ ‘์ด‰ ๋ถ„๋งŒ ์—ฌ๋ถ€ ๋ถ„๋ฅ˜ ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์€ ์ถ”๊ฐ€ ์„ผ์„œ ๋ถ€์ฐฉ ์—†์ด ์˜์ƒ๋งŒ์œผ๋กœ ๋ถ„๋งŒ ์—ฌ๋ถ€๋ฅผ ํŒ๋ณ„ํ•˜์—ฌ ํ˜„์žฅ ์ ์šฉ์„ฑ์„ ๋†’์ด๋ฉฐ, ๋ถ„๋งŒ ์ „ํ›„ ๋ชจ๋‹ˆํ„ฐ๋ง์„ ์ž๋™ํ™”ํ•˜์—ฌ ์ธ๋ ฅ ์ œ์•ฝ ํ™˜๊ฒฝ์—์„œ ๊ด€๋ฆฌ ํšจ์œจ ํ–ฅ์ƒ์„ ์ง€์›ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค.

๊ทธ๋ฆผ 1. ์—ฐ๋„๋ณ„ ์–‘๋ˆ ๋†๊ฐ€ ๊ฒฝ์˜์ฃผ ์—ฐ๋ น ๋ถ„ํฌ๋„

Fig. 1. Age distribution of swine farm household heads by year

../../Resources/kiee/KIEE.2026.75.6.1383/fig1.png

2. ๊ด€๋ จ ์—ฐ๊ตฌ

๋ชจ๋ˆ ๋ถ„๋งŒ ์‹œ์ ์˜ ์ •ํ™•ํ•œ ํŒŒ์•…๊ณผ ์กฐ๊ธฐ ๋Œ€์‘์„ ์œ„ํ•ด ์–‘๋ˆ ๋ถ„์•ผ์—์„œ๋Š” ๋‹ค์–‘ํ•œ ์ ‘์ด‰ ๊ธฐ๋ฐ˜ ๋ชจ๋‹ˆํ„ฐ๋ง์ด ์—ฐ๊ตฌ๋˜์–ด ์™”๋‹ค. Lipori ๋“ฑ์€ ๋ชจ๋ˆ์— ์›จ์–ด๋Ÿฌ๋ธ” ์„ผ์„œ๋ฅผ ๋ถ€์ฐฉํ•ด ํ™œ๋™, ์—ด ํ”Œ๋Ÿญ์Šค, ํ”ผ๋ถ€์˜จ๋„ ์‹ ํ˜ธ๋ฅผ ์ธก์ •ํ•˜๊ณ  ์ด๋ฅผ ๋ถ„์„ํ•˜์—ฌ ๋ถ„๋งŒ ๊ฐœ์‹œ ์‹œ์ ์„ ์˜ˆ์ธกํ•˜๋Š” ์‹œ์Šคํ…œ์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค[5]. Mayrhuber ๋“ฑ์€ ๋ชจ๋ˆ ๊ท€ํ‘œ ๊ฐ€์†๋„ ์„ผ์„œ ์‹ ํ˜ธ๋กœ ๋ถ„๋งŒ ์ „ ๋‘ฅ์ง€ ์ง“๊ธฐ ํ–‰๋™์˜ ์‹œ์ž‘์„ ํƒ์ง€ํ•˜๊ณ  ํ•ด๋‹น ํ–‰๋™ ์‹ ํ˜ธ๋ฅผ ์ด์šฉํ•ด ๋ถ„๋งŒ ๊ฐœ์‹œ ์‹œ์ ์„ ์˜ˆ์ธกํ•˜๋Š” ์‹œ์Šคํ…œ์„ ์ œ์‹œํ•˜์˜€๋‹ค[6]. Oczak ๋“ฑ์€ ๊ท€ํ‘œํ˜• 3์ถ• ๊ฐ€์†๋„ ์„ผ์„œ๋กœ ๋ชจ๋ˆ ํ™œ๋™๋Ÿ‰์„ ์ •๋Ÿ‰ํ™”ํ•˜๊ณ  ์˜์ƒ ๊ธฐ๋ฐ˜ ๊ณ„์ธก๊ณผ์˜ ๋น„๊ต๋ฅผ ํ†ตํ•ด ๋ถ„๋งŒ ์ „ํ›„ ํ–‰๋™ ๋ณ€ํ™” ๋ถ„์„์— ํ™œ์šฉ ๊ฐ€๋Šฅํ•œ ํ™œ๋™ ๋ชจ๋‹ˆํ„ฐ๋ง ์‹œ์Šคํ…œ์„ ์ œ์‹œํ•˜์˜€๋‹ค[7]. ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ์ ‘์ด‰ ๊ธฐ๋ฐ˜ ๋ชจ๋‹ˆํ„ฐ๋ง์€ ์„ผ์„œ ๋ถ€์ฐฉ๊ณผ ์œ ์ง€๊ด€๋ฆฌ์— ์ถ”๊ฐ€ ๋ถ€๋‹ด์ด ์ˆ˜๋ฐ˜๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ ์žฅ๋น„ ์†์ƒ์— ๋”ฐ๋ฅธ ๊ด€๋ฆฌ ๋น„์šฉ์ด ์ฆ๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ•œ๊ณ„๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด ๋”ฅ๋Ÿฌ๋‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ถ„๋งŒ ์ „ํ›„ ์ƒํƒœ๋ฅผ ์ž๋™์œผ๋กœ ํƒ์ง€ํ•˜๊ฑฐ๋‚˜ ์‹œ์ ์„ ์˜ˆ์ธกํ•˜๋ ค๋Š” ๋น„์ ‘์ด‰ ์ ‘๊ทผ์ด ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ๋‹ค. Yang ๋“ฑ์€ Convolutional Neural Network(CNN) ๊ธฐ๋ฐ˜ ๋ชจ๋ˆ ๊ฒ€์ถœ๊ณผ ์ž์„ธ ๋ถ„๋ฅ˜๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ณ , ์‹œ๊ฐ„ ์ถ•์—์„œ ์ž์„ธ ์ „ํ™˜ ๊ตฌ๊ฐ„์„ ํƒ์ง€ํ•จ์œผ๋กœ์จ ๋ถ„๋งŒ ์ „ํ›„ ๊ด€๋ฆฌ์— ํ•„์š”ํ•œ ์ž์„ธ ๋ณ€ํ™”๋ฅผ ์ž๋™์œผ๋กœ ๊ฒ€์ถœํ•˜๋Š” ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•˜์˜€๋‹ค[8]. Witte ๋“ฑ์€ YOLOv5 ๊ฐ์ฒด ๊ฒ€์ถœ๋กœ ์ž๋ˆ ์ถœํ˜„์„ ํฌ์ฐฉํ•˜๊ณ  EfficientNet ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๋ถ„๋งŒ์„ ์ž๋™ ๊ฐ์ง€ํ•˜๋Š” ํŒŒ์ดํ”„๋ผ์ธ์„ ์ œ์‹œํ•˜์˜€๋‹ค[9]. Wutke ๋“ฑ์€ ๋ถ„๋งŒ์‚ฌ ์˜์ƒ์—์„œ CNN ๊ธฐ๋ฐ˜ ์‹ ์ƒ ์ž๋ˆ ๊ฒ€์ถœ๊ธฐ์— Noisy Student ํ•™์Šต ์ „๋žต์„ ์ ์šฉํ•˜์—ฌ ์ œํ•œ๋œ ๋ฐ์ดํ„ฐ ํ™˜๊ฒฝ์—์„œ๋„ ๊ฒ€์ถœ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค[10].

์ด์ฒ˜๋Ÿผ CNN์„ ํ™œ์šฉํ•ด ๋‹จ์ผ ํ”„๋ ˆ์ž„ ๋˜๋Š” ์ด๋ฏธ์ง€ ๋‹จ์œ„๋กœ ๋ถ„๋งŒ ์—ฌ๋ถ€๋ฅผ ์ถ”์ •ํ•˜๋Š” ์ ‘๊ทผ์ด ๋„๋ฆฌ ์‚ฌ์šฉ๋˜์–ด ์™”๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ถ„๋งŒ๊ณผ ์ง์ ‘ ๊ด€๋ จ๋œ ์‹œ๊ฐ์  ๋‹จ์„œ๋Š” ์˜์ƒ ์ „์ฒด๊ฐ€ ์•„๋‹ˆ๋ผ ์™ธ์Œ๋ถ€ ์ฃผ๋ณ€์˜ ์ œํ•œ๋œ ์˜์—ญ์— ์ง‘์ค‘๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๊ณ  ์‹ค์ œ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ํ•ด๋‹น ์˜์—ญ์— ๋Œ€ํ•œ ์œ„์น˜ ๋ผ๋ฒจ ์—†์ด ์ด๋ฏธ์ง€ ๋‹จ์œ„ ๋ผ๋ฒจ๋งŒ ์ œ๊ณต๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์ผ๋ฐ˜์ ์ด๋‹ค. ์ด์— ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” CNN์„ ํŠน์ง• ์ถ”์ถœ๊ธฐ๋กœ ํ™œ์šฉํ•˜๊ณ  Multiple Instance Learning(MIL) ๊ธฐ๋ฐ˜ ์ง‘๊ณ„ ๊ตฌ์กฐ๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ๋‚ด ์—ฌ๋Ÿฌ ์ง€์—ญ ํŠน์ง• ์ค‘ ๋ถ„๋งŒ ๊ด€๋ จ ๋‹จ์„œ๊ฐ€ ์กด์žฌํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ ์˜์—ญ์— ๋” ํฐ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•จ์œผ๋กœ์จ ๋ถ„๋งŒ ์—ฌ๋ถ€ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•˜์˜€๋‹ค.

3. ๋ณธ ๋ก 

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋ชจ๋ˆ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ๋ถ„๋งŒ๊ณผ ๋น„๋ถ„๋งŒ์„ ์ด์ง„ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋น„์ ‘์ด‰ ๋ถ„๋งŒ ์—ฌ๋ถ€ ๋ถ„๋ฅ˜ ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•œ๋‹ค. ๋ˆ์‚ฌ ํ™˜๊ฒฝ์—์„œ๋Š” ๋ชจ๋ˆ์˜ ์ž์„ธ ๋ฐ ์œ„์น˜ ๋ณ€ํ™”๋กœ ์ธํ•ด ๋ถ„๋งŒ ๊ด€๋ จ ์ •๋ณด๊ฐ€ ํ”„๋ ˆ์ž„ ์ค‘์•™์— ํ•ญ์ƒ ์œ„์น˜ํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ Segment Anything Model(SAM) ๊ธฐ๋ฐ˜ ๋ชจ๋ˆ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•ด ๊ฐ์ฒด ๊ธฐ์ค€์˜ ROI๋ฅผ ์ •๋ฐ€ํ•˜๊ฒŒ ์ถ”์ถœํ•˜์—ฌ ๋ฐฐ๊ฒฝ ์ •๋ณด์˜ ์˜ํ–ฅ์„ ์ตœ์†Œํ™”ํ•˜์˜€๋‹ค. ์ดํ›„ MIL ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์„ ์ ์šฉํ•˜์—ฌ ํŒจ์น˜ ์ธ์Šคํ„ด์Šค๋ณ„ ์ค‘์š”๋„๋ฅผ ํ•™์Šตํ•˜์—ฌ ์ถœ๋ ฅ๊ฐ’์„ ๊ฐ€์ค‘ํ•ฉ์œผ๋กœ ๊ฒฐํ•ฉํ•จ์œผ๋กœ์จ ์ตœ์ข… ์˜ˆ์ธก์„ ์‚ฐ์ถœํ•˜์˜€๋‹ค.

3.1 ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ

๋ณธ ์—ฐ๊ตฌ์—์„œ ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ๋Š” ๋Œ€ํ•œ๋ฏผ๊ตญ ๊ฒฝ์ƒ๋‚จ๋„ ํ•จ์•ˆ๊ตฐ์— ์œ„์น˜ํ•œ ๋ˆ์‚ฌ์—์„œ ์ˆ˜์ง‘๋˜์—ˆ๋‹ค. ๋ชจ๋ˆ์„ ์ƒ๋ถ€์—์„œ ๊ด€์ฐฐํ•  ์ˆ˜ ์žˆ๋„๋ก top view ์กฐ๊ฑด์—์„œ 2D ์นด๋ฉ”๋ผ(Deep-eyes)๋ฅผ ์„ค์น˜ํ•˜์˜€์œผ๋ฉฐ, ์ง€๋ฉด์œผ๋กœ๋ถ€ํ„ฐ 2.3m ๋†’์ด์— ๊ณ ์ •ํ•˜์—ฌ ์ดฌ์˜ํ•˜์˜€๋‹ค. ์ˆ˜์ง‘๋œ ๋ฐ์ดํ„ฐ๋Š” MP4 ๋น„๋””์˜ค ํ˜•์‹์œผ๋กœ ์ €์žฅํ•˜์˜€์œผ๋ฉฐ, ์˜์ƒ์—์„œ 10fps ๊ฐ„๊ฒฉ์œผ๋กœ ํ”„๋ ˆ์ž„์„ ์ถ”์ถœํ•˜์—ฌ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋กœ ๊ตฌ์„ฑํ•˜์˜€๋‹ค. ๊ฐ ์ด๋ฏธ์ง€์—๋Š” ํ•œ ๋งˆ๋ฆฌ์˜ ๋ชจ๋ˆ๊ณผ ์—ฌ๋Ÿฌ ๋งˆ๋ฆฌ์˜ ์ž๋ˆ์ด ํฌํ•จ๋˜๋ฉฐ, ์ดฌ์˜ ํ™˜๊ฒฝ ํŠน์„ฑ์ƒ ์กฐ๋„ ๋ณ€ํ™”, ๊ฐ€๋ฆผ ํ˜„์ƒ์œผ๋กœ ์ธํ•ด ํ•™์Šต ๋ฐ ํ‰๊ฐ€์— ๋ถ€์ ํ•ฉํ•œ ์ƒ˜ํ”Œ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด์— ๋”ฐ๋ผ ์ถ•์‚ฐ ์ „๋ฌธ๊ฐ€๊ฐ€ ์ด๋ฏธ์ง€ ํ’ˆ์งˆ๊ณผ ์žฅ๋ฉด ์ ํ•ฉ์„ฑ์„ ๊ฒ€ํ† ํ•˜์—ฌ ํ™œ์šฉ์ด ์–ด๋ ค์šด ์ด์ƒ์น˜ ์ƒ˜ํ”Œ์„ ์ œ์™ธํ•˜์˜€์œผ๋ฉฐ, ์ตœ์ข…์ ์œผ๋กœ 23,203์žฅ์˜ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ๊ทธ ํ›„ ๊ฐ ์ด๋ฏธ์ง€๋Š” ๋ถ„๋งŒ๊ณผ ๋น„๋ถ„๋งŒ์˜ ๋‘ ๊ฐœ์˜ ํด๋ž˜์Šค๋กœ ๊ตฌ๋ถ„ํ•˜์˜€์œผ๋ฉฐ, ํ•™์Šต์„ ์œ„ํ•ด ๋ฐ์ดํ„ฐ์…‹์„ train, validation, test ์„ธํŠธ๋กœ ์•ฝ 6:2:2 ๋น„์œจ๋กœ ๋ถ„ํ• ํ•˜์˜€๋‹ค. ๋ฐ์ดํ„ฐ ๋ถ„ํ•  ๊ณผ์ •์—์„œ๋Š” ๋ชจ๋ˆ ๊ฐœ์ฒด ๋‹จ์œ„๋กœ ๋ถ„ํ• ํ•˜์—ฌ ๋™์ผํ•œ ๊ฐœ์ฒด๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์„ธํŠธ์— ์ค‘๋ณต ํฌํ•จ๋˜์ง€ ์•Š๋„๋ก ๊ตฌ์„ฑํ•˜์˜€๋‹ค. ํด๋ž˜์Šค ๊ฐ„ ํ‘œ๋ณธ ์ˆ˜ ์ฐจ์ด๋Š” ์กด์žฌํ•˜๋‚˜ ๊ทธ ์ •๋„๊ฐ€ ํฌ์ง€ ์•Š์•„, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋ณ„๋„์˜ ๋ถˆ๊ท ํ˜• ์ฒ˜๋ฆฌ ์—†์ด ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ์ž์„ธํ•œ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ์€ ๋‹ค์Œ ํ‘œ 1์— ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

ํ‘œ 1. ๋ชจ๋ˆ ๋ถ„๋งŒ ์—ฌ๋ถ€๋ณ„ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ(๋‘์ˆ˜, ์ด๋ฏธ์ง€)

Table 1. Dataset composition for sow farrowing status classification(heads, images)

Type Train Validation Test Total
Non-farrowing Head 1,460 486 488 2,434
Image 6,159 2,269 2,058 10,486
Farrowing Head 1,723 574 575 2,872
Image 7,741 2,468 2,508 12,717

3.2 SAM ๊ธฐ๋ฐ˜ ๋ชจ๋ˆ ๋ถ„ํ• ์„ ์ด์šฉํ•œ ROI ์ •๋ฐ€ ํฌ๋กญ(cropping)

๋ˆ์‚ฌ ์˜์ƒ์—์„œ๋Š” ์นด๋ฉ”๋ผ ์‹œ์ , ๋ชจ๋ˆ์˜ ์ž์„ธ ๋ฐ ๋ฐฉํ–ฅ์œผ๋กœ ์ธํ•ด ๋ถ„๋งŒ ๊ด€๋ จ ๋‹จ์„œ๊ฐ€ ํ”„๋ ˆ์ž„ ์ค‘์•™์— ํ•ญ์ƒ ์œ„์น˜ํ•˜์ง€ ์•Š๋Š”๋‹ค. ํŠนํžˆ ๋ถ„๋งŒ ์ด๋ฒคํŠธ๋Š” ์™ธ์Œ๋ถ€ ์ฃผ๋ณ€์˜ ๊ตญ์†Œ์  ๋ณ€ํ™”๋กœ ๋‚˜ํƒ€๋‚˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•„ ๋‹จ์ˆœ ์ค‘์•™ ํฌ๋กญ๋งŒ์œผ๋กœ๋Š” ํ•ด๋‹น ๋‹จ์„œ๊ฐ€ ๋ˆ„๋ฝ๋˜๊ฑฐ๋‚˜ ๋ฐฐ๊ฒฝ ์ •๋ณด๊ฐ€ ๊ณผ๋„ํ•˜๊ฒŒ ํฌํ•จ๋  ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” SAM ๊ธฐ๋ฐ˜ ๋ชจ๋ˆ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•ด ๊ฐ์ฒด ์ค‘์‹ฌ์˜ ROI๋ฅผ ์ถ”์ถœํ•˜๊ณ  ๋ถ„๋งŒ ๋‹จ์„œ๊ฐ€ ์ง‘์ค‘๋˜๋Š” ์™ธ์Œ๋ถ€ ์˜์—ญ์„ ํฌํ•จํ•˜๋„๋ก ์ •๋ฐ€ ํฌ๋กญ์„ ์ ์šฉํ•˜์˜€๋‹ค.

SAM์€ ๊ฐ์ฒด ๋ฐ ์˜์—ญ์˜ ๋ถ„ํ•  ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ถ„ํ•  ๋ชจ๋ธ์ด๋‹ค[11]. ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ ํŠน์ง•์„ ์ถ”์ถœํ•˜๋Š” ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”์™€ ์ถ”์ถœ๋œ ํŠน์ง•์„ ์ด์šฉํ•ด ํ”ฝ์…€ ๋‹จ์œ„ ๋ถ„ํ•  ๋งˆ์Šคํฌ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋งˆ์Šคํฌ ๋””์ฝ”๋”๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”๋Š” ์ž…๋ ฅ ์˜์ƒ์˜ ๊ณต๊ฐ„์  ์ •๋ณด๋ฅผ ๋ฐ˜์˜ํ•œ ๊ณ ์ฐจ์› ํŠน์ง• ํ‘œํ˜„์„ ์ƒ์„ฑํ•˜๋ฉฐ ๋งˆ์Šคํฌ ๋””์ฝ”๋”๋Š” ํ•ด๋‹น ํŠน์ง•์œผ๋กœ๋ถ€ํ„ฐ ๊ฐ์ฒด ๊ฒฝ๊ณ„๋ฅผ ์ถ”์ •ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ๊ฐ์ฒด์˜ ํ˜•ํƒœ ์ •๋ณด๋ฅผ ํฌํ•จํ•˜๋Š” ๋ถ„ํ•  ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด์— ๋”ฐ๋ผ ์ž…๋ ฅ ์ด๋ฏธ์ง€์— SAM์„ ์ ์šฉํ•˜์—ฌ ๋ชจ๋ˆ ๋ถ„ํ•  ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ดํ›„ ๋ชจ๋ˆ์œผ๋กœ ๋ถ„๋ฅ˜๋œ ํ”ฝ์…€ ์ขŒํ‘œ์˜ ์ง‘ํ•ฉ์ธ $\Omega$์œผ๋กœ๋ถ€ํ„ฐ ๋ชจ๋ˆ ๊ฐ์ฒด์˜ ์ค‘์‹ฌ์  $c=(c_x, c_y)$์„ ๊ณ„์‚ฐํ•œ๋‹ค. ์ค‘์‹ฌ์  ์ขŒํ‘œ์ธ $(c_x, c_y)$๋Š” $\Omega$์— ํฌํ•จ๋œ ํ”ฝ์…€ ์ขŒํ‘œ์˜ ์‚ฐ์ˆ  ํ‰๊ท ์œผ๋กœ ์‹(1)์— ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

(1)
$c_x = \frac{1}{|\Omega|} \sum_{(x,y) \in \Omega} x, \quad c_y = \frac{1}{|\Omega|} \sum_{(x,y) \in \Omega} y$

๊ณ„์‚ฐ๋œ ์ค‘์‹ฌ์  $c$๋Š” ๋ชจ๋ˆ ๊ฐ์ฒด์˜ ์ „๋ฐ˜์ ์ธ ์œ„์น˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฏ€๋กœ ROI๋ฅผ ๊ฐ์ฒด ์ขŒํ‘œ๊ณ„ ๊ธฐ์ค€์œผ๋กœ ์ •๋ ฌํ•˜๊ธฐ ์œ„ํ•œ ๊ธฐ์ค€์ ์œผ๋กœ ํ™œ์šฉ๋œ๋‹ค. ๊ทธ ํ›„ ๋ชจ๋ˆ์˜ ์™ธ์Œ๋ถ€ ์˜์—ญ์ด ROI์˜ ์ค‘์‹ฌ์— ๋ณด๋‹ค ์ž˜ ํฌํ•จ๋˜๋„๋ก ์ค‘์‹ฌ์  $c$์˜ $x$์ขŒํ‘œ๋Š” ์œ ์ง€ํ•˜๊ณ  $y$์ขŒํ‘œ๋Š” ๋ถ„ํ•  ๋งˆ์Šคํฌ์˜ ์ตœํ•˜๋‹จ ๊ฐ’์œผ๋กœ ์„ค์ •ํ•˜์—ฌ ํ•˜๋ถ€ ๊ธฐ์ค€์  $b=(b_x, b_y)$์„ ์‚ฐ์ถœํ•œ๋‹ค. ์ด๋Š” ๋‹ค์Œ ์‹(2)์— ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

(2)
$b_x = c_x, \quad b_y = \max\{y | (x,y) \in \Omega\}$

์ด์™€ ๊ฐ™์ด ์ •์˜๋œ ๊ธฐ์ค€์  $b$๋ฅผ ROI ์ค‘์‹ฌ์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ˆ์ด ํ”„๋ ˆ์ž„ ๋‚ด์—์„œ ์ขŒ์šฐ๋กœ ์ด๋™ํ•˜๊ฑฐ๋‚˜ ์ž์„ธ๊ฐ€ ๋ณ€ํ•˜๋”๋ผ๋„ ROI๊ฐ€ ๋ชจ๋ˆ ๊ฐ์ฒด๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ •๋ ฌ๋˜๋ฉฐ ๋™์‹œ์— ์™ธ์Œ๋ถ€ ์˜์—ญ์ด ROI ๋‚ด์— ํฌํ•จ๋˜๋„๋ก ์ค‘์‹ฌ์ด ๋ณด์ •๋œ๋‹ค. ์ดํ›„ ROI์˜ ๊ฐ€๋กœ ๋ฐ ์„ธ๋กœ ํฌ๊ธฐ๋Š” ๋ชจ๋“  ์‹คํ—˜์—์„œ ๋™์ผํ•œ ๊ณ ์ •๊ฐ’์œผ๋กœ ์„ค์ •ํ•˜์˜€์œผ๋ฉฐ, ์ค‘์•™ ๊ณ ์ • ํฌ๋กญ ๋ฐฉ์‹์—์„œ๋„ ๋™์ผํ•œ ํฌ๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ณต์ •ํ•˜๊ฒŒ ๋น„๊ตํ•˜์˜€๋‹ค. ๋‹ค์Œ ๊ทธ๋ฆผ 2๋Š” ๋ชจ๋ˆ ๋ถ„ํ•  ๋งˆ์Šคํฌ์—์„œ ๊ณ„์‚ฐ๋œ ์ค‘์‹ฌ์  $c$์™€ ํ•˜๋ถ€ ๊ธฐ์ค€์  $b$๋ฅผ ์ด์šฉํ•ด ROI๋ฅผ ์ •์˜ํ•˜๋Š” ์˜ˆ์‹œ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ตœ์ข…์ ์œผ๋กœ ์ถ”์ถœ๋œ ROI๋Š” ๋ถ„๋ฅ˜ ๋ชจ๋ธ ์ž…๋ ฅ ํฌ๊ธฐ์— ๋งž๊ฒŒ ํฌ๊ธฐ๋ฅผ ์กฐ์ •ํ•˜๊ณ  ์ •๊ทœํ™”ํ•œ ํ›„ CNN ๊ธฐ๋ฐ˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ๋ชจ๋“  ROI ์ด๋ฏธ์ง€๋Š” 384ร—384๋กœ ๋ฆฌ์‚ฌ์ด์ฆˆํ•˜์˜€์œผ๋ฉฐ, ์ตœ์ข… MIL ์„ค์ •์—์„œ๋Š” ๊ฐ ์ด๋ฏธ์ง€๋ฅผ 4ร—4 ํŒจ์น˜๋กœ ๋ถ„ํ• ํ•˜์—ฌ ๊ฐ ํŒจ์น˜์˜ ํฌ๊ธฐ๊ฐ€ 96ร—96์ด ๋˜๋„๋ก ๊ตฌ์„ฑํ•˜์˜€๋‹ค.

๊ทธ๋ฆผ 2. SAM ๊ธฐ๋ฐ˜ ๋ถ„ํ•  ๋งˆ์Šคํฌ๋กœ๋ถ€ํ„ฐ ์ค‘์‹ฌ์ ๊ณผ ํ•˜๋ถ€ ๊ธฐ์ค€์ ์„ ์ด์šฉํ•œ ROI ํฌ๋กญ ๊ณผ์ •

Fig. 2. SAM-based ROI cropping process using centroid and bottom-center point derived from the segmentation mask

../../Resources/kiee/KIEE.2026.75.6.1383/fig2.png

3.3 ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ

3.3.1 ConvNeXt

๋ชจ๋ˆ ๋ถ„๋งŒ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ๋ชจ๋ˆ ํ•˜๋ถ€์— ๊ด€์ฐฐ๋˜๋Š” ๋ฏธ์„ธํ•œ ํ˜•ํƒœ์  ํŠน์ง•๊ณผ ์ž๋ˆ์˜ ๋ถ€๋ถ„ ์ถœํ˜„๊ณผ ๊ฐ™์€ ๊ตญ์†Œ์  ์ •๋ณด๊ฐ€ ๋ถ„๋ฅ˜์— ์ค‘์š”ํ•˜๋‹ค. ์ด์— ๋”ฐ๋ผ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ROI์—์„œ ๊ด€์ฐฐ๋˜๋Š” ๊ตญ์†Œ์  ํŠน์ง•์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด CNN ๊ธฐ๋ฐ˜ ConvNeXt ๋ชจ๋ธ์„ ์ฑ„ํƒํ•˜์˜€๋‹ค[12]. ConvNeXt๋Š” ResNet-50 ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์กฐ์  ์„ค๊ณ„์™€ ํ•™์Šต ํšจ์œจ ์ธก๋ฉด์˜ ๊ฐœ์„ ์„ ํ†ตํ•ด ์„ฑ๋Šฅ์„ ๊ณ ๋„ํ™”ํ•œ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ด๋‹ค. ์—ฌ๋Ÿฌ stage์™€ block์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, stage๊ฐ€ ๊นŠ์–ด์งˆ์ˆ˜๋ก ํŠน์ง•๋งต์˜ ๊ณต๊ฐ„ ํ•ด์ƒ๋„๋Š” ๊ฐ์†Œํ•˜๊ณ  ์ฑ„๋„ ์ˆ˜๋Š” ์ฆ๊ฐ€ํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ €์ˆ˜์ค€์˜ ์œค๊ณฝ ๋ฐ ์งˆ๊ฐ ์ •๋ณด๋ถ€ํ„ฐ ๊ณ ์ˆ˜์ค€์˜ ํ˜•ํƒœ์  ํŠน์ง•๊นŒ์ง€ ๋‹จ๊ณ„์ ์œผ๋กœ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๋ถ„๋งŒ ์—ฌ๋ถ€์™€ ๊ด€๋ จ๋œ ๊ตญ์†Œ ํŒจํ„ด์„ ์•ˆ์ •์ ์œผ๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋‹ค. Block ๋‚ด๋ถ€์—๋Š” depthwise convolution์„ ์‚ฌ์šฉํ•ด ์ฑ„๋„๋ณ„ ๊ณต๊ฐ„ ํŠน์ง•์„ ํšจ์œจ์ ์œผ๋กœ ์ถ”์ถœํ•œ๋‹ค[13]. ์ดํ›„ pointwise convolution์„ ์ ์šฉํ•˜์—ฌ ์ค‘๊ฐ„ ๋‹จ๊ณ„์—์„œ ์ฑ„๋„ ์ฐจ์›์„ ํ™•์žฅํ•œ ๋’ค ๋‹ค์‹œ ์ถ•์†Œํ•˜๋Š” ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ํ‘œํ˜„๋ ฅ์„ ํ™•๋ณดํ•˜๋ฉด์„œ๋„ ์—ฐ์‚ฐ๋Ÿ‰์„ ์ ˆ๊ฐํ•œ๋‹ค. ๋˜ํ•œ ๋„“์€ ์ˆ˜์šฉ ์˜์—ญ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด 7ร—7 ์ปค๋„์„ ์ ์šฉํ•˜์—ฌ ๋” ๋„“์€ ๊ณต๊ฐ„์  ๋ฌธ๋งฅ์„ ํฌ์ฐฉํ•จ์œผ๋กœ์จ ํ•˜๋ถ€ ROI ๋‚ด์—์„œ ๋ฏธ์„ธํ•œ ํŠน์ง•์„ ์ฃผ๋ณ€ ๊ตฌ์กฐ์  ์ •๋ณด์™€ ํ•จ๊ป˜ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋‹ค. ConvNeXt๋Š” ๊ฐ stage ๊ฐ„์˜ block ๋น„์œจ์„ Swin Transformer์˜ ์„ค๊ณ„ ์›์น™์— ๋”ฐ๋ผ 1:1:3:1๋กœ ์„ค์ •ํ•˜์˜€๋‹ค[14]. Stem ๋‹จ๊ณ„์—์„œ๋Š” stride 4์˜ 4ร—4 convolution layer๋ฅผ ํ†ตํ•ด ๋‹ค์šด์ƒ˜ํ”Œ๋ง์„ ์ˆ˜ํ–‰ํ•œ ๋’ค ์ •๊ทœํ™”๋ฅผ ์ ์šฉํ•ด ์•ˆ์ •์ ์ธ ์ดˆ๊ธฐ ํŠน์ง•์„ ํ•™์Šตํ•˜๊ณ  ์ดํ›„ stage์˜ ์—ฐ์‚ฐ ๋ถ€๋‹ด์„ ์ค„์ธ๋‹ค. ConvNeXt๋Š” ๋ชจ๋ธ ๊ทœ๋ชจ์— ๋”ฐ๋ผ tiny, small, base, large๋กœ ๊ตฌ๋ถ„๋œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์‹คํ—˜ ํ™˜๊ฒฝ์˜ ๋ฉ”๋ชจ๋ฆฌ ๋ฐ ์—ฐ์‚ฐ ๋น„์šฉ์„ ๊ณ ๋ คํ•˜์—ฌ ConvNeXt-Base๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

3.3.2 Multiple Instance Learning ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ROI ์ •๋ฐ€ ํฌ๋กญ์„ ํ†ตํ•ด ๋ฐฐ๊ฒฝ ์˜ํ–ฅ์„ ์ตœ์†Œํ™”ํ•˜๊ณ  ConvNeXt๋ฅผ ํ™œ์šฉํ•ด ๊ตญ์†Œ์  ํŠน์ง•์„ ํ•™์Šตํ•˜์˜€๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ž…๋ ฅ์„ ๋‹จ์ผ ์ด๋ฏธ์ง€๋กœ ์ฒ˜๋ฆฌํ•  ๊ฒฝ์šฐ, ์„œ๋กœ ๋‹ค๋ฅธ ์˜์—ญ์—์„œ ์‚ฐ์ถœ๋˜๋Š” ์ •๋ณด๊ฐ€ ํ•˜๋‚˜์˜ ์˜ˆ์ธก์œผ๋กœ ์ง‘๊ณ„๋˜๋ฉด์„œ ๋ถ„๋ฅ˜์— ์œ ์˜๋ฏธํ•œ ์˜์—ญ์˜ ์ถœ๋ ฅ์ด ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค. ์ด์— ๋”ฐ๋ผ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ ์—ฌ๋Ÿฌ ์ธ์Šคํ„ด์Šค๋กœ ๊ตฌ์„ฑํ•˜๊ณ  ConvNeXt๋กœ๋ถ€ํ„ฐ ์–ป์€ ์ธ์Šคํ„ด์Šค๋ณ„ ์ถœ๋ ฅ๊ฐ’์„ ๊ฒฐํ•ฉํ•˜๋Š” MIL ๋ฐฉ์‹์„ ์ ์šฉํ•˜์˜€๋‹ค[15].

MIL ๋ฐฉ์‹์€ ํ•˜๋‚˜์˜ ์ž…๋ ฅ์— ๋Œ€ํ•ด ํ•˜๋‚˜์˜ ๋ผ๋ฒจ๋งŒ ์ฃผ์–ด์ง€๋Š” ์ƒํ™ฉ์—์„œ ์ž…๋ ฅ์„ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ธ์Šคํ„ด์Šค๋กœ ๊ตฌ์„ฑํ•˜์—ฌ ํ•™์Šตํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ์ฆ‰ ๊ฐœ๋ณ„ ์ธ์Šคํ„ด์Šค์—๋Š” ์ •๋‹ต ๋ผ๋ฒจ์„ ๋ถ€์—ฌํ•˜์ง€ ์•Š๊ณ  ์ž…๋ ฅ ์ „์ฒด์— ๋Œ€ํ•œ ๋ผ๋ฒจ๋งŒ์„ ์ด์šฉํ•ด ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ ํŒจ์น˜ ๋‹จ์œ„ ์ธ์Šคํ„ด์Šค๋กœ ๊ตฌ์„ฑํ•˜๊ณ  ๊ฐ ์ธ์Šคํ„ด์Šค๋ฅผ ConvNeXt์— ์ž…๋ ฅํ•˜์—ฌ ์ธ์Šคํ„ด์Šค๋ณ„ ์ถœ๋ ฅ๊ฐ’์„ ๊ณ„์‚ฐํ•œ ๋’ค ๊ฒฐํ•ฉ ๋‹จ๊ณ„์—์„œ ์ด๋ฅผ ์ง‘๊ณ„ํ•˜์—ฌ ์ตœ์ข… ์˜ˆ์ธก์„ ์‚ฐ์ถœํ•˜์˜€๋‹ค. ๋‹ค์Œ ๊ทธ๋ฆผ 3์€ ๋ณธ ์—ฐ๊ตฌ์—์„œ ์ ์šฉํ•œ MIL ๊ธฐ๋ฐ˜ ConvNeXt ๋ชจ๋ธ ๊ตฌ์กฐ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

๊ทธ๋ฆผ 3. Multiple Instance Learning ๊ธฐ๋ฐ˜ ConvNeXt ๋ชจ๋ธ ๊ตฌ์กฐ

Fig. 3. Multiple Instance Learning-based ConvNeXt model architecture

../../Resources/kiee/KIEE.2026.75.6.1383/fig3.png

ํŒจ์น˜ ๋‹จ์œ„ ์ž…๋ ฅ์„ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ์ ์—์„œ Transformer ๊ณ„์—ด ๋ชจ๋ธ๊ณผ ์œ ์‚ฌํ•ด ๋ณด์ผ ์ˆ˜ ์žˆ์œผ๋‚˜ Transformer๋Š” ํŒจ์น˜ ํ† ํฐ ๊ฐ„ ๊ด€๊ณ„๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ตฌ์กฐ์ธ ๋ฐ˜๋ฉด, MIL ๊ฒฐํ•ฉ ๋ฐฉ์‹์€ ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ํŒจ์น˜๋กœ ๋ถ„ํ• ํ•œ ๋’ค ๊ฐ ํŒจ์น˜๋ฅผ ConvNeXt์— ์ž…๋ ฅํ•˜์—ฌ ํŒจ์น˜ ๋‹จ์œ„ ์˜ˆ์ธก ์ ์ˆ˜๋ฅผ ์‚ฐ์ถœํ•˜๊ณ  ์ด๋ฅผ ์ง‘๊ณ„ํ•˜์—ฌ ์ตœ์ข… ์˜ˆ์ธก์„ ๊ณ„์‚ฐํ•œ๋‹ค[16]. ๋”ฐ๋ผ์„œ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ํŒจ์น˜ ๊ฐ„ ๊ด€๊ณ„ ํ•™์Šต ๋Œ€์‹  ํŒจ์น˜ ๋‹จ์œ„ ์ถœ๋ ฅ์˜ ๊ฒฐํ•ฉ์„ ํ†ตํ•ด ์ตœ์ข… ์˜ˆ์ธก์„ ์‚ฐ์ถœํ•˜์˜€๋‹ค.

์ธ์Šคํ„ด์Šค ์˜ˆ์ธก์˜ ํ†ตํ•ฉ ๋ฐฉ์‹์œผ๋กœ๋Š” top-k ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹๊ณผ attention ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์ด ์กด์žฌํ•œ๋‹ค[17]. Top-k ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์€ ์ธ์Šคํ„ด์Šค ์˜ˆ์ธก ์ ์ˆ˜ ์ค‘ ์ƒ์œ„ k๊ฐœ๋ฅผ ์„ ํƒํ•œ ๋’ค, ์„ ํƒ๋œ ์ธ์Šคํ„ด์Šค๋“ค์˜ ์˜ˆ์ธก์„ ํ†ตํ•ฉํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” k ๊ฐ’์„ ๊ฒฝํ—˜์ ์œผ๋กœ 5๋กœ ์„ค์ •ํ•˜์˜€๋‹ค. ์ด๋Š” ์ค‘์š”ํ•œ ๊ตญ์†Œ ์ •๋ณด๋ฅผ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜ํ•˜๋ฉด์„œ๋„ ๊ณผ๋„ํ•œ ๋ฐฐ๊ฒฝ ์ •๋ณด์˜ ์œ ์ž…์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ์„ ํƒ์ด๋ฉฐ, k ๊ฐ’์— ๋Œ€ํ•œ ์ฒด๊ณ„์ ์ธ ๋ฏผ๊ฐ๋„ ๋ถ„์„์€ ํ–ฅํ›„ ์—ฐ๊ตฌ์—์„œ ์ถ”๊ฐ€๋กœ ๊ฒ€์ฆํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค. ๋ฐ˜๋ฉด์— attention ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์€ ํŒจ์น˜๋ณ„ ์ค‘์š”๋„๋ฅผ ํ•™์Šตํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๊ณ  ๊ฐ€์ค‘ํ•ฉ์œผ๋กœ ์ธ์Šคํ„ด์Šค ์˜ˆ์ธก์„ ํ†ตํ•ฉํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋‘ ๊ฒฐํ•ฉ ๋ฐฉ์‹์„ ๋ชจ๋‘ ์ ์šฉํ•˜์—ฌ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์˜€๋‹ค.

4. ์—ฐ๊ตฌ ๊ฒฐ๊ณผ

๋ชจ๋“  ๋ชจ๋ธ ํ•™์Šต๊ณผ ์„ฑ๋Šฅ ๋ถ„์„์€ Python 3.10.13 ํ™˜๊ฒฝ์—์„œ PyTorch 2.1.2์™€ CUDA 11.8์„ ์‚ฌ์šฉํ•˜์—ฌ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ์‹คํ—˜์€ NVIDIA TITAN RTX GPU์™€ 64GB RAM์„ ๊ฐ–์ถ˜ ์‹œ์Šคํ…œ์—์„œ ์ง„ํ–‰ํ•˜์˜€์œผ๋ฉฐ ๋น„๊ต์˜ ์ผ๊ด€์„ฑ์„ ์œ„ํ•ด ๋ชจ๋“  ๋ชจ๋ธ์— ๋™์ผํ•œ ํ•™์Šต ์„ค์ •์„ ์ ์šฉํ•˜์˜€๋‹ค. Optimizer๋Š” AdamW๋ฅผ ์‚ฌ์šฉํ•˜์˜€๊ณ , ๋ฐฐ์น˜ ํฌ๊ธฐ๋Š” 32, ํ•™์Šต๋ฅ ์€ 1e-4๋กœ ์„ค์ •ํ•˜์˜€๋‹ค. ์‹คํ—˜ ํ™˜๊ฒฝ ๋ฐ ํ•™์Šต ์„ค์ •์€ ๋‹ค์Œ ํ‘œ 2์— ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค. ์„ฑ๋Šฅ ํ‰๊ฐ€๋Š” ํ˜ผ๋™ํ–‰๋ ฌ์„ ๊ธฐ๋ฐ˜์œผ๋กœ TP(True Positive), FP(False Positive), TN(True Negative), FN(False Negative)์„ ์‚ฐ์ถœํ•˜๊ณ  precision, recall, F1-score, accuracy์˜ ๋„ค ๊ฐ€์ง€ ์ง€ํ‘œ๋ฅผ ๊ณ„์‚ฐํ•˜์˜€๋‹ค. Precision์€ ์–‘์„ฑ์œผ๋กœ ์˜ˆ์ธกํ•œ ์ƒ˜ํ”Œ ์ค‘ ์‹ค์ œ ์–‘์„ฑ์˜ ๋น„์œจ์„ ์˜๋ฏธํ•˜๋ฉฐ recall์€ ์‹ค์ œ ์–‘์„ฑ ์ƒ˜ํ”Œ์„ ์–‘์„ฑ์œผ๋กœ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๊ฒ€์ถœํ•œ ๋น„์œจ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. F1-score๋Š” precision๊ณผ recall์˜ ์กฐํ™”ํ‰๊ท ์ด๋ฉฐ accuracy๋Š” ์ „์ฒด ์ƒ˜ํ”Œ ์ค‘ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ถ„๋ฅ˜ํ•œ ๋น„์œจ์„ ์˜๋ฏธํ•œ๋‹ค. ๋ถ„๋งŒ ๋ถ„๋ฅ˜์—์„œ๋Š” ์‹ค์ œ ๋ถ„๋งŒ ์ƒํ™ฉ์„ ๋†“์น˜๋Š” ์˜ค๋ฅ˜๋ฅผ ์ค„์ด๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋ฏ€๋กœ recall์„ ์ค‘์ ์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ๋˜ํ•œ recall๋งŒ์œผ๋กœ๋Š” ์˜ค๊ฒ€์ถœ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜ํ•˜๊ธฐ ์–ด๋ ต๊ณ , ๋‘ ํด๋ž˜์Šค ๊ฐ„ ํ‘œ๋ณธ ์ˆ˜ ์ฐจ์ด๊ฐ€ ๊ทน์‹ฌํ•œ ์ˆ˜์ค€์€ ์•„๋‹ˆ์ง€๋งŒ ์ผ๋ถ€ ์กด์žฌํ•˜๋ฏ€๋กœ accuracy๋งŒ์œผ๋กœ๋Š” ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ์ฐจ์ด๋ฅผ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜ํ•˜๊ธฐ ์–ด๋ ค์šธ ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” precision๊ณผ recall์„ ํ•จ๊ป˜ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ๋Š” F1-score๋ฅผ ์ฃผ์š” ํ‰๊ฐ€์ง€ํ‘œ๋กœ ํ™œ์šฉํ•˜์˜€๋‹ค. ์ž์„ธํ•œ ์ˆ˜์‹์€ ์‹(3), (4), (5), (6)์— ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค. ๋˜ํ•œ ํ‰๊ฐ€์ง€ํ‘œ์˜ ์‹ ๋ขฐ์„ฑ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด ๋™์ผํ•œ ๋ฐ์ดํ„ฐ ๋ถ„ํ• ์—์„œ 3ํšŒ ๋ฐ˜๋ณต ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜์˜€์œผ๋ฉฐ ๋ชจ๋“  ์„ฑ๋Šฅ ์ง€ํ‘œ๋Š” ๊ฐ ๋ฐ˜๋ณต ์‹คํ—˜์˜ ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ท ํ•˜์˜€๋‹ค. ํ‘œ 3๊ณผ ํ‘œ 4์—์„œ ยฑ๋Š” ๊ฐ ๋ฐ˜๋ณต ์‹คํ—˜ ๊ฒฐ๊ณผ์˜ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ์˜๋ฏธํ•œ๋‹ค.

(3)
$Precision = \frac{TP}{TP + FP}$
(4)
$Recall = \frac{TP}{TP + FN}$
(5)
$F1-score = 2 \times \frac{Precision \times Recall}{Precision + Recall}$
(6)
$Accuracy = \frac{TP + TN}{TP + FN + FP + TN}$

ํ‘œ 2. ์‹คํ—˜ ํ™˜๊ฒฝ ๋ฐ ํ•™์Šต ์„ค์ •

Table 2. Experimental setup and training settings

Component Setting
CPU / GPU Intel Xeon W-2133
NVIDIA TITAN RTX
Python / PyTorch 3.10.13 / 2.1.2
Batch Size 32
Learning rate 1ร—10-4
Optimizer AdamW

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ž…๋ ฅ ๊ตฌ์„ฑ ๋ฐฉ์‹์— ๋”ฐ๋ฅธ ๋ถ„๋งŒ ์—ฌ๋ถ€ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ์›๋ณธ ์ž…๋ ฅ, ์›๋ณธ ์ด๋ฏธ์ง€์˜ ์ค‘์•™์„ ์ผ์ • ํฌ๊ธฐ๋กœ ๊ณ ์ • ํฌ๋กญํ•˜๋Š” ๋ฐฉ์‹, ๊ทธ๋ฆฌ๊ณ  SAM ๊ธฐ๋ฐ˜ ๋ชจ๋ˆ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋กœ ๊ฐ์ฒด ์ค‘์‹ฌ ROI๋ฅผ ์ถ”์ถœํ•ด ํฌ๋กญํ•˜๋Š” ๋ฐฉ์‹์˜ ์„ธ ๊ฐ€์ง€ ์„ค์ •์„ ๋น„๊ตํ•˜์˜€๋‹ค. ๋‹ค์Œ ํ‘œ 3์€ ํฌ๋กญ ๋ฐฉ์‹์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ ์ง€ํ‘œ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, ์ค‘์•™ ๊ณ ์ • ํฌ๋กญ๊ณผ SAM ๊ธฐ๋ฐ˜ ROI ํฌ๋กญ์„ ์ ์šฉํ•œ ์„ค์ •์—์„œ ์›๋ณธ ์ž…๋ ฅ ๋Œ€๋น„ recall๊ณผ F1-score๊ฐ€ ์•ฝ 2 ํผ์„ผํŠธ ํฌ์ธํŠธ ํ–ฅ์ƒ๋˜์—ˆ๋‹ค. ํŠนํžˆ SAM ๊ธฐ๋ฐ˜ ROI ํฌ๋กญ์˜ recall์€ 84.57%๋กœ 2.69 ํผ์„ผํŠธ ํฌ์ธํŠธ ํ–ฅ์ƒ๋˜์—ˆ๊ณ , F1-score๋Š” 84.62%๋กœ 2.65 ํผ์„ผํŠธ ํฌ์ธํŠธ ํ–ฅ์ƒ๋˜์—ˆ๋‹ค. ์ด๋Š” SAM์œผ๋กœ ๋ชจ๋ˆ ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•œ ๋’ค ํ•˜๋ถ€ ์˜์—ญ์ด ํฌํ•จ๋˜๋„๋ก ROI๋ฅผ ์ •๋ฐ€ํ•˜๊ฒŒ ํฌ๋กญํ•œ ๊ฒƒ์ด ์„ฑ๋Šฅ ํ–ฅ์ƒ์œผ๋กœ ์ด์–ด์ง„ ๊ฒƒ์œผ๋กœ ํ•ด์„๋œ๋‹ค. ๊ณ ์ • ์ค‘์•™ ํฌ๋กญ์€ ๋ชจ๋ˆ์˜ ์œ„์น˜ ๋ฐ ์ž์„ธ ๋ณ€ํ™”์— ๋”ฐ๋ผ ํ•˜๋ถ€ ์˜์—ญ์ด ROI์— ๋ถˆ์™„์ „ํ•˜๊ฒŒ ํฌํ•จ๋  ์ˆ˜ ์žˆ์œผ๋‚˜ SAM ๊ธฐ๋ฐ˜ ROI ํฌ๋กญ์€ ํ•˜๋ถ€ ์˜์—ญ์ด ๋ณด๋‹ค ์•ˆ์ •์ ์œผ๋กœ ํฌํ•จ๋˜๋„๋ก ํ•œ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ SAM ๊ธฐ๋ฐ˜ ROI ํฌ๋กญ์€ ๋ถ„๋งŒ ์—ฌ๋ถ€ ๋ถ„๋ฅ˜์— ์œ ์˜๋ฏธํ•œ ์ •๋ณด๊ฐ€ ์ผ๊ด€๋˜๊ฒŒ ์ œ๊ณต๋˜๋ฉด์„œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๊ธฐ์—ฌํ•˜์˜€๋‹ค.

ํ‘œ 3. ํฌ๋กญ ๋ฐฉ์‹์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ ์ง€ํ‘œ ๊ฒฐ๊ณผ(๋‹จ์œ„: %)

Table 3. Performance metrics results by cropping method(unit: %)

Method Precision Recall F1-score Accuracy
Original 82.61
ยฑ0.77
81.88
ยฑ1.09
81.97
ยฑ0.90
82.25
ยฑ0.63
Fixed center crop 84.85
ยฑ1.36
83.95
ยฑ0.37
84.12
ยฑ0.45
84.37
ยฑ0.59
SAM-based
ROI crop
84.94
ยฑ1.55
84.57
ยฑ1.03
84.62
ยฑ1.25
84.78
ยฑ1.30

๋˜ํ•œ, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ConvNeXt์— MIL ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์„ ์ ์šฉํ–ˆ์„ ๋•Œ ๊ฐ ๋ชจ๋ธ๊ณผ ๊ฒฐํ•ฉ ๋ฐฉ์‹์— ๋”ฐ๋ฅธ ๋ถ„๋งŒ ์—ฌ๋ถ€ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ ๋ณ€ํ™”๋ฅผ ๋น„๊ตํ•˜์˜€๋‹ค. ํ‘œ 4๋Š” SAM ๊ธฐ๋ฐ˜ ROI ํฌ๋กญ์„ ๋™์ผํ•˜๊ฒŒ ์ ์šฉํ•œ ์ž…๋ ฅ์—์„œ Vision Transformer, EfficientNetV2, ConvNeXt ๋‹จ์ผ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ๊ณผ MIL ๊ฒฐํ•ฉ ๋ฐฉ์‹์ธ top-k ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐ attention ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ์„ ์ ์šฉํ•œ ConvNeXt ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ์ง€ํ‘œ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ๊ณต์ •ํ•œ ๋น„๊ต๋ฅผ ์œ„ํ•ด ํ‘œ 4์˜ ๋ชจ๋“  ๋ชจ๋ธ์€ ๋™์ผํ•œ SAM ๊ธฐ๋ฐ˜ ROI ์ž…๋ ฅ๊ณผ ๋™์ผํ•œ ํ•™์Šต ์„ค์ •์—์„œ ํ•™์Šต ๋ฐ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ๋˜ํ•œ ๋ณธ ์—ฐ๊ตฌ๋Š” ์ด๋ฏธ์ง€ ๋‹จ์œ„ ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋ฏ€๋กœ, ๊ฐ์ฒด ํƒ์ง€ ๊ฒฐ๊ณผ์™€ ํ›„์† ๊ทœ์น™ ๊ธฐ๋ฐ˜ ํŒ๋‹จ์„ ๊ฒฐํ•ฉํ•˜๋Š” YOLO ๊ณ„์—ด ๋ฐฉ๋ฒ•์€ ๋น„๊ต์šฉ baseline์— ํฌํ•จํ•˜์ง€ ์•Š์•˜๋‹ค. ๋จผ์ € ๋‹จ์ผ ๋ชจ๋ธ ๋น„๊ต์—์„œ ConvNeXt๋Š” recall 84.57%์™€ F1-score 84.62%๋กœ ๋‹จ์ผ backbone ์ค‘ ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ์ด๋Š” ConvNeXt๊ฐ€ ๋‹ค๋ฅธ ๋‹จ์ผ ๋ชจ๋ธ๋“ค์— ๋น„ํ•ด ํŠน์ง•์„ ๋ณด๋‹ค ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ–ˆ์Œ์„ ๋ณด์—ฌ์ค€๋‹ค. ์ด์— ๋”ฐ๋ผ MIL ๊ฒฐํ•ฉ ์‹คํ—˜์—์„œ๋Š” ConvNeXt๋ฅผ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ๋กœ ์„ค์ •ํ•˜์˜€๋‹ค. ์ดํ›„ MIL ๊ธฐ๋ฐ˜ ์‹คํ—˜์—์„œ๋Š” ConvNeXt์— top-k ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ๊ณผ attention ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ์„ ๊ฐ๊ฐ ์ ์šฉํ•˜์—ฌ ๊ฒฐํ•ฉ ๋ฐฉ์‹์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์˜€๋‹ค. MIL ๊ฒฐํ•ฉ์„ ์ ์šฉํ•œ ConvNeXt ๋ชจ๋ธ์€ ConvNeXt ๋‹จ์ผ ๋ชจ๋ธ ๋Œ€๋น„ recall๊ณผ F1-score์—์„œ ํ–ฅ์ƒ๋˜๋Š” ๊ฒฝํ–ฅ์„ ๋ณด์˜€๋‹ค. ํŠนํžˆ attention ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์€ ๋ชจ๋“  ๋น„๊ต ๋ชจ๋ธ ์ค‘ ๊ฐ€์žฅ ๋†’์€ recall, F1-score, accuracy๋ฅผ ๊ธฐ๋กํ•˜์˜€์œผ๋ฉฐ, ConvNeXt ๋‹จ์ผ ๋ชจ๋ธ ๋Œ€๋น„ recall์€ 0.90 ํผ์„ผํŠธ ํฌ์ธํŠธ, F1-score๋Š” 1.06 ํผ์„ผํŠธ ํฌ์ธํŠธ ํ–ฅ์ƒ๋˜์—ˆ๋‹ค. ์ด๋Š” ๋ถ„๋งŒ ์ด๋ฏธ์ง€์—์„œ ์œ ์˜๋ฏธํ•œ ๋‹จ์„œ๊ฐ€ ํŠน์ • ํŒจ์น˜์—๋งŒ ๊ฐ•ํ•˜๊ฒŒ ๋‚˜ํƒ€๋‚˜๋Š” ๊ฒฝ์šฐ๋„ ์žˆ์ง€๋งŒ ์ž๋ˆ์˜ ๋ถ€๋ถ„ ์ถœํ˜„, ์™ธ์Œ๋ถ€ ์ฃผ๋ณ€ ํ˜•ํƒœ, ์ฒด์œ„ ๋ณ€ํ™” ๋“ฑ ๋ถ„๋ฅ˜์— ๊ธฐ์—ฌํ•˜๋Š” ์ •๋ณด๊ฐ€ ์—ฌ๋Ÿฌ ์˜์—ญ์— ๋ถ„์‚ฐ๋˜์–ด ๋‚˜ํƒ€๋‚  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. Top-k ๊ฒฐํ•ฉ ๋ฐฉ์‹์€ ์ผ๋ถ€ ์ธ์Šคํ„ด์Šค๋งŒ์„ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ๋‹จ์„œ๊ฐ€ ๋ถ„์‚ฐ๋œ ๊ฒฝ์šฐ ์ •๋ณด๊ฐ€ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค. ๋ฐ˜๋ฉด์— attention ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์€ ์ธ์Šคํ„ด์Šค๋ณ„ ์ค‘์š”๋„๋ฅผ ํ•™์Šตํ•˜์—ฌ ์—ฌ๋Ÿฌ ์˜์—ญ์˜ ์ •๋ณด๋ฅผ ํ•จ๊ป˜ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ์–ด ๋” ์•ˆ์ •์ ์ธ ์„ฑ๋Šฅ์œผ๋กœ ์ด์–ด์ง„ ๊ฒƒ์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

์ด๋Ÿฌํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์˜ ํ†ต๊ณ„์  ์œ ์˜์„ฑ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆํ•œ ๋ชจ๋ธ๊ณผ ๋น„๊ต ๋ชจ๋ธ ๊ฐ„์˜ Wilcoxon signed-rank test๋ฅผ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ์ œ์•ˆ ๋ชจ๋ธ์€ ๋น„๊ต ๋ชจ๋ธ ๋Œ€๋น„ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜ํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋ณด์˜€๊ณ , p-value๋Š” 0.001 ๋ฏธ๋งŒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ์ด๋Ÿฌํ•œ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ๊ฒฐ๊ณผ ์™ธ์—๋„ FLOPs์™€ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ชจ๋ธ ๋ณต์žก๋„๋ฅผ ๋น„๊ตํ•˜์˜€๋‹ค. ํ‘œ 5๋Š” ๊ฐ ๋ชจ๋ธ์˜ ์—ฐ์‚ฐ๋Ÿ‰๊ณผ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ ๋น„๊ตํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์‹ค์ œ FPS๋Š” ํ•˜๋“œ์›จ์–ด์™€ ๊ตฌํ˜„ ์กฐ๊ฑด์˜ ์˜ํ–ฅ์„ ํฌ๊ฒŒ ๋ฐ›์œผ๋ฏ€๋กœ, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ํ•˜๋“œ์›จ์–ด ๋…๋ฆฝ์ ์ธ ๋ณต์žก๋„ ์ง€ํ‘œ๋กœ FLOPs์™€ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ ๋ณด๊ณ ํ•˜์˜€๋‹ค. ๋น„๊ต ๋ชจ๋ธ ์ค‘ EfficientNetV2 ๋ชจ๋ธ์ด ๊ฐ€์žฅ ๋‚ฎ์€ ์—ฐ์‚ฐ๋Ÿ‰๊ณผ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ ๋ณด์˜€์œผ๋ฉฐ, Vision Transformer์™€ ConvNeXt ๋ชจ๋ธ๋„ ๋น„๊ต์  ์œ ์‚ฌํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋ƒˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ MIL ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์„ ์ ์šฉํ•œ ConvNeXt ๋ชจ๋ธ์€ ์ถ”๊ฐ€ ๋ชจ๋“ˆ๋กœ ์ธํ•ด FLOPs์™€ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•˜์˜€์œผ๋‚˜, ๊ทธ์™€ ํ•จ๊ป˜ recall๊ณผ F1-score์˜ ๊ฐœ์„ ์ด ํ™•์ธ๋˜์—ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ œ์•ˆ ๋ฐฉ๋ฒ•์˜ ์ด์ ์€ ์„ฑ๋Šฅ ํ–ฅ์ƒ๊ณผ ๊ณ„์‚ฐ ๋ณต์žก๋„ ์ฆ๊ฐ€ ์‚ฌ์ด์˜ ์ƒ์ถฉ ๊ด€๊ณ„๋ฅผ ํ•จ๊ป˜ ๊ณ ๋ คํ•˜์—ฌ ํ•ด์„ํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค.

ํ‘œ 4. MIL ๊ฒฐํ•ฉ ๋ฐฉ์‹์— ๋”ฐ๋ฅธ ๋ชจ๋ธ ์„ฑ๋Šฅ ์ง€ํ‘œ ๊ฒฐ๊ณผ(๋‹จ์œ„: %)

Table 4. Performance metrics of the models according to the MIL aggregation method (unit: %)

Method Precision Recall F1-score Accuracy
Vision
Transformer
81.53
ยฑ1.00
81.76
ยฑ0.97
81.60
ยฑ1.10
81.58
ยฑ1.14
EfficientNetV2 82.57
ยฑ1.17
81.68
ยฑ0.91
81.87
ยฑ0.89
82.19
ยฑ0.87
ConvNeXt 84.94
ยฑ1.55
84.57
ยฑ1.03
84.62
ยฑ1.25
84.78
ยฑ1.30
ConvNeXt
+MIL(Top-5)
85.80
ยฑ0.78
85.12
ยฑ1.09
85.14
ยฑ0.85
85.35
ยฑ1.56
ConvNeXt
+MIL(Attention)
86.47
ยฑ1.30
85.47
ยฑ1.16
85.68
ยฑ0.86
85.95
ยฑ1.61

ํ‘œ 5. ๊ฐ ๋ชจ๋ธ์˜ ์—ฐ์‚ฐ๋Ÿ‰ ๋ฐ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜ ๋น„๊ต ๊ฒฐ๊ณผ

Table 5. Comparison of computational cost and parameter counts across models

Method FLOPs Parameters
Vision Transformer 33.72G 85.64M
EfficientNetV2 30.65G 52.45M
ConvNeXt 30.70G 87.51M
ConvNeXt+MIL(Top-5) 45.96G 98.07M
ConvNeXt+MIL(Attention) 45.95G 97.54M

ํŒจ์น˜ ๋ถ„ํ•  ์ˆ˜์— ๋”ฐ๋ผ MIL ๊ธฐ๋ฐ˜ ConvNeXt ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ์–ด๋–ป๊ฒŒ ๋‹ฌ๋ผ์ง€๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด 3ร—3, 4ร—4, 5ร—5 patches ์กฐ๊ฑด์—์„œ ๋น„๊ต ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•˜์˜€์œผ๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ํ‘œ 6์— ์ œ์‹œํ•˜์˜€๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, 4ร—4 patches ์„ค์ •์ด ๊ฐ€์žฅ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค. ๋ฐ˜๋ฉด 3ร—3 patches ์„ค์ •์—์„œ๋Š” ํŒจ์น˜ ์ˆ˜๊ฐ€ ์ ์–ด ๊ตญ์†Œ์ ์ธ ํŠน์ง• ์ •๋ณด๋ฅผ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜ํ•˜๊ธฐ ์–ด๋ ค์› ๊ณ , 5ร—5 patches ์„ค์ •์—์„œ๋Š” ํŒจ์น˜๊ฐ€ ์ง€๋‚˜์น˜๊ฒŒ ์„ธ๋ถ„ํ™”๋˜์–ด ๊ฐ ํŒจ์น˜์— ํฌํ•จ๋˜๋Š” ์ •๋ณด๋Ÿ‰์ด ๊ฐ์†Œํ•จ์— ๋”ฐ๋ผ ์ „์ฒด์ ์ธ ๋ฌธ๋งฅ ์ •๋ณด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜์ง€ ๋ชปํ•œ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” MIL ๊ตฌ์กฐ์—์„œ ํŒจ์น˜ ๋ถ„ํ•  ์ˆ˜๊ฐ€ ๊ตญ์†Œ ์ •๋ณด์™€ ์ „์ฒด ๋ฌธ๋งฅ ์ •๋ณด ๊ฐ„์˜ ๊ท ํ˜•์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋ฉฐ, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” 4ร—4 patches ์„ค์ •์ด ๊ฐ€์žฅ ์ ์ ˆํ•จ์„ ๋ณด์—ฌ์ค€๋‹ค.

ํ‘œ 6. ํŒจ์น˜ ๋ถ„ํ•  ์ˆ˜์— ๋”ฐ๋ฅธ MIL ๊ธฐ๋ฐ˜ ConvNeXt ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ๊ฒฐ๊ณผ(๋‹จ์œ„: %)

Table 6. Performance metrics of the MIL-based ConvNeXt model according to the number of patch divisions (unit: %)

Method Precision Recall F1-score Accuracy
3ร—3 patches 86.00
ยฑ1.19
85.02
ยฑ1.21
85.04
ยฑ1.81
85.35
ยฑ1.55
4ร—4 patches 86.47
ยฑ1.30
85.47
ยฑ1.16
85.68
ยฑ0.86
85.95
ยฑ1.61
5ร—5 patches 85.34
ยฑ1.86
84.82
ยฑ1.88
84.96
ยฑ1.80
85.16
ยฑ1.70

5. ๊ฒฐ ๋ก 

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋ชจ๋ˆ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ๋ถ„๋งŒ ์—ฌ๋ถ€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋น„์ ‘์ด‰ ๋ถ„๋งŒ ์—ฌ๋ถ€ ๋ถ„๋ฅ˜ ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋ˆ์‚ฌ ํ™˜๊ฒฝ์—์„œ ๋ชจ๋ˆ์˜ ์œ„์น˜ ๋ฐ ์ž์„ธ ๋ณ€ํ™”๋กœ ์ธํ•ด ๋ถ„๋งŒ ๊ด€๋ จ ์ •๋ณด๊ฐ€ ํ”„๋ ˆ์ž„ ๋‚ด ์ผ์ • ์œ„์น˜์— ๊ณ ์ •๋˜์ง€ ์•Š๋Š” ๋ฌธ์ œ๋ฅผ ๊ณ ๋ คํ•˜์—ฌ SAM ๊ธฐ๋ฐ˜ ๋ชจ๋ˆ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•œ ROI ์ •๋ฐ€ ํฌ๋กญ์„ ์ ์šฉํ•˜์˜€๋‹ค. ๋˜ํ•œ ๋‹จ์ผ CNN ๋ชจ๋ธ์„ ์ ์šฉํ•  ๊ฒฝ์šฐ, ์˜์—ญ๋ณ„ ํŒ๋ณ„ ์ •๋ณด๋ฅผ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜ํ•˜๋Š” ๋ฐ ํ•œ๊ณ„๊ฐ€ ์žˆ์–ด MIL ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์„ ์ ์šฉํ•˜์˜€๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์€ ๋น„๊ตํ•œ ๋ชจ๋ธ ์ค‘ ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์œผ๋กœ recall 85.47%์™€ F1-score 85.68%๋ฅผ ๋‹ฌ์„ฑํ•˜์˜€์œผ๋ฉฐ, ์ด๋Š” ์›๋ณธ ์ž…๋ ฅ ๋Œ€๋น„ ๊ฐ๊ฐ 3.59 ํผ์„ผํŠธ ํฌ์ธํŠธ์™€ 3.71 ํผ์„ผํŠธ ํฌ์ธํŠธ ํ–ฅ์ƒ๋œ ๊ฒฐ๊ณผ์ด๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” ROI ์ •๋ฐ€ ํฌ๋กญ์„ ํ†ตํ•ด ๋ฐฐ๊ฒฝ ์ •๋ณด์˜ ์˜ํ–ฅ์„ ์ตœ์†Œํ™”ํ•˜๊ณ  MIL ๊ธฐ๋ฐ˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹์„ ํ†ตํ•ด ์˜์—ญ๋ณ„ ํŒ๋ณ„ ์ •๋ณด๋ฅผ ์ข…ํ•ฉ์ ์œผ๋กœ ๋ฐ˜์˜ํ•จ์œผ๋กœ์จ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๊ธฐ์—ฌํ–ˆ์Œ์„ ๋ณด์—ฌ์ค€๋‹ค.

ํ•˜์ง€๋งŒ ๋ณธ ์—ฐ๊ตฌ๋Š” ๋‹จ์ผ ๋ˆ์‚ฌ ํ™˜๊ฒฝ์—์„œ ์ˆ˜์ง‘๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ˆ˜ํ–‰๋˜์—ˆ์œผ๋ฏ€๋กœ ๋‹ค์–‘ํ•œ ์‚ฌ์œก ํ™˜๊ฒฝ ๋ฐ ์ดฌ์˜ ์กฐ๊ฑด์— ๋Œ€ํ•œ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ ๊ฒ€์ฆ์—๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค. ํ–ฅํ›„ ์—ฐ๊ตฌ์—์„œ๋Š” ๋‹ค์–‘ํ•œ ๋ˆ์‚ฌ ํ™˜๊ฒฝ์—์„œ ์ˆ˜์ง‘๋œ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ๊ฒ€์ฆํ•˜๊ณ , ๋ณด๋‹ค ๊ฒฝ๋Ÿ‰์ด๊ฑฐ๋‚˜ ์ตœ์‹ ์˜ backbone ๋ชจ๋ธ์„ ์ถ”๊ฐ€๋กœ ํ‰๊ฐ€ํ•˜์—ฌ ์‹ค์ œ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ๋”์šฑ ๋ฉด๋ฐ€ํžˆ ๋ถ„์„ํ•  ๊ณ„ํš์ด๋‹ค. ๋˜ํ•œ ๋ถ„๋งŒ ์—ฌ๋ถ€์— ๋Œ€ํ•œ ํŒ๋ณ„์„ ๋„˜์–ด, ๋ถ„๋งŒ ๊ณผ์ •์—์„œ ๋‚˜ํƒ€๋‚˜๋Š” ์ฃผ์š” ํ–‰๋™๋“ค์„ ์„ธ๋ถ„ํ™”๋œ ํด๋ž˜์Šค๋กœ ๊ตฌ์„ฑํ•˜๊ณ  ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ด์ƒ ์ง•ํ›„๋ฅผ ์กฐ๊ธฐ์— ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ๋„๋ก ์—ฐ๊ตฌ๋ฅผ ํ™•์žฅํ•  ๊ณ„ํš์ด๋‹ค. ๋” ๋‚˜์•„๊ฐ€ ์‹ค์ œ ํ˜„์žฅ ์ ์šฉ์„ ๊ณ ๋ คํ•œ ๊ฒฝ๋Ÿ‰ํ™” ๋ฐ ์ถ”๋ก  ์†๋„ ๊ฐœ์„  ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ์‹ค์‹œ๊ฐ„ ๋ชจ๋‹ˆํ„ฐ๋ง ์‹œ์Šคํ…œ์œผ๋กœ์˜ ํ™•์žฅ ๊ฐ€๋Šฅ์„ฑ์„ ๊ฒ€ํ† ํ•˜๊ณ ์ž ํ•œ๋‹ค.

Acknowledgements

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2022R1I1A3053872); in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2023-00242528) and was supported by Korea Institute of Planning and Evaluation for Technology in Food, Agriculture and Forestry(IPET) and Korea Smart Farm R&D Foundation(KosFarm) through Smart Farm Innovation Technology Development Program, funded by Ministry of Agriculture, Food and Rural Affairs(MAFRA) and Ministry of Science and ICT(MSIT), Rural Development Administration(RDA) (RS-2025-02315218).

References

1 
Korea Rural Economic Institute (KREI), "Agricultural Outlook 2025 Report," 2025. Google Search
2 
Statistics Korea, "Farm Households by Age of Farm Household Head(Census of Agriculture, Forestry and Fisheries)," 2023. Google Search
3 
Livestock Environmental Management Institute, "Comparison of the Proportion of Foreign Workers on Farms by Livestock Species," 2023. Google Search
4 
I. Traulsen, Art. no. 170, "Using Acceleration Data to Automatically Detect the Onset of Farrowing in Sows," Sensors, vol. 18, no. 1, 2018. DOI
5 
C. Lipori, B. F. A. Laurenssen, I. Reimert, N. M. Soede, A. Youssef, "A Wearable Software Sensor for Parturition Onset Prediction in Sows," pp. 1315-1323, 2024. Google Search
6 
E. Mayrhuber, K. Maschat, D. Brunner, S. M. Winkler, M. Oczak, Art. no. 104381, "Improved and interpretable accelerometer-based farrowing prediction," Biosystems Engineering, vol. 263, 2026. DOI
7 
M. Oczak, F. Bayer, S. Vetter, K. Maschat, J. Baumgartner, Art. no. 106517, "Comparison of the automated monitoring of the sow activity in farrowing pens using video and accelerometer data," Computers and Electronics in Agriculture, vol. 192, 2022. DOI
8 
X. Yang, C. Zheng, C. Zou, H. Gan, S. Li, S. Huang, Y. Xue, Art. no. 106139, "A CNN-based posture change detection for lactating sow in untrimmed depth videos," Computers and Electronics in Agriculture, vol. 185, 2021. DOI
9 
J. H. Witte, J. Gerberding, C. Lensches, I. Traulsen, "Using Deep Learning for automated birth detection during farrowing," pp. 141-154, 2022. Google Search
10 
M. Wutke, C. Lensches, U. Hartmann, I. Traulsen, "Towards automatic farrowing monitoring-A Noisy Student approach for improving detection performance of newborn piglets," PLOS ONE, vol. 19, no. 10, 2024. DOI
11 
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollรกr, R. Girshick, "Segment Anything," pp. 4015-4026, 2023. Google Search
12 
Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, "A ConvNet for the 2020s," pp. 11976-11986, 2022. Google Search
13 
F. Chollet, "Xception: Deep learning with depthwise separable convolutions," pp. 1251-1258, 2017. Google Search
14 
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, "Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows," pp. 10012-10022, 2021. Google Search
15 
M. Ilse, J. Tomczak, M. Welling, "Attention-based Deep Multiple Instance Learning," pp. 2127-2136, 2018. Google Search
16 
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," 2021. Google Search
17 
D. J. Araรบjo, "Key Patches Are All You Need: A Multiple Instance Learning Framework for Robust Medical Diagnosis," 2024. Google Search

์ €์ž์†Œ๊ฐœ

์›ํ˜•์‹ (Hyeong-sik Won)
../../Resources/kiee/KIEE.2026.75.6.1383/au1.png

Hyeong-sik Won received the B.S. degree in Electronic Engineering from Kangwon National University and the M.S. degree from the Department of Data Science, Kangwon National University.

์กฐํ˜„์ข… (Hyun-chong Cho)
../../Resources/kiee/KIEE.2026.75.6.1383/au2.png

Hyun-chong Cho received his M.S. and Ph.D. degrees in electrical and computer engineering from the University of Florida, USA, in 2009. During 2010โ€“2011, he was a Research Fellow at the University of Michigan, Ann Arbor, USA. From 2012 to 2013, he was a Chief Research Engineer at LG Electronics, South Korea. He is currently a Professor at the Department of Electronics Engineering, the Department of Data Science, and Interdisciplinary Graduate Program for BIT Medical, Kangwon National University, South Korea.