• ๋Œ€ํ•œ์ „๊ธฐํ•™ํšŒ
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ๋‹จ์ฒด์ด์—ฐํ•ฉํšŒ
  • ํ•œ๊ตญํ•™์ˆ ์ง€์ธ์šฉ์ƒ‰์ธ
  • Scopus
  • crossref
  • orcid

  1. (Dept. of Interdisciplinary Graduate Program for BIT Medical Convergence, Kangwon National University, Korea.)
  2. (Dept. of Electronics Engineering Kangwon National University, Korea.)
  3. (Dept. of Internal Medicine & Institute of Health Sciences, Gyeongsang National University School of Medicine and Gyeongsang National University Hospital, Korea.)



CADx, Gastric Diagnosis, Classification, Convolution Neural Network, Deep learning, Vision Transformer

1. ์„œ ๋ก 

2020๋…„ ๊ตญ์ œ์•”์—ฐ๊ตฌ์†Œ IARC(International Agency for Research on Cancer)์—์„œ ๋ฐœํ‘œํ•œ ๋ณด๊ณ ์„œ์— ๋”ฐ๋ฅด๋ฉด ์œ„์•”์€ ํŠนํžˆ ํ•œ๊ตญ, ์ค‘๊ตญ, ์ผ๋ณธ ๋“ฑ ๋™์•„์‹œ์•„์ธ์—๊ฒŒ ๋ฐœ๋ณ‘๋ฅ ์ด ๋†’์€ ์งˆ๋ณ‘์ด๋‹ค(1). ๋™์•„์‹œ์•„ ์ง€์—ญ์€ ์ธ๊ตฌ 10๋งŒ ๋ช…๋‹น 32.5๋ช…์œผ๋กœ 2์œ„์ธ ๋™์œ ๋Ÿฝ ๋ฐœ๋ณ‘๋ฅ ์ธ 17.4๋ช…๋ณด๋‹ค 15.1๋ช… ๋” ๋†’์€ ์ˆ˜์น˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๋‹ค์Œ ๊ทธ๋ฆผ 1์€ ์œ„์•” ๋ฐœ๋ณ‘๋ฅ ์ด ๊ฐ€์žฅ ๋†’์€ ์ƒ์œ„ 5๊ฐœ ์ง€์—ญ์˜ ๋ฐœ๋ณ‘๋ฅ ์„ ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค. ์œ„์•” ์ดˆ๊ธฐ์—๋Š” ์ฆ์ƒ์„ ๋ณด์ด์ง€ ์•Š์•„ ์กฐ๊ธฐ ๋ฐœ๊ฒฌ์ด ํž˜๋“ค๋‹ค. ํ•˜์ง€๋งŒ ์ฆ์ƒ์„ ๋ณด์ด๊ณ  ์ถ”ํ›„ ๋ฐœ๊ฒฌ ์‹œ ์ด๋ฏธ ์ „์ด๋œ ์ง„ํ–‰์„ฑ ์œ„์•”์ผ ๊ฒฝ์šฐ๊ฐ€ ๋งค์šฐ ๋†’๋‹ค. ์กฐ๊ธฐ ์œ„์•”๊ณผ ์ง„ํ–‰์„ฑ ์œ„์•”์€ ์ƒ์กด์œจ์— ํฐ ์ฐจ์ด๋ฅผ ๋ณด์ธ๋‹ค.

๊ทธ๋ฆผ. 1. ์œ„์•” ๋ฐœ๋ณ‘๋ฅ ์ด ๊ฐ€์žฅ ๋†’์€ 5๊ฐœ ์ง€์—ญ์˜ ๋ฐœ๋ณ‘๋ฅ 

Fig. 1. Incidence rates in the five regions with the highest incidence of gastric cancer

../../Resources/kiee/KIEE.2023.72.11.1399/fig1.png

2023๋…„ ๋Œ€ํ•œ๋ฏผ๊ตญ ๋ณด๊ฑด๋ณต์ง€๋ถ€์—์„œ ๋ฐœํ‘œํ•œ 2020๋…„ ๊ตญ๊ฐ€์•”๋“ฑ๋ก ์‚ฌ์—…๋ณด๊ณ ์„œ์— ๋”ฐ๋ฅด๋ฉด ๋ณ‘๋ณ€์ด ๊ตญํ•œ๋œ 1๊ธฐ ํ™˜์ž์˜ 5๋…„ ์ƒ์กด์œจ์€ 97.5%๋กœ ์ƒ๋‹นํžˆ ๋†’์œผ๋‚˜ 2-3๊ธฐ ๊ตญ์†Œ ๋ณ‘๋ณ€ ํ™˜์ž์˜ 5๋…„ ์ƒ์กด์œจ์€ 62.3%, ์›๊ฒฉ์ „์ด๊ฐ€ ์žˆ๋Š” ๋ง๊ธฐ ์œ„์•” ํ™˜์ž์˜ ์ƒ์กด์œจ์€ 6.7%๋กœ ๊ธ‰๊ฒฉํžˆ ๊ฐ์†Œํ•œ๋‹ค(2). ๋”ฐ๋ผ์„œ ๋Œ€ํ•œ๋ฏผ๊ตญ ๋ณด๊ฑด๋ณต์ง€๋ถ€์—์„œ๋Š” ์œ„์•” ์˜ˆ๋ฐฉ๊ณผ ์กฐ๊ธฐ ๋ฐœ๊ฒฌ์„ ์œ„ํ•ด 40์„ธ ์ด์ƒ ๊ตญ๋ฏผ์„ ๋Œ€์ƒ์œผ๋กœ 2๋…„๋งˆ๋‹ค ์œ„๋‚ด์‹œ๊ฒฝ ๊ฒ€์‚ฌ๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ๋„๋ก ๊ตญ๊ฐ€์•”๊ฒ€์ง„ ํ”„๋กœ๊ทธ๋žจ์„ ์šด์˜ํ•˜๊ณ  ์žˆ๋‹ค. ๊ตญ๋ฆฝ์•”์„ผํ„ฐ์˜ ์œ„์•” ๊ฒ€์ง„ ์ˆ˜๊ฒ€๋ฅ ์— ๋”ฐ๋ฅด๋ฉด 2004๋…„ ์•ฝ 40%์— ๋ถˆ๊ณผํ•˜๋˜ ์ˆ˜๊ฒ€๋ฅ  ์ถ”์ด๊ฐ€ ๊พธ์ค€ํžˆ ์ƒ์Šนํ•˜์—ฌ 2012๋…„์— 74.2%๋ฅผ ๋‹ฌ์„ฑํ•˜๊ณ  ๊พธ์ค€ํžˆ 70% ์ด์ƒ์„ ์œ ์ง€ ์ค‘์ด๋‹ค(3). ํ•˜์ง€๋งŒ ๋Œ€ํ•œ์˜ํ•™ํšŒ์˜ 2021๋…„ ๋ถ„๊ณผ์ „๋ฌธ์˜ ์ œ๋„ ์—ฐ๋ณด์— ๋”ฐ๋ฅด๋ฉด ์‹ ๊ทœ ์†Œํ™”๊ธฐ๋‚ด๊ณผ ๋ถ„๊ณผ์ „๋ฌธ์˜ ์ˆ˜๋Š” 2012๋…„ 200๋ช… ์ดํ›„ ๊พธ์ค€ํžˆ ๊ฐ์†Œํ•˜์—ฌ 2021๋…„์—๋Š” 120๋ช…์œผ๋กœ ์ตœ์ €์น˜๋ฅผ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค(4). ์ด๋Š” ๋‚ด์‹œ๊ฒฝ์„ ๊ฒ€์‚ฌํ•  ์ „๋ฌธ์˜ ์ˆ˜๊ฐ€ ๋ถ€์กฑํ•ด์ง€๊ณ , ์ „๋ฌธ์˜๊ฐ€ ๋ถ€๋‹ดํ•ด์•ผ ํ•  ์ˆ˜๊ฒ€์ž ์ˆ˜๋Š” ์ฆ๊ฐ€ํ•จ์„ ์˜๋ฏธํ•œ๋‹ค. ํ•œ ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜๊ฐ€ ๋ถ€๋‹ดํ•ด์•ผ ํ•  ์ˆ˜๊ฒ€์ž ์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ• ์ˆ˜๋ก ํ”ผ๋กœ๋„๋Š” ์ฆ๊ฐ€ํ•˜๊ณ , ์ด๋Š” ์˜ค์ง„๊ณผ ๊ฐ™์€ ์น˜๋ช…์ ์ธ ์‹ค์ˆ˜๋ฅผ ์œ ๋ฐœํ•  ์ˆ˜ ์žˆ๋Š” ์š”์ธ์ด ๋œ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ ์ž ์œ„๋‚ด์‹œ๊ฒฝ ๋‹จ๊ณ„์—์„œ ์ผ์ •ํ•˜๊ณ  ์ •ํ™•๋„ ๋†’์€ ์ง„๋‹จ์œผ๋กœ ์˜์‚ฌ์—๊ฒŒ 2์ฐจ ์˜๊ฒฌ์„ ์ œ์‹œํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์œ„ ๋ณ‘๋ณ€ ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ CADx(Computer-Aided Diagnosis, ์ปดํ“จํ„ฐ ๋ณด์กฐ ์ง„๋‹จ ์‹œ์Šคํ…œ)๋ฅผ ์œ„ํ•œ ๊ฐœ๋ฐœ์ด ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€๋ฅผ ํ†ตํ•œ ์œ„์žฅ๊ด€ ์งˆํ™˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ CADx ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ ์—ฐ๊ตฌ๊ฐ€ ์ˆ˜ํ–‰๋˜์—ˆ๋‹ค(5). ๊ธฐ์กด ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด CADx ์‹œ์Šคํ…œ์ด ์˜์‚ฌ์—๊ฒŒ ์ผ๊ด€์ ์ธ 2์ฐจ ์˜๊ฒฌ์„ ์ œ์‹œํ•˜๊ณ  ์•ˆ์ •์ ์ธ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค„ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค. D-CNN(Deep Convolution Neural Network, ์‹ฌ์ธต ์ปจ๋ณผ๋ฃจ์…˜ ์‹ ๊ฒฝ๋ง)์„ ํ™œ์šฉํ•˜์—ฌ ์œ„ ์งˆํ™˜์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ์—ฐ๊ตฌ๋„ ์ง„ํ–‰๋˜์—ˆ๋‹ค(6). InceptionV3 ๋ฐ DenseNet-201๋ฅผ ํ†ตํ•ด ํŠน์ง• ๋ฒกํ„ฐ๋ฅผ ์ถ”์ถœํ•˜๊ณ  Binary dragonfly ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์—ฌ ์ •ํ™•๋„ 99.8%์˜ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ์œ„ ์งˆํ™˜๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์กฐ๊ธฐ ์œ„์•”์„ ์ง„๋‹จํ•˜๋Š” ์—ฐ๊ตฌ ๋˜ํ•œ ์ˆ˜ํ–‰๋˜์—ˆ๋‹ค(7). ResNet-50๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์กฐ๊ธฐ ์œ„์•” ๋ถ„๋ฅ˜ ์ •ํ™•๋„ 98.7%๋ฅผ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ๋˜ํ•œ ์œ„ ๋ณ‘๋ณ€ ํƒ์ง€์— ๊ด€ํ•œ ์—ฐ๊ตฌ๋„ ์ˆ˜ํ–‰๋˜์—ˆ๋‹ค(8). ์—‘์Šค๋ ˆ์ด(X-ray) ์ด๋ฏธ์ง€๋ฅผ ํ†ตํ•ด ์•…์„ฑ ์˜์—ญ์„ ๊ฒ€์ถœํ•˜๊ธฐ ์œ„ํ•˜์—ฌ Faster R-CNN ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์žฌํ˜„์œจ 92.3%๋ฅผ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ์กฐ๊ธฐ ์œ„์•”(EGC) ๋ฐ ์ง„ํ–‰์„ฑ ์œ„์•”(AGC) ๋ณ‘๋ณ€์„ ๋ถ„ํ•  ํ•˜๋Š” ์—ฐ๊ตฌ ๋˜ํ•œ ์ง„ํ–‰๋˜์—ˆ๋‹ค(9). U-Net์„ ํ†ตํ•ด ์กฐ๊ธฐ ์œ„์•”์˜ Dice ๊ณ„์ˆ˜๋Š” 0.555, ์ง„ํ–‰์„ฑ ์œ„์•”์˜ Dice ๊ณ„์ˆ˜๋Š” 0.716์„ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ์ „๋ฐ˜์ ์œผ๋กœ ๊ธฐ์กด ์—ฐ๊ตฌ์—์„œ๋Š” ํƒ€ ๋ถ„์•ผ์— ๋น„ํ•ด ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘์— ์žฅ๊ธฐ๊ฐ„์ด ์†Œ์š”๋˜๊ณ  ์ž‘์€ ๋ฐ์ดํ„ฐ์…‹์„ ๊ตฌ์„ฑํ•จ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ ์„ ํ•ด๊ฒฐํ•˜๊ณ ์ž ๋ฐ์ดํ„ฐ ์ฆ๋Œ€๋ฅผ ํ†ตํ•ด ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•œ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์—ˆ๋‹ค(10). ๊ธฐ์กด ์—ฐ๊ตฌ์—์„œ๋Š” ์ด๋ฏธ์ง€๋ฅผ ํšŒ์ „, ์ด๋™, ์ ˆ๋‹จ, ์คŒ ๋ฐ ๋’ค์ง‘๊ธฐ๋ฅผ ํฌํ•จํ•œ ์•„ํ™‰ ๊ฐ€์ง€ ์œ ํ˜•์˜ ์ฆ๋Œ€๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์˜€๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ์ฆ๋Œ€๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์€ ๋„คํŠธ์›Œํฌ๋ณด๋‹ค AUC๊ฐ€ 1.5% ํ–ฅ์ƒํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป์—ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์ฆ๋Œ€๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ „์ด ํ•™์Šต์„ ํ•จ๊ป˜ ์ ์šฉํ•˜์—ฌ ์œ„๋‚ด์‹œ๊ฒฝ์„ ํ†ตํ•œ ์œ„์žฅ ์งˆํ™˜์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ์—ฐ๊ตฌ๋„ ์ˆ˜ํ–‰๋˜์—ˆ๋‹ค(11). ์ƒํ•˜์ขŒ์šฐ ์ด๋™, ํ™•๋Œ€/์ถ•์†Œ, ๋ฐ๊ธฐ ์กฐ์ •, ์ƒํ•˜ ๋ฐ˜์ „ ๋“ฑ์„ ํ™œ์šฉํ•œ ๋ฐ์ดํ„ฐ ์ฆ๋Œ€ ๋ฐฉ์‹๊ณผ ์ „์ด ํ•™์Šต์„ ํ†ตํ•ด VGGNet์€ 10.61%, InceptionNet์€ 11.8%, ResNet์€ 14.99%์˜ ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ๋ณด์˜€๋‹ค.

๊ธฐ์กด ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ์•Œ ์ˆ˜ ์žˆ๋“ฏ์ด ์˜๋ฃŒ ๋ฐ์ดํ„ฐ๋Š” ํ™˜์ž์˜ ๋™์˜ ์™€ IRB(Institutional Review Board, ๊ธฐ๊ด€ ๊ฒ€ํ†  ์œ„์›ํšŒ)์˜ ์Šน์ธ์ด ํ•„์ˆ˜์ ์ด์–ด์„œ ์ˆ˜์ง‘ ๊ธฐ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆฌ๊ณ  ์ž‘์€ ๋ฐ์ดํ„ฐ์…‹์ด ์ฃผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฐ์ดํ„ฐ์…‹์€ CADx์˜ ์„ฑ๋Šฅ์„ ๊ฐ์†Œ์‹œํ‚ค๋Š” ์š”์ธ์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•˜์—ฌ Google์—์„œ ๊ฐœ๋ฐœํ•œ AutoAugment ์ฆ๋Œ€์ •์ฑ…์„ ์ ์šฉํ•˜์˜€๋‹ค. ๋˜ํ•œ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” SAM(Sharpness Aware Minimization) ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•œ CNN ๊ธฐ๋ฐ˜์˜ ConvNeXt์™€ ํŠธ๋žœ์Šคํฌ๋จธ ๊ธฐ๋ฐ˜์˜ ViT(Vision Transformer)๋ฅผ ์ ์šฉํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹๊ณผ ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์˜ ํ•™์Šต ๊ฒฐ๊ณผ๋ฅผ ๊ฐ๊ฐ ๋น„๊ตํ•จ์œผ๋กœ์จ ๋น„์ •์ƒ ์œ„ ๋ณ‘๋ณ€ ๋ถ„๋ฅ˜ ์ปดํ“จํ„ฐ ๋ณด์กฐ ์ง„๋‹จ ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜๊ณ ์ž ํ•œ๋‹ค.

2. ํ•™์Šต ๋ฐ์ดํ„ฐ

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๊ฒฝ์ƒ๊ตญ๋ฆฝ๋Œ€ํ•™๊ต๋ณ‘์›์˜ ์†Œํ™”๊ธฐ๋‚ด๊ณผ๋ฅผ ํ†ตํ•ด ์ˆ˜์ง‘ํ•œ ๋น„์ •์ƒ ๋ฐ ์ •์ƒ ๋ฐฑ์ƒ‰๊ด‘ ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์— ์‚ฌ์šฉ๋œ ๋ชจ๋“  ๋ฐ์ดํ„ฐ์…‹์€ ํ™˜์ž ๊ฐœ์ธ์˜ ๋™์˜์™€ IRB์˜ ์Šน์ธ์„ ๋ฐ›๊ณ  ์ˆ˜์ง‘๋˜์—ˆ๋‹ค. ๋˜ํ•œ ์กฐ์ง ๊ฒ€์‚ฌ ๋ฐ ์ „๋ฌธ์˜์˜ 2์ฐจ ๊ฒ€์ฆ์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ์…‹์˜ ์‹ ๋ขฐ๋„๋ฅผ ํ–ฅ์ƒํ•˜์˜€๋‹ค.

2.1 ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ

๋ณธ ์—ฐ๊ตฌ์— ์‚ฌ์šฉ๋œ ๋ฐ์ดํ„ฐ์…‹์€ ์ด 96๋ช…์˜ ํ™˜์ž๋กœ๋ถ€ํ„ฐ ๋น„์ •์ƒ ์ด๋ฏธ์ง€ 300์žฅ๊ณผ ์ •์ƒ ์ด๋ฏธ์ง€ 300์žฅ์„ ์ˆ˜์ง‘ํ•˜์—ฌ ์ด 600์žฅ์„ ๊ตฌ์„ฑํ•˜์˜€๋‹ค. ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€๋Š” ํ•œ ํ™˜์ž๋กœ๋ถ€ํ„ฐ ์—ฌ๋Ÿฌ ์žฅ์ด ์ˆ˜์ง‘๋  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‹ค. ํ•œ ํ™˜์ž์˜ ์ด๋ฏธ์ง€๊ฐ€ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์— ๋“ค์–ด๊ฐ€๋ฉด ์ถฉ๋ถ„ํ•œ ํ•™์Šต์ด ๋˜์ง€ ์•Š๊ณ , ๊ณผ์ ํ•ฉ์ด ๋ฐœ์ƒํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’๊ธฐ์— ํ™˜์ž ๊ตฌ์„ฑ์ด ์ค‘์š”ํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ํ›ˆ๋ จ ๋ฐ ๊ฒ€์ฆ, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์— ํ™˜์ž๊ฐ€ ๊ฒน์น˜์ง€ ์•Š๋„๋ก ๋ฌด์ž‘์œ„ ๋ถ„๋ฐฐํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์— ์ ์šฉํ•œ ํ™˜์ž ์ˆ˜ ๋ฐ ์ด๋ฏธ์ง€ ์ˆ˜ ๊ตฌ์„ฑ์€ ํ‘œ 1์— ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

ํ‘œ 1. ๋น„์ •์ƒ ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ(๋‹จ์œ„ : ์žฅ)

Table 1. Construction of abnormal gastroscopy image dataset

Type

Number

Train

Valid

Test

patients

Normal

28

10

10

Abnormal

Gastritis

9

5

3

Ulcer

7

1

1

Polyp

2

2

2

Others

10

2

4

Images

Normal

180

60

60

Abnormal

Gastritis

77

40

23

Ulcer

51

6

4

Polyp

8

7

6

Others

44

7

27

2.2 ๋ฐ์ดํ„ฐ ์ฆ๋Œ€

์œ„ ๋ณ‘๋ณ€์€ ์ฆ์ƒ์ด ๋‹ค์–‘ํ•œ ๋งŒํผ ์—ฌ๋Ÿฌ ํŠน์ง•์„ ํฌํ•จํ•ด์•ผ ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ํŠน์ง•์€ ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์— ์žˆ์–ด ๋งค์šฐ ์ค‘์š”ํ•œ ์š”์†Œ์ด๋ฉฐ ์œ„ ๋ณ‘๋ณ€ ์ด๋ฏธ์ง€๊ฐ€ ๋ถ€์กฑํ•  ๋•Œ ๊ณผ์ ํ•ฉ์ด๋‚˜ ํ•™์Šต์ด ์ถฉ๋ถ„ํžˆ ๋˜์ง€ ์•Š์„ ๊ฐ€๋Šฅ์„ฑ์ด ๋งค์šฐ ๋†’๋‹ค. ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹์€ ์˜๋ฃŒ ์˜์ƒ์ด๋ฏ€๋กœ ํ™˜์ž ๊ฐœ์ธ์ •๋ณด๋ณดํ˜ธ๋ฅผ ์œ„ํ•ด ํ™˜์ž์˜ ๋™์˜ ๋ฐ ์ต๋ช…ํ™”๋ฅผ ํ•„์š”๋กœ ํ•œ๋‹ค. ๊ทธ๋ ‡๊ธฐ์— ์ˆ˜์ง‘์— ์˜ค๋žœ ๊ธฐ๊ฐ„์ด ์†Œ์š”๋˜๊ณ , ๋‹ค๋ฅธ ๋ถ„์•ผ์˜ ๋ฐ์ดํ„ฐ์…‹๋ณด๋‹ค ์ž‘์€ ๋ฐ์ดํ„ฐ์…‹์ด ๊ตฌ์„ฑ๋œ๋‹ค. ์ด๋Ÿฌํ•œ ์˜๋ฃŒ ์˜์ƒ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ ์ž ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” AutoAugment๋ฅผ ์ ์šฉํ•˜์˜€๋‹ค(12). AutoAugment๋Š” ๊ตฌ๊ธ€์—์„œ ์ œ์•ˆํ•œ ์ฆ๋Œ€์ •์ฑ…์œผ๋กœ ์—ฌ๋Ÿฌ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์ตœ์ ์˜ ์ฆ๋Œ€์ •์ฑ…์„ ์ œ๊ณตํ•œ๋‹ค. AutoAugment๋Š” 25๊ฐœ์˜ ํ•˜์œ„ ์ •์ฑ…์œผ๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ ๊ฐ ํ•˜์œ„ ์ •์ฑ…์€ Shear, Translate, Auto Contrast, Equalize ๋“ฑ 16๊ฐ€์ง€ ์˜์ƒ์ฒ˜๋ฆฌ ๋ฐฉ์‹ ์ค‘ 2๊ฐ€์ง€์™€ ๊ฐ ๋ฐฉ์‹์˜ ์ ์šฉ ํ™•๋ฅ  10๋‹จ๊ณ„, ์ ์šฉ ๊ฐ•๋„ 11๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ํ†ตํ•ด ์ •ํ•ด์ง„ ์ฆ๋Œ€์ •์ฑ…์„ ์œ„ ๋ณ‘๋ณ€ ์ด๋ฏธ์ง€์— ์ ์šฉํ•˜์—ฌ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์˜ 25๋ฐฐ๋ฅผ ์ฆ๋Œ€ํ•˜์—ฌ ๋ณธ ์—ฐ๊ตฌ์— ์‚ฌ์šฉํ•˜์˜€๋‹ค. AutoAugment๋Š” Cifar10, ImageNet, ๊ทธ๋ฆฌ๊ณ  SVHN ์„ธ ๊ฐ€์ง€ ์ฃผ์š” ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด ํŠนํ™”๋œ ์ด๋ฏธ์ง€ ์ฆ๋Œ€์ •์ฑ…์„ ์ œ์‹œํ•œ๋‹ค. Cifar10 ๋ฐ์ดํ„ฐ์…‹์€ 32x32 ํ”ฝ์…€ ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€๋กœ, ์ด 10๊ฐœ์˜ ํด๋ž˜์Šค๊ฐ€ ์žˆ๋‹ค(13). ๋ฐ˜๋ฉด, ImageNet์€ 1,000๊ฐœ ์ด์ƒ์˜ ํด๋ž˜์Šค์™€ ํ•จ๊ป˜ 140๋งŒ ๊ฐœ๊ฐ€ ๋„˜๋Š” ์ด๋ฏธ์ง€๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค(14). SVHN์€ Google Street View์—์„œ ์ถ”์ถœํ•œ ์ˆซ์ž ์ด๋ฏธ์ง€๋กœ, ๋Œ€๋žต 10๋งŒ ๊ฐœ์˜ ์ด๋ฏธ์ง€๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค(15). ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด ์ค‘์—์„œ๋„ ๊ฐ€์žฅ ๋ฐฉ๋Œ€ํ•œ ์ด๋ฏธ์ง€์™€ ํด๋ž˜์Šค๋ฅผ ๊ฐ€์ง„ ImageNet์˜ ์ฆ๋Œ€์ •์ฑ…์„ ํ™œ์šฉํ•˜์˜€๋‹ค.

3. ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ๊ณผ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜

3.1 ConvNeXt

ViT(Vision Transformer)๋Š” 2020๋…„ ๋“ฑ์žฅํ•ด ๊ธฐ์กด CNN ๊ธฐ๋ฐ˜ ๋ชจ๋ธ๋ณด๋‹ค ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋Š” ๋ชจ๋ธ์ด๋‹ค(16). ํ•˜์ง€๋งŒ Transformer๋Š” ์ž…๋ ฅ์˜ ์œ„์น˜๊ฐ€ ๋ณ€ํ•˜๋ฉด ์ถœ๋ ฅ์˜ ์œ„์น˜๊ฐ€ ๋ณ€ํ•œ ์ •๋ณด๋ฅผ ์œ ์ง€ํ•˜๊ธฐ ํž˜๋“ค์–ด Inductive bias๊ฐ€ CNN ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์— ๋น„ํ•ด ์ƒ๋Œ€์ ์œผ๋กœ ๋ถ€์กฑํ•œ ๋ชจ์Šต์„ ๋ณด์˜€๋‹ค. ConvNeXt๋Š” Inductive bias๋ฅผ ์œ„ํ•ด CNN์ด ์ค‘์š”ํ•˜๋‹ค๋Š” ์ ์„ ํ™œ์šฉํ•˜์˜€๋‹ค(17). Resnet-50์„ ์ตœ์‹ ํ™” ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ์˜ฌ๋ฆฌ๊ณ ์ž ํ•˜์˜€๋‹ค(18). Mixup, Cutmix, Random Augment์™€ ๊ฐ™์€ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•๋งŒ ์•„๋‹ˆ๋ผ Stochastic depth, label smoothing๊ณผ ๊ฐ™์€ ๊ธฐ๋ฒ•๋„ ์ ์šฉํ•˜์˜€๋‹ค. ๋˜ํ•œ Swin transformer์˜ ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ Stage๋งˆ๋‹ค ๋ธ”๋ก์˜ ๊ฐœ์ˆ˜๋ฅผ 3:4:6:3์—์„œ 3:3:9:3์œผ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ  Stem ๋ถ€๋ถ„์—์„œ๋„ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜๊ธฐ ์œ„ํ•ด ViT์˜ Patchify layer์˜ ํ˜•ํƒœ์ธ 4x4 Convolution, 4 stride๋กœ ๋ณ€๊ฒฝํ•˜์˜€๋‹ค. ๋˜ํ•œ ResNeXt์˜ ResNeXt-ify๋ฅผ ์ ์šฉํ•˜์˜€๋‹ค. 256์ฐจ์› ์ž…๋ ฅ์„ ์ด 32 path๋กœ ๋‚˜๋ˆ„๊ณ  ์ฑ„๋„ ์ˆ˜๋ฅผ 4๋กœ ์ค„์ธ ํ›„ ๋‹ค์‹œ 256์ฑ„๋„๋กœ ํ‚ค์›Œ ๋ชจ๋“  path๋ฅผ ํ•ฉ์น˜๊ณ  Depthwise separable convolution์„ ์ถ”๊ฐ€๋กœ ๋ฐฐ์น˜ํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์—ฐ์‚ฐ๋Ÿ‰์€ ์ค„์ด๊ณ  ์„ฑ๋Šฅ์€ ํ–ฅ์ƒํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ๋‹ค์Œ ๊ทธ๋ฆผ 2๋Š” ResNeXt-ify์˜ ์ž์„ธํ•œ ๊ตฌ์กฐ๋ฅผ ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

๊ทธ๋ฆผ. 2. ConvNeXt์˜ ResNeXt-ify์˜ ์ƒ์„ธ ๊ตฌ์กฐ

Fig. 2. Detailed structure of ResNeXt-ify by ConvNeXt

../../Resources/kiee/KIEE.2023.72.11.1399/fig2.png

ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ์‹ ๊ฒฝ๋ง์˜ ๊ฐ ๋…ธ๋“œ์—์„œ ๋น„์„ ํ˜•์„ ์ถ”๊ฐ€ํ•œ๋‹ค. ํŠธ๋žœ์Šคํฌ๋จธ ๊ตฌ์กฐ์—๋Š” ๋ชจ๋ธ์˜ ๋ณต์žก์„ฑ์„ ์ค„์ด๊ณ  ๊ณ„์‚ฐ ํšจ์œจ์„ฑ์„ ๋†’์ด๊ณ ์ž ์ผ๋ฐ˜ CNN๋ชจ๋ธ๋ณด๋‹ค ๋” ์ ์€ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ์ด๋ฅผ ConvNeXt์— ์ ์šฉํ•˜๊ณ ์ž 1x1 Convolution layer๋ฅผ ์ œ์™ธํ•˜๊ณ  ๋‚จ์€ ๋ ˆ์ด์–ด์—์„œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์ œ๊ฑฐํ•˜์˜€๋‹ค. ๋˜ํ•œ ๋ฐฐ์น˜ ์ •๊ทœํ™” ์ˆ˜๋ฅผ ์ค„์ด๊ณ  ๋ ˆ์ด์–ด ์ •๊ทœํ™”๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜๊ณ  ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” 21,000๊ฐœ์˜ ํด๋ž˜์Šค์™€ 1,000๋งŒ ์žฅ ์ด์ƒ์˜ ์ด๋ฏธ์ง€๋กœ ๊ตฌ์„ฑ๋œ ImageNet21k ๋ฐ์ดํ„ฐ์…‹์„ ์ ์šฉํ•˜์—ฌ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ConvNeXt-B ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

3.2 Title

ํŠธ๋žœ์Šคํฌ๋จธ ๊ตฌ์กฐ๋Š” ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๋ถ„์•ผ์—์„œ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์œผ๋กœ ์„ฑ๊ณต์„ ๊ฑฐ๋’€๋‹ค. ํŠธ๋žœ์Šคํฌ๋จธ๋Š” ๋‹จ์–ด๋‚˜ ๋ฌธ์žฅ์˜ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ฃผ๋Š” ๊ตฌ์กฐ๋กœ ์ด๋ฅผ ํ†ตํ•ด ๋ฌธ์žฅ์˜ ๋ฌธ๋งฅ์„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ ์„ ์‘์šฉํ•˜์—ฌ ํŠธ๋žœ์Šคํฌ๋จธ์— Vision task๋ฅผ ์ ‘๋ชฉํ•˜๋ ค๋Š” ์‹œ๋„๊ฐ€ ์ˆ˜ํ–‰๋˜์—ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ๋กœ ViT(Vision Transformer)๊ฐ€ ๋“ฑ์žฅํ•˜์˜€๋‹ค. ViT๋Š” ๊ธฐ์กด์˜ CNN ๋ชจ๋ธ๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ ์ด๋ฏธ์ง€๋ฅผ ํŒจ์น˜ ๋‹จ์œ„๋กœ ๋‚˜๋ˆ„์–ด ํŠธ๋žœ์Šคํฌ๋จธ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•œ๋‹ค. ๊ฐ ํŒจ์น˜๋Š” ์ด๋ฏธ์ง€์˜ ์ผ๋ถ€๋ถ„์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ์ด ํŒจ์น˜๋“ค ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ํŠธ๋žœ์Šคํฌ๋จธ๊ฐ€ ํ•™์Šตํ•˜๊ฒŒ ๋œ๋‹ค. ๋‹ค์Œ ๊ทธ๋ฆผ 3์€ ViT์˜ ์ „์ฒด์ ์ธ ํ๋ฆ„์„ ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

๊ทธ๋ฆผ. 3. ViT ์•„ํ‚คํ…์ณ ์„ธ๋ถ€ ๊ตฌ์กฐ

Fig. 3. ViT architecture detailed structure

../../Resources/kiee/KIEE.2023.72.11.1399/fig3.png

ViT๋Š” ํŒจ์น˜ ์ž„๋ฒ ๋”ฉ(Patch Embedding)์„ ํ†ตํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ๊ณ ์ •๋œ ํฌ๊ธฐ์˜ ํŒจ์น˜๋กœ ๋‚˜๋ˆ„๊ณ  ์ด๋ฅผ 1์ฐจ์› ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•œ ๋‹ค์Œ ์œ„์น˜ ์ž„๋ฒ ๋”ฉ(Position Embedding)์œผ๋กœ ๊ฐ ํŒจ์น˜์— ์œ„์น˜ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•œ๋‹ค. ๋ณ€ํ™˜๋œ 1์ฐจ์› ์ž…๋ ฅ ๋ฒกํ„ฐ๋Š” ํŠธ๋žœ์Šคํฌ๋จธ ์ธ์ฝ”๋”์—์„œ Feed- forward ์‹ ๊ฒฝ๋ง๊ณผ Multi-head Self-Attention์„ ํ†ตํ•ด ์ด๋ฏธ์ง€์˜ ๋ณต์žกํ•œ ํŒจํ„ด๊ณผ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ MLP(Multi Layer Perceptron, ๋‹ค์ธต ํผ์…‰ํŠธ๋ก ) Head์—์„œ ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ํด๋ž˜์Šค๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ConvNeXt ๋ชจ๋ธ๊ณผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•˜์—ฌ 21,000๊ฐœ์˜ ํด๋ž˜์Šค์™€ 1,000๋งŒ ์žฅ ์ด์ƒ ์ด๋ฏธ์ง€๋กœ ๊ตฌ์„ฑ๋œ ImageNet21k๋กœ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ViT-B ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

3.3 Title

ํ˜„์žฌ ๋งŽ์€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ํ›ˆ๋ จ ์†์‹ค์˜ ๊ทธ๋ž˜ํ”„๋Š” ๋ณต์žกํ•˜๊ณ  ๋‚ ์นด๋กœ์šด ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง„๋‹ค. ์ „์—ญ ์ตœ์†Œ ์†์‹ค์„ ์ฐพ๊ธฐ ์œ„ํ•ด ์ผ๋ฐ˜ํ™”๊ฐ€ ํ•„์ˆ˜์ ์ธ๋ฐ, SAM(Sharpness Aware Minimization) ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์†์‹ค ๊ฐ’์„ ๋‚ฎ์ถ”๊ณ  ๋‚ ์นด๋กœ์šด ํ˜•ํƒœ ๋˜ํ•œ ์ตœ์†Œํ™”ํ•จ์œผ๋กœ์จ ์ผ๋ฐ˜ํ™”๋ฅผ ๊ฐœ์„ ํ•˜์˜€๋‹ค(19). SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ˆœ์ฐจ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ๋‹ค์Œ ์‹๊ณผ ๊ฐ™๋‹ค. $W(t)$๋Š” ํ˜„์žฌ ๊ฐ€์ค‘์น˜์ด๋ฉฐ, $\nabla_{W}L(W_{t},\:X,\:Y)$๋Š” $W(t)$์—์„œ์˜ ์†์‹ค ํ•จ์ˆ˜ $L$์— ๋Œ€ํ•œ ๊ธฐ์šธ๊ธฐ์ด๋‹ค. $X$ ๋ฐ $Y$๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์™€ ํ•ด๋‹น ๋ ˆ์ด๋ธ”์ด๋‹ค. $\rho$๋Š” ๊ต๋ž€์˜ ํฌ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜๋Š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ, $\eta$๋Š” ํ•™์Šต๋ฅ (LR, Learning Rate)์ด๋‹ค.

(1)
$W_{a d v}=W_t+\rho \cdot \frac{\nabla_W L\left(W_t, X, Y\right)}{|| \nabla_W L\left(W_t, X, Y\right)||_2}$

(2)
$g =\nabla_{w}L(W_{adv},\:X,\:Y)$

(3)
$W_{t+1}=W_{t}-\eta g$

$W(t)$์—์„œ ์†์‹ค์ด ์ตœ๋Œ€ํ™”๋˜๋„๋ก ํ•˜๋Š” ๊ต๋ž€(perturbation)์„ ์ฐพ๋Š”๋‹ค. ์ด ๊ต๋ž€์„ ์ ์šฉํ•œ ํ›„์˜ ๊ฐ€์ค‘์น˜๋ฅผ $W_{adv}$๋ผ๊ณ  ํ•œ๋‹ค. ์ดํ›„ $W_{adv}$์—์„œ ์†์‹ค์ด ์ตœ์†Œํ™”๋˜๋„๋ก ํ•˜๋Š” ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐ ํ›„ ์›๋ž˜ ๊ฐ€์ค‘์น˜ $W(t)$์—์„œ ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•์„ ์‹œํ–‰ํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ์ ˆ์ฐจ๋ฅผ ํ†ตํ•ด SAM์€ ํ›ˆ๋ จ ์ค‘์— ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜์— ๊ต๋ž€์„ ์ฃผ์–ด ์†์‹ค์˜ ๋‚ ์นด๋กœ์šด ๋ถ€๋ถ„์„ ์ธ์‹ํ•˜๊ณ  ์ด๋ฅผ ์ตœ์†Œํ™”ํ•œ๋‹ค.

4. ์—ฐ๊ตฌ๊ฒฐ๊ณผ

๋ณธ ์—ฐ๊ตฌ๋Š” ConvNeXt์™€ ViT์— SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์—ฌ ๋น„์ •์ƒ๊ณผ ์ •์ƒ ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ํ•™์Šต์€ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๊ณ ์ž ํ•™์Šต ๋ฐ ๊ฒ€์ฆ ์†์‹ค๊ฐ’์ด ๋” ์ด์ƒ์˜ ์ˆ˜๋ ด์„ ๋ณด์ด์ง€ ์•Š์„ ๋•Œ ์ข…๋ฃŒํ•˜์˜€๋‹ค. ๋˜ํ•œ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜๊ณ ์ž ์ฆ๋Œ€์ •์ฑ…์ธ AutoAugment๋ฅผ ์ ์šฉํ•˜์—ฌ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ 25๋ฐฐ ์ฆ๋Œ€ํ•˜๊ณ  ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์„ ๋”ํ•˜์˜€๋‹ค. ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹๊ณผ ์ฆ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ์…‹์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œ 2์— ์ž์„ธํžˆ ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

ํ‘œ 2. ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€ ํ•™์Šต ์›๋ณธ ๋ฐ ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ(๋‹จ์œ„ : ์žฅ)

Table 2. Constructing original and augmented gastroscopy image learning dataset

Type

Number of Training images

Original

Augment

ABN

180

4,680

NOR

180

4,680

์ด๋ฅผ ํ†ตํ•ด ๊ฐ ๋ชจ๋ธ์ด ๋น„์ •์ƒ๊ณผ ์ •์ƒ ์ด๋ฏธ์ง€์˜ ๋ฏธ์„ธํ•œ ๋ณ€ํ™”์™€ ๋‹ค์–‘ํ•œ ํŒจํ„ด์„ ํ•™์Šตํ•˜์—ฌ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๊ณ  CADx์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค.

ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ‰๊ฐ€๋ฅผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•ด ๋ชจ๋ธ๋งˆ๋‹ค ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹๊ณผ ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์„ ์ ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์œผ๋กœ๋Š” ์ •๋ฐ€๋„(Precision), ๋ฏผ๊ฐ๋„(Sensitivity), F1-score, ์ •ํ™•๋„(Accuracy)๋ฅผ ํ‰๊ฐ€์ง€ํ‘œ๋กœ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ์ •๋ฐ€๋„๋Š” ๋ชจ๋ธ์ด ๋ถ„๋ฅ˜ํ•œ ๋น„์ •์ƒ ์ด๋ฏธ์ง€ ์ค‘ ์‹ค์ œ ๋น„์ •์ƒ ์ด๋ฏธ์ง€์ธ ๊ฒฝ์šฐ์˜ ๋น„์œจ์ด๋‹ค. ๋ฏผ๊ฐ๋„๋Š” ์ „์ฒด ๋น„์ •์ƒ ํด๋ž˜์Šค ์ค‘ ๋ชจ๋ธ์ด ๋น„์ •์ƒ ํด๋ž˜์Šค ๋ถ„๋ฅ˜์— ์„ฑ๊ณตํ•œ ๋น„์œจ์„ ์˜๋ฏธํ•œ๋‹ค. F1-score๋Š” ์ •๋ฐ€๋„์™€ ๋ฏผ๊ฐ๋„์˜ ์กฐํ™”ํ‰๊ท ์œผ๋กœ ํŽธํ–ฅ๋œ ํ‰๊ฐ€๋ฅผ ๋ฐฉ์ง€ํ•œ๋‹ค. ์ •ํ™•๋„๋Š” ์ „์ฒด ์ด๋ฏธ์ง€ ์ค‘ ๋ชจ๋ธ์ด ์ •ํ™•ํ•˜๊ฒŒ ๋น„์ •์ƒ๊ณผ ์ •์ƒ์„ ๋ถ„๋ฅ˜ํ•œ ๋น„์œจ์„ ์˜๋ฏธํ•œ๋‹ค. ํ‘œ 3์€ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹ ๋ฐ ์ฆ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šต๋œ ConvNeXt ๋ชจ๋ธ๊ณผ SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•œ ๋ชจ๋ธ์˜ ๋น„์ •์ƒ๊ณผ ์ •์ƒ ํ…Œ์ŠคํŠธ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๊ฒฐ๊ณผ์ด๋‹ค.

ํ‘œ 3. ConvNeXt ๋ชจ๋ธ์˜ ์„ธ๋ถ€ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ

Table 3. Detailed classification performance of ConvNeXt modelv

Model

Type

Abnormal vs Normal

Original

AutoAugment

ConvNeXt

-Base

Precision

0.7770

0.7756

Sensitivity

0.7167

0.7583

F1-score

0.7456

0.7668

Accuracy

0.7167

0.7583

Precision

0.9585

0.9833

ConvNeXt

-Base

(SAM Optimizer)

Sensitivity

0.9583

0.9833

F1-score

0.9584

0.9833

Accuracy

0.9583

0.9833

๋ชจ๋“  ํ‰๊ฐ€์ง€ํ‘œ๋ฅผ ๋น„๊ตํ–ˆ์„ ๋•Œ, ConvNeXt์— SAM์„ ์ ์šฉํ•˜์˜€์„ ๊ฒฝ์šฐ ๋งค์šฐ ํฐ ์„ฑ๋Šฅ ์ฐจ์ด๋ฅผ ๋ณด์˜€๋‹ค. ์ด๋Š” ์˜๋ฃŒ ์˜์ƒ์„ ํ†ตํ•ด ๊ฐœ๋ฐœํ•œ CADx์˜ ํ‰๊ฐ€์ง€ํ‘œ๋Š” ์ •ํ™•๋„๋ณด๋‹ค ๋ฏผ๊ฐ๋„๊ฐ€ ๋” ์ค‘์š”ํ•œ ์˜๋ฏธ๋ฅผ ๋‚ดํฌํ•œ๋‹ค. ์˜๋ฃŒ ๋ฐ์ดํ„ฐ๋Š” ํด๋ž˜์Šค๋งˆ๋‹ค ํ™˜์ž์˜ ๋ถ„ํฌ๊ฐ€ ๋‹ฌ๋ผ ๋ถˆ๊ท ํ˜•ํ•œ ๋ฐ์ดํ„ฐ์…‹์ด ํ˜•์„ฑ๋  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ •ํ™•๋„๋งŒ์œผ๋กœ ํ‰๊ฐ€ํ•˜๋ฉด ์‹ ๋ขฐ๋„๊ฐ€ ๊ฐ์†Œํ•œ๋‹ค. ์‹ค์ œ๋กœ๋Š” ๋น„์ •์ƒ์ด๋‚˜ ์ •์ƒ์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ๊ฒฝ์šฐ, ํ™˜์ž์˜ ์œ„ ๋ณ‘๋ณ€์˜ ๋ฐœ๊ฒฌ์ด ๋Šฆ์–ด ์น˜๋ฃŒ ์‹œ๊ธฐ๋ฅผ ๋†“์น  ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ฏผ๊ฐ๋„๋Š” ์ค‘์š”ํ•œ ํ‰๊ฐ€์ง€ํ‘œ๋กœ ์‚ฌ์šฉ๋œ๋‹ค. ํ•˜์ง€๋งŒ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋ฐ์ดํ„ฐ์…‹์˜ ๊ฐ ํด๋ž˜์Šค ์žฅ ์ˆ˜๊ฐ€ ๋™์ผํ•˜๋ฏ€๋กœ ๋ฏผ๊ฐ๋„์™€ ์ •ํ™•๋„๊ฐ€ ๋™์ผํ•œ ๋ชจ์Šต์„ ๋ณด์—ฌ์ค€๋‹ค. SAM์„ ์ ์šฉํ•˜์ง€ ์•Š์€ ConvNeXt์˜ ๋ฏผ๊ฐ๋„๋ฅผ ๋ณด๋ฉด ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•˜์˜€์„ ๋•Œ๋Š” 0.7167์„ ๋‹ฌ์„ฑํ•˜์˜€๊ณ  ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•˜์˜€์„ ๋•Œ 0.7583์„ ๋‹ฌ์„ฑํ•˜์—ฌ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹ ๋Œ€๋น„ 4.16%์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋ณด์˜€๋‹ค. ๋ฐ˜๋ฉด์— Original ๋ชจ๋ธ์— SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์˜€์„ ๊ฒฝ์šฐ, 0.9583์—์„œ 0.9833์œผ๋กœ SAM์„ ์ ์šฉํ•˜์ง€ ์•Š์€ ConvNeXt๋ณด๋‹ค ์„ฑ๋Šฅ ํ–ฅ์ƒ ํญ 2.5%๋กœ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹ ๋Œ€๋น„ ํ–ฅ์ƒํญ์€ ์ ์ง€๋งŒ Original ๋ชจ๋ธ ๋Œ€๋น„ 24.16%, 22.5%์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒํญ์„ ๋ณด์˜€๋‹ค. ๋‹ค์Œ ํ‘œ 4๋Š” ViT-Base ๋ชจ๋ธ์˜ ์„ธ๋ถ€ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค.

ํ‘œ 4. ViT ๋ชจ๋ธ์˜ ์„ธ๋ถ€ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ

Table 4. Detailed classification performance of ViT model

Model

Type

Abnormal vs Normal

Original

AutoAugment

ViT

-Base

Precision

0.9520

0.9520

Sensitivity

0.9500

0.9500

F1-score

0.9510

0.9510

Accuracy

0.9500

0.9500

ViT-Base

(SAM Optimizer)

Precision

0.7769

0.9595

Sensitivity

0.7750

0.9583

F1-score

0.7760

0.9589

Accuracy

0.7750

0.9583

SAM์„ ์ ์šฉํ•˜์ง€ ์•Š์€ ViT์˜ Sensitivity๋ฅผ ํ™•์ธํ•˜๋ฉด ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•˜์˜€์„ ๋•Œ 0.9500์„ ๋‹ฌ์„ฑํ•˜์˜€์œผ๋ฉฐ ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•˜์˜€์„ ๋•Œ ๋˜ํ•œ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹๊ณผ ๋™์ผํ•œ ์„ฑ๋Šฅ์ด ๋‚˜์™”๋‹ค. SAM์„ ์ ์šฉํ•œ ViT์˜ ๊ฒฝ์šฐ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•˜์˜€์„ ๋•Œ๋Š” 0.7750์œผ๋กœ ๋ณด๋‹ค ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์ง€๋งŒ ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•˜์˜€์„ ๋•Œ 0.9583์œผ๋กœ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹ ๋Œ€๋น„ 18.33% ์„ฑ๋Šฅ ํ–ฅ์ƒํญ์„ ๋ณด์˜€๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ, ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜๊ณ  SAM์„ ์ ์šฉํ•œ ๋ชจ๋ธ์ด ๋” ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๊ฐ€์ง์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

5. ๊ฒฐ ๋ก 

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๊ฒฝ์ƒ๊ตญ๋ฆฝ๋Œ€ํ•™๊ต๋ณ‘์› ์†Œํ™”๊ธฐ๋‚ด๊ณผ์—์„œ ์ˆ˜์ง‘ํ•œ ๋น„์ •์ƒ ๋ฐ ์ •์ƒ ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋น„์ •์ƒ๊ณผ ์ •์ƒ ํด๋ž˜์Šค ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” CADx ์‹œ์Šคํ…œ์„ SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜๋Š” ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์— ์‚ฌ์šฉ๋œ ๋ฐ์ดํ„ฐ์…‹์€ ์˜๋ฃŒ ์˜์ƒ์œผ๋กœ ์ˆ˜์ง‘์ด ์–ด๋ ค์›Œ ์ด์— ๋”ฐ๋ผ ์ž‘์€ ๋ฐ์ดํ„ฐ์…‹์ด ๊ตฌ์„ฑ๋˜์—ˆ๋‹ค. ์ด๋Š” ํ•™์Šตํ•ด์•ผ ํ•˜๋Š” ๋‹ค์–‘ํ•œ ๋ณ‘๋ณ€์˜ ํŠน์ง•์ด ๋ถ€์กฑํ•˜์—ฌ ๊ณผ์ ํ•ฉ์„ ์œ ๋ฐœํ•˜๊ณ  ์„ฑ๋Šฅ์„ ํ•˜๋ฝํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๊ตฌ๊ธ€์—์„œ ์ œ์•ˆํ•œ AutoAugment๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์„ ์ฆ๋Œ€ํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ถฉ๋ถ„ํ•œ ํŒจํ„ด๊ณผ ๋ณ‘๋ณ€์˜ ํŠน์ง•์„ ํ•™์Šต์— ์ ์šฉํ•˜์˜€๋‹ค.

์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ธ ConvNeXt์™€ ViT์— SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์—ฌ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹๊ณผ ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๊ฐ๊ฐ ๋น„๊ตํ•˜์˜€๋‹ค. ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ConvNeXt์™€ ViT์— SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์˜€์„ ๋•Œ Original ๋ชจ๋ธ์— ๋น„ํ•ด ConvNeXt๋Š” 24.16% ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์ง€๋งŒ ViT๋ชจ๋ธ์—์„œ๋Š” 17.5% ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ํ•˜์ง€๋งŒ ImageNet ์ฆ๋Œ€์ •์ฑ…์„ ์ ์šฉํ•œ AutoAugment ์ฆ๋Œ€ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ConvNeXt์™€ ViT์— SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•œ ๋ชจ๋ธ์˜ ๋ฏผ๊ฐ๋„๋Š” Original ๋ชจ๋ธ๊ณผ ๋น„๊ตํ•˜์—ฌ ConvNeXt๋Š” 22.5% ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๊ณ , ViT ๋˜ํ•œ 0.83% ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด CADx ์„ฑ๋Šฅ์„ ์ถฉ๋ถ„ํžˆ ํ–ฅ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์‚ฌ์‹ค์„ ์ž…์ฆํ•˜์˜€๋‹ค.

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•œ CNN ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ๊ณผ ํŠธ๋žœ์Šคํฌ๋จธ ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ์„ ํ†ตํ•ด CADx์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ํ•˜์ง€๋งŒ SAM Optimizer๋ฅผ ์ ์šฉํ•จ์— ์žˆ์–ด ํ•™์Šต๋ฅ  ๋ฐ Rho ๊ฐ’๊ณผ ๊ฐ™์€ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ๋ณ„๋„๋กœ ์กฐ์ •ํ•˜์ง€ ์•Š์•„ ์™„๋ฒฝํžˆ SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์„ฑ๋Šฅ์„ ๋Œ์–ด๋ƒˆ๋‹ค๊ณ  ๋ณด๊ธฐ์— ์–ด๋ ค์šด ์ ์ด ์กด์žฌํ•œ๋‹ค. ์ถ”ํ›„ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ์ ์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ๋ฏธ์„ธ์กฐ์ •์„ ํ†ตํ•ด SAM ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์„ฑ๋Šฅ์„ ๋”์šฑ ํ–ฅ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•  ์˜ˆ์ •์ด๋‹ค. ๋˜ํ•œ, ๊ธฐ๋ณธ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ SGD(Stochastic Gradient Descent)๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ตœ์‹  ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ถ”๊ฐ€๋กœ ์ ์šฉํ•˜์—ฌ ๋” ๋น ๋ฅด๊ฒŒ ์†์‹ค๊ฐ’์„ ์ˆ˜๋ ดํ•  ์ˆ˜ ์žˆ๋„๋ก ์ถ”๊ฐ€ ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•  ์˜ˆ์ •์ด๋‹ค. ๋˜ํ•œ ImageNet ๊ธฐ๋ฐ˜์˜ ์ฆ๋Œ€์ •์ฑ…์„ ์ ์šฉํ•œ AutoAugment๋Š” ๋น„์ •์ƒ ๋ฐ ์ •์ƒ ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€์— ์ ํ•ฉํ•œ ์ฆ๋Œ€์ •์ฑ…์ด๋ผ๊ณ  ๋ณด๊ธฐ์— ํž˜๋“ค๋‹ค. ์ด๋Ÿฌํ•œ ์ ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ถ”ํ›„ ์—ฐ๊ตฌ์—์„œ๋Š” ๊ณผ์ ํ•ฉ์„ ๋ง‰๊ณ  ๋‹ค์–‘ํ•œ ์œ„ ๋ณ‘๋ณ€์˜ ํŒจํ„ด์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋Š” ์œ„๋‚ด์‹œ๊ฒฝ ์ด๋ฏธ์ง€์— ๋งž๋Š” ๋ฐ์ดํ„ฐ ์ฆ๋Œ€์ •์ฑ…์„ ๊ฐœ๋ฐœํ•˜๋Š” ์—ฐ๊ตฌ๋ฅผ ์ˆ˜ํ–‰ ์˜ˆ์ •์ด๋‹ค.

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (No. 2022R1I1A3053872) and was supported by "Regional Innovation Strategy (RIS)" through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(MOE)(2022RIS-005).

References

1 
F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, A. Jemal, 2018, Global Cancer Statistics 2018: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries, Int. J. Cancer, Vol. 144, pp. 1941-1953DOI
2 
Ministry of Health and Welfare, 2023, Annual report of the National Cancer Registration Program 2020, Ministry of Health and WelfareDOI
3 
Korea National Cancer Center, 2021, Cancer trend report through data, DOI
4 
Korean Academy of Medical Sciences, 2021, Annual Report of Medical Subspecialty in Korea 2021, DOI
5 
O. Attallah, M. Sharkas, 2021, GASTRO-CADx: a three stages framework for diagnosing gastrointestinal diseases, PeerJ Computer Science, Vol. 7, pp. e423-DOI
6 
F. Mohammad, M. Al-Razgan, 2022, Deep feature fusion and optimization-based approach for stomach disease classification, Sensors, Vol. 22, No. 7, pp. 2801-DOI
7 
H. Ueyama, 2021, Application of Artificial Intelligence with a Convolutional Neural Network for Early Gastric Cancer Diagnosis Based on Magnifying Endoscopy with Narrowโ€Band Imaging, Journal of Gastroenterology and Hepatology, Vol. 36, No. 2, pp. 482-489DOI
8 
H. Okamoto, Q. Cap, T. Nomura, H. Iyatomi, J. Hashimoto, 2019, Stochastic Gastric Image Augmentation for Cancer Detection from X-ray Images, Proceedings of the 2019 IEEE International Conference on Big Data, pp. 4858-4863DOI
9 
A. Teramoto, T. Shibata, H. Yamada, Y. Hirooka, K. Saito, H. Fujita, 2022, Detection and characterization of gastric cancer using a cascade deep learning model in endoscopic images, Diagnostics, Vol. 12, No. 8, pp. 1996-DOI
10 
Y. Sakai, S. Takemoto, K. Hori, M. Nishimura, H. Ikematsu, T. Yano, H. Yokota, 2018, Automatic detection of early gastric cancer in endoscopic images using a transferring convolutional neural network, Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 4138-4141DOI
11 
M. Kang, S. Kang, K. Oh, 2020, Verification of the Effect of Data Augmentation and Transfer Learning on the Performance Improvement of CNN-Based Gastroscope Classification/ Segmentation, Proceedings of the Korea Information Science Society Conference, pp. 593-595DOI
12 
E.D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, Q.V. Le, 2019, Autoaugment: Learning augmentation strategies from data. Proc, In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 113-123DOI
13 
A. Krizhevsky, 2009, Learning multiple layers of features from tiny images, Technical reportDOI
14 
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, 2009, Imagenet: A large-scale hierarchical image database, IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255DOI
15 
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A. Y. Ng, 2011, Reading Digits in Natural Images with Unsupervised Feature Learning, Neural Information Processing Systems (NIPS)DOI
16 
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, 2020, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, arXiv preprint arXiv:2010.11929DOI
17 
Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, Mar. 2022, A ConvNet for the 2020s, In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11976-11986DOI
18 
S. Xie, R. Girshick, P. Dollรกr, Z. Tu, K. He, 2017, Aggregated residual transformations for deep neural networks, In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1492-1500DOI
19 
P. Foret, A. Kleiner, H. Mobahi, B. Neyshabur, 2020, Sharpness-aware minimization for efficiently improving generalization, arXiv preprint arXiv:2010.01412DOI

์ €์ž์†Œ๊ฐœ

๋ฐ•์žฌ๋ฒ” (Jae-beom Park)
../../Resources/kiee/KIEE.2023.72.11.1399/au1.png

Jae-beom Park currently working toward B.S and M.S. degree in Interdisciplinary Graduate Program for BIT Medical Convergence from Kangwon National University, South Korea.

๊น€๋ฏผ์ค€ (Min-jun Kim)
../../Resources/kiee/KIEE.2023.72.11.1399/au2.png

Min-jun Kim currently working toward the B.S. degree in Electrical and Electronic Engineering from Kangwon National University, South Korea.

์›ํ˜•์‹(Hyeong-sik Won)
../../Resources/kiee/KIEE.2023.72.11.1399/au3.png

Hyeong-sik Won currently working toward the B.S. degree in Electrical and Electronic Engineering from Kangwon National University, South Korea.

์กฐํ˜„์ง„(Hyun Chin Cho)
../../Resources/kiee/KIEE.2023.72.11.1399/au4.png

She received the M.S. and Ph.D. degrees in Internal Medicine from Gyeongsang National University School of Medicine of Jinju, South Korea in 2008 and 2014, she was a Fellow at Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea from 2009 to 2010.

During 2011-2015, she was a professor at Samsung Changwon Hospital, Sungkyunkwan University School of Medicine, Changwon, South Korea.

She is currently a professor at Gyeongsang National University School of Medicine and Gyeongsang National University Hospital, Jinju, Korea.

์กฐํ˜„์ข… (Hyun-chong Cho)
../../Resources/kiee/KIEE.2023.72.11.1399/au5.png

Hyun-chong Cho received his M.S. and Ph.D. degrees in electrical and computer engineering from the University of Florida, USA, in 2009.

During 2010โ€“2011, he was a Research Fellow at the University of Michigan, Ann Arbor, USA.

From 2012 to 2013, he was a Chief Research Engineer at LG Electronics, South Korea.

He is currently a Professor with the Department of Electronics Engineering and Interdisciplinary Graduate Program for BIT Medical, Kangwon National University, South Korea.