Sanghyuk Chun

  • Research Scientist
  • sanghyuk.chun [at]
  • Scholar | Github | Twitter | CV (as of Jan 15, 2021)

I'm a research scientist at NAVER AI LAB and NAVER CLOVA, working on machine learning and its applications. Prior to working at NAVER, I worked as a research engineer at advanced recommendation team (ART) in Kakao from 2016 to 2018.

I received a master's degree in Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST) in 2016. During the master's degree, I researched on a scalable algorithm for robust subspace clustering (the algorithm is based on robust PCA and k-means clustering). Before my master's study, I worked at IUM-SOCIUS in 2012 as a software engineering internship. I also did a research internship at Networked and Distributed Computing System Lab in KAIST and NAVER Labs during summer 2013 and fall 2015, respectively.

NAVER AI LAB is looking for motivated research internship students / regular research scientists (topic: real-world biases, uncertainty estimation, robustness, causality, explainability, large-scale learning, self-supervised learning, multi-modal learning). If you are interested in joining our group, please send an email to me with your academic CV and desired topics.

Research Interests

Reliable machine learning with limited supervision. Real-world machine learning models often suffer from unreliability issues; (1) the lack of generalizability to unseen biases or corruptions, (2) improper uncertainty estimation, (3) their decisions are not explainable to humans. To achieve reliable machine decisions, we need a large number of annotations in every possible situation, e.g., traffic signs with every possible weather condition, which is highly impractical and unachievable in most cases. Instead of collecting or generating all possible situations, my research interests focus on developing reliable machine learning models with only limited human supervision. In particular, I am interested in the following types of supervision: (1) human inductive bias without additional labeling, (2) extra multi-modal information related to the original task, (3) weak supervision, or semi-supervision which requires a reasonable number of additional annotations.


  • _1/2021 : 1 paper [AdamP] is accepted at ICLR 2021.
  • 12/2020 : 1 paper [LF-Font] is accepted at AAAI 2021.
  • _7/2020 : 1 paper [DM-Font] is accepted at ECCV 2020.
  • _6/2020 : Receiving the best paper runner-up award at AICCW CVPR 2020.
  • _6/2020 : Receiving an outstanding reviewer award at CVPR 2020.
  • _6/2020 : Giving a talk at CVPR 2020 NAVER interative session.
  • _6/2020 : 1 paper [ReBias] is accepted at ICML 2020.
  • _4/2020 : 1 paper [DM-Font short] is accepted at CVPR 2020 workshop.
  • _2/2020 : 1 paper [wsoleval] is accepted at CVPR 2020.
  • _1/2020 : 1 paper [HCNN] is accepted at ICASSP 2020.
  • 10/2019 : 1 paper [HCNN short] is accpeted at ISMIR late break demo.
  • 10/2019 : Working at Naver Labs Europe as a visiting researcher (Oct - Dec 2019)
  • _7/2019 : 2 papers [CutMix] [WCT2] are accepted at ICCV 2019 (1 oral presentation).
  • _6/2019 : Giving a talk at ICML 2019 Expo workshop.
  • _5/2019 : 2 papers [MTSA] [RegEval] are accepted at ICML 2019 workshops (1 oral presentation).
  • _5/2019 : Giving a talk at ICLR 2019 Expo talk.
  • _3/2019 : 1 paper [PRM] is accepted at ICLR 2019 workshop.


(C: peer-reviewed conference, W: peer-reviewed workshop, A: arxiv preprint, O: others)
(authors contributed equally)

See also at my Google Scholar.

  • Probabilistic Embeddings for Cross-Modal Retrieval.
    • Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus
    • preprint. paper | bibtex
  • Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels.
    • Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk Chun
    • preprint. paper | code | bibtex
  • AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights.
    • Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha
    • ICLR 2021. paper | code | project page | pypi | bibtex
  • Few-shot Font Generation with Localized Style Representations and Factorization.
    • Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
    • AAAI 2021. paper | code | bibtex
  • Evaluation for Weakly Supervised Object Localization: Protocol, Metrics, and Datasets.
  • Few-shot Compositional Font Generation with Dual Memory.
    • Junbum Cha, Sanghyuk Chun, Gayoung Lee, Bado Lee, Seonghyeon Kim, Hwalsuk Lee
    • ECCV 2020. paper | code | video | bibtex
  • Learning De-biased Representations with Biased Representations.
  • Toward High-quality Few-shot Font Generation with Dual Memory. Oral presentation The best paper runner-up award
    • Junbum Cha, Sanghyuk Chun, Gayoung Lee, Bado Lee, Seonghyeon Kim, Hwalsuk Lee
    • CVPR Workshop 2020. paper | bibtex
  • Evaluating Weakly Supervised Object Localization Methods Right.
  • Data-driven Harmonic Filters for Audio Representation Learning.
  • Neural Approximation of Auto-Regressive Process through Confidence Guided Sampling.
    • YoungJoon Yoo, Sanghyuk Chun, Sangdoo Yun, Jung-Woo Ha, Jaejun Yoo
    • preprint. paper | bibtex
  • Toward Interpretable Music Tagging with Self-attention.
  • CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. Oral presentation
  • Photorealistic Style Transfer via Wavelet Transforms.
  • Automatic Music Tagging with Harmonic CNN.
    • Minz Won, Sanghyuk Chun, Oriol Nieto, Xavier Serra
    • ISMIR LBD 2019. paper | code | bibtex
  • An Empirical Evaluation on Robustness and Uncertainty of Regularization methods.
    • Sanghyuk Chun, Seong Joon Oh, Sangdoo Yun, Dongyoon Han, Junsuk Choe, Youngjoon Yoo
    • ICML Workshop 2019. paper | bibtex
  • Visualizing and Understanding Self-attention based Music Tagging. Oral presentation
  • Where To Be Adversarial Perturbations Added? Investigating and Manipulating Pixel Robustness Using Input Gradients.
    • Jisung Hwang, Younghoon Kim, Sanghyuk Chun, Jaejun Yoo, Ji-Hoon Kim, Dongyoon Han
    • ICLR Workshop 2019. paper | bibtex
~ 2018
  • Multi-Domain Processing via Hybrid Denoising Networks for Speech Enhancement.
  • A Study on Intelligent Personalized Push Notification with User History.
    • Hyunjong Lee, Youngin Jo, Sanghyuk Chun, Kwangseob Kim
    • Big Data 2017. paper | bibtex
  • Scalable Iterative Algorithm for Robust Subspace Clustering: Convergence and Initialization.
    • Master's Thesis, Korea Advanced Institute of Science and Technology, 2016 (advised by Jinwoo Shin) paper | code

Academic Activities

Professional Service
  • Reviewer: CVPR 2020 (outstanding reviewer), NeurIPS 2020, ACCV 2020, WACV 2021, AAAI 2021, ICLR 2021, CVPR 2021, ICML 2021, ICCV 2021.
  • Outstanding reviewer award, CVPR 2020
  • Best paper runner-up award, AI for Content Creation Workshop at CVPR 2020

Industry Experience

NAVER AI Research (2018 ~ Now)
  • Hangul
    DM-Font teasor
    Hangul Handwriting Font Generation

    Distributed at 2019 Hangul's day (한글날), [Full font list]

    • Hangul (Korean alphabet, 한글) originally consists of only 24 sub-letters (ㄱ, ㅋ, ㄴ, ㄷ, ㅌ, ㅁ, ㅂ, ㅍ, ㄹ, ㅅ, ㅈ, ㅊ, ㅇ, ㅎ, ㅡ, ㅣ, ㅗ, ㅏ, ㅜ, ㅓ, ㅛ, ㅑ, ㅠ, ㅕ), but by combining them, there exist 11,172 valid characters in Hangul. For example, "한" is a combination of ㅎ, ㅏ, and ㄴ, and "쐰" is a combination of ㅅ, ㅅ, ㅗ, ㅣ, and ㄴ. It makes generating a new Hangul font be very expensive and time-consuming. Meanwhile, since 2008, Naver has distributed Korean fonts for free (named Nanum fonts, 나눔 글꼴).
    • In 2019, we developed a technology for fully-personalized Hangul generation only with 152 characters. We opened an event page where users can submit their own handwriting. The full generated font list can be found in [this link]. Details for the generation technique used for the service was presented in Deview 2019 [Link].
    • This work was also extended to the few-shot generation based on the compositionality. See the papers in AI for Content Creation Workshop (AICCW) at CVPR 2020 (short paper) [Link], ECCV 2020 (full paper) [Link], and AAAI 2021 [Link].
    • [BONUS] You can play with my handwriting here
  • example sticker
    Example emoji from LINE sticker shop.
    Emoji Recommendation (LINE Timeline)

    Deployed in Jan. 2019

    • LINE is a major messenger player in east asia (Japan, Taiwan, Thailand, Indonesia, and Korea). In the application, users can buy and use numerous emoijs a.k.a. LINE Sticker.
    • In this project, we recommended emojis to users based on their profile picture (cross-domain recommendation).
    • I developed and researched the entire pipeline of the cross-domain recommendation system and operation tools.
Kakao Advanced Recommendation Technology (ART) team (2016 ~ 2018)
  • Kakao
    Recommender Systems (Kakao services)

    Feb. 2016 - Feb. 2018

    • I developed and maintained a large-scale real-time recommender system (Toros [PyCon Talk] [AI Report]) for various services in Daum and Kakao. I mainly worked with content-based representation modeling (for textual, visual, and musical data), collaborative filtering modeling, user embedding, user clustering, and ranking system based on Multi-armed Bandit.
    • Textual domain: Daum News similar article recommendation, Brunch (blog service) similar post recommendation, Daum Cafe (community service) hit item recommendation.
    • Visual domain: Daum Webtoon and Kakao Page similar item recommendation, video recommendation for a news article (cross-domain recommendation).
    • Audio domain: music recommendation for Kakao Mini (smart speaker), Melon and Kakao Music.
    • Online to offline: Kakao Hairshop style recommendation.
  • IPPN
    System overview.
    Personalized Push Notification with User History (Daum, Kakao Page)

    Deployed in 2017

    • The mobile push service (or alert system) is widely-used in mobile applications to attain a high user retention rate. However, a freqeunt push notification makes a user feel fatigue, resulting on the application removal. Usually, the push notification system is a rule-based system, and managed by human labor. In this project, we researched and developed a personalized push notification system based on user activity and interests. The system has been applied to Daum an Kakao Page mobile applications. More details are in our paper.
  • Daum Shopping
    Large-Scale Item Categorization in e-Commerce (Daum Shopping)

    Deployed in 2017

    • An accurate categorization helps users to search desired items in e-Commerce based on the category, e.g., clothes / shoes / sneakers. However, the categorization is usually performed based on rule-based systems or human labor, which leads to low coverage of categorized items. Even the automatic item categorization is difficult due to its web-scale data size, the highly unbalanced annotation distribution, and noisy labels. I developed a large-scale item categorization system for Daum Shopping based on a deep network, from the operation tool to the categorization API.
  • Naver Labs
    Research internship (Naver Labs)

    Aug. 2015 - Dec. 2015

    • During the internship, I implemented batch normalization (BN) to AlexNet, Inception v2 and VGG on ImageNet using Caffe. I also researched batch normalization for sequential models, e.g., RNN using Lua Torch.
    Software engineer (IUM-SOCIUS)

    Jun. 2012 - Jan. 2013

    • I worked as web developer at IUM-SOCIUS. During the internship, I developed and maintained internal batch services (JAVA spring batch), internal statistics service (Python Flask, MongoDB), internal admin tools (Python Django, MySQL), and main service systems (JAVA spring, Ruby on Rails, MariaDB).

Education and Career

  • M.S. (2014.03 - 2016.02), School of Electrical Engineering, KAIST
  • B.S. (2009.03 - 2014.02), School of Electrical Engineering and School of Management Science (double major), KAIST