I'm a lead research scientist at NAVER AI Lab, working on machine learning and its applications. My research aims to expand machine knowledge with insufficient human supervision.
Machine knowledge: Existing machine learning models cannot understand the problem itself [Shortcut learning tutorial]. This causes many realistic problems, such as discrimination by machines, poor generalizability to unseen (or minor) corruptions / environments / groups. Current state-of-the-art machines only do "predict", rather than "logical thinking based on logical reasoning". As models prefer to learn by shortcuts [WCST-ML], just training models as usual will lead to biased models. If it is difficult to make machines understand the problem itself, what can we do?
Expanding machine knowledge: Thus, we need to make a machine with a causal understanding of the problem. Our model should not learn undesirable shortcut features [ReBias] [StyleAugment], or should be robust to unseen corruptions [CutMix] [RegEval] [ReLabel] [PiT] or significant distribution shifts [SWAD] [MIRO]. Also we need to make a machine not discriminative to certain demographic groups [CGL] [FairDRO]. We expect a model says "I don't know" when they get unexpected inputs [PCME]. At least, we expect a model can explain why it makes a such decision [MTSA] [MTSA WS] [WSOL eval] [WSOL Eval journal], and how it can be fixed (e.g., More data collection? More annotations? Filtering?). My research focuses on expanding machine knowledge from "just prediction" to "logical reasoning". Unfortunately, in many cases, the existing evaluation protocol or metrics are not reliable to measure how machines learn proper knowledge. I also have worked with fair evaluation benchmarks and metrics to mitigate this issue [ECCV Caption] [PCME] [WSOL eval] [WSOL Eval journal] [RegEval].
Why "insufficient human supervision"? Maybe we can make such models with large-scale datasets if we have explicit human annotations for every possible situation. Furthermore, data collection itself is even non-trivial in many scenarios. As I have witnessed the power of large-scale data points and models in NAVER [CutMix] [AdamP] [ReLabel] [PiT] [ImageNet PPF WS] [ViDT], my assumption is that learning with tremendously many data points (crawled from web) would mimic many possible situations. However, human annotations are too expensive and infeasible in many practical scenarios. We need other approaches rather than the fully supervised approach. My recent research aims to build reliable machine learning models with limited number of additional information (e.g., bias labels) but more data [ReLabel] [CGL]. In particular, I have focused on learning with vision-language datasets [PCME] [ECCV Caption] [CompoDiff].
I am looking for motivated research internship students with the following topics:
If you are interested in joining our group, or collaborating with me please send an email to me (or naverai at navercorp.com) with your academic CV and desired topics. Please check our full publication list and our job decription. You have to aware that we expect 6-month internship (no extension is available due to legal regulations). That is, we expect internship students to finish their research project within 6-months (i.e., submitting a full paper to top-tier conferences, releasing their code officially, ...). We, therefore, expect strong publication records (e.g., 1+ research papers relevant to their desired topic) to interns. Officially, we can work from either home or office (located at Seoul, Korea [Google map]). However, if you are non Korean citizenship, as of now (July 2022) it is almost impossible to work at Korea. Lastly, our hiring process is notoriously slow, usually taking more than 2 months. So, please do not contact to us imminently.
(C: peer-reviewed conference, W: peer-reviewed workshop, A: arxiv preprint, O: others)
(❋authors contributed equally)
See also at my Google Scholar.
Topics: Reliable ML learning with limited annotations Modality-specific tasks Generative models Other topics
Distributed at 2019 Hangul's day (νκΈλ ), [Full font list]
Deployed in Jan. 2019
Feb. 2016 - Feb. 2018
Deployed in 2017
Deployed in 2017
Aug. 2015 - Dec. 2015
Jun. 2012 - Jan. 2013