|
Oz T. Jang
He is a researcher in artificial intelligence and an amateur researcher in mathematics. he is working on Reasoning, AI safety and Multimodal. He is also a incoming visiting faculty, National University of Singapore. He was a visiting researcher in Carnegie Mellon University.
He always supports Slow Science
Email: echo b3p0LmpAaWNsb3VkLmNvbQ== | base64 -d
Github  / 
Scholar  / 
If you would like to join his group in any other capacity, please fill this form and then please send him a short email note without any documents.
|
|
|
AutoRL-LRM: Automated RL Algorithm Optimization for Large Reasoning Models
Oz T. Jang
[paper]
(Under reviewing)
[code]
An autonomous research agent that discovers, implements, and evaluates RL algorithms to improve mathematical reasoning, guided by metrics.
2025
|
|
VRHF: RLVR from Human Feedback
Oz T. Jang
[paper]
(Writing)
Propose combining RLHF with RLVR for improving Large Reasoning Model
2025
|
|
Double Zero: Self-Evolving Reasoning with Zero Data, Zero RL
Oz T. Jang
[paper]
(Writing)
Propose Double Zero Self-Evolving Large Reasoning Models without data, without SFT
2025
|
|
SFT-hybrid-RFT Large Reasoning Model
Oz T. Jang
[paper]
(Writing)
Propose SFT-hybrid-RFT Large Reasoning Model
2025
|
|
LookaheadBio: Bilevel Optimization with Lookahead Moving Averages for LLM Data Reweighting
Ruijie Xie, Chaoyue Zhao, Oz T. Jang
[paper]
(Under reviewing)
the first bilevel optimization algorithm that incorporates Lookahead's slow-fast weight mechanism for both model parameters and data weights, achieving comprehensive variance reduction.
2025
|
|
Lookahead-LSTM Optimizer: A Meta-Learning K-steps Method
Oz T. Jang,
Teng Yang,
Xiaozhu Hu,
Zi Yang,
Chifong Wong
[paper]
[code]
Propose a Meta-learning optimization method named Lookahead-LSTM for improving generalization and data transferability.
2020 Spring
|
|
Pro bono office hour
Starting January 2025, He has decided to commit 1~2 hours every week to provide guidance, suggestions, and/or mentorships for students from underrepresented groups or whoever is in need.
Please fill in this form if you are interested.
|
|