Shenghao Wu

[Update Dec 2023] I recently joined Tiktok (Bytedance) as a research scientist working on AI for Science.

I graduated with a joint Ph.D. between the Neuroscience Institute 🧠 and the Machine Learning Department from CMU. I was fortunate to be advised by Prof. Byron Yu, Prof. Matthew Smith, and Prof. Brent Doiron. I have been working on machine learning algorithms to automate and accelerate scientific discovery and decision-making processes. I have also worked with Prof. Leila Wehbe and Prof. Aaditya Ramdas on individual identification and privacy in brain recordings, and Prof. Woody Zhu on generative models and causal inference.

Before coming to CMU, I obtained a bachelor’s degree in mathematics and statistics from Columbia University, where I worked with Prof. Liam Paninski on a machine learning pipeline for neural signal processing. I have also spent two wonderful years in Hong Kong, where I studied computing mathematics at City University of Hong Kong, before enrolling in the joint bachelor’s degree program between CityU and Columbia.

Education

Ph.D. in Neural Computation and Machine Learning, Carnegie Mellon University, 2018-2023

Thesis committee: Byron Yu, Matthew Smith, Brent Doiron, Chengcheng Huang, Robert Kass, Tatiana Engel

B.A. in Mathematics-Statistics, Columbia University, 2015-2018

Summa Cum Laude, Phi Beta Kappa, Statistics Department Honor

B.S. in Computing Mathematics, City University of Hong Kong, 2013-2015
Mainland Student Full Tuition Scholarship, First Class Honor

Work Experience

Research Scientist, Machine Learning for Science, Tiktok (ByteDance), 2023-Now
Deep learning for computational biology.

Research Scientist Intern, Applied Machine Learning Group, Tiktok (ByteDance), 2022
Developed a reinforcement learning framework based on graph neural networks to accelerate molecular simulation.

Research Interests

My research interests lie at the intersection of machine learning and science. I work closely with experimentalists and develop ML algorithms (e.g. generative models, Bayesian optimization, reinforcement learning, causal inference) with applications in neuroscience (e.g. brain circuit modeling), health care (e.g. generative decision making), and scientific computing (e.g. molecule graph partition).

Preprints

πŸ“„ Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas. Paper
Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang
Keywords: large language models, differential privacy, security.

πŸ“„ YASS: Yet Another Spike Sorter applied to large-scale multi-electrode array recordings in primate retina. Paper Code
JinHyung Lee, Catalin Mitelut, Hooshmand Shokri, Ian Kinsella, Nishchal Dethe, Shenghao Wu, Kevin Li, Eduardo Blancas Reyes, Denis Turcu, Eleanor Batty, Young Joon Kim, Nora Brackbill, Alexandra Kling, Georges Goetz, E.J. Chichilnisky, David Carlson, Liam Paninski
Keywords: signal processing, convolutional neural networks, generative models, spike sorting.

Journal Publications

πŸ“„ Automated customization of large-scale spiking network models to neuronal population activity. (Nature Computational Science) Paper Code
Shenghao Wu, Chengcheng Huang, Adam Snyder, Matthew Smith, Brent Doiron, Byron Yu
Keywords: Bayesian optimization, dimensionality reduction, spatiotemporal data, spiking neural networks, brain-computer interface.

πŸ“„ Brainprints: identifying individuals from magnetoencephalograms. (Nature Communications Biology) Paper Code
Shenghao Wu, Aaditya Ramdas, Leila Wehbe
Keywords: neuroimaging, data privacy, feature engineering, multi-modal recordings.

Conference Publications and Presentations

πŸ“„ Counterfactual Generative Models for Time-Varying Treatments. (KDD 2024, talk; NeurIPS 2023 Deep Generative Models for Health Workshop, spotlight.) Paper Code
Shenghao Wu, Wenbin Zhou, Minshuo Chen, Shixiang Zhu
Keywords: diffusion models, causal inference, longitudinal data, counterfactual prediction.

πŸ“„ Interpreting area-to-area differences in spiking variability using spiking network models. (Society for Neuroscience, 2023)
Shenghao Wu, Adam Snyder, Chengcheng Huang, Matthew Smith, Byron Yu, Brent Doiron
Keywords: deep neural emulators, mechanistic models, spiking neural networks.

πŸ“„ RLCG: When Reinforcement Learning Meets Coarse Graining. (Neurips 2022 AI4Science Workshop) Paper
Shenghao Wu, Tianyi Liu, Zhirui Wang, Wen Yan, Yingxiang Yang
Keywords: graph neural networks, reinforcement learning, molecular dynamics.

πŸ“„ Automatic fitting of spiking network models to neuronal activity reveals limits of model flexibility. (Computational and Systems Neuroscience, 2020) Abstract
Shenghao Wu, Chengcheng Huang, Adam Snyder, Matthew Smith, Brent Doiron, Byron Yu

πŸ“„ Neural networks for sorting neurons. (Computational and Systems Neuroscience, 2020) Abstract
JinHyung Lee, Catalin Mitelut, Ian Kinsella, Shenghao Wu, Eleanor Batty, Liam Paninski, Hooshmand Shokri, Ari Pakman, Yueqi Wang, Nishchal Dethe, Kevin Li, Eduardo Blancas Reyes, Alexandra Tikidji-Hamburyan, Georges Goetz, Ej Chichilnisky, David Carlson

πŸ“„ Riskalyzer: Inferring Individual Risk-Taking Propensity Using Phone Metadata. (ACM Ubicomp 2018) Paper
Vivek Singh, Rushil Goyal, Shenghao Wu

Services

PC Member: Neurips (2024), KDD (2024, 2023), ICML/Neurips AI4Science Workshop (2024, 2023), Neurips Generative AI & Biology Workshop (2023), Brain Informatics (2023), ACAIN (2021, 2022, 2023).

Journal Reviewer : Nature Communications Biology, Neurocomputing, Information Fusion.