Welcome, my name is Zexi Huang. I am a Machine Learning Scientist at TikTok Recommendation. I obtained my PhD in Computer Science at the Department of Computer Science, University of California, Santa Barbara (UCSB) in April 2023. Prior to that, I received my Bachelor's in Computer Science and Technology with the highest honor at Yingcai Honors College, University of Electronic Science and Technology of China (UESTC), in 2018.
My research focuses on machine learning and data mining on information-rich data. At TikTok, I work on developing state-of-the-art machine learning solutions for the billion-scale recommendation system of TikTok Live. My PhD dissertation at Dynamic Networks: Analysis and Modeling Lab, UCSB, is on representation learning for information-rich graphs, with Prof. Ambuj Singh as my advisor. Previously, I had multiple applied science internships at Books Tech, Amazon, where I leveraged graph-based machine learning techniques to solve large-scale industry problems including fraud detection, inventory management, and content discovery. I also interned at Computational Intelligence Lab, Nanyang Technological University, working with Prof. Sinno Jialin Pan on transfer learning framework for community detection in multiplex networks, as an undergraduate.
Feb. 2024
May. 2023
Apr. 2023
Sep. 2018 - Apr. 2023
- Course GPA: 4.0/4.0. UCSB Computer Science Outstanding Scholar Fellow (top 4 out 63 PhD students).
- Advisor: Prof. Ambuj Singh. Dissertation: Learning Representations for Information-rich Graphs.
Sep. 2014 - Jun. 2018
- GPA: 3.96/4.0, Avg. Score: 92.79/100 (1/90 in freshman year, 1/87 in sophomore year, 1/93 in junior year).
- Advisors: Prof. Junming Shao and Prof. Sinno Jialin Pan. Thesis: Transfer Learning for Community Detection in Multiplex Networks.
Feb. 2016 - Jun. 2016
- GPA:4.0/4.0, Avg. Score: 96.44/100, Straight A+.
May. 2023 - Present
- Owned the iterations of the core ranking models of the recommendation system for TikTok Live, leveraging representation learning, multi-task learning, knowledge distillation, and sequence modeling, with +2% user watch-live duration gains from online A/B experiments.
- Developed the host go-live model that captures the relationship between watch-live and go-live with causal inference and uplift modeling to motivate authorized hosts to go-live, achieving +2% user go-live penetration in online A/B experiments.
Jun. 2022 - Sep. 2022
- Developed a stochastic inventory management model based on dynamic programming for Amazon’s Print-On-Demand business, optimizing the ordering strategy for 42.4M units of books and realizing an annual saving of $10.4M.
- Designed and implemented a graph-based NLP model for long text embedding and classification using graph neural networks, leading to superior performance to state-of-the-art transformer-based models for Kindle book contents.
Jun. 2021 - Sep. 2021
- Designed graph-based machine learning models for fraud detection based on multi-modal signals in Kindle Direct Publishing.
- Implemented the models in an end-to-end fashion and deployed them to graphs with millions of nodes and billions of edges in production.
- Validation results show that the models can surface fraud rings undetected by existing processes with an estimated annual value of $2.4M.
Jun. 2020 - Sep. 2020
- Proposed to augment existing fraud detection methods with graph-based machine learning models for Kindle Direct Publishing.
- Designed and implemented various heuristics and an embedding framework for attributed heterogeneous multiplex networks.
- The models are deployed into production and results show that they surface up to 15 times more fraud compared to the existing processes.
Sep. 2017 - Feb. 2018
- Proposed to refine community detection results in some layers with transferred knowledge from other layers in multiplex networks.
- Designed a representation-based community detection framework and implemented it with an extended symmetric NMF approach.
- Our algorithm outperforms other representation-based community detection algorithms, especially when the target layer is noisy.
[AAAI-2024] Aritra Bhowmick, Mert Kosan, Zexi Huang, Sourav Medya, Ambuj Singh, Sourav Medya. DGCLUSTER: A Neural Framework for Attributed Graph Clustering via Modularity Maximization. AAAI Conference on Artificial Intelligence, 2024. [Code]
[WSDM-2023] Zexi Huang*, Mert Kosan*, Sourav Medya, Sayan Ranu, Ambuj Singh. Global Counterfactual Explainer for Graph Neural Networks. ACM International Conference on Web Search and Data Mining, 2023. (*: equal contribution) [Code] [Slides] [Talk]
[WSDM-2022] Zexi Huang, Arlei Silva, Ambuj Singh. POLE: Polarized Embedding for Signed Networks. ACM International Conference on Web Search and Data Mining, 2022. [Code] [Poster] [Slides] [Talk]
[KDD-2021] Zexi Huang, Arlei Silva, Ambuj Singh. A Broader Picture of Random-walk Based Graph Embedding. ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021. [Code] [Poster] [Slides] [Talk] [Talk (in Chinese)]
[Preprint] Zexi Huang, Mert Kosan, Arlei Silva, Ambuj Singh. Link Prediction without Graph Neural Networks. arXiv preprint arXiv:2305.13656, 2023. [Code]
[Preprint] Wei Ye, Zexi Huang, Yunqi Hong, Ambuj Singh. Graph Neural Diffusion Networks for Semi-supervised Learning. arXiv preprint arXiv:2201.09698, 2022. [Code]
Apr. 2021 - Apr. 2023
- Developed a stability optimization algorithm for multiscale community detection based on the pointwise mutual information similarity.
- Preliminary experiments show that the proposed algorithm can uncover the natural scales for different communities in the graph.
May. 2021 - Feb. 2023
- Scrutinized the training and evaluation of link prediction methods and identify their limitations in handling class imbalance.
- Proposed a novel topology-centric framework that combines graph learning, topological heuristics, and an N-pair loss for link prediction.
- Results showed that the proposed method is 145% more accurate and trains/infers 11/6,000 times faster than the state-of-the-art methods.
Oct. 2021 - Aug. 2022
- Formulated the novel problem of global counterfactual reasoning/explanation of graph neural networks for graph classification.
- Proposed GCFExplainer, the first global explainer powered by vertex-reinforced random walks on an edit map with a greedy summary.
- Results showed that GCFExplainer not only provides crucial high-level insights but also outperforms existing methods in recourse quality.
Oct. 2020 - Aug. 2021
- Designed a novel polarization measure for signed graphs and showed that existing methods fail in polarized signed link prediction.
- Proposed a polarized embedding algorithm that captures both topological and signed similarity jointly via signed autocovariance.
- Extensive experiments showed that the proposed model outperforms state-of-the-art methods by up to one order of magnitude.
Apr. 2020 - Oct. 2021
- Interpreted the layer-wise propagation rule of GCN from the perspective of power iteration and analyzed its converging process.
- Designed a novel GCN architecture that learns to aggregate multiscale information based on graph diffusions with a neural network.
- Illustrated the effectiveness and efficiency of the proposed model by extensive comparative studies with state-of-the-art methods.
Sep. 2018 - Feb. 2021
- Presented a unified view of embedding, covering different random-walk processes, similarity metrics, and embedding algorithms.
- Showed both theoretical and empirical evidence of the superiority of the novel autocovariance embedding in link prediction.
- Illustrated ways to exploit the multiscale nature of random-walk similarity to further optimize embedding performance.
Jan. 2020 - Aug. 2020
- Extended the Prospect Theory to model group-level risky decision making dynamics with an interpersonal influence system.
- Results on two human-subject experiments show that the group behavior shifts towards consensus and is explained by the influence.
Jul. 2016 - Aug. 2017
- Proposed an intuitive algorithm for fast overlapping community detection in networks.
- Introduced gain and loss functions from game theory to model the intrinsic dynamics between nodes.
- Results are comparable to state-of-the-art algorithms.
2020 - 2021
Fall 2019
Spring 2019
Winter 2019
Fall 2018
Feb. 2023
Feb. 2022
Sep. 2020
Sep. 2018
Sep. 2018
Jun. 2018
Jun. 2018
Dec. 2017
Dec. 2017
Dec. 2017
Oct. 2017
May. 2017
Dec. 2016
Dec. 2015
Dec. 2015
Dec. 2015
- KDD’23
- AAAI’23-24
- KDD’22
- SDM’22
- TNNLS’23
- Neural Networks’23
- TIST’22-24
- TKDD’21-24
- KDD’20-21
- WebConf’21
- Graduate Affairs Committee, Department of Computer Science, UCSB
- SB Hacks V, VII, VIII Hackathon