Publications

2026
(2026). Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization. arXiv ‘26.
2025
(2025). Optimizing Compute Core Assignment for Dynamic Batch Inference in AI Inference Accelerator. ACM SAC ‘25.
2023
(2023). Design of Analog-AI Hardware Accelerators for Transformer-based Language Models. IEDM ‘23.