I am a Ph.D. candidate in Computer Science at the University of North Carolina at Charlotte, supervised by
Dr. Pu Wang
in the GENIUS Lab. In industry, I work as a researcher with the Computer Vision teams at
Amazon and
Lowe’s where I am developing large-scale, multimodal language models (MLLMs) to enhance operational efficiency and customer experience in complex, real-world environments. Moreover, I’ll be joining
Google as a Student Researcher in the Multimodal GenAI (AR/VR) team, working on advancing multimodal and generative AI for immersive technologies.
My research interests lie at the intersection of computer vision and generative AI, with a focus on 3D human modeling. Specifically, I focus on 3D human pose estimation and mesh reconstruction via generative masked modeling. Moreover, I’m interested in developing multimodal motion synthesis frameworks that synthesize controllable, high-fidelity 3D human animations for real time applications.
If you have any research opportunities or open positions, please feel free to reach out at msaleem2@charlotte.edu .
Google, San Francisco, CA Nov. 2025 – Present
Amazon Inc., Boston, MA June 2025 – Present
Lowe’s, Charlotte, NC Sept. 2023 – Present