I am a Ph.D. candidate in Computer Science at the University of North Carolina at Charlotte, supervised by
Dr. Pu Wang
in the GENIUS Lab. In industry, I work as a researcher with the Multimodal GenAI teams at
Amazon and
Lowe’s where I am developing large-scale, multimodal language models (MLLMs) to enhance operational efficiency and customer experience in complex, real-world environments. Moreover, I joined
Google as a Research Scientist Intern in the Extended Reality (AR/VR) team, working on advancing multimodal and generative AI for immersive technologies.
My research focus on building multimodal foundation models that unify real-time perception with high-fidelity synthesis. I aim to develop context-aware AI systems capable of perceiving, reconstructing, and interacting with complex human behavior across both physical and virtual environments. My work centers on multimodal motion synthesis frameworks that enable controllable, high-quality 3D human animation for real-time applications , as well as 3D human pose estimation and mesh reconstruction using generative masked modeling. Ultimately, I seek to leverage these generative foundations to create AI systems that can both understand human behavior in the physical world and synthesize interactive digital counterparts within immersive XR environments.
If you have any research opportunities or open positions, please feel free to reach out at msaleem2@charlotte.edu .
Google, San Francisco, CA Nov. 2025 – Present
Amazon Inc., Boston, MA June 2025 – Aug 2025
Lowe’s, Charlotte, NC Sept. 2023 – Oct. 2025