I build intelligent systems that perceive, reason, remember, search and verify. My work spans computer vision, medical imaging, multimedia retrieval, healthcare AI, multimodal foundation models, neuro-inspired architectures, and trustworthy AI systems deployed in real clinical environments.
I teach and mentor students in the Stanford Biomedical Data Science Program, and at other institutions around the country in healthcare AI topics. I work collaboratively with several research institutions around the world.
The next frontier of AI is not only generating answers, but ensuring their correctness. As foundation models become more capable, high-stakes domains such as healthcare require systems that can independently verify, explain, and correct outputs before they influence human decisions.
My current research focuses on building Verification AI systems that move beyond generation toward trust, safety, and accountability in multimodal AI.
Recent work introduces verifier models for radiology AI that detect factual inconsistencies, localization errors, and hallucinations in generated clinical reports. This is in collaboration with Prof. Pingkun Yan's group at RPI. The Master's student who led the work under our guidance recently won the UC Berkeley Data Science ChangeMaker Award for building trustworthy AI.
This includes:
This direction shifts AI systems from generation-centric to verification-centric design.
I shepherded IBM's Granite Vision 3.3 2B foundation model; Top OCRBench performer, 95K+ downloads
SemCLIP: A Semantic Memory-Aligned Vision Language Model
The formation and recall of memories in the brain utilizes the linkage between episodic and semantic memory subsystems. In this work, I exploited this paradigm to design a new vision-language model that projects visual and textual concepts into a semantic memory space to build stable conceptual associations between objects and ways of referring to them.
Whether it is picking the right size stent for coronary arteries or ensuring that the stent placement has been done correctly, interventional AI requires development of high precision AI architectures. This project I led in collaboration with Boston Scientific and MIT developed a novel segmentation neural network for measuring continuous lumen boundaries in intravascular imaging.
Our stent detection method has been subsequently commercialized by Boston Scientific and introduced in clinics through their AVVIGO multi-modality guidance system.
The bioinspired memories project I led on building a computational model of the hippocampal memory system launched a new IBM Content-aware storage product
The trisynaptic circuit of the hippocampal system in the brain holds the key to remembering and recalling memory. While the auto-associative features of CA3 cells have been modeled through mathematical frameworks like the Modern Hopfield Networks, turning them into large-scale storage systems has not been practical. This work builds a dentate gyrus analog encoding mechanism prior to storing in Hopfield networks which enable accurate recovery of a large number of facts using cross-modal associations.
Beyond research contributions, I have helped shape the evolution of computer vision and healthcare AI communities through leadership roles in major international conferences and professional organizations.
My work has translated into deployed AI systems, enterprise platforms, and new product categories across healthcare, enterprise AI, and consumer robotics.
These courses explore emerging advances in multimodal foundation models, clinical AI, verification systems, and future directions for trustworthy healthcare AI. The course taught at Stanford is a full-length course along with Prof. James Zhou and Akshay Chaudhari. Other courses listed are 4-8 hour tutorials delivered at the MICCAI conference for medical imaging attendees.