Khurram Azeem Hashmi

Machine Learning Engineer at DFKI and Ph.D. Candidate at RPTU Kaiserslautern

Email / CV / Google Scholar / Twitter / GitHub / LinkedIn

About Me

I am a Machine Learning Engineer at the German Research Center for Artificial Intelligence (DFKI) and a PhD Candidate at RPTU Kaiserlautern-Landau, working under the supervision of Prof. Dr. Didier Stricker. My research focuses on advancing computer vision and deep learning, particularly in developing robust solutions for image and video understanding in challenging environments with limited spatial or labeled information. Recently nominated for the AI Person of the Year Award 2024, my work has been published at top-tier conferences including CVPR, ICCV, and WACV.

My current work spans several exciting areas: developing sparse transformers for efficient scene understanding, enhancing visual perception in challenging environments, and creating multi-modal systems for robotic applications. I lead projects like HERON, which focuses on vision-guided robotic assembly, and work on open-vocabulary scene understanding for ego-centric perception. My approach combines theoretical innovation with practical applications, aiming to bridge the gap between academic research and real-world deployment. I'm particularly passionate about making AI systems more efficient, interpretable, and applicable to real-world scenarios, with a current focus on exploring novel trends in Agentic AI and autonomous learning systems.

Projects

HERON: Multi-Modal Vision Guided Navigation System for Collaborative Assembly Robots A multi-modal vision guided navigation system designed for collaborative assembly robots, featuring advanced capabilities in obstacle avoidance, path planning, and precise screw mounting operations. The system integrates multiple sensor modalities to ensure robust and efficient robot-human collaboration in assembly tasks.

Prompt: Make a Pasta

Open Vocabulary Egocentric Scene Understanding This project leverages multi-modal foundation models to enhance egocentric scene understanding in unconstrained environments. The system enables natural language-driven task execution through LLM prompting, allowing users to perform complex sequences of actions by simply describing tasks like make a pasta. This approach combines advanced vision-language models with practical augmented reality applications.

Selected Publications [Google Scholar]

Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection K.A. Hashmi, T.U. Sheikh, D.Stricker, M.Z. Afzal
WACV, 2025
pdf / webpage (coming soon) / code (coming soon)

Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection T. Shehzadi K.A. Hashmi, D.Stricker, M.Z. Afzal
CVPR, 2024
pdf / video

FeatEnHancer: Enhancing Hierarchical Features for Object Detection and Beyond Under Low-Light Vision K.A. Hashmi, G. Kallempudi, D.Stricker, M.Z. Afzal
ICCV, 2023
pdf / webpage / code

BoxMask: Revisiting Bounding Box Supervision for Video Object Detection K.A. Hashmi, A.Pagani. D.Stricker, M.Z. Afzal
WACV, 2023
pdf / code

Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection K.A. Hashmi, D.Stricker, M.Z. Afzal
BMVC, 2022
pdf / video

Object Detection with Transformers: A Review T. Shehzadi K.A. Hashmi, D.Stricker, M.Z. Afzal
arXiv, 2023
pdf / code

Attention-Guided Disentangled Feature Aggregation for Video Object Detection S. Muralidhara, K.A. Hashmi, A.Pagani, D.Stricker, M.Z. Afzal
Sensors, 2022
pdf

Exploiting Concepts of Instance Segmentation to Boost Detection in Challenging Environments K.A. Hashmi, A.Pagani, M. Liwicki, D.Stricker, M.Z. Afzal
Sensors, 2022
pdf

Guided Table Structure Recognition Through Anchor Optimization K.A. Hashmi, D.Stricker, M. Liwicki, M.Z. Afzal
IEEE ACCESS, 2022
pdf

Services

Reviewer of Conferences:
ICCV2025, CVPR2025, ICLR2024, ECCV2024, CVPR2024, WACV2023, ECCV2022, BMVC2022

Reviewer of Journals:
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
IEEE Access
Springer Nature
Neurocomputing
Journal of Imaging

Honors & Awards

Received merit based full scholarship for complete Bachelors
Nominated for the best AI Newcomer Award by German Association of Computer Science
Nominated for the AI Person of the Year Award 2024 by data:unplugged and t3n Magazin

Template credits: Jon Barron.