-
Object-Guided Visual Tokens: Eliciting Compositional Reasoning in Multimodal Language Models
Addressing MLLMs shortcomings in Compositional Reasoning through CLIP-Segmentation fusion
-
Optimizing Predictions: Vocabulary Reduction and Contrastive Decoding in LLMs
Efficiency-focused early exiting, vocabulary pruning, and contrastive decoding for LLM inference
-
Perception, Localization, Planning and Control on RAE Robots
Computer Vision for Autonomour Robots
-
Model Compression for Machine Translation in Large Language Models
Model Compression for Machine Translation in Large Language Models