-
Object-Guided Visual Tokens:
Eliciting Compositional Reasoning
in Multimodal Language ModelsAddressing MLLMs shortcomings in Compositional Reasoning through CLIP-Segmentation fusion
-
Perception, Localization, Planning and Control on RAE Robots
Computer Vision for Autonomour Robots
-
Model Compression for Machine Translation in Large Language Models
Model Compression for Machine Translation in Large Language Models
-
Optimizing Predictions: Vocabulary Reduction and Contrastive Decoding in LLMs
Efficiency-focused early exiting, vocabulary pruning, and contrastive decoding for LLM inference