What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models

Integrated Vision and Language Lab, KAIST
EMNLP Findings 2024

Abstract

This paper presents a way of enhancing the reliability of Large Multi-modal Models (LMMs) in addressing hallucination, where the models generate cross-modal inconsistent responses. Without additional training, we propose Counterfactual Inception, a novel method that implants counterfactual thinking into LMMs using self-generated counterfactual keywords. Our method is grounded in the concept of counterfactual thinking, a cognitive process where human considers alternative realities, enabling more extensive context exploration. Bridging the human cognition mechanism into LMMs, we aim for the models to engage with and generate responses that span a wider contextual scene understanding, mitigating hallucinatory outputs. We further introduce Plausibility Verification Process (PVP), a simple yet robust keyword constraint that effectively filters out sub-optimal keywords to enable the consistent triggering of counterfactual thinking in the model responses. Comprehensive analyses across various LMMs, including both open-source and proprietary models, corroborate that counterfactual thinking significantly reduces hallucination and helps to broaden contextual understanding based on true visual clues.

Counterfactual Inception Overview

Overview

Counterfactual Inception: LMMs generate counterfactual keywords at the object, attribute, and relation levels, then integrate them with a counterfactual prompt to implant counterfactual thinking to the models. To filter out keywords that are either too similar or too deviated from the visual content, we adopt a robust constraint called PVP.

Keyword Analysis

Comparison

Frequency distribution for the counterfactual keywords. We have empirically observed that the keywords in the upper half of the distribution are closer to factual information rather than counterfactual, thus the lower half, excluding extreme low, is set as the criteria.

Keyword Distribution

Comparison

Top-5 words occurrence using morphological analysis (NLTK) in counterfactual keywords & cumulative distribution of the self-generated counterfactual keywords. One interesting finding is that GPT-4V likes ice cream (biased result). Such bias may the frequently occurred words in its training data, reflecting a specific weakness of the model's ability to generate diverse alternatives. Also this indicates the potential availability of counterfactual keywords as revealing generative vulnerabilities in the alternative responses.

Qualitative Results

BibTeX

@article{kim2024if,
        title={What if...?: Counterfactual inception to mitigate hallucination effects in large multimodal models},
        author={Kim, Junho and Kim, Yeon Ju and Ro, Yong Man},
        journal={arXiv preprint arXiv:2403.13513},
        year={2024}
      }
}