Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use meta-llama/Llama-3.2-3B-Instruct,get unexpected result. #1366

Open
Ningshiqi opened this issue Oct 12, 2024 · 1 comment
Open

Use meta-llama/Llama-3.2-3B-Instruct,get unexpected result. #1366

Ningshiqi opened this issue Oct 12, 2024 · 1 comment

Comments

@Ningshiqi
Copy link

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

I followed https://captum.ai/tutorials/Llama2_LLM_Attribution
My code is here,the only difference is I changed the model_name.

`
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

from captum.attr import (
    FeatureAblation,
    ShapleyValues,
    LayerIntegratedGradients,
    LLMAttribution,
    LLMGradientAttribution,
    TextTokenInput,
    TextTemplateInput,
    ProductBaselines,
)


model_name = "meta-llama/Llama-3.2-1B-Instruct"



def load_model(model_name, bnb_config):
    n_gpus = torch.cuda.device_count()
    max_memory = "10000MB"

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config=bnb_config,
        device_map="auto", # dispatch efficiently the model on the available ressources
        max_memory = {i: max_memory for i in range(n_gpus)},
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name, token=True)

    # Needed for LLaMA tokenizer
    tokenizer.pad_token = tokenizer.eos_token

    return model, tokenizer

def create_bnb_config():
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
    )

    return bnb_config

model, tokenizer = load_model(model_name, bnb_config)

model.eval()



def prompt_fn(*examples):
    main_prompt = "Decide if the following movie review enclosed in quotes is Positive or Negative:\n'I really liked the Avengers, it had a captivating plot!'\nReply only Positive or Negative."
    subset = [elem for elem in examples if elem]
    if not subset:
        prompt = main_prompt
    else:
        prefix = "Here are some examples of movie reviews and classification of whether they were Positive or Negative:\n"
        prompt = prefix + " \n".join(subset) + "\n " + main_prompt
    return "[INST] " + prompt + "[/INST]"

input_examples = [
    "'The movie was ok, the actors weren't great' Negative", 
    "'I loved it, it was an amazing story!' Positive",
    "'Total waste of time!!' Negative", 
    "'Won't recommend' Negative",
]
sv = ShapleyValues(model) 

sv_llm_attr = LLMAttribution(sv, tokenizer)

#attr_res = sv_llm_attr.attribute(inp, target=target, num_trials=3)

inp = TextTemplateInput(
    prompt_fn, 
    values=input_examples,
)
attr_res = sv_llm_attr.attribute(inp)

attr_res.plot_token_attr(show=True)

`

Expected behavior

It should generate 'postive' or 'negtive' .And plot the score between prompt’s example and output.

The actual output

The system gives a hint "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:128001 for open-end generation."

And the ploted picture is
aa

Environment

Describe the environment used for Captum


 - Captum : 0.7.0:
 - google Colab : Ubuntu 22.04.3 LTS"
 - The method installed Captum / PyTorch :  pip
 - Python version: 3.10.12
 - CUDA/cuDNN version:
     `
     cuda-python                        12.2.1
    cupy-cuda12x                       12.2.0
    jax-cuda12-pjrt                    0.4.33
    jax-cuda12-plugin                  0.4.33
    nvidia-cuda-cupti-cu12             12.6.80
    nvidia-cuda-nvcc-cu12              12.6.77
    nvidia-cuda-runtime-cu12           12.6.77`
 - GPU models and configuration:
 - Pytorch : 2.4.1+cu121
 - Transformer: 4.44.2
 
## The Possible problem
 attention_mask is incorrect when captum calls model.generate?

@Ningshiqi
Copy link
Author

Follow https://captum.ai/tutorials/Llama2_LLM_Attribution ,When I user mistralai/Mistral-7B-Instruct-v0.3,the model generate"POSTIVE",but the score is all zero.
f416fce048c7dcf7dc48cffe9a1ef4d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant