Browse Source

Fix the chat example for generate when load peft model (#158)

Geeta Chauhan 1 year ago
parent
commit
322522e9a2
1 changed files with 1 additions and 1 deletions
  1. 1 1
      inference/chat_completion.py

+ 1 - 1
inference/chat_completion.py

@@ -107,7 +107,7 @@ def main(
             tokens= tokens.unsqueeze(0)
             tokens= tokens.to("cuda:0")
             outputs = model.generate(
-                tokens,
+                input_ids=tokens,
                 max_new_tokens=max_new_tokens,
                 do_sample=do_sample,
                 top_p=top_p,