Streaming Model Explorer
Last Updated: October 3, 2024
notebook by Tilde Thurium: Mastodon || Twitter || LinkedIn
Problem: there are so many LLMs these days! Which model is the best for my use case?
This notebook uses Haystack 2.0 to compare the results of sending the same prompt to several different models.
This is a very basic demo where you can only compare a few models that support streaming responses. I’d like to support more models in the future, so watch this space for updates.
Models
Haystack’s OpenAIGenerator and CohereGenerator support streaming out of the box.
The other models use the HuggingFaceAPIGenerator.
Prerequisites
- You need HuggingFace, Cohere, and OpenAI API keys. Save them as secrets in your Colab. Click on the key icon in the left menu or see detailed instructions here.
- To use Mistral-7B-v0.1, you should also accept Mistral conditions here: https://huggingface.co/mistralai/Mistral-7B-v0.1
!pip install -U haystack-ai cohere-haystack "huggingface_hub>=0.22.0"
In order for userdata.get
to work, these keys need to be saved as secrets in your Colab. Click on the key icon in the left menu or
see detailed instructions here.
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack_integrations.components.generators.cohere import CohereGenerator
from haystack.components.generators import HuggingFaceAPIGenerator
from haystack.utils import Secret
from google.colab import userdata
open_ai_generator = OpenAIGenerator(api_key=Secret.from_token(userdata.get('OPENAI_API_KEY')))
cohere_generator = CohereGenerator(api_key=Secret.from_token(userdata.get('COHERE_API_KEY')))
hf_generator = HuggingFaceAPIGenerator(
api_type="serverless_inference_api",
api_params={"model": "mistralai/Mistral-7B-Instruct-v0.1"},
token=Secret.from_token(userdata.get('HF_API_KEY')))
hf_generator_2 = HuggingFaceAPIGenerator(
api_type="serverless_inference_api",
api_params={"model": "tiiuae/falcon-7b-instruct"},
token=Secret.from_token(userdata.get('HF_API_KEY')))
hf_generator_3 = HuggingFaceAPIGenerator(
api_type="serverless_inference_api",
api_params={"model": "bigscience/bloom"},
token=Secret.from_token(userdata.get('HF_API_KEY')))
tokenizer_config.json: 0%| | 0.00/967 [00:00<?, ?B/s]
tokenizer.model: 0%| | 0.00/493k [00:00<?, ?B/s]
tokenizer.json: 0%| | 0.00/1.80M [00:00<?, ?B/s]
special_tokens_map.json: 0%| | 0.00/72.0 [00:00<?, ?B/s]
tokenizer_config.json: 0%| | 0.00/287 [00:00<?, ?B/s]
tokenizer.json: 0%| | 0.00/2.73M [00:00<?, ?B/s]
special_tokens_map.json: 0%| | 0.00/281 [00:00<?, ?B/s]
tokenizer_config.json: 0%| | 0.00/222 [00:00<?, ?B/s]
tokenizer.json: 0%| | 0.00/14.5M [00:00<?, ?B/s]
special_tokens_map.json: 0%| | 0.00/85.0 [00:00<?, ?B/s]
MODELS = [open_ai_generator, cohere_generator, hf_generator, hf_generator_2, hf_generator_3]
The AppendToken
dataclass formats the output so that the model name is printed, and the text follows in chunks of 5 tokens.
from dataclasses import dataclass
import ipywidgets as widgets
def output():...
@dataclass
class AppendToken:
output: widgets.Output
chunks = []
chunk_size = 5
def __call__(self, chunk):
with self.output:
text = getattr(chunk, 'content', '')
self.chunks.append(text)
if len(self.chunks) == self.chunk_size:
output_string = ' '.join(self.chunks)
self.output.append_display_data(output_string)
self.chunks.clear()
def multiprompt(prompt, models=MODELS):
outputs = [widgets.Output(layout={'border': '1px solid black'}) for _ in models]
display(widgets.HBox(children=outputs))
for i, model in enumerate(models):
model_name = getattr(model, 'model', '')
outputs[i].append_display_data(f'Model name: {model_name}')
model.streaming_callback = AppendToken(outputs[i])
model.run(prompt)
multiprompt("Tell me a cyberpunk story about a black cat.")
HBox(children=(Output(layout=Layout(border='1px solid black')), Output(layout=Layout(border='1px solid black')β¦
This was a very silly example prompt. If you found this demo useful, let me know the kinds of prompts you tested it with!
Mastodon || Twitter || LinkedIn
Thanks for following along.