Newsletter Sending Agent with Experimental Haystack Tools

🧑‍🔬 Discuss Experimental Feature Open in Colab Download

_{Last Updated:
October 18, 2024}

🧑‍🍳 Demo by Stefano Fiorucci ( X, LinkedIn) and Tuana Celik( X, LinkedIn)

In this recipe, we will build a newsletter sending agent with 3 tools:

A tool that fetches the top stories from Hacker News
A tool that creates newsletters for a particular audience
A tool that can send emails (with Gmail)

At time of writing (16 October 2024) we are using the Tool datatypes and components from the haystack-exerimental package.

📺 Watch Along

Install dependencies

! pip install haystack-ai trafilatura

Experimental Features

In this demo, we are using Haystack features that are from haystack_experimental.

See all the experiemtal features in the haystack-experimental repository
Discuss each feature in the discussion board

The experimantal components and extensions we are using here are:

The Tool and the extended ChatMessage datatypes.
The ToolInvoker component.

from typing import List
from trafilatura import fetch_url, extract
import requests
from getpass import getpass
import os

from haystack_experimental.components.generators.chat import OpenAIChatGenerator
from haystack_experimental.dataclasses import Tool, ChatMessage
from haystack_experimental.components.tools import ToolInvoker

from haystack import Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

Hacker News Fetcher Tool

In a previous article and recipe, we had shown how you can create a custom component for Haystack called the HackerNewsFetcher.

Here, we are doing something very similar, but instead we are creating a function and using that as a Tool instead.

📚 Hacker News Summaries with Custom Components

This tool expects top_k as input, and returns that many of the current top stories on Hacker News 🚀

def hacker_news_fetcher(top_k: int = 3):
    newest_list = requests.get(url='https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty')
    urls = []
    articles = []
    for id_ in newest_list.json()[0:top_k]:
        article = requests.get(url=f"https://hacker-news.firebaseio.com/v0/item/{id_}.json?print=pretty")
        if 'url' in article.json():
            urls.append(article.json()['url'])
        elif 'text' in article.json():
            articles.append(article.json()['text'])

    for url in urls:
        try:
            downloaded = fetch_url(url)
            text = extract(downloaded)
            if text is not None:
                articles.append(text[:500])
        except Exception as e:
            print(e)
            print(f"Couldn't download {url}, skipped")

    return articles

hacker_news_fetcher_tool = Tool(name="hacker_news_fetcher",
                                description="Fetch the top k articles from hacker news",
                                function=hacker_news_fetcher,
                                parameters={
                                    "type": "object",
                                    "properties": {
                                        "top_k": {
                                            "type": "integer",
                                            "description": "The number of articles to fetch"
                                        }
                                    },
                                })

For the Newsletter gnereation tool, we will be creating a Haystack pipeline, and making our pipeline itself a tool.

Our tool will expect the following inputs:

articles: Content to base the newsletter off of
target_people: The audience we want to target, for example “engineers” may be our target audience
n_words: The number of words we want to limit our newsletter to

if not "OPENAI_API_KEY" in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

Enter your OpenAI API key: ··········

template = """
Create a entertaining newsletter for {{target_people}} based on the following articles.
The newsletter should be well structured, with a unique angle and a maximum of {{n_words}} words.

Articles:
{% for article in articles %}
    {{ article }}
    ---
{% endfor %}
"""

newsletter_pipe = Pipeline()
newsletter_pipe.add_component("prompt_builder", PromptBuilder(template=template))
newsletter_pipe.add_component("llm", OpenAIGenerator(model="gpt-4o-mini"))
newsletter_pipe.connect("prompt_builder", "llm")

<haystack.core.pipeline.pipeline.Pipeline object at 0x7e3fedb2c6a0>
🚅 Components
  - prompt_builder: PromptBuilder
  - llm: OpenAIGenerator
🛤️ Connections
  - prompt_builder.prompt -> llm.prompt (str)

def newsletter_pipeline_func(articles: List[str], target_people: str = "programmers", n_words: int = 100):
    result = newsletter_pipe.run({"prompt_builder": {"articles": articles, "target_people": target_people, "n_words": n_words}})

    return {"reply": result["llm"]["replies"][0]}

newsletter_tool = Tool(name="newsletter_generator",
                          description="Generate a newsletter based on some articles",
                            function=newsletter_pipeline_func,
                            parameters={
                                "type": "object",
                                "properties": {
                                    "articles": {
                                        "type": "array",
                                        "items": {
                                            "type": "string",
                                            "description": "The articles to base the newsletter on",
                                        }
                                    },
                                    "target_people": {
                                        "type": "string",
                                        "description": "The target audience for the newsletter",
                                    },
                                    "n_words": {
                                        "type": "integer",
                                        "description": "The number of words to summarize the newsletter to",
                                    }
                                },
                                "required": ["articles"],
                            })

Send Email Tool

Here, we are creating a Gmail tool. You login with your gmail account, allowing the final Agent to send emails from your email, to others.

⚠️ Note: To be able to use the gmail too, you have to create an app password for your Gmail account, which will be the sender. You can delete this after.

To configure our email Tool, you have to provide the following information about the sender email account 👇

if not "NAME" in os.environ:
    os.environ["NAME"] = input("What's your name? ")
if not "SENDER_EMAIL" in os.environ:
    os.environ["SENDER_EMAIL"] = getpass("Enter your Gmail e-mail: ")
if not "GMAIL_APP_PASSWORD" in os.environ:
    os.environ["GMAIL_APP_PASSWORD"] = getpass("Enter your Gmail App Password: ")

Next, we create a Tool that expects the following input:

receiver: The email address that we want to send an email to
body: The body of the email
subject: The subject line for the email.

import smtplib, ssl
from email.mime.text import MIMEText

def send_email(receiver: str, body: str, subject: str):
  msg = MIMEText(body)
  sender_email = os.environ['SENDER_EMAIL']
  sender_name = os.environ['NAME']
  sender = f"{sender_name} <{sender_email}>"
  msg['Subject'] = subject
  msg['From'] = sender
  port = 465  # For SSL
  smtp_server = "smtp.gmail.com"
  password = os.environ["GMAIL_APP_PASSWORD"]
  context = ssl.create_default_context()
  with smtplib.SMTP_SSL(smtp_server, port, context=context) as server:
      server.login(sender_email, password)
      server.sendmail(sender_email, receiver, msg.as_string())
  return 'Email sent!'

email_tool = Tool(name="email",
                  description="Send emails with specific content",
                  function=send_email,
                  parameters={
                      "type": "object",
                      "properties": {
                          "receiver": {
                              "type": "string",
                              "description": "The email of the receiver"
                          },
                          "body": {
                              "type": "string",
                              "description": "The content of the email"
                          },
                          "subject": {
                              "type": "string",
                              "description": "The subject of the email"
                          }
                      },
                  })

Now, we build a Newsletter creating chat agent which we can use to ask for newsletters, as well as sending them to given email addresses.

chat_generator = OpenAIChatGenerator(tools=[hacker_news_fetcher_tool, newsletter_tool, email_tool])

tool_invoker = ToolInvoker(tools=[hacker_news_fetcher_tool, newsletter_tool, email_tool])

messages = [
        ChatMessage.from_system(
            """Prepare a tool call if needed, otherwise use your knowledge to respond to the user.
            If the invocation of a tool requires the result of another tool, prepare only one call at a time.

            Each time you receive the result of a tool call, ask yourself: "Am I done with the task?".
            If not and you need to invoke another tool, prepare the next tool call.
            If you are done, respond with just the final result."""
        )
    ]

while True:
    user_input = input("\n\nwaiting for input (type 'exit' or 'quit' to stop)\n🧑: ")
    if user_input.lower() == "exit" or user_input.lower() == "quit":
        break
    messages.append(ChatMessage.from_user(user_input))

    while True:
        print("⌛ iterating...")

        replies = chat_generator.run(messages=messages)["replies"]
        messages.extend(replies)

        # Check for tool calls and handle them
        if not replies[0].tool_calls:
            break
        tool_calls = replies[0].tool_calls

        # Print tool calls for debugging
        for tc in tool_calls:
            print("\n TOOL CALL:")
            print(f"\t{tc.id}")
            print(f"\t{tc.tool_name}")
            for k,v in tc.arguments.items():
                v_truncated = str(v)[:50]
                print(f"\t{k}: {v_truncated}{'' if len(v_truncated) == len(str(v)) else '...'}")

        tool_messages = tool_invoker.run(messages=replies)["tool_messages"]
        messages.extend(tool_messages)


    # Print the final AI response after all tool calls are resolved
    print(f"🤖: {messages[-1].text}")

waiting for input (type 'exit' or 'quit' to stop)
🧑: What is the top HN article now?
⌛ iterating...

 TOOL CALL:
	call_aOluHPdSMAGosryayVwNxOvL
	hacker_news_fetcher
	top_k: 1
⌛ iterating...

 TOOL CALL:
	call_G6Z10LdGwJgdqxspgIWwvsnl
	hacker_news_fetcher
	top_k: 1
⌛ iterating...
🤖: It appears that I'm facing an issue retrieving data from Hacker News, as the response does not contain the actual article information. Unfortunately, I'm unable to provide the top article at this moment. You might want to check the Hacker News website directly for the latest articles.


waiting for input (type 'exit' or 'quit' to stop)
🧑: What's the top 2 HN articles?
⌛ iterating...

 TOOL CALL:
	call_SWI39GuYw579wRDhQVs02WUu
	hacker_news_fetcher
	top_k: 2
⌛ iterating...
🤖: It seems I'm encountering difficulties retrieving the top articles from Hacker News properly. However, based on the partial information I received, one of the articles is:

1. **Adobe's new image rotation tool** - "Project Turnable" lets users fully rotate 2D vectors. This tool was showcased at Adobe's annual MAX conference as part of their "Sneaks" segment, where engineers present innovative ideas that may or may not be fully developed.

Unfortunately, the first article did not provide relevant content. For the most accurate and complete information on the top Hacker News articles, I recommend checking the Hacker News website directly.


waiting for input (type 'exit' or 'quit' to stop)
🧑: Create a newsletter targeted at engineers based on this article. No more than 100 words.
⌛ iterating...

 TOOL CALL:
	call_duVE2eBKkCe3wpJOuq6NEJte
	newsletter_generator
	articles: ["Adobe's new image rotation tool is one of the mo...
	target_people: engineers
	n_words: 100
⌛ iterating...
🤖: **Engineering Whirlwind**  
*Issue #42: AI Innovations Turned Up to Eleven*

Hello, Innovators!

Dive into Adobe's latest gem, *Project Turntable*! This fascinating tool offers engineers a chance to fully rotate 2D vectors like never before. Unveiled at their MAX conference, it sits at the intersection of imagination and engineering mastery. As Adobe's engineers sneak out innovative ideas, warm up those creative engines—who knows what else might come spinning your way?

Stay sharp and keep spinning those ideas!  
— The Engineering Brigade 🌟


waiting for input (type 'exit' or 'quit' to stop)
🧑: Email this to tuana.celik@deepset.ai You can decide on the subjectline
⌛ iterating...

 TOOL CALL:
	call_e7thnZ8Bq1kBjU4dyrGLP0jK
	email
	receiver: tuana.celik@deepset.ai
	body: **Engineering Whirlwind**  
*Issue #42: AI Innovat...
	subject: Latest Innovations: Adobe's Project Turntable
⌛ iterating...
🤖: The newsletter has been successfully emailed to tuana.celik@deepset.ai with the subject "Latest Innovations: Adobe's Project Turntable." If you need any further assistance, feel free to ask!


waiting for input (type 'exit' or 'quit' to stop)
🧑: exit

Extras: Converting Tools

Convert functions into Tools

This feature is not yet released, so we need to install haystack-experimental from main.

! pip install git+https://github.com/deepset-ai/haystack-experimental@main#egg=haystack-experimental

from typing import Annotated
from pprint import pp

Writing the JSON schema is not fun… 🤔

def newsletter_pipeline_func(articles: List[str], target_people: str = "programmers", n_words: int = 100):
    result = newsletter_pipe.run({"prompt_builder": {"articles": articles, "target_people": target_people, "n_words": n_words}})

    return {"reply": result["llm"]["replies"][0]}

newsletter_tool = Tool(name="newsletter_generator",
                          description="Generate a newsletter based on some articles",
                            function=newsletter_pipeline_func,
                            parameters={
                                "type": "object",
                                "properties": {
                                    "articles": {
                                        "type": "array",
                                        "items": {
                                            "type": "string",
                                            "description": "The articles to include in the newsletter",
                                        }
                                    },
                                    "target_people": {
                                        "type": "string",
                                        "description": "The target audience for the newsletter",
                                    },
                                    "n_words": {
                                        "type": "integer",
                                        "description": "The number of words to summarize the newsletter to",
                                    }
                                },
                                "required": ["articles"],
                            })

We can do this instead 👇

def newsletter_pipeline_func(
    articles: Annotated[List[str], "The articles to include in the newsletter"],
    target_people: Annotated[str, "The target audience for the newsletter"] = "programmers",
    n_words: Annotated[int, "The number of words to summarize the newsletter to"] = 100
    ):
    """Generate a newsletter based on some articles"""

    result = newsletter_pipe.run({"prompt_builder": {"articles": articles, "target_people": target_people, "n_words": n_words}})

    return {"reply": result["llm"]["replies"][0]}

newsletter_tool = Tool.from_function(newsletter_pipeline_func)

pp(newsletter_tool, width=200)

Tool(name='newsletter_pipeline_func',
     description='Generate a newsletter based on some articles',
     parameters={'properties': {'articles': {'items': {'type': 'string'}, 'type': 'array', 'description': 'The articles to include in the newsletter'},
                                'target_people': {'default': 'programmers', 'type': 'string', 'description': 'The target audience for the newsletter'},
                                'n_words': {'default': 100, 'type': 'integer', 'description': 'The number of words to summarize the newsletter to'}},
                 'required': ['articles'],
                 'type': 'object'},
     function=<function newsletter_pipeline_func at 0x7f6fd96511b0>)

Convert Pre-Existing Tools into Haystack Tools

Haystack is quite flexible. This means if you have tools already defined elsewhere, you are able to convert them to Haystack tools. For example, LangChain has several interesting tools that we can seamlessly convert into Haystack tools.

!pip install langchain-community

from pydantic import create_model
from haystack_experimental.dataclasses.tool import _remove_title_from_schema

def convert_langchain_tool_to_haystack_tool(langchain_tool):
    tool_name = langchain_tool.name
    tool_description = langchain_tool.description

    def invocation_adapter(**kwargs):
        return langchain_tool.invoke(input=kwargs)

    tool_function = invocation_adapter

    model_fields = langchain_tool.args_schema.model_fields

    fields = {name: (field.annotation, field.default) for name, field in model_fields.items()}
    descriptions = {name: field.description for name, field in model_fields.items()}

    model = create_model(tool_name, **fields)
    schema = model.model_json_schema()

    # we don't want to include title keywords in the schema, as they contain redundant information
    # there is no programmatic way to prevent Pydantic from adding them, so we remove them later
    # see https://github.com/pydantic/pydantic/discussions/8504
    _remove_title_from_schema(schema)

    # add parameters descriptions to the schema
    for name, description in descriptions.items():
        if name in schema["properties"]:
            schema["properties"][name]["description"] = description

    return Tool(name=tool_name, description=tool_description, parameters=schema, function=tool_function)

from langchain_community.agent_toolkits import FileManagementToolkit
toolkit = FileManagementToolkit(
    root_dir="/"
)  # If you don't provide a root_dir, operations will default to the current working directory
toolkit.get_tools()

[CopyFileTool(root_dir='/'),
 DeleteFileTool(root_dir='/'),
 FileSearchTool(root_dir='/'),
 MoveFileTool(root_dir='/'),
 ReadFileTool(root_dir='/'),
 WriteFileTool(root_dir='/'),
 ListDirectoryTool(root_dir='/')]

langchain_listdir_tool = toolkit.get_tools()[-1]

haystack_listdir_tool = convert_langchain_tool_to_haystack_tool(langchain_listdir_tool)

from haystack_experimental.components import OpenAIChatGenerator, ToolInvoker
from haystack_experimental.dataclasses import ChatMessage

chat_generator = OpenAIChatGenerator(model="gpt-4o-mini", tools=[haystack_listdir_tool])
tool_invoker = ToolInvoker(tools=[haystack_listdir_tool])

user_message = ChatMessage.from_user("List the files in /content/sample_data")

replies = chat_generator.run(messages=[user_message])["replies"]
# print(f"assistant messages: {replies}")

if replies[0].tool_calls:

    tool_messages = tool_invoker.run(messages=replies)["tool_messages"]
    # print(f"tool messages: {tool_messages}")

    # we pass all the messages to the Chat Generator
    messages = [user_message] + replies + tool_messages
    final_replies = chat_generator.run(messages=messages)["replies"]
    print(f"{final_replies[0].text}")

The files in the `/content/sample_data` directory are:

1. anscombe.json
2. README.md
3. mnist_train_small.csv
4. california_housing_train.csv
5. california_housing_test.csv
6. mnist_test.csv