Let's face it… nobody enjoys spending hours parsing invoices. Beyond being mind-numbingly tedious, it's a massive waste of time that keeps your team from doing what they do best.
We built something at Needle that could be a game-changer: We combined our AI search with Crew AI to analyze invoices automatically. Feed it your raw invoice data, and it spits out detailed expense reports. No more spreadsheet wrestling matches; just clear insights that help you make better decisions.
The Challenge: Transforming Unstructured Invoice Data
Every organization generates invoices, but manually combing through them to extract key expense details is inefficient and error-prone. Our goal was simple:
Retrieve invoice data quickly: Use Needle’s ability to search across unstructured data sources.
Analyze expenses automatically: Leverage Crew AI to build an “Expense Analyst” agent.
Generate actionable insights: Output a structured, markdown report highlighting spending patterns and potential cost optimizations.
The Integration: Needle Meets Crew AI
At a high level, our integration involves two components:
Data Retrieval: Needle’s API searches a connected knowledge base (in our case, a collection of invoice text files) and returns relevant data.
Automated Analysis & Reporting: A Crew AI agent, analyzes the retrieved data, categorizes expenses, and outputs a comprehensive report.
Let’s break down the key parts of the integration.
Code Walkthrough
1. Searching Your Invoice Data
We start by defining a tool that leverages Needle’s AI search. This function sends a query to our Needle collection and returns the top matching results from our invoice data. Make sure to have Openai and Needle API keys set in your .env
file.
from needle.v1 import NeedleClient
from crewai.tools import tool
@tool("Search Knowledge Base")
def search_knowledge_base(query: str) -> str:
"""
Retrieve information from your knowledge base containing unstructured data such as
invoices, reports, emails, and more.
Args:
query (str): The search query to find relevant invoice data.
"""
ndl = NeedleClient()
return ndl.collections.search(
collection_id="clt_01JJVNDYX8CCK8TN8FJ6WYFKH9", # Replace with your actual collection ID
text=query,
top_k=20,
)
This snippet defines the search_knowledge_base
function as a tool that our AI agent can use. The NeedleClient
is called to search within a specific collection for invoice-related information.
2. Building the Expense Analyst Agent
Next, we configure our Crew AI agent. This agent is given a role, a goal, and even a backstory to guide its actions. The agent uses our search tool to pull in invoice data and then processes it to generate an analysis.
from crewai import Agent, Task, Crew
analyst = Agent(
role="Expense Analyst",
goal="Create detailed expense analysis and categorization from invoice data",
backstory="""
You are a meticulous expense analyst with expertise in financial data analysis
and cost categorization. You excel at breaking down expenses, identifying patterns,
and providing actionable cost-saving insights.
""",
verbose=True,
tools=[search_knowledge_base],
)
This setup gives the agent context and purpose, making it more than just a script—it becomes a virtual expense analyst.
3. Defining the Analysis Task
We then define a task that outlines the steps our agent must follow. This includes grouping expenses, calculating totals, and providing recommendations.
analysis_task = Task(
description="""
Search, find, and analyze invoices to create a detailed expense drilldown report.
Steps to follow:
1. Group expenses and calculate total spend by vendor.
2. Calculate the gross total spend.
3. Identify potential cost-saving opportunities.
The report should include:
- An executive summary.
- A vendor-wise breakdown.
- Recommendations for cost optimization.
""",
expected_output="""
An expense analysis report with clear sections and actionable recommendations in markdown format.
""",
agent=analyst,
)
This task instructs the agent on how to transform raw invoice data into a structured and meaningful report.
4. Orchestrating the Process
Finally, we put everything together in our main script. Crew AI orchestrates the agent and task, executing the entire workflow when the script is run.
if __name__ == "__main__":
crew = Crew(agents=[analyst], tasks=[analysis_task], verbose=True)
crew.kickoff()
To run the project, you simply install the dependencies and execute the script:
pipenv install
pipenv run python main.py
Project Structure & Invoice Files
In our repository (found under needle-examples/invoice_summarizer
), you’ll notice a directory called invoices
containing multiple invoice text files (e.g., INV-11776.txt
, INV-24749.txt
, etc.). These files serve as the unstructured data that Needle’s AI search scans through, simulating a real-world scenario where invoices are stored across various documents.
Bringing It All Together
This integration between Needle and Crew AI exemplifies how automation can turn a traditionally manual process into a streamlined, efficient workflow. With just a few lines of code:
Needle rapidly locates the relevant data.
Crew AI analyzes that data and outputs a comprehensive report.
The result? A system that saves time, reduces errors and provides actionable insights for better expense management.
Final Thoughts
Whether you’re dealing with a small set of invoices or a vast repository of financial data, automating your analysis process can transform how you work. Our project with Needle and Crew AI shows that even complex tasks like expense analysis can be automated with a bit of creativity and the right tools.
If you’re interested in diving deeper or exploring similar integrations, feel free to reach out or comment below. Let’s keep pushing the boundaries of what AI can do for your workflows!
Happy automating!
Stay tuned to our blog for more updates, technical deep-dives, and success stories from the world of AI-powered automation.