Automating Invoice Analysis with Needle & Crew AI

From PDF Chaos to Clear Insights in Seconds

Feb 04, 2025

Let's face it… nobody enjoys spending hours parsing invoices. Beyond being mind-numbingly tedious, it's a massive waste of time that keeps your team from doing what they do best.

We built something at Needle that could be a game-changer: We combined our AI search with Crew AI to analyze invoices automatically. Feed it your raw invoice data, and it spits out detailed expense reports. No more spreadsheet wrestling matches; just clear insights that help you make better decisions.

The Challenge: Transforming Unstructured Invoice Data

Every organization generates invoices, but manually combing through them to extract key expense details is inefficient and error-prone. Our goal was simple:

Retrieve invoice data quickly: Use Needle’s ability to search across unstructured data sources.
Analyze expenses automatically: Leverage Crew AI to build an “Expense Analyst” agent.
Generate actionable insights: Output a structured, markdown report highlighting spending patterns and potential cost optimizations.

The Integration: Needle Meets Crew AI

At a high level, our integration involves two components:

Data Retrieval: Needle’s API searches a connected knowledge base (in our case, a collection of invoice text files) and returns relevant data.
Automated Analysis & Reporting: A Crew AI agent, analyzes the retrieved data, categorizes expenses, and outputs a comprehensive report.

Let’s break down the key parts of the integration.

Code Walkthrough

1. Searching Your Invoice Data

We start by defining a tool that leverages Needle’s AI search. This function sends a query to our Needle collection and returns the top matching results from our invoice data. Make sure to have Openai and Needle API keys set in your .env file.

from needle.v1 import NeedleClient
from crewai.tools import tool

@tool("Search Knowledge Base")
def search_knowledge_base(query: str) -> str:
    """
    Retrieve information from your knowledge base containing unstructured data such as
    invoices, reports, emails, and more.
    
    Args:
        query (str): The search query to find relevant invoice data.
    """
    ndl = NeedleClient()
    return ndl.collections.search(
        collection_id="clt_01JJVNDYX8CCK8TN8FJ6WYFKH9",  # Replace with your actual collection ID
        text=query,
        top_k=20,
    )

This snippet defines the search_knowledge_base function as a tool that our AI agent can use. The NeedleClient is called to search within a specific collection for invoice-related information.

2. Building the Expense Analyst Agent

Next, we configure our Crew AI agent. This agent is given a role, a goal, and even a backstory to guide its actions. The agent uses our search tool to pull in invoice data and then processes it to generate an analysis.

from crewai import Agent, Task, Crew

analyst = Agent(
    role="Expense Analyst",
    goal="Create detailed expense analysis and categorization from invoice data",
    backstory="""
        You are a meticulous expense analyst with expertise in financial data analysis
        and cost categorization. You excel at breaking down expenses, identifying patterns,
        and providing actionable cost-saving insights.
    """,
    verbose=True,
    tools=[search_knowledge_base],
)

This setup gives the agent context and purpose, making it more than just a script—it becomes a virtual expense analyst.

3. Defining the Analysis Task

We then define a task that outlines the steps our agent must follow. This includes grouping expenses, calculating totals, and providing recommendations.

analysis_task = Task(
    description="""
        Search, find, and analyze invoices to create a detailed expense drilldown report.
        
        Steps to follow:
        1. Group expenses and calculate total spend by vendor.
        2. Calculate the gross total spend.
        3. Identify potential cost-saving opportunities.
        
        The report should include:
        - An executive summary.
        - A vendor-wise breakdown.
        - Recommendations for cost optimization.
    """,
    expected_output="""
        An expense analysis report with clear sections and actionable recommendations in markdown format.
    """,
    agent=analyst,
)

This task instructs the agent on how to transform raw invoice data into a structured and meaningful report.

4. Orchestrating the Process

Finally, we put everything together in our main script. Crew AI orchestrates the agent and task, executing the entire workflow when the script is run.

if __name__ == "__main__":
    crew = Crew(agents=[analyst], tasks=[analysis_task], verbose=True)
    crew.kickoff()

To run the project, you simply install the dependencies and execute the script:

pipenv install
pipenv run python main.py

Project Structure & Invoice Files

In our repository (found under needle-examples/invoice_summarizer), you’ll notice a directory called invoices containing multiple invoice text files (e.g., INV-11776.txt, INV-24749.txt, etc.). These files serve as the unstructured data that Needle’s AI search scans through, simulating a real-world scenario where invoices are stored across various documents.

Bringing It All Together

This integration between Needle and Crew AI exemplifies how automation can turn a traditionally manual process into a streamlined, efficient workflow. With just a few lines of code:

Needle rapidly locates the relevant data.
Crew AI analyzes that data and outputs a comprehensive report.

The result? A system that saves time, reduces errors and provides actionable insights for better expense management.

Final Thoughts

Whether you’re dealing with a small set of invoices or a vast repository of financial data, automating your analysis process can transform how you work. Our project with Needle and Crew AI shows that even complex tasks like expense analysis can be automated with a bit of creativity and the right tools.

If you’re interested in diving deeper or exploring similar integrations, feel free to reach out or comment below. Let’s keep pushing the boundaries of what AI can do for your workflows!

Happy automating!

Stay tuned to our blog for more updates, technical deep-dives, and success stories from the world of AI-powered automation.

Needle