Artificial Intelligence (AI)

Artificial Intelligence (AI)

AI For Spreadsheets: Bringing AI To excel

AI For Spreadsheets: Bringing AI To excel

AI For Spreadsheets: Bringing AI To excel

Apr 9, 2025

|

10

min read

Excel VBA Meets AI: Building a RAG for PDF Chat

Recently, Google unveiled a powerful new feature: AI for Spreadsheets, allowing users to call AI directly from Google Sheets. Interestingly, we had already been exploring a very similar idea — and built a working prototype for a client weeks before the announcement.

The Challenge: Extracting Financial Data from Legal PDFs

We were approached by a client in the legal sector struggling with a frustrating and time-consuming task. Their team frequently needed to extract compensation amounts from archived legal PDF documents based on a written case description.

Here’s what their original workflow looked like:

  1. Read a new case description

  2. Identify the relevant PDF document (from hundreds)

  3. Skim through the lengthy, complex document

  4. Locate compensation-related information

  5. Manually copy and paste data into Excel

  6. Perform final calculations

As you can imagine, this process was slow, repetitive, and error-prone — especially with legal documents that are often long and dense.

Our Solution: A Web-Based AI Tool

Initially, we developed a web-based solution that would take the case description, select the relevant pdf and return the correct financial amount.

  • The user enters a case description

  • The system automatically identifies the most relevant PDF

  • It extracts and returns the appropriate compensation amount

  • It highlights the specific section in the PDF and provides a direct link to it

Below you can find a screenshot of the web tool (we blurred some information).

This already sped up the process dramatically. But there was still a bottleneck: users had to manually copy results back into Excel.

So we thought:
Why not bring the entire process inside Excel itself?

A Smarter Approach: AI-Powered Excel Integration

That’s when we created a new solution: a VBA-based AI assistant inside Excel, powered by Google’s Gemini API. This setup allows users to interact with AI and their PDF documents directly from within Excel — no switching between tools, no copy-paste needed.

The system is built entirely with:

  • VBA macros, to orchestrate user interaction and data flow

  • Microsoft Word, for automated extraction of text from PDF files

  • Google's Gemini API, used for the AI functionality

This approach is lightweight, local-first, and requires no external infrastructure.

How It Works: Behind the Scenes

Here’s a breakdown of what happens when the user clicks the “Get Response” button in Excel:

  1. PDF Selection
    The user selects one ore more pdfs an entire folder containing relevant documents.

  2. Text Extraction
    Using Microsoft Word's COM automation, the system extracts all text from each PDF.

  3. Prompt Generation
    The system builds a structured prompt combining extracted content with the user’s case description and business rules.

  4. API Request Execution
    The prompt is submitted via a curl command. The system captures and stores the response returned by the AI.

  5. Response Parsing
    The AI’s output is processed programmatically. Specific data points (e.g., monetary amounts or legal references) are extracted using tag-based logic.

  6. In-Excel Display
    The final results are rendered directly in the Excel interface enabling the user to review key information without ever opening the original PDF.

Let's talk code: Excel + AI + Shell

To connect Excel with the Gemini API, we initially experimented with the MSXML2.XMLHTTP library a common method for making HTTP requests in VBA. However, it fell short when it came to sending structured payloads or dealing with file attachments. Instead, we took a more robust approach: We used VBA to execute a shell command that runs cURL.

This allowed us to:

  • Build and send a structured JSON prompt to the Gemini API

  • Retrieve the AI’s response into a .json file

  • Read and parse the response directly back into Excel

Let's look at some snippets of the code:

shellCommand = """" & curlPath & """ 
  https://generativelanguage.googleapis.com/v1beta/models"
  & 
        "?key=" & apiKey & " -H ""Content-Type: application/json"" -d @" & """" &
  jsonInput & """" & _" >
  " & """" & jsonOutput & """"

This code block builds a curl command that sends a JSON file containing the prompt to the Gemini API. It also handles automatic retrieval of the response into a .json file .

Sub CreateGeminiRequestJSON(prompt As String, pdfPaths As Collection, rules As String)
    Dim jsonFile As Integer, jsonPath As String, jsonText As String
    Dim promptEscaped As String, rulesEscaped As String
    Dim partsText As String

    jsonPath = ThisWorkbook.Path & "\gemini_request.json"
    jsonFile = FreeFile

    promptEscaped = Replace(prompt, """", "\""")
    rulesEscaped = Replace(rules, """", "\""")
    
    jsonText = "{""contents"":["
    jsonText = jsonText & "{""role"":""user"",""parts"":[{""text"":""" & rulesEscaped & """}]},"

    jsonText = jsonText & "{""role"":""user"",""parts"":[{""text"":""" & promptEscaped & """}]}"

    jsonText = jsonText & "]," & vbCrLf

    jsonText = jsonText & """generationConfig"":{""temperature"":0,""topK"":40,""topP"":0.95,""maxOutputTokens"":8192}}"

    Open jsonPath For Output As #jsonFile
    Print #jsonFile, jsonText
    Close #jsonFile
End Sub

This function prepares the content of the AI ​​query. It combines business analysis rules and extracted text from documents to generate a structured prompt, conforming to the format required by the Gemini API. The prompt is then saved as a JSON file, ready to be sent.

Function ExtractTextFromPDF(pdfPath As String) As String
    Dim wordApp As Object
    Dim doc As Object
    Dim text As String
    On Error GoTo ErrorHandler

    Set wordApp = CreateObject("Word.Application")
    wordApp.Visible = False
    Set doc = wordApp.Documents.Open(pdfPath)

    text = doc.Content.text

    doc.Close False
    wordApp.Quit
    Set doc = Nothing
    Set wordApp = Nothing

    ExtractTextFromPDF = text
    Exit Function

ErrorHandler:
    ExtractTextFromPDF = "[Erreur de lecture du fichier : " & pdfPath & "]"
    If Not doc Is Nothing Then doc.Close False
    If Not wordApp Is Nothing Then wordApp.Quit
End Function

This feature uses the Microsoft Word application to open a PDF file and extract all the plain text from it. It returns the textual content of the document as a string, which can be used in AI prompt generation or for other automated processing.

This method proved to be lightweight, flexible, and easy to debug, and it worked without needing external libraries or dependencies.

Recently, Google unveiled a powerful new feature: AI for Spreadsheets, allowing users to call AI directly from Google Sheets. Interestingly, we had already been exploring a very similar idea — and built a working prototype for a client weeks before the announcement.

The Challenge: Extracting Financial Data from Legal PDFs

We were approached by a client in the legal sector struggling with a frustrating and time-consuming task. Their team frequently needed to extract compensation amounts from archived legal PDF documents based on a written case description.

Here’s what their original workflow looked like:

  1. Read a new case description

  2. Identify the relevant PDF document (from hundreds)

  3. Skim through the lengthy, complex document

  4. Locate compensation-related information

  5. Manually copy and paste data into Excel

  6. Perform final calculations

As you can imagine, this process was slow, repetitive, and error-prone — especially with legal documents that are often long and dense.

Our Solution: A Web-Based AI Tool

Initially, we developed a web-based solution that would take the case description, select the relevant pdf and return the correct financial amount.

  • The user enters a case description

  • The system automatically identifies the most relevant PDF

  • It extracts and returns the appropriate compensation amount

  • It highlights the specific section in the PDF and provides a direct link to it

Below you can find a screenshot of the web tool (we blurred some information).

This already sped up the process dramatically. But there was still a bottleneck: users had to manually copy results back into Excel.

So we thought:
Why not bring the entire process inside Excel itself?

A Smarter Approach: AI-Powered Excel Integration

That’s when we created a new solution: a VBA-based AI assistant inside Excel, powered by Google’s Gemini API. This setup allows users to interact with AI and their PDF documents directly from within Excel — no switching between tools, no copy-paste needed.

The system is built entirely with:

  • VBA macros, to orchestrate user interaction and data flow

  • Microsoft Word, for automated extraction of text from PDF files

  • Google's Gemini API, used for the AI functionality

This approach is lightweight, local-first, and requires no external infrastructure.

How It Works: Behind the Scenes

Here’s a breakdown of what happens when the user clicks the “Get Response” button in Excel:

  1. PDF Selection
    The user selects one ore more pdfs an entire folder containing relevant documents.

  2. Text Extraction
    Using Microsoft Word's COM automation, the system extracts all text from each PDF.

  3. Prompt Generation
    The system builds a structured prompt combining extracted content with the user’s case description and business rules.

  4. API Request Execution
    The prompt is submitted via a curl command. The system captures and stores the response returned by the AI.

  5. Response Parsing
    The AI’s output is processed programmatically. Specific data points (e.g., monetary amounts or legal references) are extracted using tag-based logic.

  6. In-Excel Display
    The final results are rendered directly in the Excel interface enabling the user to review key information without ever opening the original PDF.

Let's talk code: Excel + AI + Shell

To connect Excel with the Gemini API, we initially experimented with the MSXML2.XMLHTTP library a common method for making HTTP requests in VBA. However, it fell short when it came to sending structured payloads or dealing with file attachments. Instead, we took a more robust approach: We used VBA to execute a shell command that runs cURL.

This allowed us to:

  • Build and send a structured JSON prompt to the Gemini API

  • Retrieve the AI’s response into a .json file

  • Read and parse the response directly back into Excel

Let's look at some snippets of the code:

shellCommand = """" & curlPath & """ 
  https://generativelanguage.googleapis.com/v1beta/models"
  & 
        "?key=" & apiKey & " -H ""Content-Type: application/json"" -d @" & """" &
  jsonInput & """" & _" >
  " & """" & jsonOutput & """"

This code block builds a curl command that sends a JSON file containing the prompt to the Gemini API. It also handles automatic retrieval of the response into a .json file .

Sub CreateGeminiRequestJSON(prompt As String, pdfPaths As Collection, rules As String)
    Dim jsonFile As Integer, jsonPath As String, jsonText As String
    Dim promptEscaped As String, rulesEscaped As String
    Dim partsText As String

    jsonPath = ThisWorkbook.Path & "\gemini_request.json"
    jsonFile = FreeFile

    promptEscaped = Replace(prompt, """", "\""")
    rulesEscaped = Replace(rules, """", "\""")
    
    jsonText = "{""contents"":["
    jsonText = jsonText & "{""role"":""user"",""parts"":[{""text"":""" & rulesEscaped & """}]},"

    jsonText = jsonText & "{""role"":""user"",""parts"":[{""text"":""" & promptEscaped & """}]}"

    jsonText = jsonText & "]," & vbCrLf

    jsonText = jsonText & """generationConfig"":{""temperature"":0,""topK"":40,""topP"":0.95,""maxOutputTokens"":8192}}"

    Open jsonPath For Output As #jsonFile
    Print #jsonFile, jsonText
    Close #jsonFile
End Sub

This function prepares the content of the AI ​​query. It combines business analysis rules and extracted text from documents to generate a structured prompt, conforming to the format required by the Gemini API. The prompt is then saved as a JSON file, ready to be sent.

Function ExtractTextFromPDF(pdfPath As String) As String
    Dim wordApp As Object
    Dim doc As Object
    Dim text As String
    On Error GoTo ErrorHandler

    Set wordApp = CreateObject("Word.Application")
    wordApp.Visible = False
    Set doc = wordApp.Documents.Open(pdfPath)

    text = doc.Content.text

    doc.Close False
    wordApp.Quit
    Set doc = Nothing
    Set wordApp = Nothing

    ExtractTextFromPDF = text
    Exit Function

ErrorHandler:
    ExtractTextFromPDF = "[Erreur de lecture du fichier : " & pdfPath & "]"
    If Not doc Is Nothing Then doc.Close False
    If Not wordApp Is Nothing Then wordApp.Quit
End Function

This feature uses the Microsoft Word application to open a PDF file and extract all the plain text from it. It returns the textual content of the document as a string, which can be used in AI prompt generation or for other automated processing.

This method proved to be lightweight, flexible, and easy to debug, and it worked without needing external libraries or dependencies.

Recently, Google unveiled a powerful new feature: AI for Spreadsheets, allowing users to call AI directly from Google Sheets. Interestingly, we had already been exploring a very similar idea — and built a working prototype for a client weeks before the announcement.

The Challenge: Extracting Financial Data from Legal PDFs

We were approached by a client in the legal sector struggling with a frustrating and time-consuming task. Their team frequently needed to extract compensation amounts from archived legal PDF documents based on a written case description.

Here’s what their original workflow looked like:

  1. Read a new case description

  2. Identify the relevant PDF document (from hundreds)

  3. Skim through the lengthy, complex document

  4. Locate compensation-related information

  5. Manually copy and paste data into Excel

  6. Perform final calculations

As you can imagine, this process was slow, repetitive, and error-prone — especially with legal documents that are often long and dense.

Our Solution: A Web-Based AI Tool

Initially, we developed a web-based solution that would take the case description, select the relevant pdf and return the correct financial amount.

  • The user enters a case description

  • The system automatically identifies the most relevant PDF

  • It extracts and returns the appropriate compensation amount

  • It highlights the specific section in the PDF and provides a direct link to it

Below you can find a screenshot of the web tool (we blurred some information).

This already sped up the process dramatically. But there was still a bottleneck: users had to manually copy results back into Excel.

So we thought:
Why not bring the entire process inside Excel itself?

A Smarter Approach: AI-Powered Excel Integration

That’s when we created a new solution: a VBA-based AI assistant inside Excel, powered by Google’s Gemini API. This setup allows users to interact with AI and their PDF documents directly from within Excel — no switching between tools, no copy-paste needed.

The system is built entirely with:

  • VBA macros, to orchestrate user interaction and data flow

  • Microsoft Word, for automated extraction of text from PDF files

  • Google's Gemini API, used for the AI functionality

This approach is lightweight, local-first, and requires no external infrastructure.

How It Works: Behind the Scenes

Here’s a breakdown of what happens when the user clicks the “Get Response” button in Excel:

  1. PDF Selection
    The user selects one ore more pdfs an entire folder containing relevant documents.

  2. Text Extraction
    Using Microsoft Word's COM automation, the system extracts all text from each PDF.

  3. Prompt Generation
    The system builds a structured prompt combining extracted content with the user’s case description and business rules.

  4. API Request Execution
    The prompt is submitted via a curl command. The system captures and stores the response returned by the AI.

  5. Response Parsing
    The AI’s output is processed programmatically. Specific data points (e.g., monetary amounts or legal references) are extracted using tag-based logic.

  6. In-Excel Display
    The final results are rendered directly in the Excel interface enabling the user to review key information without ever opening the original PDF.

Let's talk code: Excel + AI + Shell

To connect Excel with the Gemini API, we initially experimented with the MSXML2.XMLHTTP library a common method for making HTTP requests in VBA. However, it fell short when it came to sending structured payloads or dealing with file attachments. Instead, we took a more robust approach: We used VBA to execute a shell command that runs cURL.

This allowed us to:

  • Build and send a structured JSON prompt to the Gemini API

  • Retrieve the AI’s response into a .json file

  • Read and parse the response directly back into Excel

Let's look at some snippets of the code:

shellCommand = """" & curlPath & """ 
  https://generativelanguage.googleapis.com/v1beta/models"
  & 
        "?key=" & apiKey & " -H ""Content-Type: application/json"" -d @" & """" &
  jsonInput & """" & _" >
  " & """" & jsonOutput & """"

This code block builds a curl command that sends a JSON file containing the prompt to the Gemini API. It also handles automatic retrieval of the response into a .json file .

Sub CreateGeminiRequestJSON(prompt As String, pdfPaths As Collection, rules As String)
    Dim jsonFile As Integer, jsonPath As String, jsonText As String
    Dim promptEscaped As String, rulesEscaped As String
    Dim partsText As String

    jsonPath = ThisWorkbook.Path & "\gemini_request.json"
    jsonFile = FreeFile

    promptEscaped = Replace(prompt, """", "\""")
    rulesEscaped = Replace(rules, """", "\""")
    
    jsonText = "{""contents"":["
    jsonText = jsonText & "{""role"":""user"",""parts"":[{""text"":""" & rulesEscaped & """}]},"

    jsonText = jsonText & "{""role"":""user"",""parts"":[{""text"":""" & promptEscaped & """}]}"

    jsonText = jsonText & "]," & vbCrLf

    jsonText = jsonText & """generationConfig"":{""temperature"":0,""topK"":40,""topP"":0.95,""maxOutputTokens"":8192}}"

    Open jsonPath For Output As #jsonFile
    Print #jsonFile, jsonText
    Close #jsonFile
End Sub

This function prepares the content of the AI ​​query. It combines business analysis rules and extracted text from documents to generate a structured prompt, conforming to the format required by the Gemini API. The prompt is then saved as a JSON file, ready to be sent.

Function ExtractTextFromPDF(pdfPath As String) As String
    Dim wordApp As Object
    Dim doc As Object
    Dim text As String
    On Error GoTo ErrorHandler

    Set wordApp = CreateObject("Word.Application")
    wordApp.Visible = False
    Set doc = wordApp.Documents.Open(pdfPath)

    text = doc.Content.text

    doc.Close False
    wordApp.Quit
    Set doc = Nothing
    Set wordApp = Nothing

    ExtractTextFromPDF = text
    Exit Function

ErrorHandler:
    ExtractTextFromPDF = "[Erreur de lecture du fichier : " & pdfPath & "]"
    If Not doc Is Nothing Then doc.Close False
    If Not wordApp Is Nothing Then wordApp.Quit
End Function

This feature uses the Microsoft Word application to open a PDF file and extract all the plain text from it. It returns the textual content of the document as a string, which can be used in AI prompt generation or for other automated processing.

This method proved to be lightweight, flexible, and easy to debug, and it worked without needing external libraries or dependencies.


Copyrights © 2025 by BIT SOLUTIONS.

Copyrights © 2025 by BIT SOLUTIONS.

Copyrights © 2025 by BIT SOLUTIONS.