Overview
Our n8n Workflow automates the process of collecting invoice data in PDF format from the user, and sends this data to the SPACE OCR api to fetch all the fields we need to extract them. parse and clean it, use a mistral AI agent, get correct HSN codes from a Google Sheet, return the modified invoice with appropriate codes. Not manually just through an API.
How does it work?
Step 1: On Form Submission (Trigger Node)
- Type: n8n Form Trigger (Form Submission)
- Goal: Trigger the workflow when a user post (upload) an invoice using an n8n form.
- Process:
- Receives uploaded invoice PDF.
- Relays the file data to the next node.
Step 2: OCR Response (HTTP Request Node)
- Type: HTTP Request
- Purpose: Sends the uploaded invoice to an OCR API such as OCR. space) to copy over text from the PDF.
- Configuration:
- Method: POST
- URL: https://api.ocr.space/parse/image
Step 3: AI Agent (Integration of the Mistral Cloud Chat Model)
- Type: AI Agent Node
- Goal: Clean and structure raw
- Logic:
- Prompt: “Tank all structured data from this OCR text and return in usuable JSON format fields may include item name, qty, value of items, category etc”
Step 4: Code Node (Pre-HSN Processing)
- Type: Code
- Purpose: I am trying to parse the AI generated JSON into objects of different types for something else.
Step 5: Get Row(s) from Sheet (Google Sheets Node)
- Type: Google Sheets → Read
- Purpose: A file of HSN Master Sheet is being read which has keyword/category and HSN code mapping.
Step 6: Code1 Node (HSN Auto-Assignment)
- Type: Code
- Objective: 1.Associating HSN Codes present in the Google sheet with invoice item categories.
- Logic:
- Look among the master sheet for as many invoice items you have to search for keyword matches.
- Assign the corresponding HSN code.
- If HSN not available – mention as “HSN Not
