Overview
Our n8n Workflow automates the process of collecting invoice data in PDF format from the user, and sends this data to the SPACE OCR api to fetch all the fields we need to extract them. parse and clean it, use a mistral AI agent, get correct HSN codes from a Google Sheet, return the modified invoice with appropriate codes. Not manually just through an API.
How does it work?
Step 1: On Form Submission (Trigger Node)
- Type: n8n Form Trigger (Form Submission)
- Goal: Trigger the workflow when a user post (upload) an invoice using an n8n form.
- Process:
- Receives uploaded invoice PDF.
- Relays the file data to the next node.
Step 2: OCR Response (HTTP Request Node)
- Type: HTTP Request
- Purpose: Sends the uploaded invoice to an OCR API such as OCR. space) to copy over text from the PDF.
- Configuration:
- Method: POST
- URL: https://api.ocr.space/parse/image
Step 3: AI Agent (Integration of the Mistral Cloud Chat Model)
- Type: AI Agent Node
- Goal: Clean and structure raw
- Logic:
- Prompt: “Tank all structured data from this OCR text and return in usuable JSON format fields may include item name, qty, value of items, category etc”
Step 4: Code Node (Pre-HSN Processing)
- Type: Code
- Purpose: I am trying to parse the AI generated JSON into objects of different types for something else.
Step 5: Get Row(s) from Sheet (Google Sheets Node)
- Type: Google Sheets → Read
- Purpose: A file of HSN Master Sheet is being read which has keyword/category and HSN code mapping.
Step 6: Code1 Node (HSN Auto-Assignment)
- Type: Code
- Objective: 1.Associating HSN Codes present in the Google sheet with invoice item categories.
- Logic:
- Look among the master sheet for as many invoice items you have to search for keyword matches.
- Assign the corresponding HSN code.
- If HSN not available – mention as “HSN Not

Technology Stack Included
Key Benefits
Accurate OCR Data Extraction
Leverages an OCR API to capture text from uploaded invoices with a high degree of accuracy to reduce data entry errors.
AI-Powered Parsing & Cleaning
The AI Agent takes the raw OCR output and cleans it then structures into a useful JSON
Dynamic HSN Code Assignment
Real-Time match extracted product descriptions with your HSN Google Sheet database. If there are any changes to HSN codes in Google Sheets.
Scalable & Reusable
Can be extended to support various types of documents (invoices, purchase orders, receipts) with ease.
Improved Compliance & GST Readiness
It auto-categorizes HSN codes rightly, prevents mismatches in GST returns, and helps you avoid penalties.
Time & Cost Savings
Automates manual searching and saves your accounting team hours.
Real-Time Processing
Outputs structured invoice data immediately after you submit the form.
Closure
This documentation outlines the step by step process, defines roles and specifies the technology usage to achieve increased efficiency. Plus, it also automates the process of collecting invoice data in PDF format from the user, and sends this data to the SPACE OCR api to fetch all the fields we need to extract them.