Build a GPT-Powered Data Analyst on Your Local Machine

Your private AI data assistant that answers questions on your CSVs — entirely offline, totally free.

Imagine This…

You’ve got a CSV file loaded with thousands of rows.
It’s a typical day at work, or maybe it’s your side project’s sales data.

You wonder:

  • “What’s the average sales by product?”

  • “Which region performed best last quarter?”

  • “Can I see a table showing total revenue by month?”

Normally, that means opening up Excel, fiddling with pivot tables, or writing Pandas code that looks like:

df.groupby('Product')['Sales'].mean()

Not exactly something everyone enjoys.


What if you could just ask?

What if — instead — you could simply type:

Average sales by product

…and instantly get the answer, calculated from your CSV, right on your laptop?

✅ No manual coding
✅ No uploading your sensitive data to OpenAI or some cloud service
✅ No API key costs
✅ 100% offline, secure, private

That’s exactly what we’re building today.
You’ll end up with your own personal GPT-powered Data Analyst, running entirely on your local machine, answering your CSV questions in plain English.


What Exactly Are We Building?

Here’s the dream:

✅ A slick web app where you upload a CSV
✅ Ask any natural language question about it
✅ Your local GPT (like Mistral via Ollama) reads a sample of your CSV and figures out the answer
✅ Gives you a direct response — like a markdown table or text summary
✅ All happens offline, so your data never leaves your machine.

It’s basically like having ChatGPT trained on your CSV, but living entirely on your laptop. 🔥


Tools We’re Using
 ToolWhy we’re using it
PythonThe glue to hold it all together
StreamlitBuild a beautiful, interactive UI
PandasLoad & explore your CSVs
OllamaRuns large language models locally
Mistral (or LLaMA 3)Local GPT-like brain for reasoning
RequestsTalk to the Ollama API

That’s it.
No OpenAI keys, no sending your private data over the internet.


The Big Deal: Why Local Matters
  • Data privacy: Your CSV stays on your laptop.

  • No surprise bills: It costs zero.

  • Blazing fast: Talking to localhost is faster than waiting on cloud latency.

  • Freedom to tweak: Want to switch to LLaMA 3 or your own finetuned model? Just change one line.


What Happens Under the Hood?

Here’s the simple architecture:

[ You ] ---> Upload CSV + type question ---> [ Streamlit ]
                          |
                          V
               [ Pandas DataFrame ]
                          |
            Send small CSV sample + question
                          |
                          V
                  [ Ollama + Mistral ]
                          |
                    Receives answer
                          |
                          V
                Display answer in app

You can literally ask:

“Average sales by product”

…and your local GPT figures it out from the data you gave it.


Prerequisites

✅ A machine with at least 8GB RAM (16GB is better).
✅ Python 3.8 or newer installed.
✅ Ollama installed to run your local GPT.
✅ Basic comfort running a few terminal commands.


Let’s Build This Step by Step

1️⃣ Install Ollama & Mistral

Head to https://ollama.com/download and install Ollama.

Then open your terminal and run:

ollama pull mistral

This downloads the Mistral 7B model (like a local GPT-3.5).

Test it:

ollama run mistral

Try typing:

> What is 7 * 6?

Mistral setup CMD

 


Boom. Instant local inference.


2️⃣ Set Up Python Environment

Make a folder for your app:

mkdir gpt-data-analyst && cd gpt-data-analyst

Create a virtual environment:

python -m venv venv
source venv/bin/activate    # On Windows: venv\Scripts\activate
If In Windows this "venv\Scripts\activate" throws error then run "Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process" then execute the above command 

Install dependencies:

pip install streamlit pandas requests

3️⃣ The Streamlit App Code

Now create a file app.py with this code:

import streamlit as st
import pandas as pd
import requests
st.set_page_config(page_title=”Local GPT Data Analyst”, layout=”wide”)
st.title(“🧠 GPT-Powered Data Analyst (Natural Language Answer via Ollama)”)
uploaded_file = st.file_uploader(“📁 Upload your CSV file”, type=[“csv”])
if uploaded_file:
    df = pd.read_csv(uploaded_file)
    st.subheader(“📊 Preview of Uploaded Data”)
    st.dataframe(df.head())
    question = st.text_input(“🔎 Ask your question (e.g. ‘Average sales by product’):”)
    if question:
        with st.spinner(“Thinking…”):
            # Take a small sample to avoid overloading prompt
            csv_sample = df.head(100).to_csv(index=False)
            prompt = f”””
You are an expert data analyst.
Given the following CSV data:
{csv_sample}
Answer this question in plain text (or table if relevant), without writing any code:
\”\”\”{question}\”\”\”
Provide a direct, clear answer only.
“””
            # Call Ollama local endpoint
            try:
                res = requests.post(
                    “http://localhost:11434/api/generate”,
                    json={“model”: “mistral”, “prompt”: prompt, “stream”: False}
                )
                response = res.json()[“response”]
                st.subheader(“🤖 Raw LLM Response”)
                st.text(response)
                # Display the answer directly
                st.subheader(“✅ Answer”)
                st.write(response)
            except Exception as e:
                st.error(f”🚨 Error contacting Ollama: {e}”)

 


4️⃣ Run Your Local Data Analyst!

Start it up:

 

You’ll see something like:

Local URL: http://localhost:8501


Click it. Upload your CSV. Ask your question.


Watch your own GPT-powered data analyst at work — entirely on your machine.


Example Questions to Try
QuestionExample Output
“Average sales by product”Markdown table with products + avg
“Total revenue by month”Table with Month & Revenue columns
“Top 5 regions by sales volume”Direct ranked list
“Which product had highest sales?”Simple text answer

Your local GPT (Mistral) figures this out by looking at the CSV sample and your question.


💡 Why We Use head(100)

We pass only the first 100 rows to the model to:

✅ Keep the prompt size small (big models can’t handle full CSVs).
✅ Still give enough context to understand your data’s columns & values.

If you have a huge CSV, this is super handy.


Looks Amazing, But Why Offline?

Because:

✅ Your data never leaves your laptop (perfect for private financial, medical, or company data).
✅ No API costs or rate limits.
✅ Works even without internet.
✅ You can swap to other models anytime:

json={"model": "llama3", "prompt": prompt, "stream": False}

What Makes This So Fun

It’s a little chat-GPT that knows pandas & your CSV, running right on your computer.

Under the hood:

  1. Streamlit handles the UI & file upload.

  2. pandas loads the CSV.

  3. A prompt is crafted:

    "You are an expert data analyst. Given this CSV data... answer this question..."
    
  4. It’s sent to Ollama on localhost:11434.

  5. Ollama + Mistral processes it and returns a plain answer (or markdown table).

  6. Streamlit displays it neatly.


Want To Level Up?

✅ Try a bigger model (like llama3)
✅ Let it generate plots too by extending your prompt.
✅ Save Q&A history in a local file.
✅ Or build a chatbot-style memory that recalls earlier questions.


Real-World Uses

💼 Finance teams:
“What were monthly expenses by category last year?”

📈 Marketing:
“Show me top 5 campaigns by leads.”

🛒 Sales:
“Which region closed most deals?”

🏠 DIY:
“Analyze electricity usage by season from my smart meter data.”


🔒 Security Note

We run exec()-free here — your GPT gives direct natural language answers.
So zero risk of arbitrary Python code running on your machine.
On the top there is no chance of your data leakage over the internet as this a totally offline local system solution .

(If you do ever want GPT to return actual pandas code to exec(), keep it local only. Always check outputs.)


Wrap Up

That’s it — you just built your very own GPT-powered Data Analyst,
that runs entirely on your laptop, costs nothing, needs no API key, and keeps your data safe.

✅ It’s like ChatGPT — but it only talks to you, only sees your CSVs, and lives inside your laptop.

Pretty awesome, right?


From PrepEngi

At PrepEngi, I love helping data folks & developers build real-world, hands-on tools that blend AI with engineering.
If this inspired you, share it, fork it, or reach out. I’d love to see what you build!