๐ ๐ค ๅฐ็ซฏ AI ้ฒ้ๆ็จ:ไฝฟ็จ Ollama ่ Python ๅปบ็ซ็งๆๆ่กๆไปถๅฉๆ
ๆ「ๅ ฌๅธๅ ง้จ SOP、็ถญ้็ญ่จ、Runbook、ๆ ้ๆๆฅๆๅ」ไธ็ตฆ้ฒ็ซฏ AI ไพๅ็ญ,ๆๅคง็็้ป้ๅธธไธๆฏๆๆ,่ๆฏ่ณๆๅคๆต้ขจ้ช่ๅ่ฆๅ้ก。
้็ฏ่ฆๅ็ๆฏไธๅ「ๅฐ็ซฏ็งๆๅ」็ๆ่กๆไปถๅฉๆ:
- LLM ๅจๅฐ็ซฏ่ท(Ollama)
- ๆไปถๅ้ๅ(embeddings)ๅพๅญ้ฒ SQLite(ไธ็จ้กๅคๆถๅ้่ณๆๅบซไน่ฝ่ท)
- ๆฅ่ฉขๆๅ ๆชข็ดข็ธ้็ๆฎต,ๅไบค็ตฆๆจกๅๆด็(RAG)
- ็ญๆก้ไธ「ๅผ็จ็ๆฎตไพๆบ」,ๆนไพฟไฝ ๅ้ ญๆ ธๅฐ
๐ ็ฎ้
- 1. ๆด้ซๆถๆง(RAG)
- 2. ๅฎ่ฃ Ollama ่ๆจกๅๆบๅ
- 3. Ollama API ๅฟซ้ๆธฌ่ฉฆ
- 4. Python ๅฐๆกๅๅงๅ
- 5. ๅฏฆไฝ:็งๆๆ่กๆไปถๅฉๆ(SQLite ๅ้ๆชข็ดข + ๅ็ญ)
- 6. ๅฏฆๅ่ชฟๆ ก:ๆบ็ขบๅบฆ、ๅฎๅ จๆง、ๆ่ฝ
- 7. ็ถญ้ๅปบ่ญฐ:ๆดๆฐ、ๅไปฝ、ๆฌ้้้ข
- FAQ
- ๐ ๅปถไผธ้ฑ่ฎ
1. ๆด้ซๆถๆง(RAG)
RAG(Retrieval-Augmented Generation)็้้ป:ๅ ๆพ่ณๆ,ๅๅซๆจกๅๅ็ญ,้ฟๅ ๆจกๅๆๅฐ่ฑกไบ่ฃ。
ไฝฟ็จ่
ๅ้ก
│
├─(1) ๅ้ๅ(embedding)
│
├─(2) SQLite ๅ Top-K ็ธไผผ็ๆฎต(cosine similarity)
│
└─(3) ๆ「็ๆฎต + ๅ้ก」ไธ็ตฆ LLM ็ๆ็ญๆก
└─(4) ็ญๆก้ไธไพๆบ(ๆชๅ/ๆฎต่ฝ)
ไฝ ๆๅๆ่ทๅ ฉ็จฎๆจกๅ:
- ่ๅคฉ/็ๆๆจกๅ:่ฒ ่ฒฌ「ๆด็ๆๅฏ่ฎ็ญๆก」
- Embedding ๆจกๅ:่ฒ ่ฒฌๆๆๅญ่ฎๆๅ้,ๆฟไพๅ็ธไผผๅบฆๆชข็ดข
2. ๅฎ่ฃ Ollama ่ๆจกๅๆบๅ
2.1 Linux ๅฎ่ฃ Ollama
curl -fsSL https://ollama.com/install.sh | sh
่ฅไฝ ๅธๆๅฎไปฅ systemd ๆๅๅฝขๅผ้ท้ง(ไผบๆๅจ็ฐๅขๅพๅธธ่ฆ),ๅฏไปฅ็จๅฎๆนๆไปถ็ๆนๅผๅปบ็ซไธฆๅ็จๆๅ(ๆ็ดๆฅไฝฟ็จๅฎ่ฃๅพ้ๅธถ็ๆๅ)。้่ฆ่ชฟๆด็ฐๅข่ฎๆธๆ,ๅฏ็จ systemctl edit ๅ override。
2.2 ไธ่ผๆจกๅ(ๅปบ่ญฐไธๅ่ๅคฉๆจกๅ + ไธๅ embeddings ๆจกๅ)
# ่ๅคฉๆจกๅ(ๆไธ)
ollama pull gemma3
# ๆ
ollama pull llama3.2
# Embeddings ๆจกๅ(ๆไธ)
ollama pull embeddinggemma
# ๆ
ollama pull all-minilm
ๅฐๆ้:embedding ๆจกๅ้ๅธธๆฏ่ๅคฉๆจกๅ่ผ้;ไฝ ไนๅฏไปฅไพ็กฌ้ซ่ณๆบ่ชฟๆด้ธๆ(CPU-only ไน่ฝ่ท,ๅชๆฏ้ๅบฆๆๆฏ่ผๆ ข)。
3. Ollama API ๅฟซ้ๆธฌ่ฉฆ
Ollama ็ API ้ ่จญๅจๆฌๆฉๆไพๆๅ(base URL:http://localhost:11434/api),ไฝ ๅฏไปฅ็จ curl ๅ
้ฉ่ญ。
(REST API ้ ่จญๆฏไธฒๆต่ผธๅบ,ๆณไธๆฌกๆฟๅฎๆด็ตๆๅฏไปฅๅ "stream": false)
3.1 ๆๅญ็ๆ(/api/generate)
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "่ซ็จไธ้ปๅๅบ:RAG ็ๆ ธๅฟๅนๅผๆฏไป้บผ?",
"stream": false
}'
3.2 ๅฐ่ฉฑ(/api/chat)
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [
{ "role": "system", "content": "ไฝ ๆฏๅด่ฌน็ๆ่กๆไปถๅฉๆ,ๅ็ญ่ฆ้ไธๅผ็จไพๆบ。" },
{ "role": "user", "content": "ไป้บผๆ
ๅข้ฉๅ็จ Active Check?" }
],
"stream": false
}'
3.3 Embeddings(/api/embed)
curl http://localhost:11434/api/embed -d '{
"model": "embeddinggemma",
"input": "Zabbix Proxy ๅฏไปฅ็ทฉ่ก่ณๆไธฆ้ไฝ Server ๅฃๅ",
"truncate": true
}'
4. Python ๅฐๆกๅๅงๅ
้่ฃก็จๆๅฐไพ่ณด:requests + sqlite3(ๆจๆบๅบซ)ๅฐฑ่ฝๅฎๆ。
mkdir private-docs-assistant && cd private-docs-assistant
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install requests
ๅปบ่ญฐๅฐๆก็ตๆง:
private-docs-assistant/
docs/ # ไฝ ็็งๆๆไปถ(md/txt/html ้ฝๅฏๅ
ๆพ้)
assistant.py # ไธป็จๅผ(ๅซ ingest + ask)
kb.sqlite3 # SQLite ็ฅ่ญๅบซ(่ชๅ็ข็)
5. ๅฏฆไฝ:็งๆๆ่กๆไปถๅฉๆ(SQLite ๅ้ๆชข็ดข + ๅ็ญ)
้ๆฏ่ ณๆฌๆไพๅ ฉๅๆไปค:
python assistant.py ingest ./docs:ๆ่ณๆๅ็ + embeddings + ๅญ SQLitepython assistant.py ask "ไฝ ็ๅ้ก":ๆชข็ดข Top-K ็ๆฎต + LLM ็ๆ็ญๆก(้ไพๆบ)
#!/usr/bin/env python3
# assistant.py
import argparse
import json
import math
import os
import re
import sqlite3
from dataclasses import dataclass
from pathlib import Path
from typing import List, Tuple
import requests
OLLAMA_BASE = os.environ.get("OLLAMA_BASE", "http://localhost:11434/api")
CHAT_MODEL = os.environ.get("OLLAMA_CHAT_MODEL", "gemma3")
EMBED_MODEL = os.environ.get("OLLAMA_EMBED_MODEL", "embeddinggemma")
DB_PATH = os.environ.get("KB_DB", "kb.sqlite3")
# --- ๆๅญ่็:็ฐกๅฎๅ็(ๅฏไพไฝ ๆไปถๅๆ
ๅๅ ๅผท) ---
def read_text_file(path: Path) -> str:
text = path.read_text(encoding="utf-8", errors="ignore")
# ็ฒ็ฅๆธ
็:ๅค็ฉบ็ฝ็ธฎๆธ
text = re.sub(r"[ \t]+", " ", text)
text = re.sub(r"\n{3,}", "\n\n", text)
return text.strip()
def chunk_text(text: str, chunk_size: int = 900, overlap: int = 150) -> List[str]:
if not text:
return []
chunks = []
i = 0
n = len(text)
while i < n:
j = min(n, i + chunk_size)
chunk = text[i:j].strip()
if chunk:
chunks.append(chunk)
i = j - overlap
if i < 0:
i = 0
if j == n:
break
return chunks
# --- Ollama API ---
def ollama_embed(texts: List[str]) -> List[List[float]]:
# /api/embed ๆฏๆด input ็บๅญไธฒๆๅญไธฒ้ฃๅ
r = requests.post(
f"{OLLAMA_BASE}/embed",
json={"model": EMBED_MODEL, "input": texts, "truncate": True},
timeout=300,
)
r.raise_for_status()
data = r.json()
return data["embeddings"]
def ollama_chat(system: str, user: str) -> str:
payload = {
"model": CHAT_MODEL,
"messages": [
{"role": "system", "content": system},
{"role": "user", "content": user},
],
"stream": False,
}
r = requests.post(f"{OLLAMA_BASE}/chat", json=payload, timeout=300)
r.raise_for_status()
data = r.json()
return data["message"]["content"]
# --- ๅ้้็ฎ ---
def cosine_sim(a: List[float], b: List[float]) -> float:
dot = sum(x * y for x, y in zip(a, b))
na = math.sqrt(sum(x * x for x in a))
nb = math.sqrt(sum(y * y for y in b))
if na == 0 or nb == 0:
return 0.0
return dot / (na * nb)
# --- SQLite KB ---
def init_db(conn: sqlite3.Connection) -> None:
conn.execute("""
CREATE TABLE IF NOT EXISTS chunks (
id INTEGER PRIMARY KEY AUTOINCREMENT,
path TEXT NOT NULL,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
embedding TEXT NOT NULL
)
""")
conn.execute("CREATE INDEX IF NOT EXISTS idx_chunks_path ON chunks(path)")
conn.commit()
def save_chunks(conn: sqlite3.Connection, path: str, chunks: List[str], embeds: List[List[float]]) -> None:
conn.execute("DELETE FROM chunks WHERE path = ?", (path,))
rows = [
(path, i, chunks[i], json.dumps(embeds[i], ensure_ascii=False))
for i in range(len(chunks))
]
conn.executemany(
"INSERT INTO chunks(path, chunk_index, content, embedding) VALUES (?, ?, ?, ?)",
rows,
)
conn.commit()
def load_all(conn: sqlite3.Connection) -> List[Tuple[str, int, str, List[float]]]:
cur = conn.execute("SELECT path, chunk_index, content, embedding FROM chunks")
out = []
for path, idx, content, emb_json in cur.fetchall():
out.append((path, idx, content, json.loads(emb_json)))
return out
@dataclass
class Hit:
score: float
path: str
chunk_index: int
content: str
def search(conn: sqlite3.Connection, query: str, top_k: int = 5) -> List[Hit]:
q_emb = ollama_embed([query])[0]
items = load_all(conn)
scored = []
for path, idx, content, emb in items:
s = cosine_sim(q_emb, emb)
scored.append(Hit(score=s, path=path, chunk_index=idx, content=content))
scored.sort(key=lambda x: x.score, reverse=True)
return scored[:top_k]
# --- CLI ---
def cmd_ingest(args) -> None:
docs_dir = Path(args.dir).resolve()
assert docs_dir.exists(), f"docs dir not found: {docs_dir}"
conn = sqlite3.connect(DB_PATH)
init_db(conn)
files = []
for ext in ("*.md", "*.txt", "*.log", "*.html"):
files.extend(docs_dir.rglob(ext))
if not files:
print("ๆพไธๅฐๅฏๅฏๅ
ฅ็ๆชๆก(md/txt/log/html)。")
return
for fp in sorted(files):
text = read_text_file(fp)
chunks = chunk_text(text, chunk_size=args.chunk_size, overlap=args.overlap)
if not chunks:
continue
embeds = ollama_embed(chunks)
rel_path = str(fp.relative_to(docs_dir))
save_chunks(conn, rel_path, chunks, embeds)
print(f"[OK] {rel_path} chunks={len(chunks)}")
print("ๅฎๆ:็ฅ่ญๅบซๅทฒๆดๆฐ。")
def cmd_ask(args) -> None:
conn = sqlite3.connect(DB_PATH)
init_db(conn)
hits = search(conn, args.question, top_k=args.top_k)
if not hits:
print("ๆพไธๅฐ็ธ้็ๆฎต,่ซ็ขบ่ชๅทฒๅท่ก ingest,ๆ่ชฟๆดๅ้กๆ่ฟฐ。")
return
context_lines = []
for i, h in enumerate(hits, 1):
context_lines.append(
f"[{i}] {h.path}#{h.chunk_index} (score={h.score:.3f})\n{h.content}\n"
)
context = "\n---\n".join(context_lines)
system = (
"ไฝ ๆฏ『็งๆๆ่กๆไปถๅฉๆ』,ๅช่ฝๆ นๆไฝฟ็จ่
ๆไพ็【ๅผ็จ็ๆฎต】ๅ็ญ。"
"่ฅ็ๆฎตไธ่ถณไปฅๅพๅบ็ต่ซ,่ซ็ดๆฅ่ชช『ๆไปถ็ๆฎตไธ่ถณ,ๅปบ่ญฐ่ฃๅ
:...』,ไธ่ฆ่ช่ก็ทจ้ 。"
"ๅ็ญๆ ผๅผ่ฆๆฑ:\n"
"1) ๅ
็ตฆ็ต่ซ(ๆขๅ)\n"
"2) ๅ็ตฆๆไฝๆญฅ้ฉ(่ฅ้ฉ็จ)\n"
"3) ๆๅพๅๅบไฝ ๅผ็จ็็ๆฎต็ทจ่(ไพๅฆ:ๅผ็จ:[1][3])\n"
)
user = (
f"【ๅ้ก】\n{args.question}\n\n"
f"【ๅผ็จ็ๆฎต】\n{context}\n\n"
"่ซ้ๅงๅ็ญ。"
)
ans = ollama_chat(system, user)
print(ans)
def main():
p = argparse.ArgumentParser(description="Private Tech Docs Assistant (Ollama + SQLite)")
sub = p.add_subparsers(dest="cmd", required=True)
p_ing = sub.add_parser("ingest", help="ingest docs into sqlite kb")
p_ing.add_argument("dir", help="docs folder, e.g. ./docs")
p_ing.add_argument("--chunk-size", type=int, default=900)
p_ing.add_argument("--overlap", type=int, default=150)
p_ing.set_defaults(func=cmd_ingest)
p_ask = sub.add_parser("ask", help="ask a question with retrieval")
p_ask.add_argument("question", help="your question")
p_ask.add_argument("--top-k", type=int, default=5)
p_ask.set_defaults(func=cmd_ask)
args = p.parse_args()
args.func(args)
if __name__ == "__main__":
main()
5.1 ไฝฟ็จๆนๅผ
# 1) ๅ
ๅฏๅ
ฅไฝ ็ๆไปถ
python assistant.py ingest ./docs
# 2) ้ๅงๅๅ้ก
python assistant.py ask "่ซๆด็้ไปฝ Runbook ่ฃก,ๆๅ็กๆณๅๅๆ็ๆๆฅ้ ๅบ"
6. ๅฏฆๅ่ชฟๆ ก:ๆบ็ขบๅบฆ、ๅฎๅ จๆง、ๆ่ฝ
6.1 ๆบ็ขบๅบฆ:Top-K、ๅ็ๅคงๅฐ、ไปฅๅ「ๅ็ญ่ฆๅ」
- Top-K:5 ้ๅธธๅค ็จ;่ฅๆไปถๅพๅคง、็ญๆกๅฎนๆๆผๆฎต่ฝ,ๅฏ่ชฟๅฐ 8~12。
- chunk size:900 ๅญๅ ๅทฆๅณๆฏๅธธ่ฆๆ่กท;ๅคชๅฐๆ็ข็ๅ、ๅคชๅคงๆๆทท้ไธๅไธป้ก。
- ๅ็ญ่ฆๅ(system prompt):ไธๅฎ่ฆๆ็ขบ่ฆๆฑ「ๅช่ฝๆ นๆๅผ็จ็ๆฎต」,ไธ่ถณๅฐฑ่ชชไธ่ถณ。
6.2 ๅฎๅ จๆง:ไธ่ฆ้จไพฟๆ Ollama API ็ดๆฅๅฐๅค
Ollama ้ ่จญๅชๅจๆฌๆฉๆไพ API(ๆฌ็ฏไนไปฅ localhost ็บๅๆ)。ๅฆๆไฝ ็็้่ฆ่ทจๆฉๅจๅผๅซ,ๅปบ่ญฐๅชๅ ็จ:
- SSH Tunnel(ๆ็ไบ)
- ๆๅๅไปฃ็ๅ ไธๅญๅๆง็ฎก(ไพๅฆ Basic Auth / IP allowlist / mTLS)
่ฅไฝ ๆฏไปฅ systemd ๆนๅผ่ท Ollama,ไนๅฏไปฅ็จ service override ่จญๅฎ็ฐๅข่ฎๆธ(ไพๅฆ OLLAMA_HOST)。่จญๅฎ็ฐๅข่ฎๆธ็ๆนๅผๅฏๅ่ๅฎๆน FAQ ็็ฏไพ。
6.3 ๆ่ฝ:ๆจกๅๅธธ้ง่ๅฟซๅ
- ๆฌ็ฏ็จ
/api/chat่/api/embed,้ฝๅฏ้้keep_alive่ฎๆจกๅๅธธ้ง,ๆธๅฐๅ่ฆ่ผๅ ฅๆๆฌ。 - ๆไปถๅฏๅ ฅ(ingest)ๆฏ้ข็ทไฝๆฅญ,ๅปบ่ญฐๆ็จๆไธ่ท;็ฝๅคฉๅชๅ ask。
7. ็ถญ้ๅปบ่ญฐ:ๆดๆฐ、ๅไปฝ、ๆฌ้้้ข
7.1 ๆดๆฐ Ollama
curl -fsSL https://ollama.com/install.sh | sh
7.2 ๅไปฝ็ฅ่ญๅบซ
cp kb.sqlite3 kb.sqlite3.bak
7.3 ๆฌ้้้ข
- ๆ
docs/่kb.sqlite3ๆพๅจๅชๆ็ฎก็่ ๅฏ่ฎ็่ทฏๅพ。 - ่ฅๆฏๅคไบบไฝฟ็จ,ๅปบ่ญฐๆ Assistant ๅ ๆๅ ง็ถฒๆๅ,ๅ็จๅธณ่ๆฌ้ๅๆง็ฎก(้ฟๅ ๆฏๅไบบ็ดๆฅ่ฎๅฐๆๆๆไปถ)。
FAQ
Q1:็บไป้บผ embeddings ไธ็ดๆฅ็จ่ๅคฉๆจกๅๅ?
ๅ ็บ embeddings ๆฏ「็บๆชข็ดข่็」:ๅ้ๅ่ณชๆด็ฉฉ、้ๅบฆๆดๅฟซ、ๆๆฌๆดไฝ。่ๅคฉๆจกๅไธป่ฆ่ฒ ่ฒฌๆด็่ชๅฅ่่ผธๅบ็ญๆก。
Q2:SQLite ็็ๅค ็จๅ?
ๅฐๅฐไธญๅ(ๅนพๅๅฐๅนพ่ฌๅ chunks)้ๅธธๆฒๅ้ก。่ฅไฝ ไธๅๅฐๅ่ฌ็ดไปฅไธ,ๆๅคไบบไฝต็ผๆฅ่ฉข,ๅฐฑๅปบ่ญฐๆๆๅฐ็จๅ้่ณๆๅบซๆ่ณๅฐๆๆชข็ดขๅฑคๅๆๅๅ。
Q3:ๆ้บผ่ฎ็ญๆกๆด「ๅๆ่กๆไปถ」่ไธๆฏ่ๅคฉ?
ๆ system prompt ๅฏซๅพๆด็กฌไธ้ป:่ฆๆฑ「ๆขๅ、ๆญฅ้ฉ、ๆไปค、้ขจ้ช、ๅๅพฉๆนๅผ」,ไธฆๅผทๅถ้ไธๅผ็จ็ๆฎต็ทจ่。
ไฝ ็ๆไปถไธป่ฆๆฏ SOP / Runbook / Markdown ็ญ่จ / Wiki / HTML ๆๅ?
ไฝ ๅธๆๅ็ญ「ๆดๅด่ฌน」้ๆฏ「ๆดๅๅไฝๅไบ」?ๆๅฏไปฅไพไฝ ็ๅ
งๅฎนๅๆ
,ๅนซไฝ ๆๅ็่ฆๅ、Top-K ่ๆ็คบ่ฉ่ชฟๆๆดๆบ。
ๆฒๆ็่จ:
ๅผต่ฒผ็่จ