Initial setup: Flask app with CI/CD
This commit is contained in:
parent
d257c3e753
commit
267f668663
4 changed files with 1005 additions and 1 deletions
90
.github/workflows/deploy.yml
vendored
Normal file
90
.github/workflows/deploy.yml
vendored
Normal file
|
|
@ -0,0 +1,90 @@
|
|||
name: Deploy to Infomaniak
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
workflow_dispatch: # Manueller Trigger möglich
|
||||
|
||||
env:
|
||||
APP_DIR: /opt/ollama-webapp
|
||||
SERVICE_NAME: ollama-webapp
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
# 1. Code auschecken
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
# 2. SSH Setup
|
||||
- name: Setup SSH
|
||||
run: |
|
||||
mkdir -p ~/.ssh
|
||||
echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/deploy_key
|
||||
chmod 600 ~/.ssh/deploy_key
|
||||
ssh-keyscan -H ${{ secrets.SERVER_HOST }} >> ~/.ssh/known_hosts
|
||||
|
||||
# 3. Dateien zum Server kopieren
|
||||
- name: Deploy files to server
|
||||
run: |
|
||||
rsync -avz --delete \
|
||||
-e "ssh -i ~/.ssh/deploy_key -o StrictHostKeyChecking=no" \
|
||||
--exclude '.git' \
|
||||
--exclude '.github' \
|
||||
--exclude '__pycache__' \
|
||||
--exclude '*.pyc' \
|
||||
--exclude 'venv' \
|
||||
--exclude '.env' \
|
||||
--exclude 'logs' \
|
||||
./ ${{ secrets.SERVER_USER }}@${{ secrets.SERVER_HOST }}:${{ env.APP_DIR }}/app/
|
||||
|
||||
# 4. Dependencies installieren und Service neu starten
|
||||
- name: Install dependencies and restart service
|
||||
run: |
|
||||
ssh -i ~/.ssh/deploy_key -o StrictHostKeyChecking=no \
|
||||
${{ secrets.SERVER_USER }}@${{ secrets.SERVER_HOST }} << 'ENDSSH'
|
||||
|
||||
echo "📦 Installing dependencies..."
|
||||
cd /opt/ollama-webapp
|
||||
./venv/bin/pip install -r app/requirements.txt --quiet --upgrade
|
||||
|
||||
echo "🔄 Restarting service..."
|
||||
sudo systemctl restart ollama-webapp
|
||||
|
||||
echo "⏳ Waiting for service to start..."
|
||||
sleep 5
|
||||
|
||||
echo "📊 Service status:"
|
||||
sudo systemctl status ollama-webapp --no-pager -l
|
||||
|
||||
echo "✅ Deployment complete!"
|
||||
ENDSSH
|
||||
|
||||
# 5. Health Check
|
||||
- name: Health Check
|
||||
run: |
|
||||
echo "🏥 Running health check..."
|
||||
sleep 3
|
||||
|
||||
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
|
||||
http://${{ secrets.SERVER_HOST }}:5000/api/health || echo "000")
|
||||
|
||||
if [ "$HTTP_STATUS" = "200" ]; then
|
||||
echo "✅ Health check passed! (HTTP $HTTP_STATUS)"
|
||||
else
|
||||
echo "❌ Health check failed! (HTTP $HTTP_STATUS)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 6. Deployment Summary
|
||||
- name: Deployment Summary
|
||||
if: success()
|
||||
run: |
|
||||
echo "🎉 Deployment successful!"
|
||||
echo ""
|
||||
echo "📍 App URL: http://${{ secrets.SERVER_HOST }}:5000"
|
||||
echo "📍 Health: http://${{ secrets.SERVER_HOST }}:5000/api/health"
|
||||
echo "📍 Ollama: http://${{ secrets.SERVER_HOST }}:5000/api/ollama/status"
|
||||
22
README.md
22
README.md
|
|
@ -1 +1,21 @@
|
|||
# private-llm
|
||||
# Private LLM - Belegscanner
|
||||
|
||||
KI-Dokumentenanalyse mit lokalen Ollama Vision-Modellen.
|
||||
|
||||
## Features
|
||||
|
||||
- Rechnungen, Belege, Bankauszüge analysieren
|
||||
- Handschrift erkennen
|
||||
- PDF-Support
|
||||
- 100% lokal - keine Cloud-APIs
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Backend:** Python Flask
|
||||
- **AI:** Ollama Vision Models
|
||||
- **Server:** Infomaniak Swiss Cloud (GPU)
|
||||
|
||||
## Deployment
|
||||
|
||||
Automatisches Deployment via GitHub Actions bei Push zu `main`.
|
||||
|
||||
|
|
|
|||
|
|
@ -3,3 +3,4 @@ flask-cors>=4.0.0
|
|||
requests>=2.31.0
|
||||
werkzeug>=3.0.0
|
||||
pymupdf>=1.24.0
|
||||
gunicorn>=21.0.0
|
||||
893
setupserver.md
Normal file
893
setupserver.md
Normal file
|
|
@ -0,0 +1,893 @@
|
|||
# Local LLM Server - Komplette Setup-Anleitung
|
||||
|
||||
Von GitHub Repo-Setup über Cursor bis zum automatischen Deployment auf Infomaniak.
|
||||
|
||||
---
|
||||
|
||||
## Access
|
||||
|
||||
## Server Data
|
||||
|
||||
IP 83.228.200.109
|
||||
Instance: local-llm Ubuntu 24.04 LTS Noble Numbat 83.228.226.58, 2001:1600:16:10::7e3 nvl4-a8-ram16-disk0 ollama-deploy-key Active az-1
|
||||
Connect: ssh -i "C:\Users\pmots\Downloads\ollama-deploy-key.pem" ubuntu@83.228.200.109
|
||||
|
||||
|
||||
## Übersicht
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ Cursor │◄──── sync ────────▶│ GitHub │
|
||||
│ (lokale Dev) │ │ private-llm │
|
||||
└─────────────────┘ └────────┬────────┘
|
||||
│
|
||||
│ Push to main
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ GitHub Actions │
|
||||
└────────┬────────┘
|
||||
│
|
||||
│ SSH Deploy
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ Infomaniak GPU │
|
||||
│ Server │
|
||||
│ ┌───────────┐ │
|
||||
│ │ Ollama │ │
|
||||
│ │ + Flask │ │
|
||||
│ │ (LLM + │ │
|
||||
│ │ Vision) │ │
|
||||
│ └───────────┘ │
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Teil A: GitHub Repository Setup
|
||||
|
||||
## A.1 Repository klonen in Cursor
|
||||
|
||||
Öffne ein Terminal in Cursor oder lokal:
|
||||
|
||||
```bash
|
||||
# In deinen Projektordner wechseln
|
||||
cd ~/Projects # oder wo du deine Projekte speicherst
|
||||
|
||||
# Repository klonen
|
||||
git clone https://github.com/valueonag/private-llm.git
|
||||
|
||||
# In den Ordner wechseln
|
||||
cd private-llm
|
||||
```
|
||||
|
||||
## A.2 Cursor mit Repo verbinden
|
||||
|
||||
**Option 1: Ordner in Cursor öffnen**
|
||||
1. Cursor öffnen
|
||||
2. **File → Open Folder**
|
||||
3. Wähle den `private-llm` Ordner
|
||||
|
||||
**Option 2: Über Terminal**
|
||||
```bash
|
||||
cd ~/Projects/private-llm
|
||||
cursor .
|
||||
```
|
||||
|
||||
## A.3 Projektstruktur erstellen
|
||||
|
||||
Erstelle folgende Struktur:
|
||||
|
||||
```
|
||||
private-llm/
|
||||
├── app.py # Deine Flask App
|
||||
├── requirements.txt # Python Dependencies
|
||||
├── templates/
|
||||
│ └── index.html # Frontend Template
|
||||
├── static/ # CSS, JS, Bilder (optional)
|
||||
├── .github/
|
||||
│ └── workflows/
|
||||
│ └── deploy.yml # CI/CD Pipeline
|
||||
├── .gitignore
|
||||
└── README.md
|
||||
```
|
||||
|
||||
### A.3.1 requirements.txt
|
||||
|
||||
Erstelle `requirements.txt`:
|
||||
|
||||
```txt
|
||||
flask>=3.0.0
|
||||
flask-cors>=4.0.0
|
||||
requests>=2.31.0
|
||||
pymupdf>=1.24.0
|
||||
gunicorn>=21.0.0
|
||||
```
|
||||
|
||||
### A.3.2 .gitignore
|
||||
|
||||
Erstelle `.gitignore`:
|
||||
|
||||
```gitignore
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
.Python
|
||||
venv/
|
||||
env/
|
||||
.venv/
|
||||
|
||||
# IDE
|
||||
.idea/
|
||||
.vscode/
|
||||
*.swp
|
||||
*.swo
|
||||
.cursor/
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Logs
|
||||
*.log
|
||||
logs/
|
||||
|
||||
# Environment
|
||||
.env
|
||||
.env.local
|
||||
|
||||
# Test
|
||||
.pytest_cache/
|
||||
.coverage
|
||||
htmlcov/
|
||||
```
|
||||
|
||||
### A.3.3 README.md
|
||||
|
||||
Erstelle `README.md`:
|
||||
|
||||
```markdown
|
||||
# Private LLM - Belegscanner
|
||||
|
||||
KI-Dokumentenanalyse mit lokalen Ollama Vision-Modellen.
|
||||
|
||||
## Features
|
||||
|
||||
- Rechnungen, Belege, Bankauszüge analysieren
|
||||
- Handschrift erkennen
|
||||
- PDF-Support
|
||||
- 100% lokal - keine Cloud-APIs
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Backend:** Python Flask
|
||||
- **AI:** Ollama Vision Models
|
||||
- **Server:** Infomaniak Swiss Cloud (GPU)
|
||||
|
||||
## Deployment
|
||||
|
||||
Automatisches Deployment via GitHub Actions bei Push zu `main`.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Teil B: GitHub Actions Deploy Workflow
|
||||
|
||||
## B.1 Workflow-Datei erstellen
|
||||
|
||||
Erstelle `.github/workflows/deploy.yml`:
|
||||
|
||||
```yaml
|
||||
name: Deploy to Infomaniak
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
workflow_dispatch: # Manueller Trigger möglich
|
||||
|
||||
env:
|
||||
APP_DIR: /opt/ollama-webapp
|
||||
SERVICE_NAME: ollama-webapp
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
# 1. Code auschecken
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
# 2. SSH Setup
|
||||
- name: Setup SSH
|
||||
run: |
|
||||
mkdir -p ~/.ssh
|
||||
echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/deploy_key
|
||||
chmod 600 ~/.ssh/deploy_key
|
||||
ssh-keyscan -H ${{ secrets.SERVER_HOST }} >> ~/.ssh/known_hosts
|
||||
|
||||
# 3. Dateien zum Server kopieren
|
||||
- name: Deploy files to server
|
||||
run: |
|
||||
rsync -avz --delete \
|
||||
-e "ssh -i ~/.ssh/deploy_key -o StrictHostKeyChecking=no" \
|
||||
--exclude '.git' \
|
||||
--exclude '.github' \
|
||||
--exclude '__pycache__' \
|
||||
--exclude '*.pyc' \
|
||||
--exclude 'venv' \
|
||||
--exclude '.env' \
|
||||
--exclude 'logs' \
|
||||
./ ${{ secrets.SERVER_USER }}@${{ secrets.SERVER_HOST }}:${{ env.APP_DIR }}/app/
|
||||
|
||||
# 4. Dependencies installieren und Service neu starten
|
||||
- name: Install dependencies and restart service
|
||||
run: |
|
||||
ssh -i ~/.ssh/deploy_key -o StrictHostKeyChecking=no \
|
||||
${{ secrets.SERVER_USER }}@${{ secrets.SERVER_HOST }} << 'ENDSSH'
|
||||
|
||||
echo "📦 Installing dependencies..."
|
||||
cd /opt/ollama-webapp
|
||||
./venv/bin/pip install -r app/requirements.txt --quiet --upgrade
|
||||
|
||||
echo "🔄 Restarting service..."
|
||||
sudo systemctl restart ollama-webapp
|
||||
|
||||
echo "⏳ Waiting for service to start..."
|
||||
sleep 5
|
||||
|
||||
echo "📊 Service status:"
|
||||
sudo systemctl status ollama-webapp --no-pager -l
|
||||
|
||||
echo "✅ Deployment complete!"
|
||||
ENDSSH
|
||||
|
||||
# 5. Health Check
|
||||
- name: Health Check
|
||||
run: |
|
||||
echo "🏥 Running health check..."
|
||||
sleep 3
|
||||
|
||||
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
|
||||
http://${{ secrets.SERVER_HOST }}:5000/api/health || echo "000")
|
||||
|
||||
if [ "$HTTP_STATUS" = "200" ]; then
|
||||
echo "✅ Health check passed! (HTTP $HTTP_STATUS)"
|
||||
else
|
||||
echo "❌ Health check failed! (HTTP $HTTP_STATUS)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 6. Deployment Summary
|
||||
- name: Deployment Summary
|
||||
if: success()
|
||||
run: |
|
||||
echo "🎉 Deployment successful!"
|
||||
echo ""
|
||||
echo "📍 App URL: http://${{ secrets.SERVER_HOST }}:5000"
|
||||
echo "📍 Health: http://${{ secrets.SERVER_HOST }}:5000/api/health"
|
||||
echo "📍 Ollama: http://${{ secrets.SERVER_HOST }}:5000/api/ollama/status"
|
||||
```
|
||||
|
||||
## B.2 GitHub Secrets einrichten
|
||||
|
||||
1. Gehe zu: `https://github.com/valueonag/private-llm/settings/secrets/actions`
|
||||
2. Klicke **New repository secret**
|
||||
3. Erstelle diese 3 Secrets:
|
||||
|
||||
| Secret Name | Wert | Beschreibung |
|
||||
|-------------|------|--------------|
|
||||
| `SERVER_HOST` | `185.xxx.xxx.xxx` | Deine Server-IP |
|
||||
| `SERVER_USER` | `ubuntu` | SSH Benutzer |
|
||||
| `SSH_PRIVATE_KEY` | `-----BEGIN OPENSSH...` | Der Private Key (ganzer Inhalt) |
|
||||
|
||||
---
|
||||
|
||||
# Teil C: Infomaniak Server Setup
|
||||
|
||||
## C.1 Horizon Dashboard Login
|
||||
|
||||
1. Öffne: https://api.pub1.infomaniak.cloud/horizon
|
||||
2. Login-Daten:
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Domain** | `PCU-MPXPVCR` |
|
||||
| **User Name** | `PCU-MPXPVCR` |
|
||||
| **Password** | Dein OpenStack-Passwort |
|
||||
|
||||
---
|
||||
|
||||
## C.2 SSH Key Pair erstellen
|
||||
|
||||
1. Gehe zu: **Compute → Key Pairs**
|
||||
2. Klicke: **Create Key Pair**
|
||||
3. Fülle aus:
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Key Pair Name** | `ollama-deploy-key` |
|
||||
| **Key Type** | `SSH Key` |
|
||||
|
||||
4. Klicke **Create Key Pair**
|
||||
5. ⚠️ **WICHTIG:** Die `.pem` Datei wird automatisch heruntergeladen
|
||||
- Speichere sie sicher ab (z.B. `~/Downloads/ollama-deploy-key.pem`)
|
||||
- Du brauchst sie später für SSH-Zugang!
|
||||
|
||||
---
|
||||
|
||||
## C.3 Security Group erstellen
|
||||
|
||||
1. Gehe zu: **Network → Security Groups**
|
||||
2. Klicke: **Create Security Group**
|
||||
3. Fülle aus:
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Name** | `ollama-webapp` |
|
||||
| **Description** | `Ports für Ollama und Flask App` |
|
||||
|
||||
4. Klicke **Create Security Group**
|
||||
5. In der Liste: Klicke **Manage Rules** bei `ollama-webapp`
|
||||
6. Klicke **Add Rule** und erstelle diese Regeln:
|
||||
|
||||
| Rule | Direction | Ether Type | IP Protocol | Port Range | CIDR |
|
||||
|------|-----------|------------|-------------|------------|------|
|
||||
| 1 | Ingress | IPv4 | TCP | 22 | `0.0.0.0/0` |
|
||||
| 2 | Ingress | IPv4 | TCP | 80 | `0.0.0.0/0` |
|
||||
| 3 | Ingress | IPv4 | TCP | 443 | `0.0.0.0/0` |
|
||||
| 4 | Ingress | IPv4 | TCP | 5000 | `0.0.0.0/0` |
|
||||
| 5 | Ingress | IPv4 | TCP | 11434 | `0.0.0.0/0` |
|
||||
|
||||
**Für jede Regel:**
|
||||
- Klicke **Add Rule**
|
||||
- Direction: `Ingress`
|
||||
- Wähle bei "Rule": `Custom TCP Rule`
|
||||
- Port: Jeweilige Portnummer eingeben
|
||||
- CIDR: `0.0.0.0/0`
|
||||
- Klicke **Add**
|
||||
|
||||
---
|
||||
|
||||
## C.4 GPU-Instanz erstellen
|
||||
|
||||
1. Gehe zu: **Compute → Instances**
|
||||
2. Klicke: **Launch Instance**
|
||||
|
||||
### Tab 1: Details
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Instance Name** | `local-llm` |
|
||||
| **Description** | `Local LLM Server` |
|
||||
| **Availability Zone** | `nova` (Standard lassen) |
|
||||
| **Count** | `1` |
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 2: Source
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Select Boot Source** | `Image` |
|
||||
| **Create New Volume** | `Yes` ✓ |
|
||||
| **Volume Size (GB)** | `150` |
|
||||
| **Delete Volume on Instance Delete** | `No` ✗ |
|
||||
|
||||
**Image auswählen:**
|
||||
1. In der unteren Liste "Available" suche: `Ubuntu 24.04 LTS Noble Numbat`
|
||||
2. Klicke den **↑ Pfeil** rechts davon
|
||||
3. Das Image erscheint oben unter "Allocated"
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 3: Flavor
|
||||
|
||||
**GPU-Flavor auswählen:**
|
||||
|
||||
In der Liste "Available" suche nach GPU-Flavors (beginnen mit `nvl4-`, `t4-`, oder `a2-`):
|
||||
|
||||
| Flavor | GPU | VRAM | vCPUs | RAM | Empfehlung |
|
||||
|--------|-----|------|-------|-----|------------|
|
||||
| `nvl4-12-46-0` | L4 | 24GB | 12 | 46GB | ✓ Beste Wahl |
|
||||
| `t4-8-32-0` | T4 | 16GB | 8 | 32GB | Budget |
|
||||
| `a2-8-32-0` | A2 | 16GB | 8 | 32GB | Alternative |
|
||||
|
||||
1. Finde den gewünschten Flavor (z.B. mit "L4" oder "nvl4" im Namen)
|
||||
2. Klicke den **↑ Pfeil** rechts davon
|
||||
3. Der Flavor erscheint oben unter "Allocated"
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 4: Networks
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Network** | `ext-net1` |
|
||||
|
||||
1. In der Liste "Available" finde: `ext-net1`
|
||||
2. Klicke den **↑ Pfeil**
|
||||
3. `ext-net1` erscheint unter "Allocated"
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 5: Network Ports
|
||||
|
||||
Überspringe diesen Tab (leer lassen).
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 6: Security Groups
|
||||
|
||||
1. Falls `default` unter "Allocated" steht: Klicke den **↓ Pfeil** um es zu entfernen
|
||||
2. In "Available" finde: `ollama-webapp`
|
||||
3. Klicke den **↑ Pfeil**
|
||||
4. Unter "Allocated" sollte nur `ollama-webapp` stehen
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 7: Key Pair
|
||||
|
||||
1. In "Available" finde: `ollama-deploy-key`
|
||||
2. Klicke den **↑ Pfeil**
|
||||
3. Unter "Allocated" steht: `ollama-deploy-key`
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 8: Configuration (optional)
|
||||
|
||||
Überspringe diesen Tab (leer lassen).
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 9: Server Groups (optional)
|
||||
|
||||
Überspringe diesen Tab (leer lassen).
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 10: Scheduler Hints (optional)
|
||||
|
||||
Überspringe diesen Tab (leer lassen).
|
||||
|
||||
→ Klicke **Next**
|
||||
|
||||
### Tab 11: Metadata (optional)
|
||||
|
||||
Überspringe diesen Tab (leer lassen).
|
||||
|
||||
---
|
||||
|
||||
### Instanz starten
|
||||
|
||||
Klicke: **Launch Instance**
|
||||
|
||||
⏳ Warte bis der Status von `Build` zu `Active` wechselt (ca. 1-3 Minuten).
|
||||
|
||||
---
|
||||
|
||||
## C.5 Floating IP zuweisen
|
||||
|
||||
Die Instanz braucht eine öffentliche IP-Adresse:
|
||||
|
||||
1. Gehe zu: **Network → Floating IPs**
|
||||
2. Klicke: **Allocate IP to Project**
|
||||
3. Wähle:
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Pool** | `ext-net1` |
|
||||
| **Description** | `IP für Ollama Server` |
|
||||
|
||||
4. Klicke **Allocate IP**
|
||||
5. In der Liste: Klicke **Associate** bei der neuen IP
|
||||
6. Wähle:
|
||||
|
||||
| Feld | Wert |
|
||||
|------|------|
|
||||
| **Port to be associated** | `local-llm` (deine Instanz) |
|
||||
|
||||
7. Klicke **Associate**
|
||||
|
||||
📝 **Notiere dir die IP-Adresse** (z.B. `185.132.xxx.xxx`) - du brauchst sie für:
|
||||
- SSH-Zugang
|
||||
- GitHub Secrets
|
||||
- Browser-Zugriff auf die App
|
||||
|
||||
---
|
||||
|
||||
## C.6 Zusammenfassung deiner Instanz
|
||||
|
||||
Nach erfolgreichem Setup hast du:
|
||||
|
||||
| Komponente | Wert |
|
||||
|------------|------|
|
||||
| **Instance Name** | `local-llm` |
|
||||
| **Description** | `Local LLM Server` |
|
||||
| **Image** | Ubuntu 24.04 LTS |
|
||||
| **Flavor** | GPU mit L4/T4/A2 |
|
||||
| **Disk** | 150 GB |
|
||||
| **Network** | `ext-net1` |
|
||||
| **Security Group** | `ollama-webapp` |
|
||||
| **Key Pair** | `ollama-deploy-key` |
|
||||
| **Floating IP** | `185.xxx.xxx.xxx` |
|
||||
|
||||
## C.2 Server Basis-Setup
|
||||
|
||||
SSH zum Server:
|
||||
|
||||
```bash
|
||||
ssh -i ~/Downloads/ollama-deploy-key.pem ubuntu@185.xxx.xxx.xxx
|
||||
```
|
||||
|
||||
### System aktualisieren
|
||||
|
||||
```bash
|
||||
sudo apt update && sudo apt upgrade -y
|
||||
```
|
||||
|
||||
### NVIDIA-Treiber installieren
|
||||
|
||||
```bash
|
||||
sudo apt install -y nvidia-driver-550
|
||||
sudo reboot
|
||||
```
|
||||
|
||||
Nach Neustart wieder verbinden und prüfen:
|
||||
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
### Ollama installieren
|
||||
|
||||
```bash
|
||||
# Installieren
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
|
||||
# Konfigurieren
|
||||
sudo systemctl edit ollama
|
||||
```
|
||||
|
||||
Füge ein:
|
||||
|
||||
```ini
|
||||
[Service]
|
||||
Environment="OLLAMA_HOST=0.0.0.0"
|
||||
Environment="OLLAMA_ORIGINS=*"
|
||||
Environment="OLLAMA_NUM_PARALLEL=4"
|
||||
Environment="OLLAMA_MAX_LOADED_MODELS=2"
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo systemctl restart ollama
|
||||
sudo systemctl enable ollama
|
||||
```
|
||||
|
||||
### Modelle herunterladen
|
||||
|
||||
```bash
|
||||
ollama pull granite3.2-vision
|
||||
ollama pull qwen2.5vl:7b
|
||||
ollama pull deepseek-ocr
|
||||
```
|
||||
|
||||
## C.3 Python-Umgebung vorbereiten
|
||||
|
||||
```bash
|
||||
# Pakete installieren
|
||||
sudo apt install -y python3-pip python3-venv git
|
||||
|
||||
# App-Verzeichnis erstellen
|
||||
sudo mkdir -p /opt/ollama-webapp/{app,venv,logs}
|
||||
sudo chown -R ubuntu:ubuntu /opt/ollama-webapp
|
||||
|
||||
# Virtual Environment erstellen
|
||||
python3 -m venv /opt/ollama-webapp/venv
|
||||
|
||||
# Basis-Pakete installieren
|
||||
/opt/ollama-webapp/venv/bin/pip install --upgrade pip
|
||||
/opt/ollama-webapp/venv/bin/pip install flask flask-cors requests pymupdf gunicorn
|
||||
```
|
||||
|
||||
## C.4 Deploy SSH-Key erstellen (für GitHub Actions)
|
||||
|
||||
```bash
|
||||
# Key erstellen
|
||||
ssh-keygen -t ed25519 -C "github-actions-deploy" -f ~/.ssh/github_deploy_key -N ""
|
||||
|
||||
# Zu authorized_keys hinzufügen
|
||||
cat ~/.ssh/github_deploy_key.pub >> ~/.ssh/authorized_keys
|
||||
|
||||
# Private Key anzeigen - DIESEN IN GITHUB SECRETS KOPIEREN!
|
||||
echo ""
|
||||
echo "=========================================="
|
||||
echo "DIESEN KEY ALS 'SSH_PRIVATE_KEY' IN GITHUB SPEICHERN:"
|
||||
echo "=========================================="
|
||||
cat ~/.ssh/github_deploy_key
|
||||
echo ""
|
||||
echo "=========================================="
|
||||
```
|
||||
|
||||
**Kopiere den kompletten Private Key** (inkl. `-----BEGIN...` und `-----END...`) und speichere ihn als GitHub Secret `SSH_PRIVATE_KEY`.
|
||||
|
||||
## C.5 Systemd Service erstellen
|
||||
|
||||
```bash
|
||||
sudo nano /etc/systemd/system/ollama-webapp.service
|
||||
```
|
||||
|
||||
Inhalt:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Belegscanner Flask App
|
||||
After=network.target ollama.service
|
||||
Wants=ollama.service
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=ubuntu
|
||||
Group=ubuntu
|
||||
WorkingDirectory=/opt/ollama-webapp/app
|
||||
Environment="PATH=/opt/ollama-webapp/venv/bin:/usr/bin"
|
||||
Environment="FLASK_ENV=production"
|
||||
ExecStart=/opt/ollama-webapp/venv/bin/gunicorn \
|
||||
--bind 0.0.0.0:5000 \
|
||||
--workers 2 \
|
||||
--timeout 3600 \
|
||||
--access-logfile /opt/ollama-webapp/logs/access.log \
|
||||
--error-logfile /opt/ollama-webapp/logs/error.log \
|
||||
app:app
|
||||
Restart=always
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Aktivieren:
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable ollama-webapp
|
||||
```
|
||||
|
||||
## C.6 Sudo-Rechte für GitHub Actions
|
||||
|
||||
```bash
|
||||
sudo visudo
|
||||
```
|
||||
|
||||
Füge am Ende hinzu:
|
||||
|
||||
```
|
||||
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl restart ollama-webapp
|
||||
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl status ollama-webapp
|
||||
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl stop ollama-webapp
|
||||
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl start ollama-webapp
|
||||
```
|
||||
|
||||
## C.7 Templates-Ordner erstellen
|
||||
|
||||
Falls deine Flask-App Templates verwendet:
|
||||
|
||||
```bash
|
||||
mkdir -p /opt/ollama-webapp/app/templates
|
||||
mkdir -p /opt/ollama-webapp/app/static
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Teil D: Erster Commit und Deploy
|
||||
|
||||
## D.1 In Cursor: Code committen
|
||||
|
||||
Öffne Terminal in Cursor (im `private-llm` Ordner):
|
||||
|
||||
```bash
|
||||
# Status prüfen
|
||||
git status
|
||||
|
||||
# Alle Dateien hinzufügen
|
||||
git add .
|
||||
|
||||
# Commit erstellen
|
||||
git commit -m "Initial setup: Flask app with CI/CD"
|
||||
|
||||
# Zu GitHub pushen
|
||||
git push origin main
|
||||
```
|
||||
|
||||
## D.2 GitHub Actions beobachten
|
||||
|
||||
1. Gehe zu: `https://github.com/valueonag/private-llm/actions`
|
||||
2. Du solltest einen laufenden Workflow sehen
|
||||
3. Klicke drauf um den Fortschritt zu sehen
|
||||
|
||||
## D.3 App testen
|
||||
|
||||
Nach erfolgreichem Deploy:
|
||||
|
||||
```bash
|
||||
# Health Check
|
||||
curl http://185.xxx.xxx.xxx:5000/api/health
|
||||
|
||||
# Ollama Status
|
||||
curl http://185.xxx.xxx.xxx:5000/api/ollama/status
|
||||
|
||||
# Im Browser öffnen
|
||||
open http://185.xxx.xxx.xxx:5000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Teil E: Entwicklungs-Workflow
|
||||
|
||||
## E.1 Lokale Entwicklung in Cursor
|
||||
|
||||
```bash
|
||||
# Virtual Environment erstellen (einmalig)
|
||||
cd ~/Projects/private-llm
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate # Mac/Linux
|
||||
# oder: .\venv\Scripts\activate # Windows
|
||||
|
||||
# Dependencies installieren
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Lokal starten (mit lokalem Ollama)
|
||||
python app.py
|
||||
```
|
||||
|
||||
## E.2 Änderungen deployen
|
||||
|
||||
```bash
|
||||
# Änderungen speichern
|
||||
git add .
|
||||
|
||||
# Commit mit Beschreibung
|
||||
git commit -m "Feature: Neue Funktion XYZ"
|
||||
|
||||
# Push zu GitHub → Automatischer Deploy!
|
||||
git push origin main
|
||||
```
|
||||
|
||||
## E.3 Cursor Git-Integration
|
||||
|
||||
In Cursor kannst du auch die GUI nutzen:
|
||||
|
||||
1. **Source Control** Tab (linke Sidebar)
|
||||
2. Änderungen sehen
|
||||
3. **+** um Dateien zu stagen
|
||||
4. Commit Message eingeben
|
||||
5. **✓** zum Committen
|
||||
6. **...** → **Push** zum Pushen
|
||||
|
||||
---
|
||||
|
||||
# Teil F: Nützliche Befehle
|
||||
|
||||
## Server-Befehle
|
||||
|
||||
```bash
|
||||
# SSH zum Server
|
||||
ssh -i ~/Downloads/ollama-deploy-key.pem ubuntu@185.xxx.xxx.xxx
|
||||
|
||||
# Service Status
|
||||
sudo systemctl status ollama-webapp
|
||||
sudo systemctl status ollama
|
||||
|
||||
# Logs anschauen
|
||||
tail -f /opt/ollama-webapp/logs/access.log
|
||||
tail -f /opt/ollama-webapp/logs/error.log
|
||||
sudo journalctl -u ollama-webapp -f
|
||||
|
||||
# Service neu starten
|
||||
sudo systemctl restart ollama-webapp
|
||||
|
||||
# GPU Status
|
||||
nvidia-smi
|
||||
|
||||
# Ollama Modelle
|
||||
ollama list
|
||||
ollama pull <modell>
|
||||
```
|
||||
|
||||
## Git-Befehle
|
||||
|
||||
```bash
|
||||
# Status
|
||||
git status
|
||||
|
||||
# Änderungen sehen
|
||||
git diff
|
||||
|
||||
# Commit und Push
|
||||
git add .
|
||||
git commit -m "Beschreibung"
|
||||
git push origin main
|
||||
|
||||
# Vom Server holen (falls jemand anders gepusht hat)
|
||||
git pull origin main
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Teil G: Troubleshooting
|
||||
|
||||
## Deploy schlägt fehl
|
||||
|
||||
1. **GitHub Actions Tab prüfen** - Fehlermeldung lesen
|
||||
2. **SSH testen:**
|
||||
```bash
|
||||
ssh -i ~/.ssh/github_deploy_key ubuntu@185.xxx.xxx.xxx
|
||||
```
|
||||
3. **Secrets prüfen** - Sind alle 3 Secrets korrekt?
|
||||
|
||||
## App startet nicht
|
||||
|
||||
```bash
|
||||
# Auf dem Server:
|
||||
sudo systemctl status ollama-webapp -l
|
||||
cat /opt/ollama-webapp/logs/error.log
|
||||
```
|
||||
|
||||
## Ollama nicht erreichbar
|
||||
|
||||
```bash
|
||||
# Status prüfen
|
||||
sudo systemctl status ollama
|
||||
|
||||
# Neu starten
|
||||
sudo systemctl restart ollama
|
||||
|
||||
# Logs
|
||||
sudo journalctl -u ollama -f
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Checkliste
|
||||
|
||||
## GitHub Setup
|
||||
- [ ] Repository geklont
|
||||
- [ ] Cursor mit Repo verbunden
|
||||
- [ ] `requirements.txt` erstellt
|
||||
- [ ] `.gitignore` erstellt
|
||||
- [ ] `.github/workflows/deploy.yml` erstellt
|
||||
- [ ] GitHub Secrets konfiguriert:
|
||||
- [ ] `SERVER_HOST`
|
||||
- [ ] `SERVER_USER`
|
||||
- [ ] `SSH_PRIVATE_KEY`
|
||||
|
||||
## Server Setup
|
||||
- [ ] GPU-Instanz erstellt
|
||||
- [ ] Floating IP zugewiesen
|
||||
- [ ] NVIDIA-Treiber installiert
|
||||
- [ ] Ollama installiert und konfiguriert
|
||||
- [ ] Modelle heruntergeladen
|
||||
- [ ] Python venv erstellt
|
||||
- [ ] Deploy SSH-Key erstellt
|
||||
- [ ] Systemd Service erstellt
|
||||
- [ ] Sudo-Rechte konfiguriert
|
||||
|
||||
## Erster Deploy
|
||||
- [ ] Code committed und gepusht
|
||||
- [ ] GitHub Actions erfolgreich
|
||||
- [ ] Health Check funktioniert
|
||||
- [ ] App im Browser erreichbar
|
||||
|
||||
---
|
||||
|
||||
# URLs
|
||||
|
||||
| Service | URL |
|
||||
|---------|-----|
|
||||
| App | `http://DEINE-IP:5000` |
|
||||
| Health Check | `http://DEINE-IP:5000/api/health` |
|
||||
| Ollama Status | `http://DEINE-IP:5000/api/ollama/status` |
|
||||
| GitHub Repo | `https://github.com/valueonag/private-llm` |
|
||||
| GitHub Actions | `https://github.com/valueonag/private-llm/actions` |
|
||||
Loading…
Reference in a new issue