866 lines
No EOL
19 KiB
Markdown
866 lines
No EOL
19 KiB
Markdown
# Local LLM Server - Komplette Setup-Anleitung
|
|
|
|
Von GitHub Repo-Setup über Cursor bis zum automatischen Deployment auf Infomaniak.
|
|
|
|
---
|
|
|
|
## Access
|
|
|
|
## Server Data
|
|
|
|
IP 83.228.200.109
|
|
Instance: local-llm Ubuntu 24.04 LTS Noble Numbat 83.228.226.58, 2001:1600:16:10::7e3 nvl4-a8-ram16-disk0 ollama-deploy-key Active az-1
|
|
Connect: ssh -i "C:\Users\pmots\Downloads\ollama-deploy-key.pem" ubuntu@83.228.200.109
|
|
|
|
|
|
## Übersicht
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ Cursor │◄──── sync ────────▶│ GitHub │
|
|
│ (lokale Dev) │ │ private-llm │
|
|
└─────────────────┘ └────────┬────────┘
|
|
│
|
|
│ Push to main
|
|
▼
|
|
┌─────────────────┐
|
|
│ GitHub Actions │
|
|
└────────┬────────┘
|
|
│
|
|
│ SSH Deploy
|
|
▼
|
|
┌─────────────────┐
|
|
│ Infomaniak GPU │
|
|
│ 83.228.200.109 │
|
|
│ ┌───────────┐ │
|
|
│ │ Ollama │ │
|
|
│ │ + Flask │ │
|
|
│ │ (LLM + │ │
|
|
│ │ Vision) │ │
|
|
│ └───────────┘ │
|
|
└─────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Finale Konfiguration
|
|
|
|
| Komponente | Wert |
|
|
|------------|------|
|
|
| **Server IP** | `83.228.200.109` |
|
|
| **GPU** | NVIDIA L4 (24GB VRAM) |
|
|
| **OS** | Ubuntu 24.04 LTS |
|
|
| **SSH User** | `ubuntu` |
|
|
| **App Port** | `5000` |
|
|
| **Ollama Port** | `11434` |
|
|
| **GitHub Repo** | `https://github.com/valueonag/private-llm` |
|
|
|
|
### Installierte Modelle
|
|
|
|
| Modell | Verwendung |
|
|
|--------|------------|
|
|
| `granite3.2-vision` | Rechnungen, Belege, Dokumente |
|
|
| `qwen2.5vl:7b` | Handschrift |
|
|
| `deepseek-ocr` | OCR / Text-Extraktion |
|
|
|
|
### URLs
|
|
|
|
| Service | URL |
|
|
|---------|-----|
|
|
| **App** | http://83.228.200.109:5000 |
|
|
| **Health Check** | http://83.228.200.109:5000/api/health |
|
|
| **Ollama Status** | http://83.228.200.109:5000/api/ollama/status |
|
|
|
|
---
|
|
|
|
# Teil A: GitHub Repository Setup
|
|
|
|
## A.1 Repository klonen in Cursor
|
|
|
|
```bash
|
|
cd ~/Projects
|
|
git clone https://github.com/valueonag/private-llm.git
|
|
cd private-llm
|
|
cursor .
|
|
```
|
|
|
|
## A.2 Projektstruktur
|
|
|
|
```
|
|
private-llm/
|
|
├── app.py # Flask App
|
|
├── requirements.txt # Python Dependencies
|
|
├── templates/
|
|
│ └── index.html # Frontend Template
|
|
├── static/ # CSS, JS, Bilder (optional)
|
|
├── .github/
|
|
│ └── workflows/
|
|
│ └── deploy.yml # CI/CD Pipeline
|
|
├── .gitignore
|
|
└── README.md
|
|
```
|
|
|
|
## A.3 Wichtige Dateien
|
|
|
|
### requirements.txt
|
|
|
|
```txt
|
|
flask>=3.0.0
|
|
flask-cors>=4.0.0
|
|
requests>=2.31.0
|
|
pymupdf>=1.24.0
|
|
gunicorn>=21.0.0
|
|
```
|
|
|
|
### .gitignore
|
|
|
|
```gitignore
|
|
# Python
|
|
__pycache__/
|
|
*.py[cod]
|
|
*$py.class
|
|
*.so
|
|
.Python
|
|
venv/
|
|
env/
|
|
.venv/
|
|
|
|
# IDE
|
|
.idea/
|
|
.vscode/
|
|
*.swp
|
|
*.swo
|
|
.cursor/
|
|
|
|
# OS
|
|
.DS_Store
|
|
Thumbs.db
|
|
|
|
# Logs
|
|
*.log
|
|
logs/
|
|
|
|
# Environment
|
|
.env
|
|
.env.local
|
|
|
|
# Test
|
|
.pytest_cache/
|
|
.coverage
|
|
htmlcov/
|
|
```
|
|
|
|
---
|
|
|
|
# Teil B: GitHub Actions Deploy Workflow
|
|
|
|
## B.1 GitHub Secret einrichten
|
|
|
|
1. Gehe zu: `https://github.com/valueonag/private-llm/settings/secrets/actions`
|
|
2. **New repository secret**
|
|
3. Name: `SSH_PRIVATE_KEY`
|
|
4. Value: Inhalt der `ollama-deploy-key.pem` Datei
|
|
|
|
**Nur dieses eine Secret wird benötigt!**
|
|
|
|
## B.2 Deploy Workflow (.github/workflows/deploy.yml)
|
|
|
|
```yaml
|
|
name: Deploy to Infomaniak
|
|
|
|
on:
|
|
push:
|
|
branches:
|
|
- main
|
|
workflow_dispatch:
|
|
|
|
env:
|
|
APP_DIR: /opt/ollama-webapp
|
|
SERVICE_NAME: ollama-webapp
|
|
SERVER_HOST: 83.228.200.109
|
|
SERVER_USER: ubuntu
|
|
|
|
jobs:
|
|
deploy:
|
|
runs-on: ubuntu-latest
|
|
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Setup SSH
|
|
run: |
|
|
mkdir -p ~/.ssh
|
|
echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/deploy_key
|
|
chmod 600 ~/.ssh/deploy_key
|
|
ssh-keyscan -H ${{ env.SERVER_HOST }} >> ~/.ssh/known_hosts
|
|
|
|
- name: Deploy files to server
|
|
run: |
|
|
rsync -avz --delete \
|
|
-e "ssh -i ~/.ssh/deploy_key -o StrictHostKeyChecking=no" \
|
|
--exclude '.git' \
|
|
--exclude '.github' \
|
|
--exclude '__pycache__' \
|
|
--exclude '*.pyc' \
|
|
--exclude 'venv' \
|
|
--exclude '.env' \
|
|
--exclude 'logs' \
|
|
./ ${{ env.SERVER_USER }}@${{ env.SERVER_HOST }}:${{ env.APP_DIR }}/app/
|
|
|
|
- name: Install dependencies and restart service
|
|
run: |
|
|
ssh -i ~/.ssh/deploy_key -o StrictHostKeyChecking=no \
|
|
${{ env.SERVER_USER }}@${{ env.SERVER_HOST }} << 'ENDSSH'
|
|
|
|
echo "Installing dependencies..."
|
|
cd /opt/ollama-webapp
|
|
./venv/bin/pip install -r app/requirements.txt --quiet --upgrade
|
|
|
|
echo "Restarting service..."
|
|
sudo systemctl restart ollama-webapp
|
|
|
|
echo "Waiting for service to start..."
|
|
sleep 5
|
|
|
|
echo "Service status:"
|
|
sudo systemctl status ollama-webapp --no-pager -l
|
|
|
|
echo "Deployment complete!"
|
|
ENDSSH
|
|
|
|
- name: Health Check
|
|
run: |
|
|
echo "Running health check..."
|
|
sleep 3
|
|
|
|
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
|
|
http://${{ env.SERVER_HOST }}:5000/api/health || echo "000")
|
|
|
|
if [ "$HTTP_STATUS" = "200" ]; then
|
|
echo "Health check passed! (HTTP $HTTP_STATUS)"
|
|
else
|
|
echo "Health check failed! (HTTP $HTTP_STATUS)"
|
|
exit 1
|
|
fi
|
|
|
|
- name: Deployment Summary
|
|
if: success()
|
|
run: |
|
|
echo "Deployment successful!"
|
|
echo ""
|
|
echo "App URL: http://${{ env.SERVER_HOST }}:5000"
|
|
echo "Health: http://${{ env.SERVER_HOST }}:5000/api/health"
|
|
```
|
|
|
|
---
|
|
|
|
# Teil C: Infomaniak Server Setup
|
|
|
|
## C.1 Horizon Dashboard Login
|
|
|
|
1. Oeffne: https://api.pub1.infomaniak.cloud/horizon
|
|
2. Login-Daten:
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Domain** | `PCU-MPXPVCR` |
|
|
| **User Name** | `PCU-MPXPVCR` |
|
|
| **Password** | Dein OpenStack-Passwort |
|
|
|
|
---
|
|
|
|
## C.2 SSH Key Pair erstellen
|
|
|
|
1. Gehe zu: **Compute → Key Pairs**
|
|
2. Klicke: **Create Key Pair**
|
|
3. Fuelle aus:
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Key Pair Name** | `ollama-deploy-key` |
|
|
| **Key Type** | `SSH Key` |
|
|
|
|
4. Klicke **Create Key Pair**
|
|
5. **WICHTIG:** Die `.pem` Datei wird automatisch heruntergeladen - sicher aufbewahren!
|
|
|
|
---
|
|
|
|
## C.3 Security Group erstellen
|
|
|
|
1. Gehe zu: **Network → Security Groups**
|
|
2. Klicke: **Create Security Group**
|
|
3. Fuelle aus:
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Name** | `ollama-webapp` |
|
|
| **Description** | `Ports fuer Ollama und Flask App` |
|
|
|
|
4. Klicke **Create Security Group**
|
|
5. In der Liste: Klicke **Manage Rules** bei `ollama-webapp`
|
|
6. Klicke **Add Rule** und erstelle diese Regeln:
|
|
|
|
| Rule | Direction | Ether Type | IP Protocol | Port Range | CIDR |
|
|
|------|-----------|------------|-------------|------------|------|
|
|
| 1 | Ingress | IPv4 | TCP | 22 | `0.0.0.0/0` |
|
|
| 2 | Ingress | IPv4 | TCP | 80 | `0.0.0.0/0` |
|
|
| 3 | Ingress | IPv4 | TCP | 443 | `0.0.0.0/0` |
|
|
| 4 | Ingress | IPv4 | TCP | 5000 | `0.0.0.0/0` |
|
|
| 5 | Ingress | IPv4 | TCP | 11434 | `0.0.0.0/0` |
|
|
|
|
**Fuer jede Regel:**
|
|
- Klicke **Add Rule**
|
|
- Direction: `Ingress`
|
|
- Waehle bei "Rule": `Custom TCP Rule`
|
|
- Port: Jeweilige Portnummer eingeben
|
|
- CIDR: `0.0.0.0/0`
|
|
- Klicke **Add**
|
|
|
|
---
|
|
|
|
## C.4 Privates Netzwerk erstellen
|
|
|
|
Da Infomaniak nur externe Netzwerke hat, muss ein privates Netzwerk erstellt werden.
|
|
|
|
### Schritt 1: Netzwerk erstellen
|
|
|
|
1. **Network → Networks → Create Network**
|
|
|
|
| Tab | Feld | Wert |
|
|
|-----|------|------|
|
|
| **Network** | Network Name | `private-network` |
|
|
| | Enable Admin State | Ja |
|
|
| | Create Subnet | Ja |
|
|
| **Subnet** | Subnet Name | `private-subnet` |
|
|
| | Network Address | `192.168.100.0/24` |
|
|
| | IP Version | `IPv4` |
|
|
| | Gateway IP | `192.168.100.1` |
|
|
| **Subnet Details** | Enable DHCP | Ja |
|
|
| | DNS Name Servers | `8.8.8.8` |
|
|
|
|
2. Klicke **Create**
|
|
|
|
### Schritt 2: Router erstellen
|
|
|
|
1. **Network → Routers → Create Router**
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Router Name** | `main-router` |
|
|
| **Enable Admin State** | Ja |
|
|
| **External Network** | `ext-floating1` |
|
|
|
|
2. Klicke **Create Router**
|
|
|
|
### Schritt 3: Router mit Subnet verbinden
|
|
|
|
1. Klicke auf `main-router`
|
|
2. Gehe zu Tab: **Interfaces**
|
|
3. Klicke: **Add Interface**
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Subnet** | `private-subnet` |
|
|
|
|
4. Klicke **Submit**
|
|
|
|
---
|
|
|
|
## C.5 GPU-Instanz erstellen
|
|
|
|
1. Gehe zu: **Compute → Instances**
|
|
2. Klicke: **Launch Instance**
|
|
|
|
### Tab 1: Details
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Instance Name** | `local-llm` |
|
|
| **Description** | `Local LLM Server` |
|
|
| **Availability Zone** | `nova` (Standard lassen) |
|
|
| **Count** | `1` |
|
|
|
|
→ Klicke **Next**
|
|
|
|
### Tab 2: Source
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Select Boot Source** | `Image` |
|
|
| **Create New Volume** | `Yes` |
|
|
| **Volume Size (GB)** | `150` |
|
|
| **Delete Volume on Instance Delete** | `No` |
|
|
|
|
**Image auswaehlen:**
|
|
1. In der unteren Liste "Available" suche: `Ubuntu 24.04 LTS Noble Numbat`
|
|
2. Klicke den **Pfeil nach oben** rechts davon
|
|
3. Das Image erscheint oben unter "Allocated"
|
|
|
|
→ Klicke **Next**
|
|
|
|
### Tab 3: Flavor
|
|
|
|
**GPU-Flavor auswaehlen:**
|
|
|
|
| Flavor | GPU | VRAM | vCPUs | RAM |
|
|
|--------|-----|------|-------|-----|
|
|
| `nvl4-a8-ram16-disk0` | L4 | 24GB | 8 | 16GB |
|
|
|
|
1. Suche nach `nvl4` in der Liste
|
|
2. Waehle `nvl4-a8-ram16-disk0` (oder aehnlich)
|
|
3. Klicke den **Pfeil nach oben**
|
|
|
|
→ Klicke **Next**
|
|
|
|
### Tab 4: Networks
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Network** | `private-network` |
|
|
|
|
**WICHTIG:** Waehle `private-network`, NICHT `ext-net1`!
|
|
|
|
1. In der Liste "Available" finde: `private-network`
|
|
2. Klicke den **Pfeil nach oben**
|
|
|
|
→ Klicke **Next**
|
|
|
|
### Tab 5: Network Ports
|
|
|
|
Ueberspringe diesen Tab (leer lassen).
|
|
|
|
→ Klicke **Next**
|
|
|
|
### Tab 6: Security Groups
|
|
|
|
1. Falls `default` unter "Allocated" steht: Klicke den **Pfeil nach unten** um es zu entfernen
|
|
2. In "Available" finde: `ollama-webapp`
|
|
3. Klicke den **Pfeil nach oben**
|
|
4. Unter "Allocated" sollte nur `ollama-webapp` stehen
|
|
|
|
→ Klicke **Next**
|
|
|
|
### Tab 7: Key Pair
|
|
|
|
1. In "Available" finde: `ollama-deploy-key`
|
|
2. Klicke den **Pfeil nach oben**
|
|
3. Unter "Allocated" steht: `ollama-deploy-key`
|
|
|
|
→ Klicke **Next**
|
|
|
|
### Tab 8-11: Optional
|
|
|
|
Ueberspringe diese Tabs (leer lassen).
|
|
|
|
### Instanz starten
|
|
|
|
Klicke: **Launch Instance**
|
|
|
|
Warte bis der Status von `Build` zu `Active` wechselt (ca. 1-3 Minuten).
|
|
|
|
---
|
|
|
|
## C.6 Floating IP zuweisen
|
|
|
|
1. Gehe zu: **Network → Floating IPs**
|
|
2. Klicke: **Allocate IP to Project**
|
|
3. Waehle:
|
|
|
|
| Feld | Wert |
|
|
|------|------|
|
|
| **Pool** | `ext-floating1` |
|
|
| **Description** | `IP fuer Local LLM Server` |
|
|
|
|
4. Klicke **Allocate IP**
|
|
5. In der Liste: Klicke **Associate** bei der neuen IP
|
|
6. Waehle deine Instanz: `local-llm`
|
|
7. Klicke **Associate**
|
|
|
|
**Notiere die IP-Adresse** (z.B. `83.228.200.109`)
|
|
|
|
---
|
|
|
|
## C.7 Zusammenfassung Instanz
|
|
|
|
| Komponente | Wert |
|
|
|------------|------|
|
|
| **Instance Name** | `local-llm` |
|
|
| **Description** | `Local LLM Server` |
|
|
| **Image** | Ubuntu 24.04 LTS |
|
|
| **Flavor** | `nvl4-a8-ram16-disk0` (NVIDIA L4) |
|
|
| **Disk** | 150 GB |
|
|
| **Network** | `private-network` |
|
|
| **Security Group** | `ollama-webapp` |
|
|
| **Key Pair** | `ollama-deploy-key` |
|
|
| **Floating IP** | `83.228.200.109` |
|
|
|
|
---
|
|
|
|
# Teil D: Server Einrichtung
|
|
|
|
## D.1 SSH-Verbindung
|
|
|
|
**Windows PowerShell:**
|
|
|
|
```powershell
|
|
ssh -i "C:\Users\pmots\Downloads\ollama-deploy-key.pem" ubuntu@83.228.200.109
|
|
```
|
|
|
|
**Mac/Linux:**
|
|
|
|
```bash
|
|
chmod 400 ~/Downloads/ollama-deploy-key.pem
|
|
ssh -i ~/Downloads/ollama-deploy-key.pem ubuntu@83.228.200.109
|
|
```
|
|
|
|
---
|
|
|
|
## D.2 System vorbereiten
|
|
|
|
```bash
|
|
# Falls dpkg Fehler auftreten
|
|
sudo dpkg --configure -a
|
|
|
|
# System aktualisieren
|
|
sudo apt update
|
|
sudo apt upgrade -y
|
|
```
|
|
|
|
---
|
|
|
|
## D.3 GPU pruefen
|
|
|
|
```bash
|
|
nvidia-smi
|
|
```
|
|
|
|
Sollte NVIDIA L4 mit 24GB VRAM zeigen (Treiber sind vorinstalliert).
|
|
|
|
---
|
|
|
|
## D.4 Ollama installieren
|
|
|
|
```bash
|
|
curl -fsSL https://ollama.com/install.sh | sh
|
|
```
|
|
|
|
### Ollama fuer Netzwerkzugriff konfigurieren
|
|
|
|
```bash
|
|
sudo systemctl edit ollama
|
|
```
|
|
|
|
Fuege ein (zwischen den Kommentaren):
|
|
|
|
```ini
|
|
[Service]
|
|
Environment="OLLAMA_HOST=0.0.0.0"
|
|
Environment="OLLAMA_ORIGINS=*"
|
|
Environment="OLLAMA_NUM_PARALLEL=4"
|
|
Environment="OLLAMA_MAX_LOADED_MODELS=2"
|
|
```
|
|
|
|
Speichern: `Ctrl+X`, dann `Y`, dann `Enter`
|
|
|
|
```bash
|
|
sudo systemctl restart ollama
|
|
sudo systemctl enable ollama
|
|
```
|
|
|
|
---
|
|
|
|
## D.5 Modelle herunterladen
|
|
|
|
```bash
|
|
# Fuer Dokumente (Rechnungen, Belege)
|
|
ollama pull granite3.2-vision
|
|
|
|
# Fuer Handschrift
|
|
ollama pull qwen2.5vl:7b
|
|
|
|
# OCR-Spezialist
|
|
ollama pull deepseek-ocr
|
|
```
|
|
|
|
### Modelle pruefen
|
|
|
|
```bash
|
|
ollama list
|
|
```
|
|
|
|
---
|
|
|
|
## D.6 Python-Umgebung einrichten
|
|
|
|
```bash
|
|
# Pakete installieren
|
|
sudo apt install -y python3-pip python3-venv python3.12-venv git
|
|
|
|
# App-Verzeichnis erstellen
|
|
sudo mkdir -p /opt/ollama-webapp/{app,venv,logs}
|
|
sudo chown -R ubuntu:ubuntu /opt/ollama-webapp
|
|
|
|
# Virtual Environment erstellen
|
|
python3 -m venv /opt/ollama-webapp/venv
|
|
|
|
# Basis-Pakete installieren
|
|
/opt/ollama-webapp/venv/bin/pip install --upgrade pip
|
|
/opt/ollama-webapp/venv/bin/pip install flask flask-cors requests pymupdf gunicorn
|
|
```
|
|
|
|
---
|
|
|
|
## D.7 Systemd Service erstellen
|
|
|
|
```bash
|
|
sudo nano /etc/systemd/system/ollama-webapp.service
|
|
```
|
|
|
|
Inhalt:
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=Belegscanner Flask App
|
|
After=network.target ollama.service
|
|
Wants=ollama.service
|
|
|
|
[Service]
|
|
Type=simple
|
|
User=ubuntu
|
|
Group=ubuntu
|
|
WorkingDirectory=/opt/ollama-webapp/app
|
|
Environment="PATH=/opt/ollama-webapp/venv/bin:/usr/bin"
|
|
Environment="FLASK_ENV=production"
|
|
ExecStart=/opt/ollama-webapp/venv/bin/gunicorn \
|
|
--bind 0.0.0.0:5000 \
|
|
--workers 2 \
|
|
--timeout 3600 \
|
|
--access-logfile /opt/ollama-webapp/logs/access.log \
|
|
--error-logfile /opt/ollama-webapp/logs/error.log \
|
|
app:app
|
|
Restart=always
|
|
RestartSec=5
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
Speichern: `Ctrl+X`, dann `Y`, dann `Enter`
|
|
|
|
```bash
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable ollama-webapp
|
|
```
|
|
|
|
---
|
|
|
|
## D.8 Sudo-Rechte fuer GitHub Actions
|
|
|
|
```bash
|
|
sudo visudo
|
|
```
|
|
|
|
Fuege am Ende hinzu:
|
|
|
|
```
|
|
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl restart ollama-webapp
|
|
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl status ollama-webapp
|
|
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl stop ollama-webapp
|
|
ubuntu ALL=(ALL) NOPASSWD: /bin/systemctl start ollama-webapp
|
|
```
|
|
|
|
Speichern: `Ctrl+X`, dann `Y`, dann `Enter`
|
|
|
|
---
|
|
|
|
# Teil E: Entwicklungs-Workflow
|
|
|
|
## E.1 Lokale Entwicklung
|
|
|
|
```bash
|
|
# In Cursor: Terminal oeffnen
|
|
cd ~/Projects/private-llm
|
|
|
|
# Virtual Environment (einmalig)
|
|
python3 -m venv venv
|
|
source venv/bin/activate # Mac/Linux
|
|
# oder: .\venv\Scripts\activate # Windows
|
|
|
|
# Dependencies installieren
|
|
pip install -r requirements.txt
|
|
|
|
# Lokal starten
|
|
python app.py
|
|
```
|
|
|
|
## E.2 Deployment (automatisch)
|
|
|
|
```bash
|
|
# Aenderungen speichern
|
|
git add .
|
|
git commit -m "Beschreibung der Aenderung"
|
|
git push origin main
|
|
|
|
# GitHub Actions deployed automatisch!
|
|
```
|
|
|
|
## E.3 Deployment Status pruefen
|
|
|
|
1. Gehe zu: https://github.com/valueonag/private-llm/actions
|
|
2. Klicke auf den neuesten Workflow Run
|
|
3. Gruen = Erfolgreich, Rot = Fehler
|
|
|
|
---
|
|
|
|
# Teil F: Server-Befehle
|
|
|
|
## SSH Verbindung
|
|
|
|
```bash
|
|
ssh -i "C:\Users\pmots\Downloads\ollama-deploy-key.pem" ubuntu@83.228.200.109
|
|
```
|
|
|
|
## Service-Verwaltung
|
|
|
|
```bash
|
|
# Status pruefen
|
|
sudo systemctl status ollama-webapp
|
|
sudo systemctl status ollama
|
|
|
|
# Neu starten
|
|
sudo systemctl restart ollama-webapp
|
|
sudo systemctl restart ollama
|
|
|
|
# Logs anschauen
|
|
tail -f /opt/ollama-webapp/logs/access.log
|
|
tail -f /opt/ollama-webapp/logs/error.log
|
|
sudo journalctl -u ollama-webapp -f
|
|
sudo journalctl -u ollama -f
|
|
```
|
|
|
|
## GPU Status
|
|
|
|
```bash
|
|
nvidia-smi
|
|
```
|
|
|
|
## Ollama Modelle
|
|
|
|
```bash
|
|
# Liste
|
|
ollama list
|
|
|
|
# Neues Modell hinzufuegen
|
|
ollama pull <modell-name>
|
|
|
|
# Modell entfernen
|
|
ollama rm <modell-name>
|
|
|
|
# Modell testen
|
|
ollama run granite3.2-vision "Beschreibe dieses Bild"
|
|
```
|
|
|
|
---
|
|
|
|
# Teil G: Troubleshooting
|
|
|
|
## App nicht erreichbar
|
|
|
|
```bash
|
|
# Service Status pruefen
|
|
sudo systemctl status ollama-webapp
|
|
|
|
# Logs pruefen
|
|
tail -50 /opt/ollama-webapp/logs/error.log
|
|
|
|
# Port pruefen
|
|
sudo netstat -tlnp | grep 5000
|
|
```
|
|
|
|
## Ollama nicht erreichbar
|
|
|
|
```bash
|
|
# Status pruefen
|
|
sudo systemctl status ollama
|
|
|
|
# Neu starten
|
|
sudo systemctl restart ollama
|
|
|
|
# Logs pruefen
|
|
sudo journalctl -u ollama -f
|
|
```
|
|
|
|
## GitHub Actions fehlgeschlagen
|
|
|
|
1. Gehe zu: https://github.com/valueonag/private-llm/actions
|
|
2. Klicke auf den fehlgeschlagenen Run
|
|
3. Klicke auf den fehlgeschlagenen Step
|
|
4. Lies die Fehlermeldung
|
|
|
|
**Haeufige Probleme:**
|
|
- SSH Key falsch: Secret `SSH_PRIVATE_KEY` pruefen
|
|
- Server nicht erreichbar: Floating IP und Security Group pruefen
|
|
- Syntax-Fehler in Code: Lokal testen vor Push
|
|
|
|
## GPU nicht erkannt
|
|
|
|
```bash
|
|
# Treiber pruefen
|
|
nvidia-smi
|
|
|
|
# Falls nicht vorhanden
|
|
sudo apt install -y nvidia-driver-550
|
|
sudo reboot
|
|
```
|
|
|
|
---
|
|
|
|
# Checkliste
|
|
|
|
## Einmalige Einrichtung
|
|
|
|
- [x] Infomaniak Public Cloud Account
|
|
- [x] GPU-Quota aktiviert
|
|
- [x] SSH Key Pair erstellt (`ollama-deploy-key`)
|
|
- [x] Security Group erstellt (`ollama-webapp`)
|
|
- [x] Privates Netzwerk erstellt (`private-network`)
|
|
- [x] Router erstellt (`main-router`)
|
|
- [x] GPU-Instanz erstellt (`local-llm`)
|
|
- [x] Floating IP zugewiesen (`83.228.200.109`)
|
|
- [x] Ollama installiert und konfiguriert
|
|
- [x] Modelle heruntergeladen
|
|
- [x] Python-Umgebung eingerichtet
|
|
- [x] Systemd Service erstellt
|
|
- [x] GitHub Secret konfiguriert (`SSH_PRIVATE_KEY`)
|
|
- [x] GitHub Actions Workflow erstellt
|
|
|
|
## Bei jedem Deploy
|
|
|
|
- [ ] Code aendern
|
|
- [ ] `git add .`
|
|
- [ ] `git commit -m "Beschreibung"`
|
|
- [ ] `git push origin main`
|
|
- [ ] GitHub Actions pruefen
|
|
- [ ] App testen
|
|
|
|
---
|
|
|
|
# Kosten
|
|
|
|
## Infomaniak Public Cloud (geschaetzt)
|
|
|
|
| Komponente | Preis ca. |
|
|
|------------|-----------|
|
|
| GPU L4 Instanz (24/7) | ~CHF 580/Monat |
|
|
| GPU L4 Instanz (8h/Tag, Mo-Fr) | ~CHF 140/Monat |
|
|
| Block Storage 150GB | ~CHF 15/Monat |
|
|
| Floating IP | ~CHF 3/Monat |
|
|
|
|
**Tipp:** Instanz stoppen wenn nicht benoetigt!
|
|
|
|
```bash
|
|
# Im Horizon Dashboard: Compute → Instances → Shut Off Instance
|
|
# Oder per CLI
|
|
openstack server stop local-llm
|
|
``` |