vigilops
Health Warn
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Fail
- Hardcoded secret — Potential hardcoded credential in agent/agent.example.yaml
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is an open-source AI monitoring platform that analyzes system alerts, performs automated root cause analysis, and executes auto-remediation runbooks to fix issues. It integrates via the Model Context Protocol (MCP) to allow AI assistants to query live production data.
Security Assessment
Overall Risk: Medium. The platform executes shell commands and scripts via its runbooks, which carries inherent risk in any deployment. A significant security finding is the presence of a hardcoded credential in `agent/agent.example.yaml`, which is a bad practice and could easily lead to accidental secret leaks. The tool also makes network requests to external services, specifically requiring an API key for DeepSeek to facilitate its AI analysis. Additionally, because it grants AI assistants access to live production environments through MCP, the boundary of what data is exposed must be strictly managed. On a positive note, the project does not request any dangerous repository-level permissions.
Quality Assessment
The project is properly licensed under the permissive Apache-2.0 license and was updated very recently, indicating active maintenance. However, its community visibility is extremely low, currently sitting at only 5 GitHub stars. This lack of widespread adoption means the codebase has undergone minimal peer review, making it difficult to assess its reliability or long-term support.
Verdict
Use with caution: the platform executes shell commands and handles live production data, and a hardcoded secret failing basic security checks suggests it needs strict manual review before deployment.
AI-powered open-source monitoring platform with auto-remediation. 6 built-in runbooks, MCP integration (global first), DeepSeek root cause analysis. 5-minute Docker setup.
VigilOps
Your team gets 200+ alerts daily. 80% are noise. AI fixes them while you sleep.
Live Demo | Install | Docs | 中文文档
What Makes VigilOps Different
You've tried Grafana + Prometheus. You know Datadog. They tell you something broke. None of them fix it.
VigilOps is the first open-source AI platform that doesn't just monitor — it heals:
- AI Analyzes — DeepSeek reads logs, metrics, topology to find the real cause
- AI Decides — Picks the right Runbook from 13 built-in auto-remediation scripts
- AI Fixes — Executes the fix with safety checks and approval workflows
- AI Learns — Same problems get resolved faster next time
Global First: World's first open-source monitoring platform with MCP (Model Context Protocol) integration — your AI coding assistant can query live production data directly.
Quickstart
Try Online (no install): demo.lchuangnet.com — [email protected] / demo123
Self-Host in 3 Steps:
git clone https://github.com/LinChuang2008/vigilops.git && cd vigilops
cp .env.example .env # Optional: add DeepSeek API key for live AI
docker compose up -d # Open http://localhost:3001
First registered account becomes admin. On first startup, the backend auto-creates tables, alert rules, and dashboard components.
Feature Comparison
| Feature | VigilOps | Nightingale | Prom+Grafana | Datadog | Zabbix |
|---|---|---|---|---|---|
| AI Root Cause Analysis | Built-in | - | - | Enterprise | - |
| Auto-Remediation | 13 Runbooks | - | - | Enterprise | - |
| MCP Integration | First | - | - | Early | - |
| PromQL Queries | ✓ | - | Native | Enterprise | - |
| Self-Hosted | Docker | K8s/Docker | Complex | SaaS | Yes |
| Cost | Free | Free/Ent | Free | $$$ | Free/Ent |
| Setup Time | 5 min | 30 min | 2+ hrs | 5 min | 1+ hr |
Sweet Spot: Small-to-medium teams who want AI-powered ops without enterprise licensing costs.
Honest disclaimer: We're early stage. For mission-critical systems at scale, use proven solutions. For teams ready to experiment with AI ops, we're your best bet.
How It Works
Alert Fires AI Diagnosis Auto-Fix Resolved
┌──────────┐ ┌──────────────┐ ┌────────────────┐ ┌────────────┐
│ Disk 95% │────>│ "Log rotation│────>│ log_rotation │───>│ Disk 60% │
│ on prod │ │ needed on │ │ runbook starts │ │ Fixed in │
│ server │ │ /var/log" │ │ safely │ │ 2 minutes │
└──────────┘ └──────────────┘ └────────────────┘ └────────────┘
13 Built-in Runbooks: disk_cleanup | service_restart | memory_pressure | log_rotation | zombie_killer | connection_reset | cpu_high | docker_cleanup | network_diag | mysql_health | redis_health | nginx_fix | swap_pressure
AI Runbook Generator: Describe a scenario in natural language, and AI generates an executable Runbook with safety checks — via /api/v1/ai/generate-runbook.
Prometheus AlertManager Bridge
Already running Prometheus? Add 3 lines to alertmanager.yml and get AI diagnosis on every alert:
receivers:
- name: 'vigilops'
webhook_configs:
- url: 'http://your-vigilops:8001/api/v1/webhooks/alertmanager'
http_config:
authorization:
type: Bearer
credentials: 'YOUR_TOKEN'
route:
receiver: 'vigilops'
What happens: Prometheus fires alert → VigilOps receives it → AI analyzes root cause → diagnosis appears in real-time on the Demo page via SSE.
Two modes: Diagnosis-only (safe, read-only analysis) or Auto-remediation (AI picks and executes the right Runbook).
Screenshots
Dashboard — Real-time metrics across all hosts
AI Alert Analysis — Root cause + recommended action
MCP Integration — Global Open Source First
Your AI assistant (Claude Code, Cursor) queries live production data via MCP:
# Enable in backend/.env
VIGILOPS_MCP_ENABLED=true
VIGILOPS_MCP_PORT=8003
VIGILOPS_MCP_API_KEY=your-secret-token
Note: Authentication via
VIGILOPS_MCP_API_KEYis required in production.
5 MCP Tools: get_servers_health | get_alerts | search_logs | analyze_incident | get_topology
Ask your AI: "Show all critical alerts on prod-server-01" / "Analyze last night's CPU spike" / "Search for OOM errors in the past 2 hours"
PromQL Query Support
Query metrics using familiar PromQL syntax via API:
# Instant query
GET /api/v1/promql/query?query=vigilops_host_cpu_percent
# Range query
GET /api/v1/promql/query_range?query=avg(vigilops_host_cpu_percent)&start=...&end=...&step=5m
# Supported: rate(), avg(), sum(), min(), max(), count(), avg_over_time(), label matchers
Compatible with Prometheus HTTP API format for Grafana integration.
Agent — Cross-Platform Monitoring
The VigilOps Agent collects system metrics, discovers services, and monitors databases. It runs on Linux, Windows/Windows Server, and macOS.
Linux:
pip install vigilops-agent
vigilops-agent run -c /etc/vigilops/agent.yaml
Windows (PowerShell):
.\scripts\install-windows-agent.ps1 -ServerUrl "http://your-server:8001" -Token "your-token"
.\scripts\install-windows-service.ps1 # Register as Windows Service
| Feature | Linux | Windows | macOS |
|---|---|---|---|
| CPU / Memory / Disk / Network | ✓ | ✓ | ✓ |
| Docker Service Discovery | ✓ | ✓ | ✓ |
| Host Service Discovery | ✓ (ss) | ✓ (netstat) | - |
| Database Monitoring | ✓ | ✓ | ✓ |
| Log Collection | ✓ | ✓ | ✓ |
Installation
Prerequisites
- Docker 20+ & Docker Compose v2+
- 4 CPU / 8 GB RAM (build) / 2 GB RAM (runtime)
Environment Variables
| Variable | Required | Description |
|---|---|---|
POSTGRES_PASSWORD |
Yes | Database password |
JWT_SECRET_KEY |
Yes | openssl rand -hex 32 |
AI_API_KEY |
Yes | DeepSeek API key |
AI_AUTO_SCAN |
Rec. | Auto-analyze alerts (true) |
See docs/installation.md for full guide.
Tech Stack
| Layer | Technology |
|---|---|
| Frontend | React 19, TypeScript, Vite, Ant Design 6, ECharts 6 |
| Backend | Python 3.9+, FastAPI, SQLAlchemy, AsyncIO |
| Database | PostgreSQL 15+, Redis 7+ |
| AI | DeepSeek API (configurable LLM) |
| Agent | Python 3.9+, psutil — Linux / Windows / macOS |
| Deploy | Docker Compose, Helm Chart (K8s) |
Documentation
Getting Started | Installation | User Guide | API Reference | Architecture | Contributing | Changelog
Contributing
We need contributors who understand alert fatigue firsthand. See CONTRIBUTING.md.
cp .env.example .env
docker compose -f docker-compose.dev.yml up -d
pip install -r requirements-dev.txt
cd frontend && npm install
Community
Apache 2.0 — Use it, fork it, ship it commercially.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found