flask-ai-agent-studio
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
This is a self-hosted Flask web application that serves as an AI assistant. It provides a rich feature set including Retrieval-Augmented Generation (RAG), vision capabilities, and multi-tool execution within a canvas document editing interface.
Security Assessment
The overall risk is rated as Medium. The light code scan found no dangerous patterns, hardcoded secrets, or requests for dangerous permissions. However, the application intrinsically handles sensitive data and makes external network requests to various LLM providers (DeepSeek, OpenRouter, MiniMax). It also uses SQLite for local storage, which includes persisting chat history and usage metadata. Additionally, because the tool supports OCR and complex file processing, users should be aware of the security implications of the data they ingest, especially if hosting it in a exposed environment rather than locally.
Quality Assessment
The project is very new and currently has low community visibility with only 5 GitHub stars. It is actively maintained, with the most recent push occurring today. It is released under the standard MIT license. The developers note that while AI assisted in writing the code, humans reviewed and validated every line, which suggests a deliberate approach to quality control despite the project's early stage.
Verdict
Use with caution.
A self-hosted Flask AI assistant with RAG, vision, multi-tool execution, and canvas document editing. Full workflow automation in one open-source platform.
Flask ChatBot: Multi-Provider + Tools + RAG + Multimodal + Canvas
AI-Assisted Development Notice: This project was developed with AI assistance. All code, architecture decisions, and documentation have been written, reviewed, and validated by humans. Every line has passed human review before inclusion.
A feature-rich, single-page Flask chat application designed for advanced LLM interactions. It supports multiple providers (DeepSeek, OpenRouter, MiniMax), complex multi-step tool usage, Local RAG, persistent memory, multimodal inputs (Vision/OCR), and an interactive Canvas/Workspace environment.
Unlike basic prompt/response wrappers, this app persists deep conversation states in SQLite, supports branch regeneration, streams reasoning/tool traces, and features a robust prompt-budgeting system.
🌟 Core Features
- Models & Routing: Native support for DeepSeek and MiniMax, plus full OpenRouter integration (with proxy rotation, provider scoping, and model capability detection).
- Persistent Memory & RAG: Conversation-scoped memory, persona-scoped memory, persistent scratchpads, and a local ChromaDB-backed RAG system for document and chat history retrieval.
- Multimodal & Attachments: Document extraction (PDF, DOCX, CSV, Code) and Image processing via local OCR (PaddleOCR), Vision LLMs, or direct multimodal injection.
- Canvas & Workspace: An interactive UI panel for the model to create, edit, search, and manage markdown or code documents. Includes project-mode for local file sandbox execution.
- Advanced Chat Controls: Slash commands (
/check), message editing/branching, history pruning, automatic summarization, and entropy-aware context selection. - Observability: Detailed usage panels, provider vs. local token estimates, caching diagnostics, and rotating agent trace logs.
📸 Screenshots
🚀 Installation
Quick Start
bash install.sh
The interactive installer configures your environment, selects hardware profiles (CPU/CUDA), and downloads required models (like BGE-M3 for RAG).
Manual Setup
- Environment:
python3 -m venv .venv source .venv/bin/activate - Dependencies:
pip install -r requirements.txt # Core pip install -r requirements-rag.txt # Optional: RAG features pip install -r requirements-ocr-paddle.txt # Optional: Local OCR - Configuration:
Copy.env.exampleto.envand add at least one API key:DEEPSEEK_API_KEY=your-key OPENROUTER_API_KEY=your-key MINIMAX_API_KEY=your-key - Run:
python core/app.py # Access at http://127.0.0.1:5000
⚙️ Configuration (Environment Variables)
Most app settings can be dynamically changed via the /settings UI and are stored in SQLite. The following environment variables dictate core infrastructure:
Core & Security
| Variable | Default | Description |
|---|---|---|
FLASK_SECRET_KEY |
required | Secret key for Flask sessions. |
LOGIN_PIN |
empty | Enables basic PIN-based authentication if set. |
FORCE_HTTPS |
false |
Redirects HTTP to HTTPS (requires reverse proxy). |
AGENT_TRACE_LOG_ENABLED |
true |
Enables JSON-lines trace logging. |
Storage Directories
| Variable | Default | Description |
|---|---|---|
IMAGE_STORAGE_DIR |
./data/images |
Uploaded images. |
DOCUMENT_STORAGE_DIR |
./data/documents |
Uploaded documents. |
PROJECT_WORKSPACE_ROOT |
./data/workspaces |
Sandboxes for workspace tools. |
CHROMA_DB_PATH |
./chroma_db |
RAG vector database persistence. |
RAG & AI Features
| Variable | Default | Description |
|---|---|---|
RAG_ENABLED |
true |
Enables knowledge-base features. |
RAG_EMBED_MODEL |
BAAI/bge-m3 |
Embedding model to use. |
BGE_M3_DEVICE |
auto |
Set to cpu or leave auto for CUDA. |
OCR_ENABLED |
true |
Enables local PaddleOCR processing. |
YOUTUBE_TRANSCRIPTS_ENABLED |
false |
Enables YouTube transcript extraction tool. |
(Note: Prompt budgets, fetch limits, and UI parameters are manageable directly in the App's UI Settings page).
🛠️ Available Tools (Agent Capabilities)
The LLM is equipped with a vast array of tools. Schemas are strictly validated before execution.
Memory & Personalization
save_to_conversation_memory/delete_conversation_memory_entry: Manage short-term chat facts.save_to_persona_memory/delete_persona_memory_entry: Manage cross-chat persona facts.append_scratchpad/replace_scratchpad/read_scratchpad: Manage long-term durable user facts.ask_clarifying_question: Halts execution to ask the user a structured question.image_explain: Queries follow-up details about uploaded images.
Knowledge Base & Search
search_knowledge_base: Semantic search over chats, docs, and tool results (RAG).search_tool_memory: Search successfully cached past web results.search_web/search_news_ddgs/search_news_google: Web discovery.fetch_url/fetch_url_summarized: Fetch, clean, and summarize web pages.scroll_fetched_content/grep_fetched_content: Deep-dive into long web pages.
Canvas & Document Editing
create_canvas_document/delete_canvas_document/clear_canvas: File management.rewrite_canvas_document/batch_canvas_edits: Edit file contents.search_canvas_document/scroll_canvas_document/expand_canvas_document: Read operations.set_canvas_viewport: Pin a line range to the context window.validate_canvas_document/preview_canvas_changes: Non-mutating checks.
Workspace (Local Sandbox)
write_project_tree,create_directory,create_file,update_file,read_file,search_files: Full filesystem operations isolated to the workspace root.
🔌 HTTP API Endpoints
The backend provides a comprehensive REST API.
| Method | Path | Purpose |
|---|---|---|
GET/POST |
/chat |
Main streamed chat endpoint (NDJSON format). |
POST |
/api/chat-runs/<id>/cancel |
Gracefully halt streaming generation. |
GET |
/api/conversations |
List all conversations. |
GET |
/api/conversations/<id> |
Load specific conversation history. |
POST |
/api/conversations/<id>/summarize |
Force history summarization. |
POST |
/api/messages/<id>/prune |
Prune specific messages from history. |
GET |
/api/conversations/<id>/export |
Export chat (MD, JSON, DOCX, PDF). |
GET |
/api/rag/search |
Search ChromaDB via REST. |
POST |
/api/rag/ingest |
Upload external documents to RAG. |
GET |
/api/activity |
Paginated audit logs of LLM invocations. |
🏗️ Architecture & Storage
- Caching Strategy: Context is structured to keep system prompts static at the top, volatile data (time, tool traces) at the bottom. This maximizes provider-side prompt caching (Anthropic, DeepSeek, Gemini).
- Databases:
- SQLite (
chatbot.db): Stores conversations, messages, settings, user profiles, assets, and tool memory. - ChromaDB: Stores embeddings for RAG document retrieval.
- SQLite (
- Assets: Images and parsed documents are stored safely in
./data/. - Workspaces: Project files managed by the LLM are stored in
./data/workspaces/.
🛡️ Security & Operations
- Production Deployment: It is highly recommended to run behind a reverse proxy (Nginx/Caddy) with HTTPS. Set
FORCE_HTTPS=trueandSESSION_COOKIE_SECURE=true. - Rate Limiting: Supports local memory limiting, or shared state via
SECURITY_RATE_LIMIT_REDIS_ENABLED. - SSRF Protection: Web fetching tools (
fetch_url) block localhost and private IP addresses by default. - Sanitization: Markdown and HTML outputs are sanitized before browser rendering.
❓ Troubleshooting
- CUDA/GPU Errors: If RAG or OCR crashes due to GPU issues, set
BGE_M3_DEVICE=cpuand ensureOCR_ENABLED=false(or install the CPU version of PaddlePaddle). - Proxy Rotation Fails: Ensure
proxies.txtis formatted correctly (one per line, e.g.,http://ip:port). Requires app restart. - Image Uploads Blocked: Ensure
OCR_ENABLED=trueOR that you have selected a Vision-capable model in the Settings page.
License
MIT
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found