ENTERPRISE AI PLATFORMNEW Self-Healing RAG · Ollama Auto-Detect · Sandbox Skills · Agent Academy

THE AI THAT
ACTUALLY RUNS YOUR
INFRASTRUCTURE.

Build autonomous agents that plan before they act, SSH into servers, query databases, manage OpenStack clouds, and delegate to each other. Deep planning loop. Dual-path OCR/RAG architecture with self-healing embeddings (Ollama auto-detect, exponential-backoff retries, ChromaDB → Postgres vector fallback). 6-stage RAG pipeline with semantic chunking, hybrid search (RRF), cross-encoder re-ranking, and Redis-backed query cache. On-spot sandbox skills from chat + an Agent Academy to grow your fleet. On your hardware. Your data never leaves.

$ git clone -b allinone https://github.com/muhammedali275/AI-Orchestrator-Studio && cd AI-Orchestrator-Studio && sudo bash allinone/setup.sh
Works With Everything
🐧Linux / RHEL
🪟Windows
🐳Docker
☸️Kubernetes
🗃️PostgreSQL
🔶Oracle DB
🍃MongoDB
🔴Redis
🔍Elasticsearch
🧠Ollama
vLLM
☁️Azure OpenAI
🤖OpenAI
🔮Anthropic
💬MS Teams
📱Telegram
💼Slack
🌐REST API
📊Prometheus
🔧Ansible
🖥️VMware
🔑Active Directory
☁️OpenStack / HCS
🛡️LDAP / SAML
🧩ChromaDB
🎯Cross-Encoder
🐧Linux / RHEL
🪟Windows
🐳Docker
☸️Kubernetes
🗃️PostgreSQL
🔶Oracle DB
🍃MongoDB
🔴Redis
🔍Elasticsearch
🧠Ollama
vLLM
☁️Azure OpenAI
🤖OpenAI
🔮Anthropic
💬MS Teams
📱Telegram
💼Slack
🌐REST API
📊Prometheus
🔧Ansible
🖥️VMware
🔑Active Directory
☁️OpenStack / HCS
🛡️LDAP / SAML
🧩ChromaDB
🎯Cross-Encoder
What It Does

Not a chatbot.
An operating system for AI agents.

Every feature you need to go from "idea" to "autonomous agent running in production" — in one platform.

🤖

Universal Agent Builder

Visual agent creation with custom system prompts, skill assignments, and multi-LLM routing. No code required.

🔀

Multi-LLM Routing

10+ providers (OpenAI, Azure, Anthropic, Ollama, vLLM, Cohere, HuggingFace, LlamaCpp, TextGen, Custom). Per-agent task routing (code_gen → Codellama, rag_answer → GPT-4) with automatic fallback chains.

🔗

Agent Delegation

Team Leader agents delegate to specialists automatically. Recursive multi-agent execution with depth guards.

📄

6-Stage RAG Pipeline

Semantic chunking → ingest-time embedding → hybrid BM25 + vector search with RRF fusion → metadata filtering → cross-encoder re-ranking → Redis semantic cache.

💚

Self-Healing Embeddings

Auto-detects local Ollama, retries transient failures with exponential backoff, and falls back ChromaDB → Postgres-backed vector store on RHEL/sqlite-old systems. Bulk reindex + embed-pending endpoints survive 1000+ document corpora.

🔍

Dual-Path OCR Engine

Path A: sync app-server (PyPDF2/PyMuPDF). Path B: async Celery workers with 3-strategy cascade (Tesseract 5 → easyocr → LLM Vision). 8-step image preprocessing. Auto-promote into RAG vector store.

🔑

Multi-Credential Binding

SSH keys, DB logins, API tokens — bind multiple credentials to one agent. Auto-inject by type into matching skills.

🛡️

Enterprise Security

AES-256 vault, RBAC, AD/LDAP login, audit trails, TLS, data governance with PII masking and ISO/NIST/GDPR regulatory references.

🧠

Deep Agent Planning

Agents plan before they act. 4-tier tool enforcement: native function calls → ReAct text parsing → false-completion re-prompt → intent auto-dispatch. 10-round execution loop with self-correction. Works with ANY LLM provider.

☁️

OpenStack / HCS

15 built-in skills for Nova, Neutron, Cinder, Glance, Keystone, Heat, Swift. Manage compute, network, and storage with natural language.

Agent Scheduler

Cron-based scheduling for automated checks, daily reports, compliance scans. Full execution history tracking.

📡

Channel Connectors

MS Teams, Slack, Telegram, REST API, Webhooks. Built-in API Gateway with rate limiting and usage analytics.

Semantic Query Cache

Redis-backed cache that detects near-duplicate queries via embedding cosine similarity (>0.95). Sub-millisecond cache hits, 24h TTL, per-agent scoping.

🖥️

Infrastructure Admin

Edit .env config from the UI (App, DB, Worker, Storage). Live health-check panel tests PostgreSQL, Redis, ChromaDB, Celery, and vLLM connectivity.

🧪

On-Spot Sandbox Skills

Create & attach a Python skill to an agent in one API call — sandbox enforced by default. Inline Python skills run in a hardened executor with import allow-list, no-network, CPU + wall-clock limits, and stdout capture.

🎓

Agent Academy

A guided learning track for new agents: import a .skill pack, auto-ingest its references into RAG, run a self-quiz, and graduate to production. Includes ready-made tracks for Presales, Finance, Infra, Legal, and Tibco-style integration agents.

0
Built-in Skills
0
Skill Categories
0
RAG Pipeline Stages
0
Enterprise Domains
Architecture

Deep agent execution.
Multi-LLM routing.

From user message to tool execution — every layer is deterministic, observable, and provider-agnostic. No black boxes.

Agent Execution Pipeline

Complete message flow from user to final answer — 10 stages, 4 fallback tiers.

Message → Agent → LLM → Tool → Response
💬 User MessageChat Studio · REST API · Teams · Slack · Telegram
🔀 Enterprise OrchestratorDeterministic 3-path routing (no LLM)
▼ ANALYTICS ▼ DOCUMENTS ▼ GENERAL
📊 KPI / Metrics APIkeyword: dashboard, chart, KPI
📄 RAG Searchkeyword: document, search, find
🤖 UniversalAgentExecutorFull agent pipeline ▼
Context Assembly
📋 System PromptBase agent prompt
🧩 Internal SkillsPrompt-injected (not callable)
📚 RAG ChunksVector search → top-K inject
🧠 Deep PlanningPlan-before-act prompt
⚡ ModelRouter.select()Task classification → LLM connection
🌐 LLMClientAuto-detect provider from URL · Probe endpoints
LLM Providers
OpenAINative tool_calls
Azure OpenAINative tool_calls
AnthropicNative tool_calls
vLLM / OllamaReAct text fallback
CustomAny OpenAI-compat
4-Tier Tool Calling (max 10 rounds)
Tier 1 · NativeLLM returns tool_calls[] → execute directly
Tier 2 · ReActParse ACTION / ACTION_INPUT from text output
Tier 3 · Re-promptDetect false completion → force tool call
Tier 4 · Auto-dispatchKeyword scoring → invoke best-match skill
Skill Execution
🔑 Credential Inject3-tier discovery: config → RBAC → auto
⚙️ Skill HandlerSSH · SQL · HTTP · Python · Ansible · …
🔗 DelegationSub-agent spawn (max depth 3)
✅ Final AnswerSynthesised response → User

Multi-LLM Routing Engine

Per-agent task-aware model selection with automatic fallback chains. No single-LLM lock-in.

🎯

Task Classification

Keyword scoring classifies each message into a task type before selecting the LLM.

reasoningcode_genrag_answersummarizeclassifyextracttranslatechattool_callplanning
🔀

Per-Agent Routing

Each agent defines its own routing config with primary, task-specific, and fallback LLM connections.

task_routing{}primary_connectionfallback_chain[]cost_aware
🛡️

Fallback Chain

If primary LLM fails (timeout, rate limit, error), the chain automatically tries the next provider.

primary → fallback[0]→ fallback[1]→ system_default
Provider Auto-Detection · Endpoint Probing
OpenAIapi.openai.com
Azure*.openai.azure.com
Anthropicapi.anthropic.com
Cohereapi.cohere.ai
vLLM/v1/completions
Ollama:11434
TextGen:5001
LlamaCpp:8080
HuggingFaceapi-inference
Customany endpoint
LLMClient auto-detects provider from URL · probes multiple candidate endpoints · caches the working one

Enhanced RAG Pipeline v2 · self-healing

End-to-end document intelligence — from upload to grounded answer. Self-healing embeddings, vector-store fallback, bulk reindex APIs.

Ingest → Embed → Store → Retrieve → Re-Rank → Answer
Ingest · Path A (sync) + Path B (async OCR)
⬆️ Upload / NFS ScanPOST /api/documents/upload · batch-scan
🔍 OCR CascadeTesseract → easyocr → LLM Vision
📄 Native ExtractPyPDF2 · PyMuPDF · docx · md
① Semantic Chunking
✂️ ChunkingPolicyparagraph · heading · sentence · auto-merge
📦 1500 chars / 300 overlapenv: RAG_CHUNK_SIZE · RAG_CHUNK_OVERLAP
🏷️ Enriched Metadatadoc_id · page · source · upload_date
② Embedding · Self-Healing Provider Chain
Tier 1 · ExplicitEMBEDDING_BASE_URL / API_KEY → OpenAI-compat /v1/embeddings
Tier 2 · Auto-DetectLocal Ollama probe :11434 → qwen3-embedding:0.6b (default)
Tier 3 · LLM EndpointRe-use chat LLM’s /v1/embeddings (legacy mode)
Retry · Backoff4 attempts · 2/4/8/15s · timeouts + 5xx + 429 · sticky API flag
③ Vector Store · Backend Fallback
🟢 ChromaDB (default)persistent on-disk · agent-scoped collection
🔄 _SQLiteVecBackendPostgres-backed JSON column · RHEL9 / sqlite<3.35 fallback
🔁 Bulk APIs/embed-pending · /reindex-all · by-name OR by-id
🔎 Query Time
❓ User Questionvia Chat Studio · REST · Teams · Slack
⚡ Semantic Cache CheckRedis · cosine > 0.95 · 24h TTL → HIT short-circuits
④ Hybrid Retrieval · RRF Fusion
🔤 BM25 Keywordstructural detection · weight 0.3
🧩 Vector kNNcosine · weight 0.7
🔀 RRF Merge1 / (60 + rank) · dual-source boost
⑤ Metadata Filter + Cross-Encoder Re-Rank
🎯 Metadata WHEREdoc_type · source_file · upload_date range
🧠 Cross-Encoderms-marco-MiniLM-L-6-v2 · ~5ms / chunk · zero API cost
🚶 LLM Re-Rank Fallbackif cross-encoder unavailable
⑥ Grounded Answer · Quality-Aware
📝 Top-K Injectchunks → system prompt · source citations
🧮 Quality Detectorgarbled / runon detector · URL/code/base64 strip
🔁 Auto-Fallback Modelhard-cap 90s · progress events emitted
💾 Cache WriteRedis · 24h TTL · per-agent scope
✅ Final Answercited · grounded · streamed via SSE w/ heartbeats
Tunable via env: RAG_CHUNK_SIZE · RAG_CHUNK_OVERLAP · EMBEDDING_BASE_URL · EMBEDDING_API_MAX_RETRIES · EMBEDDING_API_TIMEOUT · DB_POOL_SIZE · RAG_CACHE_TTL_SECONDS

Infrastructure Tiers

Separated app server, database layer, async workers, and LLM nodes. Designed for horizontal scaling.

🖥️

App Server

React 18 + FastAPI + nginx. Enterprise Orchestrator, UniversalAgentExecutor, ModelRouter, 6-stage RAG, LLMClient.

React 18FastAPInginxChromaDBCross-Encoder
🗄️

Database & Cache

All relational data, chat memory, rate limiting, task brokering, semantic query cache, vector embeddings.

PostgreSQL 16RedisChromaDBRAG Cache
⚙️

OCR Worker

Async Celery workers. 3-strategy cascade: Tesseract → easyocr → LLM Vision. 8-step image preprocessing. Auto-promotes to RAG.

CeleryTesseract 5easyocrOpenCV
🧠

LLM Server

10+ providers via auto-detection. Per-agent task routing. Primary + fallback chains with cost-aware selection.

vLLMOllamaAzureOpenAIAnthropic
Skill Library

95+ pre-built skills.
37 categories.

From SSH commands to Oracle DBA, from OpenStack cloud to a 6-stage RAG pipeline — agents come ready to work.

SSH Command Shell Exec SQL Query HTTP API Call RAG Retrieval Agent Delegation Brave Web Search Docker Mgmt Oracle DBA PostgreSQL Admin MongoDB Query Redis Ops Elasticsearch VM Management Ansible Playbook Kubernetes Ops Prometheus Query Cisco Network InfoSec Scan Vuln Assessment Scrum / Agile Project Mgmt Call Center QA M365 / Copilot Schema Inspector Migration Manager Code Execution File Operations NLP Analysis Data Lookup Image Generation HuggingFace Notifications Security Audit Cloud Integration Solution Architecture OpenStack Nova OpenStack Neutron OpenStack Cinder Heat Stacks Keystone Auth Deep Agent Plan AD / LDAP Login Data Governance Semantic Chunking RRF Hybrid Search Cross-Encoder Rerank Semantic Cache 📦 Presales Agent 📦 Finance Agent 📦 Infra Agent 📦 Legal Agent
Pre-Built Template Agents

Import-ready .skill packages with embedded knowledge bases. Upload → agents start working immediately.

💼

Presales Agent

presales-agent.skill

RFP responses, proposals, competitive analysis, objection handling (LAER), ROI/TCO calculations, demo preparation.

📄 rfp-templates.md 📄 objection-handling.md 📄 roi-models.md
💰

Finance Agent

finance-agent.skill

Financial analysis, budgeting, forecasting, compliance reporting, revenue recognition, audit preparation.

📄 financial-models.md 📄 compliance-frameworks.md
🖥️

Infrastructure Agent

infra-agent.skill

Server management, monitoring, incident response, capacity planning, patching, backup/recovery procedures.

📄 runbooks/ 📄 architecture-guides/
⚖️

Legal Agent

legal-agent.skill

Contract review, NDA analysis, regulatory compliance, risk assessment, legal terminology, policy drafting.

📄 contract-templates.md 📄 regulatory-guides.md
Use Cases

Built for real workflows.
Not demos.

See how teams across different departments leverage AOS to automate complex, multi-step operations.

🖥️

Infrastructure Operations

Team Leader agent delegates to Linux, VMware, Oracle, and AD sub-agents — all with isolated credentials.

  • SSH into servers, check disk/services/logs
  • Manage VMware VMs via vCenter API
  • Run Oracle SQL queries and health checks
  • Automated compliance and patching reports
📑

Document Intelligence

6-stage RAG pipeline: semantic chunking, ingest-time embedding, hybrid RRF search, metadata filtering, cross-encoder re-ranking, and Redis semantic cache.

  • Semantic chunking at paragraph/heading boundaries
  • Hybrid BM25 + vector search with RRF fusion
  • Cross-encoder re-ranks top results locally (no API cost)
  • Redis cache detects near-duplicate queries instantly
🔐

Security & Compliance

AD/LDAP enterprise login, regulatory-grade data governance, InfoSec scanning, and full audit trails.

  • Active Directory / LDAP authentication
  • Data governance with ISO 27001, NIST, GDPR, HIPAA, PCI DSS references
  • Network vulnerability assessment & SIEM
  • ISO / SOC2 compliance evidence & PII masking
☁️

Cloud & Infrastructure

Manage OpenStack/HCS clouds, VMware vCenter, and Kubernetes with natural language. Deep Agent plans multi-step operations before executing.

  • OpenStack: Nova servers, Neutron networks, Cinder volumes
  • Heat stack orchestration & Keystone auth
  • Deep Agent planning with fallback & self-correction
  • Multi-credential binding across cloud platforms

Ready to deploy
your first agent?

On your infrastructure. Your data never leaves. No vendor lock-in. Open source.