AI Features

Artificial intelligence in Gramax Enterprise Server extends documentation workflows with two key functions:

  1. AI search — semantic search, not only keyword-based.

  2. AI editor — an LLM-based assistant for text generation, editing, and speech transcription.

A dedicated LLM service is used and runs together with Gramax.

Image update

If you previously downloaded docker-compose.yaml and .env for Gramax, update them to the latest version so they include components required for AI services.

Requirements

OpenAI API-compatible inference server:

  • Cloud services: OpenAI, DeepSeek.

  • Local or open-source models: LM Studio, Llama.cpp, vLLM, or another server that supports the OpenAI API.

Configure LLM service

To enable AI features, configure variables in the .env file (located next to docker-compose.yaml). The configuration covers both vector search and the AI editor.

Required variables:

  • VECTORDB__TYPE, VECTORDB__HOST

  • EMBEDDING__TYPE, EMBEDDING__MODEL, EMBEDDING__DIMENSIONS, EMBEDDING__APIKEY

  • CHAT__TYPE, CHAT__MODEL, CHAT__APIKEY

  • AUTH__ADMIN__TOKEN

Example of a complete .env configuration:

# --- Vector database settings --- VECTORDB__TYPE=qdrant VECTORDB__HOST=http://db:6333 # Internal Qdrant service address from docker-compose # --- Embedding settings (for vector search) --- EMBEDDING__TYPE=openai EMBEDDING__MODEL=text-embedding-3-small # Model for creating vectors EMBEDDING__DIMENSIONS=1536 # Vector dimensions for the specified model EMBEDDING__APIKEY=<YOUR_EMBEDDING_PROVIDER_KEY> # --- AI editor settings (Chat) --- CHAT__TYPE=openai CHAT__MODEL=gpt-4o # Model for text generation CHAT__APIKEY=<YOUR_CHAT_KEY> # --- LLM service access key --- AUTH__ADMIN__TOKEN=<CREATE_SECRET_KEY>

Streaming Search Responses

For streaming to work correctly, you need to properly configure a proxy that will proxy requests to the LLM service.

Example for Angie

upstream vm-ai-gramax { zone http:vm-ai-gramax 1m; server <ip>:<port>; } server { listen <ip>:80; server_name ai-server.gramax www.ai-server.gramax; return 301 https://$server_name$request_uri; } server { listen <ip>:443 ssl; http2 on; server_name ai-server.gramax www.ai-server.gramax; status_zone https:ai-server.gramax; include /etc/angie/http.d/common/ssl-gram-ax.conf; include /etc/angie/http.d/common/noindex_robots.conf; include /etc/angie/http.d/common/error_pages.conf; location / { include /etc/angie/http.d/common/reverse-proxy.conf; client_max_body_size 500m; proxy_http_version 1.1; proxy_pass http://vm-ai-gramax; proxy_buffering off; proxy_cache off; proxy_read_timeout 3600s; proxy_send_timeout 3600s; tcp_nodelay on; } }

Connect Gramax to LLM service

After setting up the LLM service, configure the main Gramax application to communicate with it. Set the following variables in the .env file:

  • AI_TOKEN — authorization token. Set this to the same value as AUTH__ADMIN__TOKEN for the LLM service.

  • AI_SERVER_URL — URL of the LLM service. If you haven't changed the settings, this will be {GES_URL}/ai.

  • AI_INSTANCE_NAME — unique identifier for your portal. This allows a single LLM service to work with multiple independent Gramax portals. Choose any unique name, for example my-docs-portal.

Start and stop

To start Gramax Enterprise Server together with LLM services, use:

docker compose --profile ai up -d

To stop, use:

docker compose --profile ai down

Environment variable reference

Detailed description of all variables for fine-tuning the LLM service.

Vector database settings

  • VECTORDB__TYPE — vector database type. Required.

    • Value: qdrant.

  • VECTORDB__HOST — Qdrant database connection address. Required.

    • Default: http://db:6333

Embedding settings (for search)

  • EMBEDDING__TYPE — embedding provider type. Required.

    • Values:

      • openai: for OpenAI and any other API-compatible services (e.g., Deepseek, OpenRouter, Ollama).

  • EMBEDDING__MODEL — model name for the provider. The model must support embeddings. Required.

    • Note: See your provider's documentation (OpenAI, Ollama, etc.) for available models and their names.

    • Examples: text-embedding-3-large, text-embedding-3-small, mxbai-embed-large.

  • EMBEDDING__DIMENSIONS — vector dimensions produced by the model. This value is usually specified in the model documentation. Required.

    • Example: 1536

  • EMBEDDING__APIKEY — API key for the provider service. Some providers may require it even if it is not used; in that case, any string can be provided.

  • EMBEDDING__HOST — API server address for the provider. Required for OpenAI-compatible providers (other than OpenAI itself) or for a remote Ollama instance.

    • Examples: https://api.deepseek.com/v1, http://my-ollama-host:11434.

  • EMBEDDING__SOCKSPROXYURL — SOCKS5 proxy address. Useful when access to the provider's API is only available through a proxy (e.g., due to corporate restrictions or blocks).

    • Format: socks5://user:password@host:port

    • Example: socks5://proxy_user:proxy_pass@192.168.1.1:1080

Chat LLM settings (for AI editor)

The AI editor uses models that support Chat Completions — the standard method for interacting with chat models where the model continues a dialogue or performs a text task.

  • CHAT__TYPE — provider type. Required.

    • Value: openai (supports OpenAI and compatible services).

  • CHAT__MODEL — model name for text generation. Required.

    • Examples: gpt-4o, gpt-3.5-turbo.

  • CHAT__APIKEY — API key for the provider service. Some providers may require it even if it is not used; in that case, any string can be provided.

  • CHAT__SOCKSPROXYURL — SOCKS5 proxy address. Useful when access to the provider's API is only available through a proxy.

    • Format: socks5://user:password@host:port

General settings

  • AUTH__ADMIN__TOKEN — secret token for authorizing requests from Gramax to the LLM service. Choose a strong string value. Required.

Advanced settings

CORS settings

  • CORS__ALLOWED_{INDEX} — adds an allowed origin for CORS. {INDEX} is the element index starting from 0. If not set, requests from any origin are allowed.

    • Example: CORS__ALLOWED_0="http://example.com", CORS__ALLOWED_1="https://my-site.io".

Logging settings

You can configure where and at what level the LLM service sends logs.

  • LOGGING__CONSOLE — enables/disables log output to the Docker console.

    • Values: true (default), false.

To send logs to external systems (e.g., a file or Elasticsearch), configure targets using {INDEX} for each one.

Example: Log to file

LOGGING__TARGETS_0__TYPE=file LOGGING__TARGETS_0__LEVEL=info # Log level (trace, debug, info, warn, error, fatal) LOGGING__TARGETS_0__OPTIONS__DESTINATION=app/logs/llm-service.log # File path

Example: Log to Elasticsearch

LOGGING__TARGETS_1__TYPE=elasticsearch LOGGING__TARGETS_1__LEVEL=warn LOGGING__TARGETS_1__OPTIONS__NODE=http://localhost:9200 LOGGING__TARGETS_1__OPTIONS__INDEX=gramax-llm-logs # Optional auth LOGGING__TARGETS_1__OPTIONS__AUTH__USERNAME=elastic LOGGING__TARGETS_1__OPTIONS__AUTH__PASSWORD=your_password

All logging variables:

  • LOGGING__TARGETS_{INDEX}__TYPE(Required for target) Type: file, seq, elasticsearch.

  • LOGGING__TARGETS_{INDEX}__LEVEL — minimum log level for this target. Default: trace.

  • LOGGING__TARGETS_{INDEX}__OPTIONS__DESTINATION(For type="file") File path.

  • LOGGING__TARGETS_{INDEX}__OPTIONS__SERVERURL(For type="seq") Seq server address.

  • LOGGING__TARGETS_{INDEX}__OPTIONS__APIKEY — (For type="seq") API key for Seq.

  • LOGGING__TARGETS_{INDEX}__OPTIONS__NODE(For type="elasticsearch") Elasticsearch node address.

  • LOGGING__TARGETS_{INDEX}__OPTIONS__INDEX(For type="elasticsearch") Index name in Elasticsearch.

  • LOGGING__TARGETS_{INDEX}__OPTIONS__AUTH__* — Elasticsearch auth settings (USERNAME, PASSWORD, APIKEY).