LLM Module

Converts Reddit posts into dramatic text message conversations by calling Ollama's /api/chat endpoint directly with native constrained JSON generation.

`LlmService`

`rewriteAsConversation(post: RedditPost): Promise<Result<Conversation, LlmError>>`

Sends a Reddit post to Ollama with a system prompt instructing it to rewrite the content as a two-person iMessage conversation. Uses Ollama's format parameter for token-level constrained JSON output.

Features:

Direct Ollama /api/chat calls (no OpenAI SDK or LiteLLM proxy)
Native constrained generation via Ollama's format parameter with JSON schema
Automatic retry (2 attempts) with escalating JSON schema feedback
Thinking field fallback — extracts JSON from thinking when content is empty (Qwen3.5 quirk)
Strips <think> tags as safety net
Validates output against Zod schema

Types

`Conversation`

typescript

interface Conversation {
  leftName: string;   // Descriptive relationship label (e.g. "My Boss", "Best Friend")
  rightName: string;  // Always "Me" — viewer perspective
  hookText?: string;  // TikTok scroll-stopping hook line
  messages: ConversationMessage[];
}

Name conventions:

leftName uses short descriptive relationship labels (2-3 words): "My Boss", "My Ex", "Best Friend", "My Mom", "My Roommate"
rightName is always "Me" — the viewer is the right-side protagonist (blue bubbles)
hookText is a TikTok scroll-stopper starting with "When...", "POV:", "The moment...", or a direct shocking statement

`ConversationMessage`

typescript

interface ConversationMessage {
  sender: 'left' | 'right';
  text: string;
}

Configuration

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama API base URL
`LLM_MODEL`	`qwen3.5:9b`	Ollama model name (colon format)
`LLM_MAX_TOKENS`	`8192`	Max output tokens (`num_predict`)
`LLM_TEMPERATURE`	`0.8`	Sampling temperature
`LLM_TIMEOUT_MS`	`600000`	Request timeout (10 min)

LLM Setup

The module calls Ollama directly — no proxy layer needed.

Install Ollama
Pull the model: ollama pull qwen3.5:9b
Ollama runs on http://localhost:11434 by default

TIP

The module uses Ollama's native format parameter for constrained JSON generation. This provides token-level grammar enforcement, guaranteeing valid JSON structure without relying on prompt engineering alone.

WARNING

Qwen3.5 with verbose system prompts may route all output to its thinking field with empty content. The service handles this automatically via thinking field fallback + retry, but keep the system prompt concise if modifying it.

LLM Module ​

LlmService ​

rewriteAsConversation(post: RedditPost): Promise<Result<Conversation, LlmError>> ​

Types ​

Conversation ​

ConversationMessage ​

Configuration ​

LLM Setup ​