Media & Entertainment / Social Automation AI-Powered Social Automation

Automating Movie Q&A on X with Hybrid RAG + Fine-Tuned LLMs

Built a production-ready X (Twitter) bot that answers movie

"Shipped a modular X bot that reliably replies to mentions with"
GenAIRAGFine-tuningFastAPIQdrantHuggingFaceOllamaX-APIPythonDocker
Backend
PythonFastAPIPostgreSQLQdrantDocker
AI Models
Mistral 7B InstructLlama 3.2 3B (fine-tuned)
Infrastructure
Hugging Face HubHugging Face Inference EndpointsGoogle Cloud (Vertex AI / Model Garden, Colab)Ollama (local inference)X API (Basic/Paid)
$4,500/mo avoided
Avoided real-time streaming API costs
Chose periodic mention polling + queueing instead of X Streaming API
15k
Monthly X API usage ceiling validated
Client reported dashboard usage (e.g., 168/15k), used to tune polling
3 AI modes
Inference strategies available without rewrites
AI_ENV_TYPE routes between localRAG (Ollama), hosted RAG (HF/Qdrant),
1 repo integration
Reduced handoff friction
Integrated working bot + AI modules into client’s FastAPI repo

Problem Statement

The client needed a bot similar to interactive reply bots (e.g.,

Our Approach

Engineered a Python/FastAPI-based bot + AI services architecture with

Hybrid AI Runtime (RAG + Fine-Tuned + Web Search toggle)

Technical Details
Implemented retrieval-augmented generation using Qdrant
Business Value
Enabled the client to demo and iterate quickly with multiple AI

Challenges We Solved

X API permissions, rate limits, and non-streaming architecture

The bot needed to respond to mentions, but streaming access was

Implemented periodic mention polling with robust rate-limit handling,

X APIPythonFastAPI

RAG ingestion reliability with change management

Client required vector DB reset/reinsert behavior and later asked for

Built a dedicated ingestion script and ingestion API endpoint

QdrantDockerPythonFastAPI

Modular refactor into production FastAPI framework

Initial implementation needed to be merged into a pre-existing FastAPI

Refactored codebase to move AI services out of twitter_bot into a

FastAPIAPIRouterPythonGitHub

Fine-tune deployment to Hugging Face + endpoint publishing

Client required a fine-tuned model that could be served via URL; access

Delivered a working Colab fine-tuning notebook, published fine-tuned

Google ColabHugging Face HubHugging Face Inference EndpointsTransformers

Project Timeline

1

Discovery

Aligned on a bot that monitors X mentions and replies with movie

2

Build

Delivered RAG API and ingestion pipeline (CSV → embeddings → Qdrant).

3

Launch

Integrated the bot into the client’s production repo, validated

Ready to Build Something Similar?

Let's discuss how we can help transform your business with AI.

Start Your Project