Parsing Telegram channels, groups, and chats with LLM processing

11:29
22.10.2025
MarkovM
12

Hello everyone! Probably, everyone has experienced this: you open a Telegram chat, and there are thousands of new messages for the day. And somewhere within this "hodgepodge" is an important reply to your question or a discussion of the topic you need. Or you need to track certain messages for business purposes.

Of course, you can spend a lot of time searching manually, but it's much more interesting to teach a userbot to parse the chat history and create a convenient database for semantic search.

Here's where the combination of the Qdrant vector database and LLM comes into play:

The userbot collects messages and converts them into embeddings.
Qdrant stores these vectors and retrieves only the closest fragments upon request.
LLM receives exactly these pieces and formulates the final answer in human language.

If any of the terms are unclear, we recommend reviewing our previous articles on working with Qdrant. Links to them will be provided at the end of the article.

In this article, we will break down how to create such an application: from basic chat parsing to semantic search using LLM.

What is the essence of parsing Telegram through a bot?

When we say "semantic search," it's important to understand that this is not magic, but a very specific chain of steps.

Let's go through everything step by step:

1. Collecting messages

The first stage — is the userbot (for example, on Telethon). It connects to your Telegram account and "listens" to the selected chats. As soon as a new message comes in, the bot saves it. If desired, you can also load the old history, but most often, it's enough to process the new incoming ones.

Why the userbot? A regular bot does not have access to the chat history (it only sees what is written in real-time). But a userbot = you, but in the form of a program, meaning it has full access to all messages.

2. Converting text to vectors (embeddings)

For the computer to "understand the meaning" of messages, the text needs to be transformed into a numerical representation.
These are embeddings — a set of numbers (a vector) that describe not the shape of the words, but their meaning.

Example:

"Error during payment" and "Payment not going through" — different sets of words, but their vectors will be close to each other because the meaning is the same.
But "Today the weather was nice" will be far away, because the theme is different.

There are a lot of models for embeddings now: from OpenAI (text-embedding-3-small) to fully local ones (e5-base, mxbai-embed-large).

3. Storing in Qdrant

All the received vectors are stored in the Qdrant vector database. It's specifically designed for this kind of data: it quickly finds the closest vectors among millions of records.
Each message in the database is stored with our customizable metadata:

id of the message;
the text itself;
the author (can be anonymized);
the date and time.

This means that at any moment, we can ask: "Give me the most similar parts of the chat to this message," and Qdrant will find them in milliseconds.

4. User query

Now imagine that you want to know: "What colors were discussed yesterday?"
Your question is similarly transformed into a vector and sent to Qdrant.
The database searches for the most semantically similar chat fragments and returns them (for example, 5–10 pieces).

5. Processing in LLM

And now comes the time of the large language model (LLM).
We take the found fragments and insert them into the prompt:

Question: "What colors were discussed yesterday?"
Chat context:
1) [12:30] user1: My favorite color is black
2) [12:35] user2: I’m thinking of repainting the site in blue, it’ll look fresher
3) [12:40] user3: I’m for green, it looks calmer
4) [12:50] user4: Red attracts attention better
...
Formulate the answer briefly and to the point.

The LLM sees only this small piece of context, not the entire chat, and formulates the result:
“Yesterday several colors were discussed: black, blue, green, and red.”

6. What we get as a result

No endless scrolling.
The answer comes in human-friendly form.
If desired, links to the original messages can be attached to check the context, but for now there will be enough time.

Thus, the LLM never collapses under the weight of thousands of messages: it only works with the sample that Qdrant prepared in advance.

Additionally: can you get banned for this?

Technically, Telegram does not approve the use of userbots, since this is access to an account through unofficial clients. But in practice, for reading history and storing messages, there is almost no risk. Millions of people use Telethon and Pyrogram. A ban more often comes for spam or aggressive activity. For peace of mind, you can create a separate account for the userbot so as not to risk your main one.

Practical part: building an MVP of a Telegram parser bot

To avoid going deep into fine-tuning, we’ll make a minimal working example:

the userbot reads messages from the chat;
turns them into embeddings;
saves them in Qdrant;
then upon request retrieves the relevant pieces and passes them to the LLM.

The application itself will run as two separately living modules performing their own functions. The first — bot.py will collect all new incoming messages. The second (search.py) — will search through them.

1. Dependencies

In requirements.txt we add:

telethon==1.36.0
qdrant-client==1.12.1
httpx==0.27.2
python-dotenv==1.0.1
sentence-transformers==5.1.0

2. Environment variables

It’s recommended to store all important and secret data, such as the bot token and Qdrant password, in environment variables. In Amvera, where we will deploy our service, this is the “Variables” tab.

For this project we use the following variables:

API_ID = 123456
API_HASH = your_api_hash
SESSION_PATH = /data/tg.session # Save here

QDRANT_URL = http://localhost:6333
QDRANT_COLLECTION = chat_messages

LLM_BASE_URL = https://kong-proxy.yc.amvera.ru/api/v1/models/llama
LLM_API_KEY = key
LLM_MODEL = llama70b

PYTHONUNBUFFERED = 1

You can track up-to-date methods in the swagger

2.1. How to get API_ID and API_HASH for a userbot

For a userbot to connect to your Telegram account, you need special keys:

3. Userbot on Telethon

bot.py:

import os, asyncio
from dotenv import load_dotenv
from telethon import TelegramClient, events
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct
from sentence_transformers import SentenceTransformer
import httpx

load_dotenv()

API_ID = int(os.getenv("API_ID"))
API_HASH = os.getenv("API_HASH")
SESSION_PATH = os.getenv("SESSION_PATH", "session")

QDRANT_URL = os.getenv("QDRANT_URL", "http://localhost:6333")
COLLECTION = os.getenv("QDRANT_COLLECTION", "chat_messages")

LLM_BASE_URL = os.getenv("LLM_BASE_URL")
LLM_API_KEY = os.getenv("LLM_API_KEY")
LLM_MODEL = os.getenv("LLM_MODEL")

client = TelegramClient(SESSION_PATH, API_ID, API_HASH)
qdrant = QdrantClient(url=QDRANT_URL)
embedder = SentenceTransformer("intfloat/e5-small")

# create collection (once)
try:
    qdrant.get_collection(COLLECTION)
except:
    qdrant.create_collection(
        collection_name=COLLECTION,
        vectors_config=VectorParams(size=384, distance=Distance.COSINE)
    )

async def embed_text(text: str):
    return embedder.encode(text).tolist()

@client.on(events.NewMessage)
async def handler(event):
    # It’s recommended to add a condition to check the chat. In this implementation all messages are saved
    text = event.message.message
    if not text:
        return
    emb = await embed_text(text)
    point = PointStruct(
        id=event.message.id,
        vector=emb,
        payload={"text": text, "chat_id": event.chat_id}
    )
    qdrant.upsert(COLLECTION, points=[point])
    print(f"Saved message: {text[:50]}...")

async def main():
    await client.start()
    print("Userbot started")
    await client.run_until_disconnected()

if __name__ == "__main__":
    asyncio.run(main())

4. Semantic search + LLM answer

In a separate file search.py:

import os
import asyncio
import httpx
from dotenv import load_dotenv
from qdrant_client import QdrantClient
from sentence_transformers import SentenceTransformer

load_dotenv()

QDRANT_URL = os.getenv("QDRANT_URL", "http://localhost:6333")
QDRANT_COLLECTION = os.getenv("QDRANT_COLLECTION", "chat_messages")

LLM_BASE_URL = os.getenv("LLM_BASE_URL")  # e.g.: https://kong-proxy.yc.amvera.ru/api/v1/models/llama
LLM_API_KEY = os.getenv("LLM_API_KEY")
LLM_MODEL = os.getenv("LLM_MODEL", "llama8b")

qdrant = QdrantClient(url=QDRANT_URL)
embedder = SentenceTransformer("intfloat/e5-small")

def embed_text(text: str):
    return embedder.encode([text], normalize_embeddings=True).tolist()[0]

async def llm_answer(question, context):
    headers = {
        "X-Auth-Token": f"Bearer {LLM_API_KEY}",
        "Content-Type": "application/json"
    }
    payload = {
        "model": LLM_MODEL,
        "messages": [
            {"role": "system", "text": "You help find important information in the chat. Respond briefly and to the point. In Russian."},
            {"role": "user", "text": f"Question: {question}\nContext:\n{context}"}
        ]
    }
    async with httpx.AsyncClient(timeout=120.0) as http:
        r = await http.post(LLM_BASE_URL, headers=headers, json=payload)
        r.raise_for_status()
        data = r.json()
        return data["choices"][0]["message"]["text"]

async def search(question):
    vec = embed_text(question)
    results = qdrant.search(collection_name=QDRANT_COLLECTION, query_vector=vec, limit=5)
    context = "\n".join([p.payload["text"] for p in results])
    answer = await llm_answer(question, context)
    print("Question:", question)
    print("Answer:", answer)

if __name__ == "__main__":
    q = input("Enter your question: ")
    asyncio.run(search(q))

Example of input and output data.

Question: "What colors were discussed yesterday?"
Context from chat:
1) [12:30] user1: My favorite color is black
2) [12:35] user2: I'm thinking of repainting the site in blue, it will feel fresher
3) [12:40] user3: I vote for green, it looks calmer
4) [12:50] user4: Red attracts more attention
...
Formulate the answer briefly and to the point.

5. How it works together

Run bot.py — the userbot listens to the chat and saves all new messages to Qdrant.
Run python search.py — enter a question, get an answer from the LLM.
You will also need to deploy the database (Qdrant in our case) and connect the LLM via API.

Example:

Question: What colors were discussed yesterday?
Answer: Yesterday several colors were discussed: black, blue, green and red.

This is just an example of parsing Telegram channels, groups, and chats via a bot

We showed the simplest pipeline that can be built with Python, Qdrant, and LLaMA. This code works, but it’s more of an educational framework than a ready-to-use production application.

Why so:

each project has a different scale (from a small group to a corporate chat with hundreds of thousands of messages);
different privacy requirements (sometimes you can store the entire history, sometimes you need to anonymize authors or remove personal data);
different embedding models and LLMs produce different results (some work better for short messages, others for long dialogues).

So you shouldn’t see the example as the ultimate tech solution. It’s much more useful to take this template and try it yourself.

Relevant articles

Your own ChatGPT on documents: building RAG from scratch.
All about Qdrant. Overview of the vector database.
Universal website parsing in Python: requests vs headless, tokens, cookies, proxies, and IP rotation.

Parsing Telegram channels, groups, and chats with LLM processing

What is the essence of parsing Telegram through a bot?

1. Collecting messages

2. Converting text to vectors (embeddings)

3. Storing in Qdrant

4. User query

5. Processing in LLM

6. What we get as a result

Additionally: can you get banned for this?

Practical part: building an MVP of a Telegram parser bot

1. Dependencies

2. Environment variables

2.1. How to get API_ID and API_HASH for a userbot

3. Userbot on Telethon

4. Semantic search + LLM answer

5. How it works together

This is just an example of parsing Telegram channels, groups, and chats via a bot

Relevant articles

Write comment

Relevant news on the topic "AI"

Neural networks for video generation: a brief overview of Veo 3

I was a designer for 6 years, creating images for news, and then the neural network came

AI website generator on ChatGPT and Next.js 15: Creating SEO‑optimized pages from scratch

Pentest with AI agents, experiment with CAI

Implementation of structured data for AI assistants: FAQPage, HowTo, comparison tables

Also read

Review of the Acemagic Tank 03 Mini-PC firsthand. A glowing cube with powerful hardware

Velleman HPS40 — compact oscilloscope from 2002

All the intricacies of GPG signatures

I was a designer for 6 years, creating images for news, and then the neural network came