FastFetchBot

A social media content fetching service with a Telegram Bot client, built as a monorepo with two microservices.

Send a social media URL to the bot, and it fetches and archives the content for you. Supports most mainstream social media platforms.

Roadmap

Before v1.0.0

Architecture

FastFetchBot is organized as a UV workspace monorepo with three packages:

FastFetchBot/
├── packages/shared/          # fastfetchbot-shared: common models, utilities, logger
├── apps/api/                 # FastAPI server: scrapers, storage, routing
├── apps/telegram-bot/        # Telegram Bot: webhook/polling, message handling
├── app/                      # Legacy re-export wrappers (backward compatibility)
├── pyproject.toml            # Root workspace configuration
└── uv.lock                   # Lockfile for the entire workspace

Service	Port	Description
API Server (`apps/api/`)	10450	FastAPI app with all platform scrapers, file export, and storage
Telegram Bot (`apps/telegram-bot/`)	10451	Receives messages via webhook or long polling, calls the API server

The Telegram Bot communicates with the API server over HTTP. In Docker, this is http://api:10450.

Installation

Docker (Recommended)

Copy docker-compose.template.yml to docker-compose.yml.
Create a .env file from template.env and fill in the environment variables.
If you need large file support (>50 MB), fill in TELEGRAM_API_ID and TELEGRAM_API_HASH in the compose file for the local Telegram Bot API server. Otherwise, comment out the telegram-bot-api service.

docker-compose up -d

The compose file pulls pre-built images from GitHub Container Registry:

ghcr.io/aturret/fastfetchbot-api:latest
ghcr.io/aturret/fastfetchbot-telegram-bot:latest

To build locally instead, uncomment the build: blocks and comment out the image: lines in docker-compose.yml.

Local Development

Requires Python 3.12 and uv.

# Install all dependencies (including dev)
uv sync

# Run the API server
cd apps/api
uv run gunicorn -k uvicorn.workers.UvicornWorker src.main:app --preload

# Run the Telegram Bot (in a separate terminal)
cd apps/telegram-bot
uv run python -m core.main

Telegram Bot Modes

The bot supports two modes, controlled by the TELEGRAM_BOT_MODE environment variable:

Mode	Value	Use Case
Long Polling	`polling` (default)	Local development, simple deployments without a reverse proxy
Webhook	`webhook`	Production with a public HTTPS URL

In both modes, the bot runs an HTTP server on port 10451 for the /send_message callback endpoint (used by Inoreader integration) and /health.

Development

Commands

uv sync                    # Install all dependencies
uv run pytest              # Run tests
uv run pytest -v           # Run tests with verbose output
uv run black .             # Format code

Adding a New Platform Scraper

Create a new scraper module in apps/api/src/services/scrapers/<platform>/
Implement the scraper class following existing patterns
Add a platform-specific router in apps/api/src/routers/
Register the scraper in ScraperManager
Add configuration variables in apps/api/src/config.py
Create tests in tests/cases/

Docker Build

# Build both services locally
docker-compose build

# Or build individually
docker build -f apps/api/Dockerfile -t fastfetchbot-api .
docker build -f apps/telegram-bot/Dockerfile -t fastfetchbot-telegram-bot .

Note: Both Dockerfiles use the repository root as the build context (.) because they need access to pyproject.toml, uv.lock, and packages/shared/.

Environment Variables

Many scrapers require authentication cookies. You can extract cookies using the browser extension Get cookies.txt LOCALLY.

See template.env for a complete reference with comments.

Required

Variable	Description
`BASE_URL`	Public domain of the server (e.g. `example.com`). Used for webhook URL construction.
`TELEGRAM_BOT_TOKEN`	Bot token from @BotFather
`TELEGRAM_CHAT_ID`	Default chat ID for the bot

Service Communication (Docker)

Variable	Default	Description
`API_SERVER_URL`	`http://localhost:10450`	URL the Telegram Bot uses to call the API server. Set to `http://api:10450` in Docker.
`TELEGRAM_BOT_CALLBACK_URL`	`http://localhost:10451`	URL the API server uses to call the Telegram Bot. Set to `http://telegram-bot:10451` in Docker.
`TELEGRAM_BOT_MODE`	`polling`	`polling` or `webhook`

Optional

API Server

Variable	Default	Description
`PORT`	`10450`	API server port
`API_KEY`	auto-generated	API key for authentication

Telegram

Variable	Default	Description
`TELEBOT_API_SERVER_HOST`	`None`	Local Telegram Bot API server host
`TELEBOT_API_SERVER_PORT`	`None`	Local Telegram Bot API server port
`TELEGRAM_CHANNEL_ID`	`None`	Channel ID(s) for the bot, comma-separated
`TELEGRAM_CHANNEL_ADMIN_LIST`	`None`	User IDs allowed to post to the channel, comma-separated

Platform Cookies & Credentials

Platform	Variables
Twitter	`TWITTER_CT0`, `TWITTER_AUTH_TOKEN`
Reddit	`REDDIT_CLIENT_ID`, `REDDIT_CLIENT_SECRET`, `REDDIT_USERNAME`, `REDDIT_PASSWORD`
Weibo	`WEIBO_COOKIES`
Xiaohongshu	See Xiaohongshu Setup below
Instagram	`X_RAPIDAPI_KEY`
Zhihu	Store cookies in `conf/zhihu_cookies.json`

Xiaohongshu Setup

Xiaohongshu (XHS) API requests require a cryptographic signature (x-s, x-t, etc.) that must be computed by a dedicated signing proxy. FastFetchBot delegates this to an external sign server.

Note: We currently use a closed-source sign server. You will need to run your own compatible signing proxy and point SIGN_SERVER_URL at it.

The sign server must accept POST /signsrv/v1/xhs/sign with a JSON body:

{"uri": "/api/sns/web/v1/feed", "data": {...}, "cookies": "a1=..."}

and return:

{"isok": true, "data": {"x_s": "...", "x_t": "...", "x_s_common": "...", "x_b3_traceid": "..."}}

Cookie configuration (two options; file takes priority):

File (recommended): Create apps/api/conf/xhs_cookies.txt containing your XHS cookies as a single line:
```
a1=xxxxxxxx; web_id=xxxxxxxx; web_session=xxxxxxxx
```
Log in to xiaohongshu.com in your browser, then copy the cookie values from DevTools → Application → Cookies, or use the Get cookies.txt LOCALLY extension.
Environment variables (legacy fallback): Set XIAOHONGSHU_A1, XIAOHONGSHU_WEBID, and XIAOHONGSHU_WEBSESSION individually. Used only when the cookie file is absent.

Variable	Default	Description
`SIGN_SERVER_URL`	`http://localhost:8989`	URL of the XHS signing proxy
`XHS_COOKIE_PATH`	`conf/xhs_cookies.txt`	Path to cookie file (overrides default location)
`XIAOHONGSHU_A1`	`None`	`a1` cookie value (legacy fallback)
`XIAOHONGSHU_WEBID`	`None`	`web_id` cookie value (legacy fallback)
`XIAOHONGSHU_WEBSESSION`	`None`	`web_session` cookie value (legacy fallback)

Cloud Services

Variable	Description
`OPENAI_API_KEY`	OpenAI API key for audio transcription
`AWS_ACCESS_KEY_ID`	Amazon S3 access key
`AWS_SECRET_ACCESS_KEY`	Amazon S3 secret key
`AWS_S3_BUCKET_NAME`	S3 bucket name
`AWS_S3_REGION_NAME`	S3 region
`AWS_DOMAIN_HOST`	Custom domain bound to the S3 bucket

General Webpage Scraping

Variable	Default	Description
`GENERAL_SCRAPING_ON`	`false`	Enable scraping for unrecognized URLs
`GENERAL_SCRAPING_API`	`FIRECRAWL`	Backend: `FIRECRAWL` or `ZYTE`
`FIRECRAWL_API_URL`		Firecrawl API server URL
`FIRECRAWL_API_KEY`		Firecrawl API key
`ZYTE_API_KEY`		Zyte API key

Supported Content Types

Social Media

Video

YouTube
Bilibili

CI/CD

The GitHub Actions pipeline (.github/workflows/ci.yml) automatically builds and pushes both microservice images to GitHub Container Registry on every push to main:

ghcr.io/aturret/fastfetchbot-api:latest
ghcr.io/aturret/fastfetchbot-telegram-bot:latest

Acknowledgements

The HTML to Telegra.ph converter function is based on html-telegraph-poster. I separated it from this project as an independent Python package: html-telegraph-poster-v2.

The original Xiaohongshu scraper was based on MediaCrawler. The current implementation uses a custom httpx-based adapter with an external signing proxy.

The Weibo scraper is based on weiboSpider.

The Twitter scraper is based on twitter-api-client.

The Zhihu scraper is based on fxzhihu.

All the code is licensed under the MIT license. I either used their code as-is or made modifications to implement certain functions. I want to express my gratitude to the projects mentioned above for their contributions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastFetchBot

Roadmap

Before v1.0.0

Architecture

Installation

Docker (Recommended)

Local Development

Telegram Bot Modes

Development

Commands

Adding a New Platform Scraper

Docker Build

Environment Variables

Required

Service Communication (Docker)

Optional

API Server

Telegram

Platform Cookies & Credentials

Xiaohongshu Setup

Cloud Services

General Webpage Scraping

Supported Content Types

Social Media

Video

CI/CD

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 680 Commits
.github		.github
.idea		.idea
apps		apps
conf		conf
packages		packages
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.template.yml		docker-compose.template.yml
pyproject.toml		pyproject.toml
template.env		template.env
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

FastFetchBot

Roadmap

Before v1.0.0

Architecture

Installation

Docker (Recommended)

Local Development

Telegram Bot Modes

Development

Commands

Adding a New Platform Scraper

Docker Build

Environment Variables

Required

Service Communication (Docker)

Optional

API Server

Telegram

Platform Cookies & Credentials

Xiaohongshu Setup

Cloud Services

General Webpage Scraping

Supported Content Types

Social Media

Video

CI/CD

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages