Go to file

renjue 3ed3d34900 Speed up Docker apt install with configurable mirror.

Add APT_MIRROR build arg to switch Debian source before apt-get update, and document faster build command using BuildKit and host network.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-09 16:32:11 +08:00

.vscode

Implement full media crawler workflow with Flask backend and Vue frontend.

2026-05-09 16:16:18 +08:00

backend

Implement full media crawler workflow with Flask backend and Vue frontend.

2026-05-09 16:16:18 +08:00

docker

Update Docker runtime to pull latest code on startup.

2026-05-09 16:19:18 +08:00

docs

Implement full media crawler workflow with Flask backend and Vue frontend.

2026-05-09 16:16:18 +08:00

frontend

Implement full media crawler workflow with Flask backend and Vue frontend.

2026-05-09 16:16:18 +08:00

sdk

Implement full media crawler workflow with Flask backend and Vue frontend.

2026-05-09 16:16:18 +08:00

.dockerignore

Implement full media crawler workflow with Flask backend and Vue frontend.

2026-05-09 16:16:18 +08:00

.gitignore

Implement full media crawler workflow with Flask backend and Vue frontend.

2026-05-09 16:16:18 +08:00

Dockerfile

Speed up Docker apt install with configurable mirror.

2026-05-09 16:32:11 +08:00

README.md

Speed up Docker apt install with configurable mirror.

2026-05-09 16:32:11 +08:00

README.md

media_crawler

资源爬取自动入库项目，采用前后端分层：

frontend/: Vue + JavaScript 页面
backend/: Python Flask API（编排 TMDB -> HDHIVE -> Emby -> CMS）

本地启动

1) 启动后端（Flask）

cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
python app.py

默认端口：

后端：14620
前端：14621

后端 .env 中 HDHIVE 相关参数按 OpenAPI 文档配置：

HDHIVE_BASE_URL 例如 https://hdhive.com
HDHIVE_API_KEY 对应请求头 X-API-Key（必填）
HDHIVE_ACCESS_TOKEN 对应 Authorization: Bearer ...（按授权场景可选）

后端 CMS 入库参数支持两种模式：

直接给固定 token：CMS_TOKEN
自动登录获取 token：CMS_LOGIN_URL + CMS_USERNAME + CMS_PASSWORD

入库接口地址使用 CMS_ADD_SHARE_URL（或 CMS_BASE_URL 自动拼接 /api/cloud/add_share_down）。

2) 启动前端（Vite）

cd frontend
cp .env.example .env
npm install
npm run dev

页面能力

主页面支持按关键词在 TMDB 搜索影视
点击影视封面进入详情页，展示 HDHive 资源列表
点击资源链接触发入库任务（按资源 slug 入库）
任务中心位于二级页面 /tasks
前端调用 Flask API：POST /api/tasks、GET /api/tasks、GET /api/tasks/{taskId}/logs
前端新增接口：GET /api/media/search、GET /api/media/{type}/{tmdbId}
任务状态、结果、错误和步骤日志展示
后端使用 SQLite 保存任务、资源和日志，并按 tmdb_id 做幂等控制
后端统一错误分类：validation、authentication、authorization、rate_limit、not_found、business_rule、network、timeout、upstream、internal

Docker 单容器运行

项目提供了单容器运行前后端的 Dockerfile，容器启动时会先从 Git 拉取最新代码，然后再启动：

Flask 后端：14620
前端预览服务：14621

构建镜像

docker build -t media-crawler:latest .

构建较慢时可使用加速参数（BuildKit + host 网络 + 可选 apt 镜像）：

DOCKER_BUILDKIT=1 docker build --network=host \
  --build-arg APT_MIRROR=mirrors.aliyun.com \
  -t media-crawler:latest .

运行容器

docker run --rm -it \
  -p 14620:14620 \
  -p 14621:14621 \
  --env-file backend/.env \
  -e GIT_REPO_URL=https://git.rc707blog.top/rose_cat707/media_crawler.git \
  -e GIT_BRANCH=main \
  media-crawler:latest

可选环境变量：

GIT_REPO_URL：容器启动时拉取代码的仓库地址
GIT_BRANCH：拉取分支（默认 main）
WORKTREE_DIR：容器内代码目录（默认 /app/runtime）

Languages

Python 58.9%

Vue 23.4%

CSS 6.5%

Shell 6.5%

JavaScript 3.3%

Other 1.4%

README.md Unescape Escape

media_crawler

本地启动

1) 启动后端（Flask）

2) 启动前端（Vite）

页面能力

Docker 单容器运行

构建镜像

运行容器

README.md