docs: add architecture diagrams (D2 + Mermaid)

Adds docs/architecture.d2 and docs/architecture.mermaid.md showing the docker-compose-new.yml service topology — storage, application, init containers, and external dependencies with annotated connections. Also includes the rendered docs/architecture.svg (D2 output). View live: d2 --watch docs/architecture.d2 View in Gitea: navigate to docs/architecture.mermaid.md in the web UI.
fix(scraper): add Brotli decompression to HTTP client
2026-03-21 20:35:03 +05:00 · 2026-03-20 11:19:28 +05:00 · 2026-03-15 21:58:02 +05:00 · 2026-03-15 21:53:54 +05:00
9 changed files with 552 additions and 12 deletions
--- a/docker-compose-new.yml
+++ b/docker-compose-new.yml
@@ -65,7 +65,7 @@ services:
      POCKETBASE_ADMIN_EMAIL: "${POCKETBASE_ADMIN_EMAIL:-admin@libnovel.local}"
      POCKETBASE_ADMIN_PASSWORD: "${POCKETBASE_ADMIN_PASSWORD:-changeme123}"
    volumes:
-      - ./scripts/pb-init.sh:/pb-init.sh:ro
+      - ./scripts/pb-init-v2.sh:/pb-init.sh:ro
    entrypoint: ["sh", "/pb-init.sh"]

  # ─── Backend API ──────────────────────────────────────────────────────────────
--- a/docs/architecture.d2
+++ b/docs/architecture.d2
@@ -0,0 +1,99 @@
+direction: right
+
+# ─── External ─────────────────────────────────────────────────────────────────
+
+novelfire: novelfire.net {
+  shape: cloud
+  style.fill: "#f0f4ff"
+}
+
+kokoro: Kokoro-FastAPI TTS {
+  shape: cloud
+  style.fill: "#f0f4ff"
+}
+
+browser: Browser / iOS App {
+  shape: person
+  style.fill: "#fff9e6"
+}
+
+# ─── Init containers (one-shot) ───────────────────────────────────────────────
+
+init: Init containers {
+  style.fill: "#f5f5f5"
+  style.stroke-dash: 4
+
+  minio-init: minio-init {
+    shape: rectangle
+    label: "minio-init\n(mc: create buckets)"
+  }
+
+  pb-init: pb-init {
+    shape: rectangle
+    label: "pb-init\n(bootstrap collections)"
+  }
+}
+
+# ─── Storage ──────────────────────────────────────────────────────────────────
+
+storage: Storage {
+  style.fill: "#eaf7ea"
+
+  minio: MinIO {
+    shape: cylinder
+    label: "MinIO :9000\n\nbuckets:\n  libnovel-chapters\n  libnovel-audio\n  libnovel-avatars\n  libnovel-browse"
+  }
+
+  pocketbase: PocketBase {
+    shape: cylinder
+    label: "PocketBase :8090\n\ncollections:\n  books  chapters_idx\n  audio_cache  progress\n  scrape_jobs  app_users\n  ranking"
+  }
+}
+
+# ─── Application ──────────────────────────────────────────────────────────────
+
+app: Application {
+  style.fill: "#eef3ff"
+
+  backend: backend {
+    shape: rectangle
+    label: "Backend API :8080\n(Go — HTTP API server)"
+  }
+
+  runner: runner {
+    shape: rectangle
+    label: "Runner\n(Go — background worker\nscraping + TTS jobs)"
+  }
+
+  ui: ui {
+    shape: rectangle
+    label: "SvelteKit UI :5252\n(adapter-node)"
+  }
+}
+
+# ─── Init → Storage deps ──────────────────────────────────────────────────────
+
+init.minio-init -> storage.minio: create buckets {style.stroke-dash: 4}
+init.pb-init -> storage.pocketbase: bootstrap schema {style.stroke-dash: 4}
+
+# ─── App → Storage ────────────────────────────────────────────────────────────
+
+app.backend -> storage.minio: blobs (chapters, audio,\navatars, browse)
+app.backend -> storage.pocketbase: structured records\n(books, progress, jobs…)
+
+app.runner -> storage.minio: write chapter markdown\n& audio MP3s
+app.runner -> storage.pocketbase: read/update scrape jobs\nwrite book records
+
+# ─── App internal ─────────────────────────────────────────────────────────────
+
+app.ui -> app.backend: REST API calls\n(server-side)
+
+# ─── External → App ───────────────────────────────────────────────────────────
+
+app.runner -> novelfire: scrape\n(HTTP GET)
+app.runner -> kokoro: TTS generation\n(HTTP POST)
+
+# ─── Browser ──────────────────────────────────────────────────────────────────
+
+browser -> app.ui: HTTPS :5252
+browser -> storage.minio: presigned URLs\n(audio / chapter downloads)
--- a/docs/architecture.mermaid.md
+++ b/docs/architecture.mermaid.md
@@ -0,0 +1,47 @@
+```mermaid
+graph LR
+    %% ── External ──────────────────────────────────────────────────────────
+    NF([novelfire.net])
+    KK([Kokoro-FastAPI TTS])
+    CL([Browser / iOS App])
+
+    %% ── Init containers ───────────────────────────────────────────────────
+    subgraph INIT["Init containers (one-shot)"]
+        MI[minio-init\nmc: create buckets]
+        PI[pb-init\nbootstrap collections]
+    end
+
+    %% ── Storage ───────────────────────────────────────────────────────────
+    subgraph STORAGE["Storage"]
+        MN[(MinIO :9000\nchapters · audio\navatars · browse)]
+        PB[(PocketBase :8090\nbooks · chapters_idx\naudio_cache · progress\nscrape_jobs · app_users · ranking)]
+    end
+
+    %% ── Application ───────────────────────────────────────────────────────
+    subgraph APP["Application"]
+        BE[Backend API :8080\nGo HTTP server]
+        RN[Runner\nGo background worker]
+        UI[SvelteKit UI :5252]
+    end
+
+    %% ── Init → Storage ────────────────────────────────────────────────────
+    MI -.->|create buckets| MN
+    PI -.->|bootstrap schema| PB
+
+    %% ── App → Storage ─────────────────────────────────────────────────────
+    BE -->|blobs| MN
+    BE -->|structured records| PB
+    RN -->|chapter markdown & audio| MN
+    RN -->|read/update jobs & books| PB
+
+    %% ── App internal ──────────────────────────────────────────────────────
+    UI -->|REST API| BE
+
+    %% ── Runner → External ─────────────────────────────────────────────────
+    RN -->|scrape HTTP GET| NF
+    RN -->|TTS HTTP POST| KK
+
+    %% ── Client ────────────────────────────────────────────────────────────
+    CL -->|HTTPS :5252| UI
+    CL -->|presigned URLs| MN
+```
--- a/docs/architecture.svg
+++ b/docs/architecture.svg
--- a/scraper/go.mod
+++ b/scraper/go.mod
@@ -10,6 +10,7 @@ require (

 require (
 	github.com/BurntSushi/toml v1.4.1-0.20240526193622-a339e1f7089c // indirect
+	github.com/andybalholm/brotli v1.2.0 // indirect
 	github.com/davecgh/go-spew v1.1.1 // indirect
 	github.com/dustin/go-humanize v1.0.1 // indirect
 	github.com/go-ini/ini v1.67.0 // indirect
--- a/scraper/go.sum
+++ b/scraper/go.sum
@@ -1,5 +1,7 @@
 github.com/BurntSushi/toml v1.4.1-0.20240526193622-a339e1f7089c h1:pxW6RcqyfI9/kWtOwnv/G+AzdKuy2ZrqINhenH4HyNs=
 github.com/BurntSushi/toml v1.4.1-0.20240526193622-a339e1f7089c/go.mod h1:ukJfTF/6rtPPRCnwkur4qwRxa8vTRFBF0uk2lLoLwho=
+github.com/andybalholm/brotli v1.2.0 h1:ukwgCxwYrmACq68yiUqwIWnGY0cTPox/M94sVwToPjQ=
+github.com/andybalholm/brotli v1.2.0/go.mod h1:rzTDkvFWvIrjDXZHkuS16NPggd91W3kUSvPlQ1pLaKY=
 github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
--- a/scraper/internal/browser/http.go
+++ b/scraper/internal/browser/http.go
@@ -10,6 +10,8 @@ import (
 	"os"
 	"strings"
 	"time"
+
+	"github.com/andybalholm/brotli"
 )

 type httpClient struct {
@@ -106,16 +108,17 @@ func (c *httpClient) GetContent(ctx context.Context, req ContentRequest) (string
 	// net/http decompresses gzip automatically only when it sets the header
 	// itself; since we set Accept-Encoding explicitly we must do it ourselves.
 	body := resp.Body
-	if strings.EqualFold(resp.Header.Get("Content-Encoding"), "gzip") {
+	switch strings.ToLower(resp.Header.Get("Content-Encoding")) {
+	case "gzip":
 		gr, gzErr := gzip.NewReader(resp.Body)
 		if gzErr != nil {
 			return "", fmt.Errorf("http: gzip reader: %w", gzErr)
 		}
 		defer gr.Close()
 		body = gr
+	case "br":
+		body = io.NopCloser(brotli.NewReader(resp.Body))
 	}
-	// br (Brotli) decompression requires an external package; skip for now —
-	// the server will fall back to gzip or plain text for unknown encodings.

 	raw, err := io.ReadAll(body)
 	if err != nil {
--- a/scripts/pb-init-v2.sh
+++ b/scripts/pb-init-v2.sh
@@ -0,0 +1,257 @@
+#!/bin/sh
+# pb-init-v2.sh — idempotent PocketBase collection bootstrap for the v2 stack
+#
+# Creates all collections required by libnovel v2 (backend + runner + ui-v2).
+# Safe to re-run: POST returns 400/422 when a collection already exists; both
+# are treated as success.  The ensure_field helper adds fields to existing
+# instances without touching fields that are already present.
+#
+# Collections created:
+#   books           — book metadata
+#   chapters_idx    — per-chapter index (title, number)
+#   ranking         — novelfire ranking snapshots
+#   progress        — per-session reading progress
+#   scraping_tasks  — scrape job queue (runner ↔ backend)
+#   audio_jobs      — TTS job queue (runner ↔ backend)
+#
+# Required env vars (with defaults matching docker-compose-new.yml):
+#   POCKETBASE_URL            http://pocketbase:8090
+#   POCKETBASE_ADMIN_EMAIL    admin@libnovel.local
+#   POCKETBASE_ADMIN_PASSWORD changeme123
+
+set -e
+
+PB_URL="${POCKETBASE_URL:-http://pocketbase:8090}"
+PB_EMAIL="${POCKETBASE_ADMIN_EMAIL:-admin@libnovel.local}"
+PB_PASSWORD="${POCKETBASE_ADMIN_PASSWORD:-changeme123}"
+
+log() { echo "[pb-init-v2] $*"; }
+
+# ─── 0. Ensure curl and python3 are available ────────────────────────────────
+if ! command -v curl > /dev/null 2>&1; then
+  apk add --no-cache curl > /dev/null 2>&1
+fi
+if ! command -v python3 > /dev/null 2>&1; then
+  apk add --no-cache python3 > /dev/null 2>&1
+fi
+
+# ─── 1. Wait for PocketBase to be ready ──────────────────────────────────────
+log "waiting for PocketBase at $PB_URL ..."
+until curl -sf "$PB_URL/api/health" > /dev/null 2>&1; do
+  sleep 2
+done
+log "PocketBase is up"
+
+# ─── 2. Ensure the superuser exists ──────────────────────────────────────────
+#
+# On a fresh install PocketBase v0.23+ exposes a one-time install token in the
+# /_/ redirect Location header.  Use it to create the superuser if needed; on
+# subsequent runs the token is gone and we fall through to normal auth.
+
+log "ensuring superuser $PB_EMAIL exists ..."
+
+LOCATION=$(curl -sf -o /dev/null -w "%{redirect_url}" "$PB_URL/_/" 2>/dev/null || true)
+if echo "$LOCATION" | grep -q "pbinstal/"; then
+  INSTALL_TOKEN=$(echo "$LOCATION" | sed 's|.*pbinstal/||' | tr -d ' \r\n')
+  log "install token found — creating superuser via install endpoint"
+  curl -sf -X POST "$PB_URL/api/collections/_superusers/records" \
+    -H "Content-Type: application/json" \
+    -H "Authorization: Bearer $INSTALL_TOKEN" \
+    -d "{\"email\":\"$PB_EMAIL\",\"password\":\"$PB_PASSWORD\",\"passwordConfirm\":\"$PB_PASSWORD\"}" \
+    > /dev/null 2>&1 || true
+  log "superuser create attempted (may already exist)"
+fi
+
+# ─── 3. Authenticate and obtain a superuser token ────────────────────────────
+log "authenticating as $PB_EMAIL ..."
+AUTH_RESPONSE=$(curl -sf -X POST "$PB_URL/api/collections/_superusers/auth-with-password" \
+  -H "Content-Type: application/json" \
+  -d "{\"identity\":\"$PB_EMAIL\",\"password\":\"$PB_PASSWORD\"}")
+
+TOKEN=$(echo "$AUTH_RESPONSE" | sed 's/.*"token":"\([^"]*\)".*/\1/')
+if [ -z "$TOKEN" ] || [ "$TOKEN" = "$AUTH_RESPONSE" ]; then
+  log "ERROR: failed to obtain auth token. Response: $AUTH_RESPONSE"
+  exit 1
+fi
+log "auth token obtained"
+
+# ─── 4. Helpers ──────────────────────────────────────────────────────────────
+
+# create_collection NAME JSON_BODY
+# POSTs to /api/collections.  400/422 = already exists → treated as success.
+create_collection() {
+  NAME="$1"
+  BODY="$2"
+  STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
+    -X POST "$PB_URL/api/collections" \
+    -H "Content-Type: application/json" \
+    -H "Authorization: Bearer $TOKEN" \
+    -d "$BODY")
+  case "$STATUS" in
+    200|201) log "created collection: $NAME" ;;
+    400|422) log "collection already exists (skipped): $NAME" ;;
+    *)       log "WARNING: unexpected status $STATUS for collection: $NAME" ;;
+  esac
+}
+
+# ensure_field COLLECTION FIELD_NAME FIELD_TYPE
+#
+# Uses python3 to parse the collection schema, then PATCHes the full fields
+# array with the new field appended — only if it is not already present.
+# python3 is required to correctly extract the top-level collection id from
+# the JSON response (sed-based extraction is unreliable on multi-field schemas
+# because the greedy pattern picks up a field id instead of the collection id).
+ensure_field() {
+  COLL="$1"
+  FIELD_NAME="$2"
+  FIELD_TYPE="$3"
+
+  SCHEMA=$(curl -sf \
+    -H "Authorization: Bearer $TOKEN" \
+    "$PB_URL/api/collections/$COLL" 2>/dev/null)
+
+  PARSED=$(echo "$SCHEMA" | python3 -c "
+import sys, json
+try:
+    d = json.load(sys.stdin)
+    fields = d.get('fields', [])
+    exists = any(f.get('name') == '$FIELD_NAME' for f in fields)
+    print('exists=' + str(exists))
+    print('id=' + d.get('id', ''))
+    if not exists:
+        fields.append({'name': '$FIELD_NAME', 'type': '$FIELD_TYPE'})
+        print('fields=' + json.dumps(fields))
+except Exception as e:
+    print('error=' + str(e))
+" 2>/dev/null)
+
+  if echo "$PARSED" | grep -q "^exists=True"; then
+    log "field $COLL.$FIELD_NAME already exists — skipping"
+    return
+  fi
+
+  COLLECTION_ID=$(echo "$PARSED" | grep "^id=" | sed 's/^id=//')
+  if [ -z "$COLLECTION_ID" ]; then
+    log "WARNING: could not get id for collection $COLL — skipping ensure_field"
+    return
+  fi
+
+  NEW_FIELDS=$(echo "$PARSED" | grep "^fields=" | sed 's/^fields=//')
+  STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
+    -X PATCH "$PB_URL/api/collections/$COLLECTION_ID" \
+    -H "Content-Type: application/json" \
+    -H "Authorization: Bearer $TOKEN" \
+    -d "{\"fields\":${NEW_FIELDS}}")
+  case "$STATUS" in
+    200|201) log "patched $COLL — added field: $FIELD_NAME ($FIELD_TYPE)" ;;
+    *)       log "WARNING: patch returned $STATUS when adding $FIELD_NAME to $COLL" ;;
+  esac
+}
+
+# ─── 5. Collections ───────────────────────────────────────────────────────────
+
+# books — one record per scraped novel
+create_collection "books" '{
+  "name": "books",
+  "type": "base",
+  "fields": [
+    {"name": "slug",           "type": "text",   "required": true},
+    {"name": "title",          "type": "text",   "required": true},
+    {"name": "author",         "type": "text"},
+    {"name": "cover",          "type": "text"},
+    {"name": "status",         "type": "text"},
+    {"name": "genres",         "type": "json"},
+    {"name": "summary",        "type": "text"},
+    {"name": "total_chapters", "type": "number"},
+    {"name": "source_url",     "type": "text"},
+    {"name": "ranking",        "type": "number"}
+  ]
+}'
+
+# chapters_idx — lightweight chapter list (no content; content lives in MinIO)
+create_collection "chapters_idx" '{
+  "name": "chapters_idx",
+  "type": "base",
+  "fields": [
+    {"name": "slug",   "type": "text",   "required": true},
+    {"name": "number", "type": "number", "required": true},
+    {"name": "title",  "type": "text"}
+  ]
+}'
+
+# ranking — periodic novelfire ranking snapshots
+create_collection "ranking" '{
+  "name": "ranking",
+  "type": "base",
+  "fields": [
+    {"name": "rank",       "type": "number", "required": true},
+    {"name": "slug",       "type": "text",   "required": true},
+    {"name": "title",      "type": "text"},
+    {"name": "author",     "type": "text"},
+    {"name": "cover",      "type": "text"},
+    {"name": "status",     "type": "text"},
+    {"name": "genres",     "type": "json"},
+    {"name": "source_url", "type": "text"}
+  ]
+}'
+
+# progress — per-session reading progress (no user accounts required)
+create_collection "progress" '{
+  "name": "progress",
+  "type": "base",
+  "fields": [
+    {"name": "session_id", "type": "text",   "required": true},
+    {"name": "slug",       "type": "text",   "required": true},
+    {"name": "chapter",    "type": "number"}
+  ]
+}'
+
+# scraping_tasks — scrape job queue consumed by the runner
+create_collection "scraping_tasks" '{
+  "name": "scraping_tasks",
+  "type": "base",
+  "fields": [
+    {"name": "kind",             "type": "text"},
+    {"name": "target_url",       "type": "text"},
+    {"name": "from_chapter",     "type": "number"},
+    {"name": "to_chapter",       "type": "number"},
+    {"name": "worker_id",        "type": "text"},
+    {"name": "status",           "type": "text",   "required": true},
+    {"name": "books_found",      "type": "number"},
+    {"name": "chapters_scraped", "type": "number"},
+    {"name": "chapters_skipped", "type": "number"},
+    {"name": "errors",           "type": "number"},
+    {"name": "error_message",    "type": "text"},
+    {"name": "started",          "type": "date"},
+    {"name": "finished",         "type": "date"},
+    {"name": "heartbeat_at",     "type": "date"}
+  ]
+}'
+
+# audio_jobs — TTS generation queue consumed by the runner
+create_collection "audio_jobs" '{
+  "name": "audio_jobs",
+  "type": "base",
+  "fields": [
+    {"name": "cache_key",     "type": "text",   "required": true},
+    {"name": "slug",          "type": "text",   "required": true},
+    {"name": "chapter",       "type": "number", "required": true},
+    {"name": "voice",         "type": "text"},
+    {"name": "worker_id",     "type": "text"},
+    {"name": "status",        "type": "text",   "required": true},
+    {"name": "error_message", "type": "text"},
+    {"name": "started",       "type": "date"},
+    {"name": "finished",      "type": "date"},
+    {"name": "heartbeat_at",  "type": "date"}
+  ]
+}'
+
+# ─── 6. Schema migrations (idempotent — safe to re-run on existing instances) ─
+#
+# heartbeat_at was added after the initial v2 deploy.  ensure_field is a no-op
+# if the field already exists (e.g. fresh installs that ran this script from
+# the start already have it from the create_collection call above).
+ensure_field "scraping_tasks" "heartbeat_at" "date"
+ensure_field "audio_jobs"     "heartbeat_at" "date"
+
+log "all collections ready"
--- a/scripts/pb-init.sh
+++ b/scripts/pb-init.sh
@@ -98,22 +98,34 @@ ensure_field() {
    -H "Authorization: Bearer $TOKEN" \
    "$PB_URL/api/collections/$COLL" 2>/dev/null)

-  # Check if the field already exists (look for "name":"<FIELD_NAME>" in the fields array)
-  if echo "$SCHEMA" | grep -q "\"name\":\"$FIELD_NAME\""; then
+  # Use python3 to reliably parse the JSON schema.
+  PARSED=$(echo "$SCHEMA" | python3 -c "
+import sys, json
+try:
+    d = json.load(sys.stdin)
+    fields = d.get('fields', [])
+    exists = any(f.get('name') == '$FIELD_NAME' for f in fields)
+    print('exists=' + str(exists))
+    print('id=' + d.get('id', ''))
+    if not exists:
+        fields.append({'name': '$FIELD_NAME', 'type': '$FIELD_TYPE'})
+        print('fields=' + json.dumps(fields))
+except Exception as e:
+    print('error=' + str(e))
+" 2>/dev/null)
+
+  if echo "$PARSED" | grep -q "^exists=True"; then
    log "field $COLL.$FIELD_NAME already exists — skipping"
    return
  fi

-  COLLECTION_ID=$(echo "$SCHEMA" | sed 's/.*"id":"\([^"]*\)".*/\1/')
-  if [ -z "$COLLECTION_ID" ] || [ "$COLLECTION_ID" = "$SCHEMA" ]; then
+  COLLECTION_ID=$(echo "$PARSED" | grep "^id=" | sed 's/^id=//')
+  if [ -z "$COLLECTION_ID" ]; then
    log "WARNING: could not get id for collection $COLL — skipping ensure_field"
    return
  fi

-  # Extract current fields array and append the new field before the closing bracket.
-  CURRENT_FIELDS=$(echo "$SCHEMA" | sed 's/.*"fields":\(\[.*\]\).*/\1/')
-  TRIMMED=$(echo "$CURRENT_FIELDS" | sed 's/]$//')
-  NEW_FIELDS="${TRIMMED},{\"name\":\"${FIELD_NAME}\",\"type\":\"${FIELD_TYPE}\"}]"
+  NEW_FIELDS=$(echo "$PARSED" | grep "^fields=" | sed 's/^fields=//')
  PATCH_BODY="{\"fields\":${NEW_FIELDS}}"

  STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
Author	SHA1	Message	Date
Admin	29d0eeb7e8	docs: add architecture diagrams (D2 + Mermaid) All checks were successful CI / Scraper / Test (pull_request) Successful in 10s Details CI / UI / Build (pull_request) Successful in 28s Details CI / UI / Docker Push (pull_request) Has been skipped Details CI / Scraper / Lint (pull_request) Successful in 1m5s Details CI / Scraper / Docker Push (pull_request) Has been skipped Details iOS CI / Build (pull_request) Successful in 8m7s Details iOS CI / Test (pull_request) Successful in 16m14s Details Adds docs/architecture.d2 and docs/architecture.mermaid.md showing the docker-compose-new.yml service topology — storage, application, init containers, and external dependencies with annotated connections. Also includes the rendered docs/architecture.svg (D2 output). View live: d2 --watch docs/architecture.d2 View in Gitea: navigate to docs/architecture.mermaid.md in the web UI.	2026-03-21 20:35:03 +05:00
Admin	fabe9724c2	fix(scraper): add Brotli decompression to HTTP client Some checks failed CI / Scraper / Lint (pull_request) Failing after 21s Details CI / Scraper / Test (pull_request) Failing after 21s Details CI / UI / Build (pull_request) Failing after 21s Details CI / Scraper / Docker Push (pull_request) Has been skipped Details CI / UI / Docker Push (pull_request) Has been skipped Details iOS CI / Build (pull_request) Successful in 3m53s Details iOS CI / Test (pull_request) Successful in 6m32s Details novelfire.net responds with Content-Encoding: br when the scraper advertises 'gzip, deflate, br'. The client only handled gzip, so Brotli-compressed bytes were fed raw into the HTML parser producing garbage — empty titles, zero chapters, and selector failures. Added github.com/andybalholm/brotli and wired it into GetContent alongside the existing gzip path.	2026-03-20 11:19:28 +05:00
Admin	4c9bb4adde	feat: add pb-init-v2.sh for v2 stack; wire into docker-compose-new.yml All checks were successful CI / Scraper / Lint (pull_request) Successful in 12s Details CI / Scraper / Test (pull_request) Successful in 16s Details CI / UI / Build (pull_request) Successful in 16s Details CI / Scraper / Docker Push (pull_request) Has been skipped Details CI / UI / Docker Push (pull_request) Has been skipped Details iOS CI / Build (pull_request) Successful in 5m20s Details iOS CI / Test (pull_request) Successful in 7m4s Details Minimal PocketBase bootstrap for the v2 stack (backend + runner + ui-v2). Creates only the 6 collections actually used by v2: books, chapters_idx, ranking, progress, scraping_tasks, audio_jobs Drops v1-only collections (app_users, user_settings, audio_cache, book_comments, comment_votes, user_library, user_sessions, user_subscriptions) and unused fields (date_label, user_id/audio_time on progress). heartbeat_at is included in create_collection from the start and also covered by ensure_field for existing instances. docker-compose-new.yml pb-init service now mounts pb-init-v2.sh.	2026-03-15 21:58:02 +05:00
Admin	22b6ee824e	fix(pb-init): use python3 for JSON parsing in ensure_field; add heartbeat_at fields All checks were successful CI / Scraper / Test (pull_request) Successful in 14s Details CI / UI / Build (pull_request) Successful in 17s Details CI / UI / Docker Push (pull_request) Has been skipped Details CI / Scraper / Lint (pull_request) Successful in 21s Details CI / Scraper / Docker Push (pull_request) Has been skipped Details iOS CI / Build (pull_request) Successful in 8m22s Details iOS CI / Test (pull_request) Successful in 12m35s Details The sed-based collection id and fields extraction was greedy and broke on collections with multiple fields (grabbed the last field id instead of the top-level collection id → PATCH to wrong URL → 404). Rewrite ensure_field to use python3 for reliable JSON parsing. Also adds the missing heartbeat_at (date) field to scraping_tasks and audio_jobs which was never applied on the initial deploy because the bug prevented the PATCH.	2026-03-15 21:53:54 +05:00