Преглед изворни кода

小红书作品同步分页问题

Ethanfly пре 14 часа
родитељ
комит
a106f7dc97

+ 170 - 0
server/python/PAGING_LOGIC.md

@@ -0,0 +1,170 @@
+# 小红书作品自动分页逻辑说明
+
+## 页数计算逻辑
+
+### 1. 页数统计变量
+
+在 `get_all_works` 方法中,有两个关键的计数变量:
+
+- **`iters`**: 总请求次数(包括失败的请求)
+  - 初始值:0
+  - 每次循环:`iters += 1`
+  - 最大限制:800 次(`max_iters = 800`)
+
+- **`page_count`**: 成功获取到的页数(只统计有作品的页面)
+  - 初始值:0
+  - 只有当 `notes` 不为空时:`page_count += 1`
+  - 这是**实际成功获取到的页数**,不是总页数
+
+### 2. 总页数计算
+
+总页数可以通过以下方式计算:
+
+```python
+总页数 = ceil(总作品数 / 每页大小)
+```
+
+其中:
+- **总作品数** (`total`): 从 API 返回的声明总数,可能来自:
+  - `tags` 中的 `special.note_time_desc` 的 `notes_count`
+  - `tags` 中其他 tag 的 `notes_count` 最大值
+  - `data.total` 或 `data.total_count`
+  - `data.page.total` 或 `data.page.totalCount`
+  
+- **每页大小** (`api_page_size`): 固定为 20
+
+### 3. 分页流程
+
+```
+开始循环 (最多 800 次)
+  ↓
+iters += 1 (总请求次数)
+  ↓
+调用 API: fetch_notes_page(cursor)
+  ↓
+检查响应是否成功
+  ├─ 失败 → 如果是第一次请求,切换到滚动模式
+  └─ 成功 → 继续
+      ↓
+检查 notes 是否为空
+  ├─ 为空 → 停止分页
+  └─ 不为空 → page_count += 1 (成功页数+1)
+      ↓
+解析作品并去重
+  ↓
+检查是否获取完所有作品
+  ├─ 累计作品数 >= 总数 → 停止分页
+  └─ 继续 → 更新 cursor,继续下一页
+```
+
+### 4. 分页统计输出
+
+在分页完成后,会输出以下统计信息:
+
+```
+分页统计: 总请求次数={iters}, 成功获取页数={page_count}, 累计作品数={len(works)}, 声明总数={total}
+```
+
+示例:
+- 总请求次数:20(包括所有请求,即使某些请求返回空)
+- 成功获取页数:18(只有返回作品的页面才计数)
+- 累计作品数:360(实际获取到的作品总数)
+- 声明总数:368(API 声明的总作品数)
+
+### 5. 页数计算示例
+
+假设账号有 368 个作品:
+
+```
+总作品数 = 368
+每页大小 = 20
+理论总页数 = ceil(368 / 20) = 19 页
+
+实际执行:
+- 第 1-18 页:每页 20 个作品,共 360 个
+- 第 19 页:8 个作品
+- page_count = 19(成功获取的页数)
+- iters = 19(总请求次数,如果都成功的话)
+```
+
+## 日志输出位置
+
+### Python 服务日志
+
+Python 服务的日志通过 `print(..., flush=True)` 输出到**控制台**(标准输出)。
+
+要查看完整的分页日志,需要:
+
+1. **确保 Python 服务正在运行**
+   ```bash
+   cd server/python
+   python app.py
+   ```
+
+2. **查看 Python 服务的控制台输出**
+   - 日志会实时输出到运行 Python 服务的终端窗口
+   - 所有 `print` 语句都会立即输出(因为使用了 `flush=True`)
+
+3. **关键日志标识**
+   - `========== 开始自动分页获取作品 ==========`: 分页开始
+   - `第 {iters} 次请求 (cursor={cursor})`: 每次请求
+   - `✅ 第 {page_count} 页获取成功`: 成功获取的页
+   - `📊 分页统计`: 分页完成后的统计信息
+
+### Node.js 日志
+
+Node.js 端的日志会输出到:
+- **控制台**(开发环境)
+- **日志文件**(生产环境):
+  - `server/logs/combined.log` - 所有日志
+  - `server/logs/error.log` - 错误日志
+
+## 调试建议
+
+如果 Python API 返回 0 个作品,可能的原因:
+
+1. **Cookie 格式问题**
+   - 检查 Node.js 传递给 Python 的 Cookie 格式是否正确
+   - Python 期望 JSON 格式的 Cookie 数组
+
+2. **登录状态失效**
+   - 检查账号的 Cookie 是否过期
+   - 查看 Python 日志中是否有 "Cookie 已过期" 的错误
+
+3. **API 调用失败**
+   - 查看 Python 日志中的错误信息
+   - 检查网络连接和 API 响应
+
+4. **页面加载问题**
+   - 检查是否成功访问笔记管理页面
+   - 查看是否有导航超时的错误
+
+## 查看日志的方法
+
+### 方法 1: 直接查看 Python 服务控制台
+
+运行 Python 服务的终端窗口会显示所有日志。
+
+### 方法 2: 重定向日志到文件
+
+```bash
+python app.py > python.log 2>&1
+```
+
+然后查看 `python.log` 文件。
+
+### 方法 3: 使用测试脚本
+
+运行测试脚本时,Python 服务的日志会输出到运行 Python 服务的终端窗口:
+
+```bash
+# 终端 1: 运行 Python 服务
+cd server/python
+python app.py
+
+# 终端 2: 运行测试脚本
+cd server
+pnpm exec tsx src/scripts/test-xhs-works-sync.ts 35
+```
+
+Python 服务的日志会在终端 1 中显示。

BIN
server/python/platforms/__pycache__/base.cpython-311.pyc


BIN
server/python/platforms/__pycache__/douyin.cpython-311.pyc


BIN
server/python/platforms/__pycache__/xiaohongshu.cpython-311.pyc


+ 2 - 0
server/python/platforms/base.py

@@ -121,6 +121,7 @@ class WorksResult:
     has_more: bool = False
     next_page: Any = ""
     error: str = ""
+    debug_info: str = ""  # 调试信息
     
     def to_dict(self) -> Dict[str, Any]:
         return {
@@ -131,6 +132,7 @@ class WorksResult:
             "has_more": self.has_more,
             "next_page": self.next_page,
             "error": self.error,
+            "debug_info": self.debug_info,
         }
 
 

+ 303 - 19
server/python/platforms/xiaohongshu.py

@@ -693,28 +693,96 @@ class XiaohongshuPublisher(BasePublisher):
             print(f"[{self.platform_name}] 当前页面: {current_url}", flush=True)
             if "login" in current_url:
                 raise Exception("Cookie 已过期,请重新登录")
+            
+            # 等待页面完全加载,确保签名函数可用
+            print(f"[{self.platform_name}] 等待页面完全加载和签名函数初始化...", flush=True)
+            await asyncio.sleep(3)
+            
+            # 检查签名函数是否可用
+            sign_check_attempts = 0
+            max_sign_check_attempts = 10
+            while sign_check_attempts < max_sign_check_attempts:
+                sign_available = await self.page.evaluate("""() => {
+                    return typeof window !== 'undefined' && typeof window._webmsxyw === 'function';
+                }""")
+                if sign_available:
+                    print(f"[{self.platform_name}] ✓ 签名函数 _webmsxyw 已可用", flush=True)
+                    break
+                sign_check_attempts += 1
+                print(f"[{self.platform_name}] ⏳ 等待签名函数... ({sign_check_attempts}/{max_sign_check_attempts})", flush=True)
+                await asyncio.sleep(1)
+            
+            if sign_check_attempts >= max_sign_check_attempts:
+                print(f"[{self.platform_name}] ⚠️ 警告: 签名函数 _webmsxyw 在 {max_sign_check_attempts} 次检查后仍不可用", flush=True)
+                print(f"[{self.platform_name}] 继续尝试,但 API 调用可能会失败", flush=True)
 
             async def fetch_notes_page(p):
+                # 再次检查签名函数(每次调用前都检查)
+                sign_available = await self.page.evaluate("""() => {
+                    return typeof window !== 'undefined' && typeof window._webmsxyw === 'function';
+                }""")
+                
+                if not sign_available:
+                    print(f"[{self.platform_name}] ⚠️ 签名函数 _webmsxyw 不可用,等待...", flush=True)
+                    await asyncio.sleep(2)
+                
                 return await self.page.evaluate(
                     """async (pageNum) => {
                         try {
-                            const url = `https://edith.xiaohongshu.com/web_api/sns/v5/creator/note/user/posted?tab=0&page=${pageNum}`;
-                            const headers = { 'Accept': 'application/json' };
+                            // 使用正确的 API 端点:/api/galaxy/v2/creator/note/user/posted
+                            const url = `/api/galaxy/v2/creator/note/user/posted?tab=0&page=${pageNum}`;
+                            const headers = {
+                                'Accept': 'application/json, text/plain, */*',
+                                'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
+                                'Referer': 'https://creator.xiaohongshu.com/new/note-manager',
+                                'Sec-Fetch-Dest': 'empty',
+                                'Sec-Fetch-Mode': 'cors',
+                                'Sec-Fetch-Site': 'same-origin'
+                            };
+                            
+                            // 尝试获取签名
+                            let signResult = { hasSign: false, x_s: '', x_t: '', x_s_common: '', error: '' };
                             if (typeof window !== 'undefined' && typeof window._webmsxyw === 'function') {
                                 try {
                                     const sign = window._webmsxyw(url, '');
                                     headers['x-s'] = sign['X-s'];
                                     headers['x-t'] = String(sign['X-t']);
+                                    // 检查是否有 x-s-common
+                                    if (sign['X-s-common']) {
+                                        headers['x-s-common'] = sign['X-s-common'];
+                                    }
+                                    signResult = {
+                                        hasSign: true,
+                                        x_s: sign['X-s'] ? sign['X-s'].substring(0, 50) + '...' : '',
+                                        x_t: String(sign['X-t']),
+                                        x_s_common: sign['X-s-common'] ? sign['X-s-common'].substring(0, 50) + '...' : '',
+                                        error: ''
+                                    };
+                                    console.log('签名生成成功:', signResult);
                                 } catch (e) {
-                                    // ignore sign errors and fallback
+                                    signResult.error = e.toString();
+                                    console.error('签名生成失败:', e);
                                 }
+                            } else {
+                                signResult.error = '_webmsxyw function not found';
+                                console.error('签名函数不存在');
                             }
+                            
                             const res = await fetch(url, {
                                 method: 'GET',
                                 credentials: 'include',
                                 headers
                             });
-                            return await res.json();
+                            
+                            const responseData = await res.json();
+                            return {
+                                ...responseData,
+                                _debug: {
+                                    signResult: signResult,
+                                    status: res.status,
+                                    statusText: res.statusText
+                                }
+                            };
                         } catch (e) {
                             return { success: false, error: e.toString() };
                         }
@@ -765,13 +833,30 @@ class XiaohongshuPublisher(BasePublisher):
             resp = None
             for attempt in range(1, 4):
                 resp = await fetch_notes_page(page)
+                
+                # 打印调试信息
+                if resp and isinstance(resp, dict) and resp.get('_debug'):
+                    debug_info = resp.get('_debug', {})
+                    sign_result = debug_info.get('signResult', {})
+                    print(f"[{self.platform_name}] 🔍 调试信息: 签名可用: {sign_result.get('hasSign', False)}, X-S: {sign_result.get('x_s', '')}, X-T: {sign_result.get('x_t', '')}, X-S-Common: {sign_result.get('x_s_common', '')}, 签名错误: {sign_result.get('error', '')}, HTTP 状态: {debug_info.get('status', 'N/A')}", flush=True)
+                    resp.pop('_debug', None)
+                
                 if resp and (resp.get('success') or resp.get('code') == 0) and resp.get('data'):
                     break
                 print(f"[{self.platform_name}] 拉取作品列表失败,重试 {attempt}/3: {str(resp)[:200]}", flush=True)
                 await asyncio.sleep(1.2 * attempt)
 
             if not resp or not (resp.get('success') or resp.get('code') == 0) or not resp.get('data'):
-                raise Exception(f"无法获取作品列表数据: {resp.get('msg') if isinstance(resp, dict) else resp}")
+                error_msg = resp.get('msg') if isinstance(resp, dict) else str(resp)
+                # 打印详细的错误信息
+                if isinstance(resp, dict):
+                    if resp.get('msg'):
+                        print(f"[{self.platform_name}] 错误消息: {resp.get('msg')}", flush=True)
+                    if resp.get('message'):
+                        print(f"[{self.platform_name}] 错误消息: {resp.get('message')}", flush=True)
+                    if resp.get('error'):
+                        print(f"[{self.platform_name}] 错误: {resp.get('error')}", flush=True)
+                raise Exception(f"无法获取作品列表数据: {error_msg}")
 
             data = resp.get('data', {}) or {}
             notes = data.get('notes', []) or []
@@ -858,36 +943,112 @@ class XiaohongshuPublisher(BasePublisher):
 
             print(f"[{self.platform_name}] 访问笔记管理页面...", flush=True)
             try:
-                await self.page.goto("https://creator.xiaohongshu.com/new/note-manager", wait_until="domcontentloaded", timeout=30000)
+                await self.page.goto("https://creator.xiaohongshu.com/new/note-manager", wait_until="domcontentloaded", timeout=60000)
+                print(f"[{self.platform_name}] 页面加载成功", flush=True)
             except Exception as nav_error:
                 print(f"[{self.platform_name}] 导航超时,但继续尝试: {nav_error}", flush=True)
+                # 即使超时也检查当前页面状态
+                try:
+                    await asyncio.sleep(2)
+                    current_url = self.page.url
+                    print(f"[{self.platform_name}] 超时后当前页面: {current_url}", flush=True)
+                except Exception as e:
+                    print(f"[{self.platform_name}] 检查页面状态时出错: {e}", flush=True)
 
             current_url = self.page.url
             print(f"[{self.platform_name}] 当前页面: {current_url}", flush=True)
             if "login" in current_url:
                 raise Exception("Cookie 已过期,请重新登录")
+            
+            # 等待页面完全加载,确保签名函数可用
+            print(f"[{self.platform_name}] 等待页面完全加载和签名函数初始化...", flush=True)
+            await asyncio.sleep(3)
+            
+            # 检查签名函数是否可用
+            sign_check_attempts = 0
+            max_sign_check_attempts = 10
+            while sign_check_attempts < max_sign_check_attempts:
+                sign_available = await self.page.evaluate("""() => {
+                    return typeof window !== 'undefined' && typeof window._webmsxyw === 'function';
+                }""")
+                if sign_available:
+                    print(f"[{self.platform_name}] ✓ 签名函数 _webmsxyw 已可用", flush=True)
+                    break
+                sign_check_attempts += 1
+                print(f"[{self.platform_name}] ⏳ 等待签名函数... ({sign_check_attempts}/{max_sign_check_attempts})", flush=True)
+                await asyncio.sleep(1)
+            
+            if sign_check_attempts >= max_sign_check_attempts:
+                print(f"[{self.platform_name}] ⚠️ 警告: 签名函数 _webmsxyw 在 {max_sign_check_attempts} 次检查后仍不可用", flush=True)
+                print(f"[{self.platform_name}] 继续尝试,但 API 调用可能会失败", flush=True)
 
             async def fetch_notes_page(p):
+                # 再次检查签名函数(每次调用前都检查)
+                sign_available = await self.page.evaluate("""() => {
+                    return typeof window !== 'undefined' && typeof window._webmsxyw === 'function';
+                }""")
+                
+                if not sign_available:
+                    print(f"[{self.platform_name}] ⚠️ 签名函数 _webmsxyw 不可用,等待...", flush=True)
+                    await asyncio.sleep(2)
+                
                 return await self.page.evaluate(
                     """async (pageNum) => {
                         try {
-                            const url = `https://edith.xiaohongshu.com/web_api/sns/v5/creator/note/user/posted?tab=0&page=${pageNum}`;
-                            const headers = { 'Accept': 'application/json' };
+                            // 使用正确的 API 端点:/api/galaxy/v2/creator/note/user/posted
+                            const url = `/api/galaxy/v2/creator/note/user/posted?tab=0&page=${pageNum}`;
+                            const headers = {
+                                'Accept': 'application/json, text/plain, */*',
+                                'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
+                                'Referer': 'https://creator.xiaohongshu.com/new/note-manager',
+                                'Sec-Fetch-Dest': 'empty',
+                                'Sec-Fetch-Mode': 'cors',
+                                'Sec-Fetch-Site': 'same-origin'
+                            };
+                            
+                            // 尝试获取签名
+                            let signResult = { hasSign: false, x_s: '', x_t: '', x_s_common: '', error: '' };
                             if (typeof window !== 'undefined' && typeof window._webmsxyw === 'function') {
                                 try {
                                     const sign = window._webmsxyw(url, '');
                                     headers['x-s'] = sign['X-s'];
                                     headers['x-t'] = String(sign['X-t']);
+                                    // 检查是否有 x-s-common
+                                    if (sign['X-s-common']) {
+                                        headers['x-s-common'] = sign['X-s-common'];
+                                    }
+                                    signResult = {
+                                        hasSign: true,
+                                        x_s: sign['X-s'] ? sign['X-s'].substring(0, 50) + '...' : '',
+                                        x_t: String(sign['X-t']),
+                                        x_s_common: sign['X-s-common'] ? sign['X-s-common'].substring(0, 50) + '...' : '',
+                                        error: ''
+                                    };
+                                    console.log('签名生成成功:', signResult);
                                 } catch (e) {
-                                    // ignore sign errors and fallback
+                                    signResult.error = e.toString();
+                                    console.error('签名生成失败:', e);
                                 }
+                            } else {
+                                signResult.error = '_webmsxyw function not found';
+                                console.error('签名函数不存在');
                             }
+                            
                             const res = await fetch(url, {
                                 method: 'GET',
                                 credentials: 'include',
                                 headers
                             });
-                            return await res.json();
+                            
+                            const responseData = await res.json();
+                            return {
+                                ...responseData,
+                                _debug: {
+                                    signResult: signResult,
+                                    status: res.status,
+                                    statusText: res.statusText
+                                }
+                            };
                         } catch (e) {
                             return { success: false, error: e.toString() };
                         }
@@ -945,7 +1106,7 @@ class XiaohongshuPublisher(BasePublisher):
                 async def handle_response(response):
                     nonlocal captured_total
                     url = response.url
-                    if "edith.xiaohongshu.com" not in url or "creator/note/user/posted" not in url:
+                    if ("creator.xiaohongshu.com" not in url and "edith.xiaohongshu.com" not in url) or "creator/note/user/posted" not in url:
                         return
                     try:
                         json_data = await response.json()
@@ -1000,9 +1161,22 @@ class XiaohongshuPublisher(BasePublisher):
 
                 try:
                     try:
-                        await self.page.goto("https://creator.xiaohongshu.com/new/note-manager", wait_until="networkidle", timeout=60000)
+                        # 使用更宽松的等待条件,避免超时
+                        await self.page.goto("https://creator.xiaohongshu.com/new/note-manager", wait_until="domcontentloaded", timeout=90000)
+                        print(f"[{self.platform_name}] 页面加载成功", flush=True)
                     except Exception as nav_error:
                         print(f"[{self.platform_name}] 导航异常(继续):{nav_error}", flush=True)
+                        # 即使超时也继续尝试,可能页面已经部分加载
+                        try:
+                            await asyncio.sleep(3)
+                            current_url = self.page.url
+                            print(f"[{self.platform_name}] 超时后当前页面: {current_url}", flush=True)
+                            if "login" in current_url:
+                                raise Exception("Cookie 已过期,请重新登录")
+                        except Exception as e:
+                            if "Cookie" in str(e):
+                                raise
+                            print(f"[{self.platform_name}] 检查页面状态时出错: {e}", flush=True)
 
                     await asyncio.sleep(2.0)
 
@@ -1085,25 +1259,95 @@ class XiaohongshuPublisher(BasePublisher):
                     except Exception:
                         pass
 
+            # 添加请求监听,捕获请求头信息
+            captured_requests = []
+            async def handle_request(request):
+                url = request.url
+                if ("creator.xiaohongshu.com" in url or "edith.xiaohongshu.com" in url) and "creator/note/user/posted" in url:
+                    headers = request.headers
+                    captured_requests.append({
+                        "url": url,
+                        "method": request.method,
+                        "headers": dict(headers),
+                        "timestamp": asyncio.get_event_loop().time()
+                    })
+                    # 打印关键头部信息
+                    x_s = headers.get('x-s', '')
+                    x_t = headers.get('x-t', '')
+                    x_s_common = headers.get('x-s-common', '')
+                    print(f"[{self.platform_name}] 📡 API 请求: {url}", flush=True)
+                    print(f"[{self.platform_name}]    Method: {request.method}", flush=True)
+                    print(f"[{self.platform_name}]    X-S: {x_s[:50] if x_s else '(none)'}...", flush=True)
+                    print(f"[{self.platform_name}]    X-T: {x_t}", flush=True)
+                    print(f"[{self.platform_name}]    X-S-Common: {x_s_common[:50] if x_s_common else '(none)'}...", flush=True)
+                    print(f"[{self.platform_name}]    Cookie: {headers.get('cookie', '')[:100]}...", flush=True)
+            
+            self.page.on("request", handle_request)
+            
             iters = 0
+            page_count = 0  # 统计实际获取到的页数
+            print(f"[{self.platform_name}] ========== 开始自动分页获取作品 ==========", flush=True)
+            print(f"[{self.platform_name}] 最大迭代次数: {max_iters}, 每页大小: {api_page_size}", flush=True)
+            
             while iters < max_iters:
                 iters += 1
+                print(f"\n[{self.platform_name}] ---------- 第 {iters} 次请求 (cursor={cursor}) ----------", flush=True)
                 resp = await fetch_notes_page(cursor)
+                
+                # 打印调试信息
+                if resp and isinstance(resp, dict) and resp.get('_debug'):
+                    debug_info = resp.get('_debug', {})
+                    sign_result = debug_info.get('signResult', {})
+                    print(f"[{self.platform_name}] 🔍 调试信息:", flush=True)
+                    print(f"[{self.platform_name}]    签名可用: {sign_result.get('hasSign', False)}", flush=True)
+                    if sign_result.get('x_s'):
+                        print(f"[{self.platform_name}]    X-S: {sign_result.get('x_s', '')}", flush=True)
+                    if sign_result.get('x_t'):
+                        print(f"[{self.platform_name}]    X-T: {sign_result.get('x_t', '')}", flush=True)
+                    if sign_result.get('error'):
+                        print(f"[{self.platform_name}]    签名错误: {sign_result.get('error', '')}", flush=True)
+                    print(f"[{self.platform_name}]    HTTP 状态: {debug_info.get('status', 'N/A')} {debug_info.get('statusText', '')}", flush=True)
+                    # 移除调试信息,避免影响后续处理
+                    resp.pop('_debug', None)
+                
                 if not resp or not isinstance(resp, dict):
-                    print(f"[{self.platform_name}] 第 {iters} 次拉取无响应,cursor={cursor}", flush=True)
+                    print(f"[{self.platform_name}] ❌ 第 {iters} 次拉取无响应,cursor={cursor}", flush=True)
+                    print(f"[{self.platform_name}] 响应类型: {type(resp)}, 响应内容: {str(resp)[:500]}", flush=True)
                     break
                 if not (resp.get('success') or resp.get('code') == 0) or not resp.get('data'):
-                    print(f"[{self.platform_name}] 拉取失败 cursor={cursor}: {str(resp)[:200]}", flush=True)
+                    error_msg = str(resp)[:500]
+                    print(f"[{self.platform_name}] ❌ 拉取失败 cursor={cursor}", flush=True)
+                    print(f"[{self.platform_name}] 响应详情: {error_msg}", flush=True)
+                    print(f"[{self.platform_name}] success={resp.get('success')}, code={resp.get('code')}, has_data={bool(resp.get('data'))}", flush=True)
+                    # 打印详细的错误信息
+                    if resp.get('msg'):
+                        print(f"[{self.platform_name}] 错误消息: {resp.get('msg')}", flush=True)
+                    if resp.get('message'):
+                        print(f"[{self.platform_name}] 错误消息: {resp.get('message')}", flush=True)
+                    if resp.get('error'):
+                        print(f"[{self.platform_name}] 错误: {resp.get('error')}", flush=True)
+                    # 打印调试信息
+                    if resp.get('_debug'):
+                        debug_info = resp.get('_debug', {})
+                        print(f"[{self.platform_name}] HTTP 状态: {debug_info.get('status', 'N/A')} {debug_info.get('statusText', '')}", flush=True)
+                        sign_result = debug_info.get('signResult', {})
+                        if sign_result.get('error'):
+                            print(f"[{self.platform_name}] 签名错误: {sign_result.get('error')}", flush=True)
                     if iters == 1:
+                        print(f"[{self.platform_name}] 第一次请求失败,切换到滚动模式", flush=True)
                         return await collect_by_scrolling()
                     break
 
                 data = resp.get('data', {}) or {}
                 notes = data.get('notes', []) or []
                 if not notes:
-                    print(f"[{self.platform_name}] cursor={cursor} 无作品,停止", flush=True)
+                    print(f"[{self.platform_name}] ⚠️ cursor={cursor} 无作品,停止分页", flush=True)
                     break
 
+                # 统计页数
+                page_count += 1
+                print(f"[{self.platform_name}] ✅ 第 {page_count} 页获取成功,本页作品数: {len(notes)}", flush=True)
+
                 tags = data.get('tags', []) or []
                 if tags:
                     preferred = 0
@@ -1113,13 +1357,19 @@ class XiaohongshuPublisher(BasePublisher):
                             break
                     if preferred:
                         total = max(total, int(preferred))
+                        print(f"[{self.platform_name}] 📊 从 tags 获取总数: {total} (preferred)", flush=True)
                     else:
-                        total = max(total, max([int(t.get('notes_count', 0) or t.get('notesCount', 0) or t.get('count', 0) or 0) for t in tags] + [0]))
+                        tag_total = max([int(t.get('notes_count', 0) or t.get('notesCount', 0) or t.get('count', 0) or 0) for t in tags] + [0])
+                        total = max(total, tag_total)
+                        if tag_total > 0:
+                            print(f"[{self.platform_name}] 📊 从 tags 获取总数: {total}", flush=True)
                 if not total:
                     t2 = int(data.get('total', 0) or data.get('total_count', 0) or data.get('totalCount', 0) or 0)
                     if not t2 and isinstance(data.get('page', {}), dict):
                         t2 = int(data.get('page', {}).get('total', 0) or data.get('page', {}).get('totalCount', 0) or 0)
                     total = max(total, t2)
+                    if t2 > 0:
+                        print(f"[{self.platform_name}] 📊 从 data.total 获取总数: {total}", flush=True)
 
                 parsed = parse_notes(notes)
                 new_items = []
@@ -1129,14 +1379,17 @@ class XiaohongshuPublisher(BasePublisher):
                         new_items.append(w)
                 works.extend(new_items)
 
-                print(f"[{self.platform_name}] cursor={cursor} got={len(notes)}, new={len(new_items)}, total_now={len(works)}, declared_total={total}", flush=True)
+                print(f"[{self.platform_name}] 📈 累计统计: 本页新作品={len(new_items)}, 累计作品数={len(works)}, 声明总数={total}", flush=True)
 
                 if total and len(works) >= total:
+                    print(f"[{self.platform_name}] ✅ 已获取全部作品 (累计={len(works)} >= 总数={total}),停止分页", flush=True)
                     break
                 if len(new_items) == 0:
+                    print(f"[{self.platform_name}] ⚠️ 本页无新作品,停止分页", flush=True)
                     break
 
                 next_page = data.get('page', "")
+                old_cursor = cursor
                 if next_page == cursor:
                     next_page = ""
                 if next_page == -1 or str(next_page) == "-1":
@@ -1146,26 +1399,57 @@ class XiaohongshuPublisher(BasePublisher):
                         cursor = cursor + 1
                     else:
                         cursor = len(works) // api_page_size
+                    print(f"[{self.platform_name}] 🔄 下一页 cursor: {old_cursor} -> {cursor} (自动递增)", flush=True)
                 else:
                     cursor = next_page
+                    print(f"[{self.platform_name}] 🔄 下一页 cursor: {old_cursor} -> {cursor} (API返回)", flush=True)
 
                 await asyncio.sleep(0.5)
+            
+            # 移除请求监听器
+            try:
+                self.page.remove_listener("request", handle_request)
+            except Exception:
+                pass
+            
+            print(f"\n[{self.platform_name}] ========== 分页完成 ==========", flush=True)
+            print(f"[{self.platform_name}] 📊 分页统计: 总请求次数={iters}, 成功获取页数={page_count}, 累计作品数={len(works)}, 声明总数={total}", flush=True)
+            if captured_requests:
+                print(f"[{self.platform_name}] 📡 捕获到 {len(captured_requests)} 个 API 请求", flush=True)
+                for i, req in enumerate(captured_requests[:3], 1):  # 只显示前3个
+                    print(f"[{self.platform_name}]    请求 {i}: {req['method']} {req['url']}", flush=True)
+                    if 'x-s' in req['headers']:
+                        print(f"[{self.platform_name}]      X-S: {req['headers']['x-s'][:50]}...", flush=True)
+                    if 'x-t' in req['headers']:
+                        print(f"[{self.platform_name}]      X-T: {req['headers']['x-t']}", flush=True)
+            print(f"[{self.platform_name}] ========================================\n", flush=True)
 
         except Exception as e:
             import traceback
+            error_trace = traceback.format_exc()
             print(f"[{self.platform_name}] 发生异常: {e}", flush=True)
             traceback.print_exc()
-            return WorksResult(success=False, platform=self.platform_name, error=str(e))
+            return WorksResult(
+                success=False, 
+                platform=self.platform_name, 
+                error=str(e),
+                debug_info=f"异常详情: {error_trace[:500]}"
+            )
         finally:
             await self.close_browser()
 
+        debug_info = f"总请求次数={iters}, 成功获取页数={page_count}, 累计作品数={len(works)}, 声明总数={total}"
+        if len(works) == 0:
+            debug_info += " | 警告: 没有获取到任何作品,可能原因: Cookie失效、API调用失败、或账号无作品"
+        
         return WorksResult(
             success=True,
             platform=self.platform_name,
             works=works,
             total=total or len(works),
             has_more=False,
-            next_page=-1
+            next_page=-1,
+            debug_info=debug_info
         )
     
     async def get_comments(self, cookies: str, work_id: str, cursor: str = "") -> CommentsResult:

+ 34 - 0
server/python/test_xhs_paging.py

@@ -0,0 +1,34 @@
+#!/usr/bin/env python3
+"""
+测试小红书作品自动分页 - 直接调用 Python API
+"""
+import asyncio
+import sys
+import json
+from platforms.xiaohongshu import XiaohongshuPublisher
+
+async def main():
+    if len(sys.argv) < 2:
+        print("用法: python test_xhs_paging.py <账号ID>")
+        print("示例: python test_xhs_paging.py 35")
+        sys.exit(1)
+    
+    account_id = sys.argv[1]
+    
+    # 从 Node.js 数据库读取账号信息
+    # 这里我们需要手动提供 cookie,或者从数据库读取
+    # 为了简化,我们先从 Node.js 获取 cookie
+    print(f"测试账号 ID: {account_id}")
+    print("=" * 60)
+    print("注意: 此脚本需要从数据库读取账号的 Cookie")
+    print("建议使用 Node.js 脚本 test-xhs-works-sync.ts 来测试")
+    print("=" * 60)
+    
+    # 实际上,我们应该从 Node.js 获取 cookie
+    # 但为了测试,我们可以直接调用 get_all_works
+    # 这里需要手动提供 cookie JSON 字符串
+    print("\n请使用 Node.js 脚本 test-xhs-works-sync.ts 来测试")
+    print("它会自动调用 Python API 并显示详细日志")
+
+if __name__ == "__main__":
+    asyncio.run(main())

+ 2 - 2
server/src/models/entities/Work.ts

@@ -56,11 +56,11 @@ export class Work {
   @Column({ name: 'collect_count', type: 'int', default: 0 })
   collectCount!: number;
 
-  @Column({ type: 'timestamp', name: 'created_at', default: () => 'CURRENT_TIMESTAMP' })
+  @Column({ type: 'datetime', name: 'created_at', default: () => 'CURRENT_TIMESTAMP' })
   createdAt!: Date;
 
   @Column({
-    type: 'timestamp',
+    type: 'datetime',
     name: 'updated_at',
     default: () => 'CURRENT_TIMESTAMP',
     onUpdate: 'CURRENT_TIMESTAMP',

+ 2 - 2
server/src/models/entities/WorkDayStatistics.ts

@@ -27,11 +27,11 @@ export class WorkDayStatistics {
   @Column({ name: 'collect_count', type: 'int', default: 0, comment: '收藏数' })
   collectCount!: number;
 
-  @Column({ type: 'timestamp', name: 'created_at', default: () => 'CURRENT_TIMESTAMP' })
+  @Column({ type: 'datetime', name: 'created_at', default: () => 'CURRENT_TIMESTAMP' })
   createdAt!: Date;
 
   @Column({
-    type: 'timestamp',
+    type: 'datetime',
     name: 'updated_at',
     default: () => 'CURRENT_TIMESTAMP',
     onUpdate: 'CURRENT_TIMESTAMP',

+ 74 - 0
server/src/scripts/check-db-timestamp.ts

@@ -0,0 +1,74 @@
+#!/usr/bin/env tsx
+/**
+ * 检查数据库中实际存储的时间格式
+ */
+import { initDatabase } from '../models/index.js';
+import { AppDataSource } from '../models/index.js';
+import { logger } from '../utils/logger.js';
+
+async function checkDbTimestamp() {
+  try {
+    await initDatabase();
+    logger.info('数据库连接已初始化');
+
+    // 检查 works 表
+    logger.info('\n检查 works 表的时间格式:');
+    const worksResult = await AppDataSource.query(`
+      SELECT 
+        id, 
+        title,
+        created_at,
+        updated_at,
+        DATE_FORMAT(created_at, '%Y-%m-%d %H:%i:%s') as created_at_formatted,
+        DATE_FORMAT(updated_at, '%Y-%m-%d %H:%i:%s') as updated_at_formatted
+      FROM works 
+      ORDER BY id DESC 
+      LIMIT 3
+    `);
+    
+    worksResult.forEach((row: any) => {
+      logger.info(`\n作品 ID: ${row.id}`);
+      logger.info(`标题: ${row.title}`);
+      logger.info(`created_at (原始): ${row.created_at}`);
+      logger.info(`created_at (格式化): ${row.created_at_formatted}`);
+      logger.info(`updated_at (原始): ${row.updated_at}`);
+      logger.info(`updated_at (格式化): ${row.updated_at_formatted}`);
+    });
+
+    // 检查 work_day_statistics 表
+    logger.info('\n检查 work_day_statistics 表的时间格式:');
+    const statsResult = await AppDataSource.query(`
+      SELECT 
+        id,
+        created_at,
+        updated_at,
+        DATE_FORMAT(created_at, '%Y-%m-%d %H:%i:%s') as created_at_formatted,
+        DATE_FORMAT(updated_at, '%Y-%m-%d %H:%i:%s') as updated_at_formatted
+      FROM work_day_statistics 
+      ORDER BY id DESC 
+      LIMIT 3
+    `);
+    
+    statsResult.forEach((row: any) => {
+      logger.info(`\n统计 ID: ${row.id}`);
+      logger.info(`created_at (原始): ${row.created_at}`);
+      logger.info(`created_at (格式化): ${row.created_at_formatted}`);
+      logger.info(`updated_at (原始): ${row.updated_at}`);
+      logger.info(`updated_at (格式化): ${row.updated_at_formatted}`);
+    });
+
+    logger.info('\n✅ 检查完成!');
+
+  } catch (error: any) {
+    logger.error('检查失败:', error);
+    if (error instanceof Error) {
+      logger.error('错误堆栈:', error.stack);
+    }
+    process.exit(1);
+  } finally {
+    await AppDataSource.destroy();
+    process.exit(0);
+  }
+}
+
+checkDbTimestamp().catch(console.error);

+ 227 - 0
server/src/scripts/check-xhs-cookie.ts

@@ -0,0 +1,227 @@
+#!/usr/bin/env tsx
+/**
+ * 检查小红书账号 Cookie 有效性
+ */
+import { initDatabase } from '../models/index.js';
+import { PlatformAccount } from '../models/entities/PlatformAccount.js';
+import { AppDataSource } from '../models/index.js';
+import { CookieManager } from '../automation/cookie.js';
+import { logger } from '../utils/logger.js';
+import { chromium } from 'playwright';
+
+async function main() {
+  logger.info('========================================');
+  logger.info('检查小红书账号 Cookie 有效性');
+  logger.info('========================================');
+
+  await initDatabase();
+  logger.info('数据库连接已初始化');
+
+  const accountId = parseInt(process.argv[2] || '35');
+  
+  const accountRepository = AppDataSource.getRepository(PlatformAccount);
+  const account = await accountRepository.findOne({
+    where: {
+      platform: 'xiaohongshu',
+      id: accountId,
+    },
+  });
+
+  if (!account) {
+    logger.error(`未找到账号 ID: ${accountId}`);
+    process.exit(1);
+  }
+
+  logger.info(`找到账号: ID=${account.id}, 名称=${account.accountName}`);
+
+  if (!account.cookieData) {
+    logger.error('账号没有 Cookie 数据');
+    process.exit(1);
+  }
+
+  // 解密 Cookie
+  let decryptedCookies: string;
+  try {
+    decryptedCookies = CookieManager.decrypt(account.cookieData);
+    logger.info('✓ Cookie 解密成功');
+  } catch {
+    decryptedCookies = account.cookieData;
+    logger.info('✓ 使用原始 Cookie 数据');
+  }
+
+  // 解析 Cookie
+  let cookieList: { name: string; value: string; domain: string; path: string }[];
+  try {
+    cookieList = JSON.parse(decryptedCookies);
+    logger.info(`✓ 解析到 ${cookieList.length} 个 Cookie (JSON 格式)`);
+  } catch {
+    // 字符串格式
+    cookieList = decryptedCookies.split(';').map(item => {
+      const [name, value] = item.trim().split('=');
+      return { name: name.trim(), value: (value || '').trim(), domain: '.xiaohongshu.com', path: '/' };
+    }).filter(c => c.name && c.value);
+    logger.info(`✓ 解析到 ${cookieList.length} 个 Cookie (字符串格式)`);
+  }
+
+  if (cookieList.length === 0) {
+    logger.error('❌ Cookie 列表为空');
+    process.exit(1);
+  }
+
+  // 检查关键 Cookie
+  const keyCookies = ['web_session', 'a1', 'webId', 'websectiga'];
+  const foundKeyCookies = cookieList.filter(c => keyCookies.includes(c.name));
+  logger.info(`关键 Cookie: ${foundKeyCookies.map(c => c.name).join(', ') || '未找到'}`);
+
+  // 使用 Playwright 检查 Cookie 有效性
+  logger.info('\n========================================');
+  logger.info('使用浏览器检查 Cookie 有效性...');
+  logger.info('========================================\n');
+
+  const browser = await chromium.launch({ headless: true });
+  const context = await browser.newContext({
+    userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
+  });
+
+  try {
+    const page = await context.newPage();
+
+    // 设置 Cookie
+    await context.addCookies(cookieList.map(c => ({
+      name: c.name,
+      value: c.value,
+      domain: c.domain || '.xiaohongshu.com',
+      path: c.path || '/',
+    })));
+
+    logger.info('✓ Cookie 已设置到浏览器');
+
+    // 访问创作者中心首页
+    logger.info('正在访问创作者中心首页...');
+    try {
+      await page.goto('https://creator.xiaohongshu.com/', {
+        waitUntil: 'domcontentloaded',
+        timeout: 30000,
+      });
+      logger.info('✓ 页面加载成功');
+    } catch (error) {
+      logger.warn(`⚠️ 页面加载超时: ${error}`);
+    }
+
+    // 检查当前 URL
+    const currentUrl = await page.url();
+    logger.info(`当前页面 URL: ${currentUrl}`);
+
+    if (currentUrl.includes('login') || currentUrl.includes('passport')) {
+      logger.error('❌ Cookie 已失效!页面跳转到了登录页');
+      logger.error(`登录页 URL: ${currentUrl}`);
+      process.exit(1);
+    }
+
+    // 检查页面标题
+    const title = await page.title();
+    logger.info(`页面标题: ${title}`);
+
+    // 尝试访问笔记管理页面
+    logger.info('\n正在访问笔记管理页面...');
+    try {
+      await page.goto('https://creator.xiaohongshu.com/new/note-manager', {
+        waitUntil: 'domcontentloaded',
+        timeout: 30000,
+      });
+      
+      await page.waitForTimeout(1000); // 等待页面稳定
+      const noteManagerUrl = typeof page.url === 'function' ? await page.url() : String(page.url || '');
+      logger.info(`笔记管理页面 URL: ${noteManagerUrl}`);
+
+      if (typeof noteManagerUrl === 'string' && (noteManagerUrl.includes('login') || noteManagerUrl.includes('passport'))) {
+        logger.error('❌ Cookie 已失效!笔记管理页面跳转到了登录页');
+        process.exit(1);
+      }
+
+      logger.info('✓ 笔记管理页面访问成功');
+
+      // 尝试调用 API 获取作品列表
+      logger.info('\n正在测试 API 调用...');
+      const apiResult = await page.evaluate(async () => {
+        try {
+          const url = 'https://edith.xiaohongshu.com/web_api/sns/v5/creator/note/user/posted?tab=0&page=0';
+          const headers: Record<string, string> = { 'Accept': 'application/json' };
+          
+          // 尝试获取签名
+          if (typeof window !== 'undefined' && typeof (window as any)._webmsxyw === 'function') {
+            try {
+              const sign = (window as any)._webmsxyw(url, '');
+              headers['x-s'] = sign['X-s'];
+              headers['x-t'] = String(sign['X-t']);
+            } catch (e) {
+              // 忽略签名错误
+            }
+          }
+
+          const res = await fetch(url, {
+            method: 'GET',
+            credentials: 'include',
+            headers,
+          });
+          
+          const data = await res.json();
+          return {
+            success: res.ok,
+            status: res.status,
+            hasData: !!data.data,
+            notesCount: data.data?.notes?.length || 0,
+            code: data.code,
+            message: data.msg || data.message || '',
+          };
+        } catch (e) {
+          return {
+            success: false,
+            error: String(e),
+          };
+        }
+      });
+
+      logger.info(`API 调用结果:`);
+      logger.info(`  成功: ${apiResult.success}`);
+      logger.info(`  状态码: ${apiResult.status || 'N/A'}`);
+      logger.info(`  有数据: ${apiResult.hasData || false}`);
+      logger.info(`  作品数: ${apiResult.notesCount || 0}`);
+      logger.info(`  代码: ${apiResult.code || 'N/A'}`);
+      if (apiResult.message) {
+        logger.info(`  消息: ${apiResult.message}`);
+      }
+      if (apiResult.error) {
+        logger.error(`  错误: ${apiResult.error}`);
+      }
+
+      if (apiResult.success && apiResult.hasData) {
+        logger.info('\n✅ Cookie 有效!可以正常获取作品数据');
+      } else if (apiResult.success && !apiResult.hasData) {
+        logger.warn('\n⚠️ Cookie 可能有效,但 API 返回空数据');
+        logger.warn('可能原因: 账号没有作品,或 API 调用失败');
+      } else {
+        logger.error('\n❌ Cookie 可能失效,API 调用失败');
+      }
+
+    } catch (error) {
+      logger.error(`访问笔记管理页面失败: ${error}`);
+      if (error instanceof Error) {
+        logger.error(`错误堆栈: ${error.stack}`);
+      }
+    }
+
+    await browser.close();
+  } catch (error) {
+    logger.error(`检查 Cookie 时发生错误: ${error}`);
+    if (error instanceof Error) {
+      logger.error(`错误堆栈: ${error.stack}`);
+    }
+    await browser.close();
+    process.exit(1);
+  }
+
+  process.exit(0);
+}
+
+main();

+ 109 - 0
server/src/scripts/fix-timestamp-format.ts

@@ -0,0 +1,109 @@
+#!/usr/bin/env tsx
+/**
+ * 修复 works 和 work_day_statistics 表的 created_at 和 updated_at 字段格式
+ * 将 timestamp 类型改为 datetime 类型,并修复历史数据
+ * 
+ * 运行: cd server && pnpm exec tsx src/scripts/fix-timestamp-format.ts
+ */
+import { initDatabase } from '../models/index.js';
+import { AppDataSource } from '../models/index.js';
+import { logger } from '../utils/logger.js';
+
+async function fixTimestampFormat() {
+  try {
+    await initDatabase();
+    logger.info('数据库连接已初始化');
+
+    const queryRunner = AppDataSource.createQueryRunner();
+    await queryRunner.connect();
+
+    try {
+      logger.info('\n========================================');
+      logger.info('开始修复时间字段格式...');
+      logger.info('========================================\n');
+
+      // 1. 修改 works 表的字段类型
+      logger.info('1. 修改 works 表的 created_at 和 updated_at 字段类型...');
+      await queryRunner.query(`
+        ALTER TABLE works 
+        MODIFY COLUMN created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+        MODIFY COLUMN updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
+      `);
+      logger.info('   ✓ works 表字段类型修改完成');
+
+      // 2. 修改 work_day_statistics 表的字段类型
+      logger.info('2. 修改 work_day_statistics 表的 created_at 和 updated_at 字段类型...');
+      await queryRunner.query(`
+        ALTER TABLE work_day_statistics 
+        MODIFY COLUMN created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+        MODIFY COLUMN updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
+      `);
+      logger.info('   ✓ work_day_statistics 表字段类型修改完成');
+
+      // 3. 修复历史数据 - 将时间转换为正确的格式
+      logger.info('\n3. 修复 works 表的历史数据...');
+      const worksResult = await queryRunner.query(`
+        UPDATE works 
+        SET created_at = CONVERT_TZ(created_at, @@session.time_zone, '+08:00'),
+            updated_at = CONVERT_TZ(updated_at, @@session.time_zone, '+08:00')
+        WHERE created_at IS NOT NULL OR updated_at IS NOT NULL
+      `);
+      logger.info(`   ✓ works 表已更新 ${worksResult.affectedRows || 0} 条记录`);
+
+      logger.info('\n4. 修复 work_day_statistics 表的历史数据...');
+      const statsResult = await queryRunner.query(`
+        UPDATE work_day_statistics 
+        SET created_at = CONVERT_TZ(created_at, @@session.time_zone, '+08:00'),
+            updated_at = CONVERT_TZ(updated_at, @@session.time_zone, '+08:00')
+        WHERE created_at IS NOT NULL OR updated_at IS NOT NULL
+      `);
+      logger.info(`   ✓ work_day_statistics 表已更新 ${statsResult.affectedRows || 0} 条记录`);
+
+      // 4. 验证修复结果
+      logger.info('\n5. 验证修复结果...');
+      const sampleWorks = await queryRunner.query(`
+        SELECT id, created_at, updated_at 
+        FROM works 
+        ORDER BY id DESC 
+        LIMIT 5
+      `);
+      logger.info('   works 表示例数据:');
+      sampleWorks.forEach((row: any) => {
+        logger.info(`     ID ${row.id}: created_at=${row.created_at}, updated_at=${row.updated_at}`);
+      });
+
+      const sampleStats = await queryRunner.query(`
+        SELECT id, created_at, updated_at 
+        FROM work_day_statistics 
+        ORDER BY id DESC 
+        LIMIT 5
+      `);
+      logger.info('   work_day_statistics 表示例数据:');
+      sampleStats.forEach((row: any) => {
+        logger.info(`     ID ${row.id}: created_at=${row.created_at}, updated_at=${row.updated_at}`);
+      });
+
+      logger.info('\n========================================');
+      logger.info('时间字段格式修复完成!');
+      logger.info('========================================\n');
+
+    } catch (error: any) {
+      logger.error('修复过程中出错:', error);
+      throw error;
+    } finally {
+      await queryRunner.release();
+    }
+
+  } catch (error: any) {
+    logger.error('修复失败:', error);
+    if (error instanceof Error) {
+      logger.error('错误堆栈:', error.stack);
+    }
+    process.exit(1);
+  } finally {
+    await AppDataSource.destroy();
+    process.exit(0);
+  }
+}
+
+fixTimestampFormat().catch(console.error);

+ 133 - 0
server/src/scripts/test-all-xhs-accounts.ts

@@ -0,0 +1,133 @@
+import { initDatabase } from '../models/index.js';
+import { PlatformAccount } from '../models/entities/PlatformAccount.js';
+import { AppDataSource } from '../models/index.js';
+import { WorkService } from '../services/WorkService.js';
+import { CookieManager } from '../automation/cookie.js';
+import { logger } from '../utils/logger.js';
+
+async function testAllXhsAccounts() {
+  try {
+    await initDatabase();
+    logger.info('数据库连接已初始化');
+
+    // 获取所有小红书账号
+    const accountRepository = AppDataSource.getRepository(PlatformAccount);
+    const accounts = await accountRepository.find({
+      where: { platform: 'xiaohongshu' },
+      order: { id: 'ASC' }
+    });
+
+    if (accounts.length === 0) {
+      logger.info('没有找到小红书账号');
+      return;
+    }
+
+    logger.info(`\n${'='.repeat(60)}`);
+    logger.info(`找到 ${accounts.length} 个小红书账号,开始批量测试...`);
+    logger.info(`${'='.repeat(60)}\n`);
+
+    const results: Array<{
+      id: number;
+      name: string;
+      success: boolean;
+      worksCount: number;
+      error?: string;
+    }> = [];
+
+    const workService = new WorkService();
+
+    for (let i = 0; i < accounts.length; i++) {
+      const account = accounts[i];
+        logger.info(`\n[${i + 1}/${accounts.length}] 测试账号: ID=${account.id}, 名称=${account.accountName}`);
+        logger.info(`${'-'.repeat(60)}`);
+
+        try {
+          if (!account.cookieData) {
+            logger.error('账号没有 Cookie 数据');
+            results.push({
+              id: account.id,
+              name: account.accountName || '(未设置)',
+              success: false,
+              worksCount: 0,
+              error: '账号没有 Cookie 数据'
+            });
+            continue;
+          }
+
+          // 调用同步方法
+          const startTime = Date.now();
+          await workService.syncAccountWorks(account.userId, account);
+          const duration = ((Date.now() - startTime) / 1000).toFixed(2);
+
+          // 查询同步后的作品数量
+          const { Work } = await import('../models/entities/Work.js');
+          const workRepository = AppDataSource.getRepository(Work);
+          const worksCount = await workRepository.count({
+            where: {
+              accountId: account.id,
+            },
+          });
+
+          logger.info(`✅ 成功: 获取到 ${worksCount} 个作品 (耗时: ${duration}秒)`);
+          results.push({
+            id: account.id,
+            name: account.accountName || '(未设置)',
+            success: true,
+            worksCount: worksCount
+          });
+        } catch (error: any) {
+          logger.error(`❌ 异常: ${error.message || error}`);
+          results.push({
+            id: account.id,
+            name: account.accountName || '(未设置)',
+            success: false,
+            worksCount: 0,
+            error: error.message || String(error)
+          });
+        }
+
+      // 每个账号之间稍作延迟,避免请求过快
+      if (i < accounts.length - 1) {
+        await new Promise(resolve => setTimeout(resolve, 2000));
+      }
+    }
+
+    // 打印汇总结果
+    logger.info(`\n${'='.repeat(60)}`);
+    logger.info('测试结果汇总');
+    logger.info(`${'='.repeat(60)}`);
+
+    const successCount = results.filter(r => r.success).length;
+    const failCount = results.filter(r => !r.success).length;
+    const totalWorks = results.reduce((sum, r) => sum + r.worksCount, 0);
+
+    logger.info(`总账号数: ${accounts.length}`);
+    logger.info(`成功: ${successCount} 个`);
+    logger.info(`失败: ${failCount} 个`);
+    logger.info(`总作品数: ${totalWorks} 个`);
+    logger.info(`\n详细结果:`);
+
+    results.forEach((result, index) => {
+      const status = result.success ? '✅' : '❌';
+      logger.info(`${index + 1}. ${status} [ID: ${result.id}] ${result.name}: ${result.success ? `${result.worksCount} 个作品` : result.error}`);
+    });
+
+    if (failCount > 0) {
+      logger.info(`\n失败的账号详情:`);
+      results.filter(r => !r.success).forEach(result => {
+        logger.info(`  - [ID: ${result.id}] ${result.name}: ${result.error}`);
+      });
+    }
+
+    logger.info(`\n${'='.repeat(60)}\n`);
+
+  } catch (error: any) {
+    logger.error('批量测试失败:', error);
+    process.exit(1);
+  } finally {
+    await AppDataSource.destroy();
+    process.exit(0);
+  }
+}
+
+testAllXhsAccounts().catch(console.error);

+ 128 - 0
server/src/scripts/test-python-api-direct.ts

@@ -0,0 +1,128 @@
+#!/usr/bin/env tsx
+/**
+ * 直接测试 Python API - 查看详细响应
+ */
+import { initDatabase } from '../models/index.js';
+import { PlatformAccount } from '../models/entities/PlatformAccount.js';
+import { AppDataSource } from '../models/index.js';
+import { CookieManager } from '../automation/cookie.js';
+import { logger } from '../utils/logger.js';
+
+const PYTHON_SERVICE_URL = process.env.PYTHON_PUBLISH_SERVICE_URL || 'http://localhost:5005';
+
+async function main() {
+  logger.info('========================================');
+  logger.info('直接测试 Python API - 查看详细响应');
+  logger.info('========================================');
+
+  await initDatabase();
+  logger.info('数据库连接已初始化');
+
+  const accountId = parseInt(process.argv[2] || '35');
+  
+  const accountRepository = AppDataSource.getRepository(PlatformAccount);
+  const account = await accountRepository.findOne({
+    where: {
+      platform: 'xiaohongshu',
+      id: accountId,
+    },
+  });
+
+  if (!account) {
+    logger.error(`未找到账号 ID: ${accountId}`);
+    process.exit(1);
+  }
+
+  logger.info(`找到账号: ID=${account.id}, 名称=${account.accountName}`);
+
+  if (!account.cookieData) {
+    logger.error('账号没有 Cookie 数据');
+    process.exit(1);
+  }
+
+  // 解密 Cookie
+  let decryptedCookies: string;
+  try {
+    decryptedCookies = CookieManager.decrypt(account.cookieData);
+    logger.info('Cookie 解密成功');
+  } catch {
+    decryptedCookies = account.cookieData;
+    logger.info('使用原始 Cookie 数据');
+  }
+
+  // 解析 Cookie
+  let cookieList: { name: string; value: string; domain: string; path: string }[];
+  try {
+    cookieList = JSON.parse(decryptedCookies);
+    logger.info(`解析到 ${cookieList.length} 个 Cookie (JSON 格式)`);
+  } catch {
+    // 字符串格式
+    cookieList = decryptedCookies.split(';').map(item => {
+      const [name, value] = item.trim().split('=');
+      return { name: name.trim(), value: (value || '').trim(), domain: '', path: '/' };
+    }).filter(c => c.name);
+    logger.info(`解析到 ${cookieList.length} 个 Cookie (字符串格式)`);
+  }
+
+  const cookieString = JSON.stringify(cookieList);
+  logger.info(`Cookie JSON 长度: ${cookieString.length} 字符`);
+
+  // 调用 Python API
+  logger.info('\n========================================');
+  logger.info('调用 Python API: /works');
+  logger.info('========================================\n');
+
+  try {
+    const response = await fetch(`${PYTHON_SERVICE_URL}/works`, {
+      method: 'POST',
+      headers: {
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        platform: 'xiaohongshu',
+        cookie: cookieString,
+        page: 0,
+        page_size: 20,
+        auto_paging: true,
+      }),
+    });
+
+    logger.info(`HTTP 状态码: ${response.status} ${response.statusText}`);
+
+    const result = await response.json();
+    
+    logger.info('\n========================================');
+    logger.info('Python API 响应:');
+    logger.info('========================================');
+    logger.info(`success: ${result.success}`);
+    logger.info(`platform: ${result.platform}`);
+    logger.info(`works count: ${result.works?.length || 0}`);
+    logger.info(`total: ${result.total || 0}`);
+    logger.info(`has_more: ${result.has_more}`);
+    logger.info(`next_page: ${result.next_page}`);
+    logger.info(`error: ${result.error || '(none)'}`);
+    logger.info(`debug_info: ${result.debug_info || '(none)'}`);
+    
+    if (result.works && result.works.length > 0) {
+      logger.info(`\n前 3 个作品:`);
+      result.works.slice(0, 3).forEach((work: any, index: number) => {
+        logger.info(`  ${index + 1}. ${work.title} (ID: ${work.work_id})`);
+      });
+    } else {
+      logger.warn('\n⚠️ 没有获取到任何作品!');
+    }
+
+    logger.info('\n完整响应 JSON:');
+    logger.info(JSON.stringify(result, null, 2));
+
+  } catch (error) {
+    logger.error('调用 Python API 失败:', error);
+    if (error instanceof Error) {
+      logger.error('错误堆栈:', error.stack);
+    }
+  }
+
+  process.exit(0);
+}
+
+main();

+ 134 - 0
server/src/scripts/test-xhs-works-sync.ts

@@ -0,0 +1,134 @@
+#!/usr/bin/env tsx
+/**
+ * 测试小红书作品同步 - 用于测试自动分页逻辑
+ * 
+ * 运行: cd server && pnpm exec tsx src/scripts/test-xhs-works-sync.ts "O_O点心时间到了"
+ */
+import { initDatabase } from '../models/index.js';
+import { PlatformAccount } from '../models/entities/PlatformAccount.js';
+import { AppDataSource } from '../models/index.js';
+import { WorkService } from '../services/WorkService.js';
+import { logger } from '../utils/logger.js';
+import { CookieManager } from '../automation/cookie.js';
+import type { PlatformType } from '@media-manager/shared';
+
+async function main() {
+  logger.info('========================================');
+  logger.info('测试小红书作品同步 - 自动分页统计');
+  logger.info('========================================');
+
+  // 初始化数据库连接
+  await initDatabase();
+  logger.info('数据库连接已初始化');
+
+  const accountArg = process.argv[2];
+  if (!accountArg) {
+    console.log('\n用法: pnpm exec tsx src/scripts/test-xhs-works-sync.ts <账号ID或账号名称>');
+    console.log('\n示例: pnpm exec tsx src/scripts/test-xhs-works-sync.ts 35');
+    console.log('示例: pnpm exec tsx src/scripts/test-xhs-works-sync.ts "O_O点心时间到了"');
+    console.log('\n正在列出所有小红书账号...\n');
+    
+    // 列出所有小红书账号
+    const accountRepository = AppDataSource.getRepository(PlatformAccount);
+    const accounts = await accountRepository.find({
+      where: { platform: 'xiaohongshu' },
+      order: { id: 'ASC' },
+    });
+    
+    if (accounts.length === 0) {
+      logger.error('未找到任何小红书账号');
+      process.exit(1);
+    }
+    
+    console.log('小红书账号列表:');
+    accounts.forEach(acc => {
+      console.log(`  ID: ${acc.id}, 名称: ${acc.accountName || '(未设置)'}, 作品数: ${acc.worksCount}`);
+    });
+    console.log('');
+    process.exit(0);
+  }
+
+  try {
+    // 查找账号 - 先尝试按ID查找,再按名称查找
+    const accountRepository = AppDataSource.getRepository(PlatformAccount);
+    let account: PlatformAccount | null = null;
+    
+    const accountId = parseInt(accountArg);
+    if (!isNaN(accountId)) {
+      // 按ID查找
+      account = await accountRepository.findOne({
+        where: {
+          platform: 'xiaohongshu',
+          id: accountId,
+        },
+      });
+    }
+    
+    if (!account) {
+      // 按名称查找
+      account = await accountRepository.findOne({
+        where: {
+          platform: 'xiaohongshu',
+          accountName: accountArg,
+        },
+      });
+    }
+
+    if (!account) {
+      logger.error(`未找到账号: ${accountArg}`);
+      logger.info('提示: 可以不带参数运行脚本查看所有账号列表');
+      process.exit(1);
+    }
+
+    logger.info(`找到账号: ID=${account.id}, 名称=${account.accountName}, 平台=${account.platform}`);
+    logger.info(`账号信息: 粉丝数=${account.fansCount}, 作品数=${account.worksCount}`);
+
+    if (!account.cookieData) {
+      logger.error('账号没有 Cookie 数据');
+      process.exit(1);
+    }
+
+    // 解密 Cookie
+    let decryptedCookies: string;
+    try {
+      decryptedCookies = CookieManager.decrypt(account.cookieData);
+      logger.info('Cookie 解密成功');
+    } catch {
+      decryptedCookies = account.cookieData;
+      logger.info('使用原始 Cookie 数据');
+    }
+
+    // 调用作品同步
+    logger.info('\n========================================');
+    logger.info('开始同步作品(将显示详细分页日志)...');
+    logger.info('========================================\n');
+
+    const workService = new WorkService();
+    await workService.syncAccountWorks(account.userId, account);
+
+    logger.info('\n========================================');
+    logger.info('作品同步完成!');
+    logger.info('========================================');
+
+    // 查询同步后的作品数量
+    const { Work } = await import('../models/entities/Work.js');
+    const workRepository = AppDataSource.getRepository(Work);
+    const worksCount = await workRepository.count({
+      where: {
+        accountId: account.id,
+      },
+    });
+
+    logger.info(`账号 ${account.accountName} 当前数据库中的作品数: ${worksCount}`);
+  } catch (error) {
+    logger.error('测试失败:', error);
+    if (error instanceof Error) {
+      logger.error('错误堆栈:', error.stack);
+    }
+    process.exit(1);
+  }
+
+  process.exit(0);
+}
+
+main();

+ 92 - 0
server/src/scripts/verify-timestamp-format.ts

@@ -0,0 +1,92 @@
+#!/usr/bin/env tsx
+/**
+ * 验证 works 和 work_day_statistics 表的时间字段格式
+ */
+import { initDatabase } from '../models/index.js';
+import { AppDataSource } from '../models/index.js';
+import { logger } from '../utils/logger.js';
+
+async function verifyTimestampFormat() {
+  try {
+    await initDatabase();
+    logger.info('数据库连接已初始化');
+
+    logger.info('\n========================================');
+    logger.info('验证时间字段格式...');
+    logger.info('========================================\n');
+
+    // 检查 works 表
+    logger.info('works 表示例数据:');
+    const works = await AppDataSource.query(`
+      SELECT id, 
+             created_at, 
+             updated_at,
+             DATE_FORMAT(created_at, '%Y-%m-%d %H:%i:%s') as created_at_formatted,
+             DATE_FORMAT(updated_at, '%Y-%m-%d %H:%i:%s') as updated_at_formatted
+      FROM works 
+      ORDER BY id DESC 
+      LIMIT 5
+    `);
+    works.forEach((row: any) => {
+      logger.info(`  ID ${row.id}:`);
+      logger.info(`    created_at (原始): ${row.created_at}`);
+      logger.info(`    created_at (格式化): ${row.created_at_formatted}`);
+      logger.info(`    updated_at (原始): ${row.updated_at}`);
+      logger.info(`    updated_at (格式化): ${row.updated_at_formatted}`);
+    });
+
+    // 检查 work_day_statistics 表
+    logger.info('\nwork_day_statistics 表示例数据:');
+    const stats = await AppDataSource.query(`
+      SELECT id, 
+             created_at, 
+             updated_at,
+             DATE_FORMAT(created_at, '%Y-%m-%d %H:%i:%s') as created_at_formatted,
+             DATE_FORMAT(updated_at, '%Y-%m-%d %H:%i:%s') as updated_at_formatted
+      FROM work_day_statistics 
+      ORDER BY id DESC 
+      LIMIT 5
+    `);
+    stats.forEach((row: any) => {
+      logger.info(`  ID ${row.id}:`);
+      logger.info(`    created_at (原始): ${row.created_at}`);
+      logger.info(`    created_at (格式化): ${row.created_at_formatted}`);
+      logger.info(`    updated_at (原始): ${row.updated_at}`);
+      logger.info(`    updated_at (格式化): ${row.updated_at_formatted}`);
+    });
+
+    // 检查表结构
+    logger.info('\n检查表结构:');
+    const worksStructure = await AppDataSource.query(`
+      SHOW COLUMNS FROM works WHERE Field IN ('created_at', 'updated_at')
+    `);
+    logger.info('works 表字段类型:');
+    worksStructure.forEach((col: any) => {
+      logger.info(`  ${col.Field}: ${col.Type}`);
+    });
+
+    const statsStructure = await AppDataSource.query(`
+      SHOW COLUMNS FROM work_day_statistics WHERE Field IN ('created_at', 'updated_at')
+    `);
+    logger.info('work_day_statistics 表字段类型:');
+    statsStructure.forEach((col: any) => {
+      logger.info(`  ${col.Field}: ${col.Type}`);
+    });
+
+    logger.info('\n========================================');
+    logger.info('验证完成!');
+    logger.info('========================================\n');
+
+  } catch (error: any) {
+    logger.error('验证失败:', error);
+    if (error instanceof Error) {
+      logger.error('错误堆栈:', error.stack);
+    }
+    process.exit(1);
+  } finally {
+    await AppDataSource.destroy();
+    process.exit(0);
+  }
+}
+
+verifyTimestampFormat().catch(console.error);

+ 64 - 0
server/src/scripts/verify-work-timestamp.ts

@@ -0,0 +1,64 @@
+#!/usr/bin/env tsx
+/**
+ * 验证 works 表的时间字段格式
+ */
+import { initDatabase } from '../models/index.js';
+import { AppDataSource } from '../models/index.js';
+import { Work } from '../models/entities/Work.js';
+import { logger } from '../utils/logger.js';
+
+async function verifyWorkTimestamp() {
+  try {
+    await initDatabase();
+    logger.info('数据库连接已初始化');
+
+    const repo = AppDataSource.getRepository(Work);
+    
+    // 查询最新的作品
+    const work = await repo.findOne({
+      where: {},
+      order: { id: 'DESC' },
+      select: ['id', 'title', 'createdAt', 'updatedAt']
+    });
+
+    if (work) {
+      logger.info(`\n作品 ID: ${work.id}`);
+      logger.info(`标题: ${work.title}`);
+      logger.info(`createdAt (Date对象): ${work.createdAt}`);
+      logger.info(`updatedAt (Date对象): ${work.updatedAt}`);
+      
+      // 格式化为字符串
+      const createdAtStr = work.createdAt instanceof Date 
+        ? work.createdAt.toISOString().replace('T', ' ').substring(0, 19)
+        : String(work.createdAt);
+      const updatedAtStr = work.updatedAt instanceof Date
+        ? work.updatedAt.toISOString().replace('T', ' ').substring(0, 19)
+        : String(work.updatedAt);
+      
+      logger.info(`createdAt (格式化): ${createdAtStr}`);
+      logger.info(`updatedAt (格式化): ${updatedAtStr}`);
+      
+      // 验证格式是否正确 (YYYY-MM-DD HH:mm:ss)
+      const datePattern = /^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}$/;
+      if (datePattern.test(createdAtStr) && datePattern.test(updatedAtStr)) {
+        logger.info('\n✅ 时间格式正确!');
+      } else {
+        logger.warn('\n⚠️ 时间格式可能不正确');
+      }
+    } else {
+      logger.warn('未找到作品记录');
+    }
+
+  } catch (error: any) {
+    logger.error('验证失败:', error);
+    if (error instanceof Error) {
+      logger.error('错误堆栈:', error.stack);
+    }
+    process.exit(1);
+  } finally {
+    await AppDataSource.destroy();
+    process.exit(0);
+  }
+}
+
+verifyWorkTimestamp().catch(console.error);

+ 9 - 0
server/src/services/HeadlessBrowserService.ts

@@ -569,7 +569,16 @@ class HeadlessBrowserService {
 
       const result = await response.json();
 
+      // 记录 Python API 的详细响应(用于调试)
+      if (pageIndex === 0) {
+        logger.info(`[Python API] Response for ${platform}: success=${result.success}, works_count=${result.works?.length || 0}, total=${result.total || 0}, has_more=${result.has_more}, error=${result.error || 'none'}`);
+        if (result.error) {
+          logger.warn(`[Python API] Error from Python service: ${result.error}`);
+        }
+      }
+
       if (!result.success) {
+        logger.error(`[Python API] Python service returned error: ${result.error || 'Unknown error'}, full response: ${JSON.stringify(result).substring(0, 500)}`);
         throw new Error(result.error || 'Failed to get works');
       }