feat(baoyu-danger-x-to-markdown): add --download-media flag and media localization

This commit is contained in:
Jim Liu 宝玉 2026-02-12 01:56:24 -06:00
parent 9ff468a6a7
commit 260b71cdd7
8 changed files with 561 additions and 33 deletions

View File

@ -681,6 +681,9 @@ Converts X (Twitter) content to markdown format. Supports tweet threads and X Ar
# JSON output
/baoyu-danger-x-to-markdown https://x.com/username/status/123456 --json
# Download media (images/videos) to local files
/baoyu-danger-x-to-markdown https://x.com/username/status/123456 --download-media
```
**Supported URLs:**
@ -713,7 +716,7 @@ Format plain text or markdown files with proper frontmatter, titles, summaries,
**Workflow**:
1. Read source file and analyze content structure
2. Check/create YAML frontmatter (title, slug, summary, featureImage)
2. Check/create YAML frontmatter (title, slug, summary, coverImage)
3. Handle title: use existing, extract from H1, or generate candidates
4. Apply formatting: headings, bold, lists, code blocks, quotes
5. Save to `{filename}-formatted.md`
@ -725,7 +728,7 @@ Format plain text or markdown files with proper frontmatter, titles, summaries,
| `title` | Use existing, extract H1, or generate candidates |
| `slug` | Infer from file path or generate from title |
| `summary` | Generate engaging summary (100-150 chars) |
| `featureImage` | Check for `imgs/cover.png` in same directory |
| `coverImage` | Check for `imgs/cover.png` in same directory |
**Formatting Rules**:
| Element | Format |

View File

@ -681,6 +681,9 @@ AI 驱动的生成后端。
# JSON 输出
/baoyu-danger-x-to-markdown https://x.com/username/status/123456 --json
# 下载媒体文件(图片/视频)到本地
/baoyu-danger-x-to-markdown https://x.com/username/status/123456 --download-media
```
**支持的 URL**
@ -713,7 +716,7 @@ AI 驱动的生成后端。
**工作流程**
1. 读取源文件并分析内容结构
2. 检查/创建 YAML frontmattertitle、slug、summary、featureImage
2. 检查/创建 YAML frontmattertitle、slug、summary、coverImage
3. 处理标题:使用现有标题、提取 H1 或生成候选标题
4. 应用格式:层级标题、加粗、列表、代码块、引用
5. 保存为 `{文件名}-formatted.md`
@ -725,7 +728,7 @@ AI 驱动的生成后端。
| `title` | 使用现有、提取 H1 或生成候选 |
| `slug` | 从文件路径推断或根据标题生成 |
| `summary` | 生成吸引人的摘要100-150 字) |
| `featureImage` | 检查同目录下 `imgs/cover.png` |
| `coverImage` | 检查同目录下 `imgs/cover.png` |
**格式化规则**
| 元素 | 格式 |

View File

@ -92,16 +92,52 @@ test -f "$HOME/.baoyu-skills/baoyu-danger-x-to-markdown/EXTEND.md" && echo "user
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Found │ Read, parse, apply settings │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Not found │ Use defaults
│ Not found │ **MUST** run first-time setup (see below) — do NOT silently create defaults
└───────────┴───────────────────────────────────────────────────────────────────────────┘
**EXTEND.md Supports**: Default output directory | Output format preferences
**EXTEND.md Supports**: Download media by default | Default output directory
### First-Time Setup (BLOCKING)
**CRITICAL**: When EXTEND.md is not found, you **MUST use `AskUserQuestion`** to ask the user for their preferences before creating EXTEND.md. **NEVER** create EXTEND.md with defaults without asking. This is a **BLOCKING** operation — do NOT proceed with any conversion until setup is complete.
Use `AskUserQuestion` with ALL questions in ONE call:
**Question 1** — header: "Media", question: "How to handle images and videos in tweets?"
- "Ask each time (Recommended)" — After saving markdown, ask whether to download media
- "Always download" — Always download media to local imgs/ and videos/ directories
- "Never download" — Keep original remote URLs in markdown
**Question 2** — header: "Output", question: "Default output directory?"
- "x-to-markdown (Recommended)" — Save to ./x-to-markdown/{username}/{tweet-id}.md
- (User may choose "Other" to type a custom path)
**Question 3** — header: "Save", question: "Where to save preferences?"
- "User (Recommended)" — ~/.baoyu-skills/ (all projects)
- "Project" — .baoyu-skills/ (this project only)
After user answers, create EXTEND.md at the chosen location, confirm "Preferences saved to [path]", then continue.
Full reference: [references/config/first-time-setup.md](references/config/first-time-setup.md)
### Supported Keys
| Key | Default | Values | Description |
|-----|---------|--------|-------------|
| `download_media` | `ask` | `ask` / `1` / `0` | `ask` = prompt each time, `1` = always download, `0` = never |
| `default_output_dir` | empty | path or empty | Default output directory (empty = `./x-to-markdown/`) |
**Value priority**:
1. CLI arguments (`--download-media`, `-o`)
2. EXTEND.md
3. Skill defaults
## Usage
```bash
npx -y bun ${SKILL_DIR}/scripts/main.ts <url>
npx -y bun ${SKILL_DIR}/scripts/main.ts <url> -o output.md
npx -y bun ${SKILL_DIR}/scripts/main.ts <url> --download-media
npx -y bun ${SKILL_DIR}/scripts/main.ts <url> --json
```
@ -112,6 +148,7 @@ npx -y bun ${SKILL_DIR}/scripts/main.ts <url> --json
| `<url>` | Tweet or article URL |
| `-o <path>` | Output path |
| `--json` | JSON output |
| `--download-media` | Download image/video assets to local `imgs/` and `videos/`, and rewrite markdown links to local relative paths |
| `--login` | Refresh cookies only |
## Supported URLs
@ -124,9 +161,10 @@ npx -y bun ${SKILL_DIR}/scripts/main.ts <url> --json
```markdown
---
url: https://x.com/user/status/123
url: "https://x.com/user/status/123"
author: "Name (@user)"
tweet_count: 3
tweetCount: 3
coverImage: "https://pbs.twimg.com/media/example.jpg"
---
Content...
@ -134,6 +172,32 @@ Content...
**File structure**: `x-to-markdown/{username}/{tweet-id}.md`
When `--download-media` is enabled:
- Images are saved to `imgs/` next to the markdown file
- Videos are saved to `videos/` next to the markdown file
- Markdown media links are rewritten to local relative paths
## Media Download Workflow
Based on `download_media` setting in EXTEND.md:
| Setting | Behavior |
|---------|----------|
| `1` (always) | Run script with `--download-media` flag |
| `0` (never) | Run script without `--download-media` flag |
| `ask` (default) | Follow the ask-each-time flow below |
### Ask-Each-Time Flow
1. Run script **without** `--download-media` → markdown saved
2. Check saved markdown for remote media URLs (`https://` in image/video links)
3. **If no remote media found** → done, no prompt needed
4. **If remote media found** → use `AskUserQuestion`:
- header: "Media", question: "Download N images/videos to local files?"
- "Yes" — Download to local directories
- "No" — Keep remote URLs
5. If user confirms → run script **again** with `--download-media` (overwrites markdown with localized links)
## Authentication
1. **Environment variables** (preferred): `X_AUTH_TOKEN`, `X_CT0`

View File

@ -0,0 +1,106 @@
---
name: first-time-setup
description: First-time setup flow for baoyu-danger-x-to-markdown preferences
---
# First-Time Setup
## Overview
When no EXTEND.md is found, guide user through preference setup.
**BLOCKING OPERATION**: This setup MUST complete before ANY other workflow steps. Do NOT:
- Start converting tweets or articles
- Ask about URLs or output paths
- Proceed to any conversion
ONLY ask the questions in this setup flow, save EXTEND.md, then continue.
## Setup Flow
```
No EXTEND.md found
|
v
+---------------------+
| AskUserQuestion |
| (all questions) |
+---------------------+
|
v
+---------------------+
| Create EXTEND.md |
+---------------------+
|
v
Continue conversion
```
## Questions
**Language**: Use user's input language or saved language preference.
Use AskUserQuestion with ALL questions in ONE call:
### Question 1: Download Media
```yaml
header: "Media"
question: "How to handle images and videos in tweets?"
options:
- label: "Ask each time (Recommended)"
description: "After saving markdown, ask whether to download media"
- label: "Always download"
description: "Always download media to local imgs/ and videos/ directories"
- label: "Never download"
description: "Keep original remote URLs in markdown"
```
### Question 2: Default Output Directory
```yaml
header: "Output"
question: "Default output directory?"
options:
- label: "x-to-markdown (Recommended)"
description: "Save to ./x-to-markdown/{username}/{tweet-id}.md"
```
Note: User will likely choose "Other" to type a custom path.
### Question 3: Save Location
```yaml
header: "Save"
question: "Where to save preferences?"
options:
- label: "User (Recommended)"
description: "~/.baoyu-skills/ (all projects)"
- label: "Project"
description: ".baoyu-skills/ (this project only)"
```
## Save Locations
| Choice | Path | Scope |
|--------|------|-------|
| User | `~/.baoyu-skills/baoyu-danger-x-to-markdown/EXTEND.md` | All projects |
| Project | `.baoyu-skills/baoyu-danger-x-to-markdown/EXTEND.md` | Current project |
## After Setup
1. Create directory if needed
2. Write EXTEND.md
3. Confirm: "Preferences saved to [path]"
4. Continue with conversion using saved preferences
## EXTEND.md Template
```md
download_media: [ask/1/0]
default_output_dir: [path or empty]
```
## Modifying Preferences Later
Users can edit EXTEND.md directly or delete it to trigger setup again.

View File

@ -6,6 +6,7 @@ import { mkdir, readFile, rename, writeFile } from "node:fs/promises";
import { fetchXArticle } from "./graphql.js";
import { formatArticleMarkdown } from "./markdown.js";
import { localizeMarkdownMedia, type LocalizeMarkdownMediaResult } from "./media-localizer.js";
import { hasRequiredXCookies, loadXCookies, refreshXCookies } from "./cookies.js";
import { resolveXToMarkdownConsentPath } from "./paths.js";
import { tweetToMarkdown } from "./tweet-to-markdown.js";
@ -15,6 +16,7 @@ type CliArgs = {
output: string | null;
json: boolean;
login: boolean;
downloadMedia: boolean;
help: boolean;
};
@ -38,6 +40,7 @@ Usage:
Options:
--output <path>, -o Output path (file or dir). Default: ./x-to-markdown/<slug>/
--json Output as JSON
--download-media Download images/videos to local ./imgs and ./videos next to markdown
--login Refresh cookies only, then exit
--help, -h Show help
@ -45,6 +48,7 @@ Examples:
${cmd} https://x.com/username/status/1234567890
${cmd} https://x.com/i/article/1234567890 -o ./article.md
${cmd} https://x.com/username/status/1234567890 -o ./out/
${cmd} https://x.com/username/status/1234567890 --download-media
${cmd} https://x.com/username/status/1234567890 --json | jq -r '.markdownPath'
${cmd} --login
`);
@ -57,6 +61,7 @@ function parseArgs(argv: string[]): CliArgs {
output: null,
json: false,
login: false,
downloadMedia: false,
help: false,
};
@ -80,6 +85,11 @@ function parseArgs(argv: string[]): CliArgs {
continue;
}
if (a === "--download-media") {
out.downloadMedia = true;
continue;
}
if (a === "--url") {
const v = argv[++i];
if (!v) throw new Error("Missing value for --url");
@ -345,16 +355,17 @@ async function convertArticleToMarkdown(
log(`[x-to-markdown] Fetching article ${articleId}...`);
const article = await fetchXArticle(articleId, cookieMap, false);
const body = formatArticleMarkdown(article).trimEnd();
const { markdown: body, coverUrl } = formatArticleMarkdown(article);
const title = typeof (article as any)?.title === "string" ? String((article as any).title).trim() : "";
const meta = formatMetaMarkdown({
url: `https://x.com/i/article/${articleId}`,
requested_url: inputUrl,
requestedUrl: inputUrl,
title: title || null,
coverImage: coverUrl,
});
return [meta, body].filter(Boolean).join("\n\n").trimEnd();
return [meta, body.trimEnd()].filter(Boolean).join("\n\n").trimEnd();
}
async function main(): Promise<void> {
@ -385,11 +396,24 @@ async function main(): Promise<void> {
const kind = articleId ? ("article" as const) : ("tweet" as const);
const { outputDir, markdownPath, slug } = await resolveOutputPath(normalizedUrl, kind, args.output, log);
const markdown =
let markdown =
kind === "article" && articleId
? await convertArticleToMarkdown(normalizedUrl, articleId, log)
: await tweetToMarkdown(normalizedUrl, { log });
let mediaResult: LocalizeMarkdownMediaResult | null = null;
if (args.downloadMedia) {
mediaResult = await localizeMarkdownMedia(markdown, {
markdownPath,
log,
});
markdown = mediaResult.markdown;
log(
`[x-to-markdown] Media localized: images=${mediaResult.downloadedImages}, videos=${mediaResult.downloadedVideos}`
);
}
await writeFile(markdownPath, markdown, "utf8");
log(`[x-to-markdown] Saved: ${markdownPath}`);
@ -398,11 +422,16 @@ async function main(): Promise<void> {
JSON.stringify(
{
url: articleId ? `https://x.com/i/article/${articleId}` : normalizedUrl,
requested_url: normalizedUrl,
requestedUrl: normalizedUrl,
type: kind,
slug,
outputDir,
markdownPath,
downloadMedia: args.downloadMedia,
downloadedImages: mediaResult?.downloadedImages ?? 0,
downloadedVideos: mediaResult?.downloadedVideos ?? 0,
imageDir: mediaResult?.imageDir ?? null,
videoDir: mediaResult?.videoDir ?? null,
},
null,
2

View File

@ -257,10 +257,15 @@ function renderContentBlocks(
return lines;
}
export function formatArticleMarkdown(article: unknown): string {
export type FormatArticleResult = {
markdown: string;
coverUrl: string | null;
};
export function formatArticleMarkdown(article: unknown): FormatArticleResult {
const candidate = coerceArticleEntity(article);
if (!candidate) {
return `\`\`\`json\n${JSON.stringify(article, null, 2)}\n\`\`\``;
return { markdown: `\`\`\`json\n${JSON.stringify(article, null, 2)}\n\`\`\``, coverUrl: null };
}
const lines: string[] = [];
@ -271,10 +276,8 @@ export function formatArticleMarkdown(article: unknown): string {
lines.push(`# ${title}`);
}
const coverUrl = resolveMediaUrl(candidate.cover_media?.media_info);
const coverUrl = resolveMediaUrl(candidate.cover_media?.media_info) ?? null;
if (coverUrl) {
if (lines.length > 0) lines.push("");
lines.push(`![](${coverUrl})`);
usedUrls.add(coverUrl);
}
@ -294,7 +297,7 @@ export function formatArticleMarkdown(article: unknown): string {
lines.push(candidate.preview_text.trim());
}
const mediaUrls = collectMediaUrls(candidate, usedUrls, coverUrl);
const mediaUrls = collectMediaUrls(candidate, usedUrls, coverUrl ?? undefined);
if (mediaUrls.length > 0) {
lines.push("", "## Media", "");
for (const url of mediaUrls) {
@ -302,5 +305,5 @@ export function formatArticleMarkdown(article: unknown): string {
}
}
return lines.join("\n").trimEnd();
return { markdown: lines.join("\n").trimEnd(), coverUrl };
}

View File

@ -0,0 +1,314 @@
import path from "node:path";
import { mkdir, writeFile } from "node:fs/promises";
type MediaKind = "image" | "video";
type MediaHint = "image" | "unknown";
type MarkdownLinkCandidate = {
url: string;
hint: MediaHint;
};
export type LocalizeMarkdownMediaOptions = {
markdownPath: string;
log?: (message: string) => void;
};
export type LocalizeMarkdownMediaResult = {
markdown: string;
downloadedImages: number;
downloadedVideos: number;
imageDir: string | null;
videoDir: string | null;
};
const MARKDOWN_LINK_RE = /(!?\[[^\]\n]*\])\((<)?(https?:\/\/[^)\s>]+)(>)?\)/g;
const IMAGE_EXTENSIONS = new Set([
"jpg",
"jpeg",
"png",
"webp",
"gif",
"bmp",
"avif",
"heic",
"heif",
"svg",
]);
const VIDEO_EXTENSIONS = new Set(["mp4", "m4v", "mov", "webm", "mkv"]);
const MIME_EXTENSION_MAP: Record<string, string> = {
"image/jpeg": "jpg",
"image/jpg": "jpg",
"image/png": "png",
"image/webp": "webp",
"image/gif": "gif",
"image/bmp": "bmp",
"image/avif": "avif",
"image/heic": "heic",
"image/heif": "heif",
"image/svg+xml": "svg",
"video/mp4": "mp4",
"video/webm": "webm",
"video/quicktime": "mov",
"video/x-m4v": "m4v",
};
const DOWNLOAD_USER_AGENT =
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36";
function normalizeContentType(raw: string | null): string {
return raw?.split(";")[0]?.trim().toLowerCase() ?? "";
}
function normalizeExtension(raw: string | undefined | null): string | undefined {
if (!raw) return undefined;
const trimmed = raw.replace(/^\./, "").trim().toLowerCase();
if (!trimmed) return undefined;
if (trimmed === "jpeg") return "jpg";
if (trimmed === "jpg") return "jpg";
return trimmed;
}
function resolveExtensionFromUrl(rawUrl: string): string | undefined {
try {
const parsed = new URL(rawUrl);
const extFromPath = normalizeExtension(path.posix.extname(parsed.pathname));
if (extFromPath) return extFromPath;
const extFromFormat = normalizeExtension(parsed.searchParams.get("format"));
if (extFromFormat) return extFromFormat;
} catch {
return undefined;
}
return undefined;
}
function resolveKindFromContentType(contentType: string): MediaKind | undefined {
if (!contentType) return undefined;
if (contentType.startsWith("image/")) return "image";
if (contentType.startsWith("video/")) return "video";
return undefined;
}
function resolveKindFromExtension(ext: string | undefined): MediaKind | undefined {
if (!ext) return undefined;
if (IMAGE_EXTENSIONS.has(ext)) return "image";
if (VIDEO_EXTENSIONS.has(ext)) return "video";
return undefined;
}
function resolveKindFromHostname(rawUrl: string): MediaKind | undefined {
try {
const hostname = new URL(rawUrl).hostname.toLowerCase();
if (hostname.includes("video.twimg.com")) return "video";
if (hostname.includes("pbs.twimg.com")) return "image";
} catch {
return undefined;
}
return undefined;
}
function resolveMediaKind(
rawUrl: string,
contentType: string,
extension: string | undefined,
hint: MediaHint
): MediaKind | undefined {
const kindFromType = resolveKindFromContentType(contentType);
if (kindFromType) return kindFromType;
const kindFromExtension = resolveKindFromExtension(extension);
if (kindFromExtension) return kindFromExtension;
const kindFromHost = resolveKindFromHostname(rawUrl);
if (kindFromHost) return kindFromHost;
if (contentType && contentType !== "application/octet-stream") {
return undefined;
}
return hint === "image" ? "image" : undefined;
}
function resolveOutputExtension(
contentType: string,
extension: string | undefined,
kind: MediaKind
): string {
const extFromMime = normalizeExtension(MIME_EXTENSION_MAP[contentType]);
if (extFromMime) return extFromMime;
const normalizedExt = normalizeExtension(extension);
if (normalizedExt) return normalizedExt;
return kind === "video" ? "mp4" : "jpg";
}
function safeDecodeURIComponent(value: string): string {
try {
return decodeURIComponent(value);
} catch {
return value;
}
}
function sanitizeFileSegment(input: string): string {
return input
.replace(/[^a-zA-Z0-9_-]+/g, "-")
.replace(/-+/g, "-")
.replace(/^[-_]+|[-_]+$/g, "")
.slice(0, 48);
}
function resolveFileStem(rawUrl: string, extension: string): string {
try {
const parsed = new URL(rawUrl);
const base = path.posix.basename(parsed.pathname);
if (!base) return "";
const decodedBase = safeDecodeURIComponent(base);
const normalizedExt = normalizeExtension(extension);
const stripExt = normalizedExt ? new RegExp(`\\.${normalizedExt}$`, "i") : null;
const rawStem = stripExt ? decodedBase.replace(stripExt, "") : decodedBase;
return sanitizeFileSegment(rawStem);
} catch {
return "";
}
}
function buildFileName(kind: MediaKind, index: number, sourceUrl: string, extension: string): string {
const stem = resolveFileStem(sourceUrl, extension);
const prefix = kind === "image" ? "img" : "video";
const serial = String(index).padStart(3, "0");
const suffix = stem ? `-${stem}` : "";
return `${prefix}-${serial}${suffix}.${extension}`;
}
const FRONTMATTER_COVER_RE = /^(coverImage:\s*")(https?:\/\/[^"]+)(")/m;
function collectMarkdownLinkCandidates(markdown: string): MarkdownLinkCandidate[] {
MARKDOWN_LINK_RE.lastIndex = 0;
const candidates: MarkdownLinkCandidate[] = [];
const seen = new Set<string>();
let match: RegExpExecArray | null;
while ((match = MARKDOWN_LINK_RE.exec(markdown))) {
const label = match[1] ?? "";
const rawUrl = match[3] ?? "";
if (!rawUrl || seen.has(rawUrl)) continue;
seen.add(rawUrl);
candidates.push({
url: rawUrl,
hint: label.startsWith("![") ? "image" : "unknown",
});
}
const fmMatch = markdown.match(/^---\n([\s\S]*?)\n---/);
if (fmMatch) {
const coverMatch = fmMatch[1]?.match(FRONTMATTER_COVER_RE);
if (coverMatch?.[2] && !seen.has(coverMatch[2])) {
seen.add(coverMatch[2]);
candidates.push({ url: coverMatch[2], hint: "image" });
}
}
return candidates;
}
function rewriteMarkdownMediaLinks(markdown: string, replacements: Map<string, string>): string {
if (replacements.size === 0) return markdown;
MARKDOWN_LINK_RE.lastIndex = 0;
let result = markdown.replace(MARKDOWN_LINK_RE, (full, label, _openAngle, rawUrl) => {
const localPath = replacements.get(rawUrl);
if (!localPath) return full;
return `${label}(${localPath})`;
});
result = result.replace(FRONTMATTER_COVER_RE, (full, prefix, rawUrl, suffix) => {
const localPath = replacements.get(rawUrl);
if (!localPath) return full;
return `${prefix}${localPath}${suffix}`;
});
return result;
}
export async function localizeMarkdownMedia(
markdown: string,
options: LocalizeMarkdownMediaOptions
): Promise<LocalizeMarkdownMediaResult> {
const log = options.log ?? (() => {});
const markdownDir = path.dirname(options.markdownPath);
const candidates = collectMarkdownLinkCandidates(markdown);
if (candidates.length === 0) {
return {
markdown,
downloadedImages: 0,
downloadedVideos: 0,
imageDir: null,
videoDir: null,
};
}
const replacements = new Map<string, string>();
let downloadedImages = 0;
let downloadedVideos = 0;
for (const candidate of candidates) {
try {
const response = await fetch(candidate.url, {
method: "GET",
redirect: "follow",
headers: {
"user-agent": DOWNLOAD_USER_AGENT,
},
});
if (!response.ok) {
log(`[x-to-markdown] Skip media (${response.status}): ${candidate.url}`);
continue;
}
const sourceUrl = response.url || candidate.url;
const contentType = normalizeContentType(response.headers.get("content-type"));
const extension = resolveExtensionFromUrl(sourceUrl) ?? resolveExtensionFromUrl(candidate.url);
const kind = resolveMediaKind(sourceUrl, contentType, extension, candidate.hint);
if (!kind) {
continue;
}
const outputExtension = resolveOutputExtension(contentType, extension, kind);
const nextIndex = kind === "image" ? downloadedImages + 1 : downloadedVideos + 1;
const dirName = kind === "image" ? "imgs" : "videos";
const targetDir = path.join(markdownDir, dirName);
await mkdir(targetDir, { recursive: true });
const fileName = buildFileName(kind, nextIndex, sourceUrl, outputExtension);
const absolutePath = path.join(targetDir, fileName);
const relativePath = path.posix.join(dirName, fileName);
const bytes = Buffer.from(await response.arrayBuffer());
await writeFile(absolutePath, bytes);
replacements.set(candidate.url, relativePath);
if (kind === "image") {
downloadedImages = nextIndex;
} else {
downloadedVideos = nextIndex;
}
} catch (error) {
const message = error instanceof Error ? error.message : String(error ?? "");
log(`[x-to-markdown] Failed to download media ${candidate.url}: ${message}`);
}
}
return {
markdown: rewriteMarkdownMediaLinks(markdown, replacements),
downloadedImages,
downloadedVideos,
imageDir: downloadedImages > 0 ? path.join(markdownDir, "imgs") : null,
videoDir: downloadedVideos > 0 ? path.join(markdownDir, "videos") : null,
};
}

View File

@ -123,22 +123,15 @@ export async function tweetToMarkdown(
const requestedUrl = normalizedUrl || buildTweetUrl(username, tweetId) || inputUrl.trim();
const rootUrl = buildTweetUrl(username, thread.rootId ?? tweetId) ?? requestedUrl;
const meta = formatMetaMarkdown({
url: rootUrl,
requested_url: requestedUrl,
author,
author_name: name ?? null,
author_username: username ?? null,
author_url: authorUrl ?? null,
tweet_count: thread.totalTweets ?? tweets.length,
});
const parts: string[] = [meta];
const articleEntity = await resolveArticleEntityFromTweet(firstTweet, cookieMap);
let coverImage: string | null = null;
let remainingTweets = tweets;
const parts: string[] = [];
if (articleEntity) {
const articleMarkdown = formatArticleMarkdown(articleEntity).trimEnd();
const articleResult = formatArticleMarkdown(articleEntity);
coverImage = articleResult.coverUrl;
const articleMarkdown = articleResult.markdown.trimEnd();
if (articleMarkdown) {
parts.push(articleMarkdown);
const firstTweetText = extractTweetText(firstTweet);
@ -148,6 +141,19 @@ export async function tweetToMarkdown(
}
}
const meta = formatMetaMarkdown({
url: rootUrl,
requestedUrl: requestedUrl,
author,
authorName: name ?? null,
authorUsername: username ?? null,
authorUrl: authorUrl ?? null,
tweetCount: thread.totalTweets ?? tweets.length,
coverImage,
});
parts.unshift(meta);
if (remainingTweets.length > 0) {
const hasArticle = parts.length > 1;
if (hasArticle) {