Skip to content

Release history

skill-seekers/Skill_Seekers releases

Transform 17 source types (docs, GitHub repos, PDFs, videos, Jupyter, Confluence, Notion, Slack/Discord) into AI-ready skills and RAG knowledge. 35 MCP tools for scraping, packaging, enhancing, and exporting to vector databases (Weaviate, Chroma, FAISS, Qdrant). Supports 16+ target platforms.

All releases

20 shown

No immediate action
v3.7.0 Breaking risk

scan command + opt-in submission

v3.6.0 Breaking risk
Notable features
  • IBM Bob packaging target via `--target bob`
  • GitHub scraper filters: issue state, labels, and since date
  • Per-issue Markdown files for GitHub issues
Full changelog

[3.6.0] - 2026-05-03

Theme: Quality-of-life release — packaging targets, GitHub issue workflow, codebase analysis fixes, and source detection hardening.

Added

  • IBM Bob packaging target — new --target bob adaptor and agent install support for IBM's Bob agent platform (#366)
  • GitHub issue filtering--github-issue-state, --github-issue-labels, and --github-issue-since filters in the GitHub scraper for narrowing which issues are pulled (#367)
  • Per-issue files — GitHub scraper now writes one Markdown file per issue instead of a single bundle, improving navigation and downstream chunking (#367)
  • Pinecone frontmatter — Pinecone vector exports now include consistent YAML frontmatter for metadata round-tripping (#367)

Fixed

  • Unified scraper now generates codebase_analysis/ index — local sources were producing C3.x outputs with broken SKILL.md links; the unified skill builder now wires up the index and resolves links correctly (#362, #376)
  • Guides fallback fires correctlyunified_skill_builder was emitting a truthy placeholder for empty guides which suppressed the fallback content; placeholder removed (#364, #375)
  • HTML URLs no longer treated as local filessource_detector now checks for http(s):// before falling through to the local-path branch, fixing false-positive routing (#373)
  • PDF extracted images appear in markdownpdf_scraper now inserts ![](…) references for images extracted from PDFs so they render in the generated SKILL.md (#369)
  • C3.x output for local sourcesunified command was skipping the C3.x analysis pipeline for local codebase sources; now emits the full pattern/test/guide/config/router output (#363, #372)
  • Language filter passed to C3.x clone analysis — repos cloned for analysis now respect --languages instead of analyzing every file (fixes #361, #370)
  • Unity vs Unreal detection — Unity projects with C# imports were being misidentified as Unreal; detection now keys on C# import patterns (fixes #365, #368)
v3.5.1 Breaking risk
Breaking changes
  • max_pages default changed from 500 to -1 (unlimited)
  • removal of hardcoded magic numbers in constants.py; now reads defaults.json
Notable features
  • Centralized `defaults.json` config as single source of truth for all default values
  • Low‑signal code snippet filtering via `_is_low_signal_code_snippet()`
  • Pattern description normalization with `_normalize_pattern_description()`
Full changelog

[3.5.1] - 2026-04-12

Added

  • Centralized defaults.json config — single source of truth for all default values (rate_limit, max_pages, workers, async_mode, enhancement, analysis, RAG settings). New defaults.py loader module. All 15+ files that previously hardcoded defaults now read from this file (#356)
  • Low-signal code snippet filtering_is_low_signal_code_snippet() filters junk patterns like bare True, options, single identifiers from quick references (#360)
  • Pattern description normalization_normalize_pattern_description() cleans boilerplate prefixes and truncates to first meaningful sentence (#360)
  • Example language priority ranking_example_language_priority() ranks Python > Bash > JSON > etc. for SKILL.md examples (#360)
  • checkpoint_exists() method on DocToSkillConverter — was called but never defined (#360)
  • Unified config source normalizationDocToSkillConverter.__init__ merges fields from sources[0] into flat config for compatibility (#360)
  • display_name support in SKILL.md generation — produces cleaner titles and slugs (#360)
  • New tests: test_doc_scraper_entrypoint.py (regression for _run_scraping), quick-reference quality tests, docs-only compatibility tests, nested reference coverage tests (#360)

Changed

  • max_pages default is now unlimited (-1) — the scraper fetches all pages unless the user explicitly sets --max-pages. Previously defaulted to 500 (#356)
  • --no-rate-limit flag now works — was defined in CLI arguments but never consumed by ExecutionContext (#356)
  • constants.py reads from defaults.json — no longer contains hardcoded magic numbers (#356)
  • ExecutionContext.ScrapingSettingsrate_limit and max_pages now use real defaults instead of None, preventing None-poisoning downstream (#356)
  • SKILL.md frontmatter cleanup — empty doc_version: and version: fields are now omitted; placeholder sections removed (#360)
  • Enhancement routing through platform adaptors instead of importing nonexistent enhance_skill_md helper (#360)
  • quality_metrics.py uses rglob for nested reference directories in unified skills (#360)

Fixed

  • TypeError: '>' not supported between instances of 'NoneType' and 'int'rate_limit defaulted to None in ExecutionContext, which flowed through config.get("rate_limit", DEFAULT) (dict.get returns None when the key exists with value None, ignoring the fallback). Fixed in doc_scraper.py (sync + async paths), estimate_pages.py, and sync_config.py (#356, #359)
  • discover_urls() loop never executed with unlimited max_pageslen(discovered) < -1 is always False. Added unlimited mode guard (#356)
  • converter.scrape() called nonexistent method in _run_scraping() — changed to converter.scrape_all() (#360)
  • None-safety for BeautifulSoup attributeslink["href"], sitemap.text, meta_desc["content"] guarded against None XML text nodes (#360)
  • Python 3.10 compatibility — backslash in f-string in quality_metrics.py not supported before 3.12 (#360)
v3.5.0 Breaking risk
⚠ Upgrade required
  • All content extraction features (pattern detection, test examples, how‑to guides, config extraction, router generation) are now enabled by default; no opt‑in required
  • Dynamic routing via `_build_argv()` replaces manual argument forwarding and adds 7 previously missing CLI flags
Breaking changes
  • Renamed `claude-enhanced` merge mode to `ai-enhanced` (backward‑compatible alias retained)
  • Removed hardcoded Claude references across the codebase
  • Removed GitHub API analysis limit of 50 files and config extraction limit of 100 files
Security fixes
  • Removed command injection vulnerability from cloned repo script execution
  • Replaced `git add -A` with targeted staging in marketplace publisher
  • Cleared auth tokens from cached `.git/config` after clone
Notable features
  • Grand Unification: single `create` command for 18 source types with auto‑detection and direct converters
  • Agent‑agnostic `AgentClient` abstraction supporting Claude, Kimi, Codex, Copilot, OpenCode, and custom agents via API‑key detection
  • Headless browser rendering (`--browser` flag) using Playwright to handle JavaScript SPAs
Full changelog

[3.5.0] - 2026-04-09

Theme: Grand Unification — one command, one interface, direct converters. Agent-agnostic architecture, marketplace pipeline, smart SPA discovery, all content extraction enabled by default. 80+ files changed across the codebase.

Added

  • Grand Unification — unified create command as single entry point for all 18 source types with auto-detection, direct converter invocation, and centralized enhancement (#346)
  • Agent-agnostic AgentClient abstraction — all 5 enhancers now support Claude, Kimi, Codex, Copilot, OpenCode, and custom agents via a unified interface. Auto-detects agent from API keys instead of hardcoding (#336)
  • Kimi CLI integration with stdin piping and output parsing (#336)
  • MarketplacePublisher — publish skills to Claude Code plugin marketplace repos (#336)
  • MarketplaceManager — register and manage marketplace repositories (#336)
  • ConfigPublisher — push configs to registered config source repos (#336)
  • push_config MCP tool for automated config publishing (#336)
  • Smart SPA discovery engine — three-layer discovery: sitemap.xml, llms.txt, SPA nav rendering (#336)
  • "browser": true config support for JavaScript SPA sites with browser renderer timeout defaults (60s, domcontentloaded) (#336)
  • Dynamic routing via _build_argv() — replaced manual arg forwarding with dynamic forwarder, added 7 missing CLI flags (#336)
  • Kotlin language support for codebase analysis — Full C3.x pipeline support: AST parsing (classes, objects, functions, data/sealed classes, extension functions, coroutines), dependency extraction, design pattern recognition (object declaration→Singleton, companion object→Factory, sealed class→Strategy), test example extraction (JUnit, Kotest, MockK, Spek), language detection patterns, config detection (build.gradle.kts), and extension maps across all analyzers (#287)
  • Headless browser rendering (--browser flag) — uses Playwright to render JavaScript SPA sites (React, Vue, etc.) that return empty HTML shells. Auto-installs Chromium on first use. Optional dep: pip install "skill-seekers[browser]" (#321)
  • skill-seekers doctor command — 8 diagnostic checks (Python version, package install, git, core/optional deps, API keys, MCP server, output dir) with pass/warn/fail status and --verbose flag (#316)
  • Prompt injection check workflow — bundled prompt-injection-check workflow scans scraped content for injection patterns (role assumption, instruction overrides, delimiter injection, hidden instructions). Added as first stage in default and security-focus workflows. Flags suspicious content without removing it (#324)
  • Codex CLI plugin manifest (.codex-plugin/plugin.json) for OpenAI Codex integration (#350)
  • 6 behavioral UML diagrams — 3 sequence (create pipeline, GitHub+C3.x flow, MCP invocation), 2 activity (source detection, enhancement pipeline), 1 component (runtime dependencies with interface contracts)
  • 134 new teststest_agent_client.py, test_config_publisher.py, _build_argv tests. Total: 3194 passed, 39 expected skips (#336)

Changed

  • All content extraction features enabled by default — pattern detection, test examples, how-to guides, config extraction, and router generation no longer require explicit opt-in
  • Renamed claude-enhanced merge mode to ai-enhanced — backward compatibility alias kept (#336)
  • Removed 118+ hardcoded Claude references across 60+ files (#336)
  • Refactored 5 enhancers to use AgentClient abstraction (#336)
  • Removed 50-file GitHub API analysis limit (#336)
  • Removed 100-file config extraction limit (#336)
  • Fixed unified scraper default max_pages from 100 to 500 (#336)
  • Centralized enhancement timeouts to 45min default with unlimited support (#336)
  • Excluded slow MCP/e2e tests from CI coverage step to prevent timeout

Fixed

  • glob('*.md') replaced with rglob('*.md') in all adaptors — fixes packaging when skills are in nested directories (#349)
  • scraped_data list-vs-dict bug in conflict detection (#336)
  • base_url passthrough to doc scraper subprocess (#336)
  • URL filtering now uses base directory correctly (#336)
  • C3.x analysis data loss (#336)
  • --enhance-level flag not passed correctly (#336)
  • guide_enhancer method rename_call_claude_api renamed to _call_ai (#336)
  • 11 pre-existing test failures fixed (#336)
  • Per-file language detection in GitHub scraper (#336)
  • GitHub language detection crashes with TypeError when API response contains non-integer metadata keys (e.g., "url") — now filters to integer values only (#322)
  • C3.x codebase analysis crashes with TypeError_run_c3_analysis() and _analyze_c3x() passed removed enhance_with_ai/ai_mode kwargs to analyze_codebase() instead of enhance_level (#323)

Security

  • Removed command injection via cloned repo script execution (#336)
  • Replaced git add -A with targeted staging in marketplace publisher (#336)
  • Clear auth tokens from cached .git/config after clone (#336)
  • Use defusedxml for sitemap XML parsing (XXE protection) (#336)
  • Path traversal validation for config names (#336)
v3.4.0 New feature
Notable features
  • 8 new LLM platform adaptors (OpenCode, Kimi, DeepSeek, Qwen, OpenRouter, Together AI, Fireworks AI) bringing total to 12
  • 7 new CLI agent install paths (roo, cline, aider, bolt, kilo, continue, kimi-code) raising count to 18
  • OpenCode skill tools: auto‑splitter and bi‑directional converter
Full changelog

What's New in v3.4.0

Theme: 8 new LLM platform adaptors (12 total), 7 new CLI agent paths (18 total), OpenCode skill tools, SPA site detection, 8 bug fixes, and full UML architecture documentation.

Platform Expansion: 5 → 12 LLM Targets

| New Platform | Flag | Base |
|---|---|---|
| OpenCode | --target opencode | Directory-based, dual YAML |
| Kimi | --target kimi | OpenAI-compatible |
| DeepSeek | --target deepseek | OpenAI-compatible |
| Qwen | --target qwen | OpenAI-compatible |
| OpenRouter | --target openrouter | OpenAI-compatible |
| Together AI | --target together | OpenAI-compatible |
| Fireworks AI | --target fireworks | OpenAI-compatible |

All new platforms inherit from a shared OpenAI-compatible base class for consistent behavior.

Agent Expansion: 11 → 18 Install Paths

New agents: roo, cline, aider, bolt, kilo, continue, kimi-code

OpenCode Skill Tools

  • Skill splitter — auto-split large docs into focused sub-skills with router
  • Bi-directional converter — import/export between OpenCode and any platform format

Distribution

  • Smithery manifest (smithery.yaml)
  • GitHub Actions template for automated skill updates
  • Claude Code Plugin with slash commands

Bug Fixes

  • sanitize_url() crash on Python 3.14 strict urlparse (#284)
  • Blind /index.html.md append breaking non-Docusaurus sites (#277)
  • Unified scraper temp config format (#317)
  • Unicode arrows breaking Windows cp1252 terminals
  • CLI flags in plugin slash commands
  • MiniMax adaptor improvements (#319)
  • Misleading "Scraped N pages" count — now shows (N saved, M skipped) (#320)
  • SPA site detection — warns when site requires JavaScript rendering (#320, #321)

Documentation

  • Full UML architecture — 14 class diagrams synced from source code via StarUML
  • StarUML HTML API reference export
  • Ecosystem section linking all Skill Seekers repos
  • Architecture references in README and CONTRIBUTING
  • Consolidated Docs/ into docs/

Test Results

2929 passed, 39 skipped, 0 failures

Install / Upgrade

pip install --upgrade skill-seekers

Full changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md

v3.3.0 New feature
⚠ Upgrade required
  • Optional dependency groups (`[jupyter]`, `[asciidoc]`, `[pptx]`, `[confluence]`, `[notion]`, `[rss]`, `[chat]`) added; install with `pip install "skill-seekers[jupyter]"` etc. for new source types
  • Config validator now checks all 17 source types (including previously missing `word` and `video`). Existing configs should be validated after upgrade.
  • CLI has ten new subcommands (`jupyter`, `html`, `openapi`, `asciidoc`, `pptx`, `rss`, `manpage`, `confluence`, `notion`, `chat`) – update scripts or aliases accordingly.
Notable features
  • 10 new source types (Jupyter, HTML, OpenAPI/Swagger, AsciiDoc, PowerPoint, RSS/Atom, Man pages, Confluence, Notion, Slack/Discord) integrated into CLI and multi‑source configs
  • Unified EPUB pipeline added to `skill-seekers` with DRM detection and TOC bug workaround
  • `sync-config` subcommand that crawls navigation links, diffs against config start URLs, and optionally updates the configuration
Full changelog

[3.3.0] - 2026-03-16

Theme: 10 new source types (17 total), EPUB unified integration, sync-config command, performance optimizations, 12 README translations, and 19 bug fixes. 117 files changed, +41,588 lines since v3.2.0.

Supported Source Types (17)

| # | Type | CLI Command | Config Type | Auto-Detection |
|---|------|-------------|-------------|----------------|
| 1 | Documentation (web) | scrape / create <url> | documentation | HTTP/HTTPS URLs |
| 2 | GitHub repository | github / create owner/repo | github | owner/repo or github.com URLs |
| 3 | PDF document | pdf / create file.pdf | pdf | .pdf extension |
| 4 | Word document | word / create file.docx | word | .docx extension |
| 5 | EPUB e-book | epub / create file.epub | epub | .epub extension |
| 6 | Video | video / create <url/file> | video | YouTube/Vimeo URLs, video extensions |
| 7 | Local codebase | analyze / create ./path | local | Directory paths |
| 8 | Jupyter Notebook | jupyter / create file.ipynb | jupyter | .ipynb extension |
| 9 | Local HTML | html / create file.html | html | .html/.htm extensions |
| 10 | OpenAPI/Swagger | openapi / create spec.yaml | openapi | .yaml/.yml with OpenAPI content |
| 11 | AsciiDoc | asciidoc / create file.adoc | asciidoc | .adoc/.asciidoc extensions |
| 12 | PowerPoint | pptx / create file.pptx | pptx | .pptx extension |
| 13 | RSS/Atom feed | rss / create feed.rss | rss | .rss/.atom extensions |
| 14 | Man pages | manpage / create cmd.1 | manpage | .1.8/.man extensions |
| 15 | Confluence wiki | confluence | confluence | API or export directory |
| 16 | Notion pages | notion | notion | API or export directory |
| 17 | Slack/Discord chat | chat | chat | Export directory or API |

Added

10 New Skill Source Types (17 total)

Skill Seekers now supports 17 source types — up from 7. Every new type is fully integrated into the CLI (skill-seekers <type>), create command auto-detection, unified multi-source configs, config validation, the MCP server, and the skill builder.

  • Jupyter Notebookskill-seekers jupyter --notebook file.ipynb or skill-seekers create file.ipynb

    • Extracts markdown cells, code cells with outputs, kernel metadata, imports, and language detection
    • Handles single files and directories of notebooks; filters .ipynb_checkpoints
    • Optional dependency: pip install "skill-seekers[jupyter]" (nbformat)
    • Entry point: skill-seekers-jupyter
  • Local HTMLskill-seekers html --html-path file.html or skill-seekers create file.html

    • Parses HTML using BeautifulSoup with smart main content detection (<article>, <main>, .content, largest div)
    • Extracts headings, code blocks, tables (to markdown), images, links; converts inline HTML to markdown
    • Handles single files and directories; supports .html, .htm, .xhtml extensions
    • No extra dependencies (BeautifulSoup is a core dep)
  • OpenAPI/Swaggerskill-seekers openapi --spec spec.yaml or skill-seekers create spec.yaml

    • Parses OpenAPI 3.0/3.1 and Swagger 2.0 specs from YAML or JSON (local files or URLs via --spec-url)
    • Extracts endpoints, parameters, request/response schemas, security schemes, tags
    • Resolves $ref references with circular reference protection; handles allOf/oneOf/anyOf
    • Groups endpoints by tags; generates comprehensive API reference markdown
    • Source detection sniffs YAML file content for openapi: or swagger: keys (avoids false positives on non-API YAML files)
    • Optional dependency: pip install "skill-seekers[openapi]" (pyyaml — already a core dep, guard added for safety)
  • AsciiDocskill-seekers asciidoc --asciidoc-path file.adoc or skill-seekers create file.adoc

    • Regex-based parser (no external library required) with optional asciidoc library support
    • Extracts headings (= through =====), [source,lang] code blocks, |=== tables, admonitions (NOTE/TIP/WARNING/IMPORTANT/CAUTION), and include:: directives
    • Converts AsciiDoc formatting to markdown; handles single files and directories
    • Optional dependency: pip install "skill-seekers[asciidoc]" (asciidoc library for advanced rendering)
  • PowerPoint (.pptx)skill-seekers pptx --pptx file.pptx or skill-seekers create file.pptx

    • Extracts slide text, speaker notes, tables, images (with alt text), and grouped shapes
    • Detects code blocks by monospace font analysis (30+ font families)
    • Groups slides into sections by layout type; handles single files and directories
    • Optional dependency: pip install "skill-seekers[pptx]" (python-pptx)
  • RSS/Atom Feedsskill-seekers rss --feed-url <url> / --feed-path file.rss or skill-seekers create feed.rss

    • Parses RSS 2.0, RSS 1.0, and Atom feeds via feedparser
    • Optionally follows article links (--follow-links, default on) to scrape full page content using BeautifulSoup
    • Extracts article titles, summaries, authors, dates, categories; configurable --max-articles (default 50)
    • Source detection matches .rss and .atom extensions (.xml excluded to avoid false positives)
    • Optional dependency: pip install "skill-seekers[rss]" (feedparser)
  • Man Pagesskill-seekers manpage --man-names git,curl / --man-path dir/ or skill-seekers create git.1

    • Extracts man pages by running man command via subprocess or reading .1.8/.man files directly
    • Handles gzip/bzip2/xz compressed man files; strips troff/groff formatting (backspace overstriking, macros, font escapes)
    • Parses structured sections (NAME, SYNOPSIS, DESCRIPTION, OPTIONS, EXAMPLES, SEE ALSO)
    • Source detection uses basename heuristic to avoid false positives on log rotation files (e.g., access.log.1)
    • No external dependencies (stdlib only)
  • Confluenceskill-seekers confluence --base-url <url> --space-key <key> or --export-path dir/

    • API mode: fetches pages from Confluence REST API with pagination (atlassian-python-api)
    • Export mode: parses Confluence HTML/XML export directories
    • Extracts page content, code/panel/info/warning macros, page hierarchy, tables
    • Optional dependency: pip install "skill-seekers[confluence]" (atlassian-python-api)
  • Notionskill-seekers notion --database-id <id> / --page-id <id> or --export-path dir/

    • API mode: fetches pages via Notion API with support for 20+ block types (paragraph, heading, code, callout, toggle, table, etc.)
    • Export mode: parses Notion Markdown/CSV export directories
    • Extracts rich text with annotations (bold, italic, code, links), 16+ property types for database entries
    • Optional dependency: pip install "skill-seekers[notion]" (notion-client)
  • Slack/Discord Chatskill-seekers chat --export-path dir/ or --token <token> --channel <channel>

    • Slack: parses workspace JSON exports or fetches via Slack Web API (slack_sdk)
    • Discord: parses DiscordChatExporter JSON or fetches via Discord HTTP API
    • Extracts messages, code snippets (fenced blocks), shared URLs, threads, reactions, attachments
    • Generates per-channel summaries and topic categorization
    • Optional dependency: pip install "skill-seekers[chat]" (slack-sdk)

EPUB Unified Pipeline Integration

  • EPUB (.epub) input support via skill-seekers create book.epub or skill-seekers epub --epub book.epub
    • Extracts chapters, metadata (Dublin Core), code blocks, images, and tables from EPUB 2 and EPUB 3 files
    • DRM detection with clear error messages (Adobe ADEPT, Apple FairPlay, Readium LCP)
    • Font obfuscation correctly identified as non-DRM
    • EPUB 3 TOC bug workaround (ignore_ncx option)
    • --help-epub flag for EPUB-specific help
    • Optional dependency: pip install "skill-seekers[epub]" (ebooklib)
    • 107 tests across 14 test classes
  • EPUB added to unified scraper_scrape_epub() method, scraped_data["epub"], config validation (_validate_epub_source), and dry-run display. Previously EPUB worked standalone but was missing from multi-source configs.

Unified Skill Builder — Generic Merge System

  • _generic_merge() — Priority-based section merge for any combination of source types not covered by existing pairwise synthesis (docs+github, docs+pdf, etc.). Produces YAML frontmatter + source-attributed sections.
  • _append_extra_sources() — Appends additional source type content (e.g., Jupyter + PPTX) to pairwise-synthesized SKILL.md.
  • _generate_generic_references() — Generates references/<type>/index.md for any source type, with ID resolution fallback chain.
  • _SOURCE_LABELS dict — Human-readable labels for all 17 source types used in merge attribution.

Config Validator Expansion

  • 17 source types in VALID_SOURCE_TYPES — All new types plus word and video now have per-type validation methods.
  • _validate_word_source() — Validates path field for Word documents (was previously missing).
  • _validate_video_source() — Validates url, path, or playlist field for video sources (was previously missing).
  • 11 new _validate_*_source() methods — One for each new type with appropriate required-field checks.

Source Detection Improvements

  • 7 new file extension detections in SourceDetector.detect().ipynb, .html/.htm, .pptx, .adoc/.asciidoc, .rss/.atom, .1.8/.man, .yaml/.yml (with content sniffing)
  • _looks_like_openapi() — Content sniffing for YAML files: only classifies as OpenAPI if the file contains openapi: or swagger: key in first 20 lines (prevents false positives on docker-compose, Ansible, Kubernetes manifests, etc.)
  • Man page basename heuristic.1.8 extensions only detected as man pages if the basename has no dots (e.g., git.1 matches but access.log.1 does not)
  • .xml excluded from RSS detection — Too generic; only .rss and .atom trigger RSS detection

MCP Server Integration

  • scrape_generic tool — New MCP tool handles all 10 new source types via subprocess with per-type flag mapping
  • _PATH_FLAGS / _URL_FLAGS dicts — Correct flag routing for each source type (e.g., jupyter→--notebook, html→--html-path, rss→--feed-url)
  • GENERIC_SOURCE_TYPES tuple — Lists all 10 new types for validation
  • Config validation displayvalidate_config tool now shows source details for all new types
  • Tool count updated — 33 → 34 tools (scraping tools 10 → 11)

CLI Wiring

  • 10 new CLI subcommandsjupyter, html, openapi, asciidoc, pptx, rss, manpage, confluence, notion, chat in COMMAND_MODULES
  • 10 new argument modulesarguments/{jupyter,html,openapi,asciidoc,pptx,rss,manpage,confluence,notion,chat}.py with per-type *_ARGUMENTS dicts
  • 10 new parser modulesparsers/{jupyter,html,openapi,asciidoc,pptx,rss,manpage,confluence,notion,chat}_parser.py with SubcommandParser implementations
  • create command routing_route_generic() method for all new types with correct module names and CLI flags
  • 10 new entry points in pyproject.toml — skill-seekers-{jupyter,html,openapi,asciidoc,pptx,rss,manpage,confluence,notion,chat}
  • 7 new optional dependency groups in pyproject.toml — [jupyter], [asciidoc], [pptx], [confluence], [notion], [rss], [chat]
  • [all] group updated — Includes all 7 new optional dependencies

Sync Config Command

  • skill-seekers sync-config — New subcommand that crawls a docs site's navigation, diffs discovered URLs against a config's start_urls, and optionally writes the updated list back with --apply (#306)
    • BFS link discovery with configurable depth (default 2), max-pages, rate-limit
    • Respects url_patterns.include/exclude from config
    • Supports optional nav_seed_urls config field
    • Handles both unified (sources array) and legacy flat config formats
    • MCP sync_config tool included
    • 57 tests (39 unit + 18 E2E with local HTTP server)

Workflow & Documentation

  • complex-merge.yaml — New 7-stage AI-powered workflow for complex multi-source merging (source inventory → cross-reference → conflict detection → priority merge → gap analysis → synthesis → quality check)
  • AGENTS.md rewritten — Updated with all 17 source types, scraper pattern docs, project layout, and key pattern documentation
  • 77 new integration tests in test_new_source_types.py — Source detection, config validation, generic merge, CLI wiring, validation, and create command routing
  • docs/BEST_PRACTICES.md — Comprehensive guide for creating high-quality skills: SKILL.md structure, code examples, prerequisites, troubleshooting, quality targets, and real-world Grade F to Grade A example (#206)
  • Documentation updated for 17 source types — 32 files updated across README, CLI reference, feature matrix, MCP reference, config format, API reference, unified scraping, multi-source guide, installation, quick-start, core concepts, user guide, FAQ, troubleshooting, architecture, and all Chinese (zh-CN) translations
  • README translations for 10 languages (12 total) — Added Japanese (日本語), Korean (한국어), Spanish (Español), French (Français), German (Deutsch), Portuguese (Português), Turkish (Türkçe), Arabic (العربية), Hindi (हिन्दी), and Russian (Русский) README translations with language selector bar across all versions

Performance

  • Pre-compiled regex and O(1) URL dedup in doc_scraper — Module-level compiled patterns, _enqueued_urls set for O(1) dedup, cached URL patterns, async error logging fix (#309)
  • Bisect-based line indexing in code_analyzer and dependency_analyzer — O(log n) offset_to_line() via bisect replaces O(n) count("\n") across all 10 language analyzers and all import extractors
  • O(n) parent class map for Python method detection — Replaces O(n²) repeated AST walks in code_analyzer
  • O(1) tree traversal in github_scraperdeque.popleft() replaces list pop(0)
  • Shared build_line_index() / offset_to_line() utilities in cli/utils.py — DRY extraction from code_analyzer and dependency_analyzer

Fixed

  • Config validator missing word and video dispatch_validate_source() had no elif branches for word or video types, silently skipping validation. Added dispatch entries and _validate_word_source() / _validate_video_source() methods.
  • openapi_scraper.py unconditional import yaml — Would crash at import time if pyyaml not installed. Added try/except ImportError guard with YAML_AVAILABLE flag and _check_yaml_deps() helper.
  • asciidoc_scraper.py missing standard argumentsmain() manually defined args instead of using add_asciidoc_arguments(). Refactored to use shared argument definitions + added enhancement workflow integration.
  • pptx_scraper.py missing standard arguments — Same issue. Refactored to use add_pptx_arguments().
  • chat_scraper.py missing standard arguments — Same issue. Refactored to use add_chat_arguments().
  • notion_scraper.py missing run_workflows call--enhance-workflow flags were silently ignored. Added workflow runner integration.
  • openapi_scraper.py return type Nonemain() returned None instead of int. Fixed to return 0 on success, matching all other scrapers.
  • MCP scrape_generic_tool flag mismatch — Was passing --path/--url as generic flags, but every scraper expects its own flag name (e.g., --notebook, --html-path, --spec). All 10 source types would have failed at runtime. Fixed with per-type _PATH_FLAGS and _URL_FLAGS mappings.
  • Word scraper docx_id key mismatch — Unified scraper data dict used docx_id but generic reference generation looked for word_id. Added word_id alias.
  • main.py docstring stale — Missing all 10 new commands. Updated to list all 27 commands.
  • source_detector.py module docstring stale — Described only 5 source types. Updated to describe 14+ detected types.
  • manpage_parser.py docstring referenced wrong file — Said manpage_scraper.py but actual file is man_scraper.py. Fixed.
  • Parser registry test count — Updated expected count from 25 to 35 for 10 new parsers.
  • 'Invalid IPv6 URL' error on bracket-containing URLs (#284) — URLs with square brackets (e.g., /api/[v1]/users) discovered via BFS crawl or HTML extraction bypassed the original fix in _clean_url(). Added shared sanitize_url() utility applied at every URL ingestion point. 16 new tests.
  • GitHub scraper 'list index out of range' on issue extraction (#269) — PyGithub's PaginatedList slicing could fail on some versions or empty repos. Replaced with itertools.islice().
  • Release workflow version mismatch — GitHub release showed wrong version (v3.1.3 instead of v3.2.0) because no explicit release name was set and sed regex had unescaped dots. Added explicit name/tag_name, version consistency check (tag vs pyproject.toml vs package), and empty release notes fallback.
  • Release workflow Python 3.10 compatibility — Version consistency check used tomllib (Python 3.11+). Replaced with grep/sed for 3.10 compatibility.
  • infer_categories() "tutorial" vs "tutorials" key mismatch — Guard checked 'tutorial' but wrote to 'tutorials' key, risking silent overwrites in category inference.
  • Flaky test_benchmark_metadata_overhead — Stabilized with 20 iterations, warm-up run, median averaging, and 200% threshold (was failing on CI with 5 iterations and mean).
  • CI branch protection check permanently pending — Summary job was named 'All Checks Complete' but branch protection required 'Tests'. PRs were stuck as 'Expected — Waiting for status to be reported'. Renamed job to match.
v3.2.0 Breaking risk
⚠ Upgrade required
  • Install optional extras with `pip install skill-seekers[video]`, `skill-seekers[docx]`, `skill-seekers[pinecone]` or `skill-seekers[all]`.
  • Run `skill-seekers video --setup` after upgrade to auto‑detect and configure GPU dependencies.
Notable features
  • Video Extraction Pipeline: CLI command `skill-seekers video --url` to scrape YouTube/local videos with transcript, OCR, panel detection, code timeline and GPU auto‑setup.
  • Word Document (.docx) Support: Command `skill-seekers word --docx` converts .docx via mammoth → HTML → SKILL.md with smart code block detection.
  • Pinecone Vector Database Adaptor: `skill-seekers package … --format pinecone --upload` provides full CRUD, namespace support and OpenAI/Sentence‑Transformer embedding integration.
Full changelog

v3.2.0 — Video Extraction, Word Support, Pinecone Adaptor

Theme: Video source support, Word document support, Pinecone adaptor, and quality improvements. 94 files changed, +23,500 lines since v3.1.3. 2,540 tests passing.

🎬 Video Extraction Pipeline

Complete video extraction system that converts YouTube videos and local video files into AI-consumable skills.

  • skill-seekers video --url <youtube-url> — New CLI command for video scraping
  • skill-seekers create <youtube-url> — Auto-detects YouTube URLs
  • Transcript extraction — 3-tier fallback: YouTube API → yt-dlp → faster-whisper
  • Visual OCR — Multi-engine ensemble (EasyOCR + pytesseract) for code frames
  • Panel detection — Splits IDE screenshots into independent sub-sections
  • Code timeline — Tracks code evolution across frames with edit history
  • Two-pass AI enhancement — Cleans OCR noise using transcript context
  • GPU auto-detectionskill-seekers video --setup detects CUDA/ROCm/CPU and installs correct PyTorch
  • 197 tests covering models, metadata, transcript, visual, OCR, and CLI

📄 Word Document (.docx) Support

  • skill-seekers word --docx <file> — Full pipeline: mammoth → HTML → sections → SKILL.md
  • skill-seekers create document.docx — Auto-detects .docx files
  • Smart code detection — Identifies monospace paragraphs as code blocks
  • Install: pip install skill-seekers[docx]

🌲 Pinecone Vector Database Adaptor

  • skill-seekers package output/ --format pinecone --upload — Direct Pinecone upload
  • Full CRUD operations with namespace support
  • OpenAI and Sentence Transformers embedding support
  • Batch upsert with configurable batch sizes
  • 764 tests for comprehensive coverage

🐛 Bug Fixes

  • 6 OCR quality fixes — Skip webcam frames, clean IDE decorations, fix duplicate lines, filter UI junk
  • 15 video pipeline fixes — Timeout handling, MCP integration, filename collisions, dependency management
  • Issue #300 — Selector fallback & dry-run link discovery (ReactFlow found 20+ pages, was 1)
  • Issue #301setup.sh macOS fix
  • RAG chunking crash — Fixed AttributeError: output_dir
  • Chunk overlap auto-scaling — Scales to max(50, chunk_tokens // 10)
  • Reference file limits removed — No more caps on GitHub issues, releases, or code blocks
  • See CHANGELOG.md for full details

📦 Install / Upgrade

pip install --upgrade skill-seekers

# With video support
pip install skill-seekers[video]
skill-seekers video --setup  # Auto-detect GPU, install deps

# With Word support
pip install skill-seekers[docx]

# With Pinecone
pip install skill-seekers[pinecone]

# Everything
pip install skill-seekers[all]

Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md

v3.1.3 Breaking risk
Breaking changes
  • --chunk-size → --chunk-tokens
  • --chunk-overlap → --chunk-overlap-tokens
  • --chunk → --chunk-for-rag
Full changelog

[3.1.3] - 2026-02-24

🐛 Hotfix — Explicit Chunk Flags & Argument Pipeline Cleanup

Fixed

  • Issue #299: skill-seekers package --target claude unrecognised argument crash_reconstruct_argv() in main.py emits default flag values back into argv when routing subcommands. package_skill.py had a 105-line inline argparser that used different flag names to those in arguments/package.py, so forwarded flags were rejected. Fixed by replacing the inline block with a call to add_package_arguments(parser) — the single source of truth.

Changed

  • package_skill.py argparser refactored — Replaced ~105 lines of inline argparse duplication with a single add_package_arguments(parser) call. Flag names are now guaranteed consistent with _reconstruct_argv() output, preventing future argument-name drift.
  • Explicit chunk flag names — All --chunk-* flags now include unit suffixes to eliminate ambiguity between RAG tokens and streaming characters:
    • --chunk-size (RAG tokens) → --chunk-tokens
    • --chunk-overlap (RAG tokens) → --chunk-overlap-tokens
    • --chunk (enable RAG chunking) → --chunk-for-rag
    • --streaming-chunk-size (chars) → --streaming-chunk-chars
    • --streaming-overlap (chars) → --streaming-overlap-chars
    • --chunk-size in PDF extractor (pages) → --pdf-pages-per-chunk
  • setup_logging() centralized — Added setup_logging(verbose, quiet) to utils.py and removed 4 duplicate module-level logging.basicConfig() calls from doc_scraper.py, github_scraper.py, codebase_scraper.py, and unified_scraper.py
v3.1.2 Bug fix
⚠ Upgrade required
  • pip install --upgrade skill-seekers
  • docker pull yusufk/skill-seekers:latest
Full changelog

What's Changed

🐛 Critical Bug Fixes

Gemini enhancement 404 errors — The gemini-2.0-flash-exp model was retired by Google, causing all Gemini enhancement requests to fail with 404. Replaced with gemini-2.5-flash (stable GA).

skill-seekers enhance auto-detection — The documented behaviour of automatically using API mode when an API key is present was never implemented. This release fixes it:

  • ANTHROPIC_API_KEY set → Claude API mode
  • GOOGLE_API_KEY set → Gemini API mode
  • OPENAI_API_KEY set → OpenAI API mode
  • No key → LOCAL mode (Claude Code Max, free)

Use --mode LOCAL to force local mode even when API keys are present.

create command argument forwarding — Universal flags (--dry-run, --verbose, --quiet, --name, --description) were crashing when used with GitHub, PDF, and codebase sources. All fixed. Also adds --dry-run support to skill-seekers github and skill-seekers pdf.

Upgrade

pip install --upgrade skill-seekers
docker pull yusufk/skill-seekers:latest

Full Changelog

See CHANGELOG.md for complete details.

v3.1.1 Bug fix

Fixed AttributeError when creating commands with max_pages.

Full changelog

What's Changed

  • fix: use getattr for max_pages in create command web routing by @YusufKaraaslanSpyke in https://github.com/yusufkaraaslan/Skill_Seekers/pull/294
  • hotfix: v3.1.1 — fix create command max_pages AttributeError by @yusufkaraaslan in https://github.com/yusufkaraaslan/Skill_Seekers/pull/295
  • Max page hot fix by @yusufkaraaslan in https://github.com/yusufkaraaslan/Skill_Seekers/pull/296

Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/compare/v3.1.0...v3.1.1

v3.1.0 Breaking risk
Notable features
  • Unified `create` command that auto‑detects source type and consolidates all workflow entry points
  • 65 bundled enhancement workflow presets (e.g., security-focus, api-documentation) with management commands
Full changelog

🎯 v3.1.0 — "Unified CLI & Developer Experience"

One command for everything. 65 workflow presets. 178 production configs. 2280+ tests.


🚀 What's New

✨ Unified create Command — One command to rule them all

No more remembering which command to use. Just create with anything:

# Auto-detects source type
skill-seekers create https://docs.react.dev/          # → web scraper
skill-seekers create facebook/react                   # → GitHub analysis
skill-seekers create ./my-project                     # → local codebase
skill-seekers create tutorial.pdf                     # → PDF extraction
skill-seekers create configs/react.json               # → multi-source unified

# Quick preset shortcut (-p)
skill-seekers create https://docs.react.dev/ -p quick
skill-seekers create facebook/react -p comprehensive

# Progressive help — no more flag overwhelm
skill-seekers create --help            # 13 universal flags (clean)
skill-seekers create --help-web        # web-specific options
skill-seekers create --help-github     # GitHub-specific options
skill-seekers create --help-all        # every flag (120+)

🔧 65 Enhancement Workflow Presets

Tailor your skills for specific use cases with bundled workflow presets:

# Chain multiple workflows
skill-seekers create facebook/react \
  --enhance-workflow security-focus \
  --enhance-workflow api-documentation

# Manage presets
skill-seekers workflows list                    # Browse all 65 bundled presets
skill-seekers workflows show security-focus     # Inspect a preset
skill-seekers workflows copy security-focus     # Copy to user dir for customization
skill-seekers workflows add my-preset.yaml      # Add custom preset

Bundled presets cover: security-focus, api-documentation, architecture-comprehensive, testing-focus, microservices-patterns, kubernetes-deployment, database-schema, mlops-pipeline, rest-api-design, graphql-schema, responsive-design, performance-optimization, accessibility-a11y and 50+ more.

⚡ Smart Enhancement Dispatcher

# Auto-detects API key or falls back to Claude Code CLI
skill-seekers enhance output/react/

# Explicit target
skill-seekers enhance output/react/ --target gemini

# Docker/root guard — clear error instead of silent failure
# (fixes #286, #289)

📄 ReStructuredText (RST) Support

Sphinx/RST documentation sites now extract content properly — class references, code blocks, tables, and cross-references are all parsed correctly.


🗃️ 178 Production Configs — All Reviewed & Enhanced

All configs in skill-seekers-configs brought to v1.1.0 quality standard:

  • ✅ All max_pages fields removed (deprecated, defaults apply automatically)
  • ✅ 5–13 categories per config, 3–6 keywords each
  • ✅ Semantic selector fallback chains (article, main, div[role='main'])
  • ✅ Outdated URLs fixed (Astro v3 restructure, Laravel 12.x)
  • scripts/validate-config.py bug fixes

🐛 Notable Bug Fixes

| Fix | Issue |
|-----|-------|
| --enhance-workflow flag forwarding in create command | workflows were silently ignored |
| LOCAL enhancement blocked for root/Docker users | fixes #286, #289 |
| %APPDATA% config paths on Windows | fixes #283 |
| Bracket characters in llms.txt URLs (IPv6 parse error) | fixes #284 |
| Unified config categories not found in validate-config.py | multi-source configs always failed |


📊 Stats

| Metric | v3.0.0 | v3.1.0 |
|--------|--------|--------|
| Tests passing | 1,852 | 2,280+ |
| Enhancement workflow presets | 0 | 65 |
| Production configs | 178 | 178 (all reviewed) |
| CLI entry points | 22 | 23 (workflows) |
| Platforms supported | 16 | 16 |


📦 Installation

pip install skill-seekers==3.1.0
# or
pip install --upgrade skill-seekers

🐳 Docker

docker pull yusufk/skill-seekers:3.1.0
docker pull yusufk/skill-seekers:latest

# MCP server
docker pull yusufk/skill-seekers-mcp:3.1.0

🔗 Links


Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/compare/v3.0.0...v3.1.0

v3.0.0 Breaking risk
Notable features
  • 16 platform adaptors (RAG, AI platforms, coding assistants, Markdown)
  • 26 MCP tools covering config generation, scraping, packaging, source management, splitting and vector DB exports
  • Cloud storage support for AWS S3, Google Cloud Storage, Azure Blob with upload/download/list/presigned URL features
Full changelog

[3.0.0] - 2026-02-10

🚀 "Universal Intelligence Platform" - Major Release

Theme: Transform any documentation into structured knowledge for any AI system.

This is our biggest release ever! v3.0.0 establishes Skill Seekers as the universal documentation preprocessor for the entire AI ecosystem - from RAG pipelines to AI coding assistants to Claude skills.

Highlights

  • 🚀 16 platform adaptors (up from 4 in v2.x)
  • 🛠️ 26 MCP tools (up from 9)
  • 1,852 tests passing (up from 700+)
  • ☁️ Cloud storage support (S3, GCS, Azure)
  • 🔄 CI/CD ready (GitHub Action + Docker)
  • 📦 12 example projects for every integration
  • 📚 18 integration guides complete

Added - Platform Adaptors (16 Total)

RAG & Vector Databases (8)

  • LangChain (--format langchain) - Output LangChain Document objects
  • LlamaIndex (--format llama-index) - Output LlamaIndex TextNode objects
  • Chroma (--format chroma) - Direct ChromaDB integration
  • FAISS (--format faiss) - Facebook AI Similarity Search
  • Haystack (--format haystack) - Deepset Haystack pipelines
  • Qdrant (--format qdrant) - Qdrant vector database
  • Weaviate (--format weaviate) - Weaviate vector search
  • Pinecone-ready (--target markdown) - Markdown format ready for Pinecone

AI Platforms (3)

  • Claude (--target claude) - Claude AI skills (ZIP + YAML)
  • Gemini (--target gemini) - Google Gemini skills (tar.gz)
  • OpenAI (--target openai) - OpenAI ChatGPT (ZIP + Vector Store)

AI Coding Assistants (4)

  • Cursor (--target claude + .cursorrules) - Cursor IDE integration
  • Windsurf (--target claude + .windsurfrules) - Windsurf/Codeium
  • Cline (--target claude + .clinerules) - VS Code extension
  • Continue.dev (--target claude) - Universal IDE support

Generic (1)

  • Markdown (--target markdown) - Generic ZIP export

Added - MCP Tools (26 Total)

Config Tools (3)

  • generate_config - Generate scraping configuration
  • list_configs - List available preset configs
  • validate_config - Validate config JSON structure

Scraping Tools (8)

  • estimate_pages - Estimate page count before scraping
  • scrape_docs - Scrape documentation websites
  • scrape_github - Scrape GitHub repositories
  • scrape_pdf - Extract from PDF files
  • scrape_codebase - Analyze local codebases
  • detect_patterns - Detect design patterns in code
  • extract_test_examples - Extract usage examples from tests
  • build_how_to_guides - Build how-to guides from code

Packaging Tools (4)

  • package_skill - Package skill for target platform
  • upload_skill - Upload to LLM platform
  • enhance_skill - AI-powered enhancement
  • install_skill - One-command complete workflow

Source Tools (5)

  • fetch_config - Fetch config from remote source
  • submit_config - Submit config for approval
  • add_config_source - Add Git config source
  • list_config_sources - List config sources
  • remove_config_source - Remove config source

Splitting Tools (2)

  • split_config - Split large configs
  • generate_router - Generate router skills

Vector DB Tools (4)

  • export_to_weaviate - Export to Weaviate
  • export_to_chroma - Export to ChromaDB
  • export_to_faiss - Export to FAISS
  • export_to_qdrant - Export to Qdrant

Added - Cloud Storage

Upload skills directly to cloud storage:

  • AWS S3 - skill-seekers cloud upload --provider s3 --bucket my-bucket
  • Google Cloud Storage - skill-seekers cloud upload --provider gcs --bucket my-bucket
  • Azure Blob Storage - skill-seekers cloud upload --provider azure --container my-container

Features:

  • Upload/download directories
  • List files with metadata
  • Check file existence
  • Generate presigned URLs
  • Cloud-agnostic interface

Added - CI/CD Support

GitHub Action

- uses: skill-seekers/action@v1
  with:
    config: configs/react.json
    format: langchain

Features:

  • Auto-update on doc changes
  • Matrix builds for multiple frameworks
  • Scheduled updates
  • Caching for faster runs

Docker

docker run -v $(pwd):/data skill-seekers:latest scrape --config /data/config.json

Added - Production Infrastructure

  • Helm Charts - Kubernetes deployment
  • Docker Compose - Local vector DB stack
  • Monitoring - Sentry integration, sync monitoring
  • Benchmarking - Performance testing framework

Added - 12 Example Projects

Complete working examples for every integration:

  1. langchain-rag-pipeline - React docs → LangChain → Chroma
  2. llama-index-query-engine - Vue docs → LlamaIndex
  3. pinecone-upsert - Documentation → Pinecone
  4. chroma-example - Full ChromaDB workflow
  5. faiss-example - FAISS index building
  6. haystack-pipeline - Haystack RAG pipeline
  7. qdrant-example - Qdrant vector DB
  8. weaviate-example - Weaviate integration
  9. cursor-react-skill - React skill for Cursor
  10. windsurf-fastapi-context - FastAPI for Windsurf
  11. cline-django-assistant - Django assistant for Cline
  12. continue-dev-universal - Universal IDE context

Quality Metrics

  • 1,852 tests across 100 test files
  • 58,512 lines of Python code
  • 80+ documentation files
  • 100% test coverage for critical paths
  • CI/CD on every commit

Fixed

URL Conversion Bug with Anchor Fragments (Issue #277)

  • Critical Bug Fix: Fixed 404 errors when scraping documentation with anchor links
    • Problem: URLs with anchor fragments (e.g., #synchronous-initialization) were malformed
      • Incorrect: https://example.com/docs/api#method/index.html.md
      • Correct: https://example.com/docs/api/index.html.md
    • Root Cause: _convert_to_md_urls() didn't strip anchor fragments before appending /index.html.md
    • Solution: Parse URLs with urllib.parse to remove fragments and deduplicate base URLs
    • Impact: Prevents duplicate requests for the same page with different anchors
    • Additional Fix: Changed .md detection from ".md" in url to url.endswith('.md')
      • Prevents false matches on URLs like /cmd-line or /AMD-processors
  • Test Coverage: 12 comprehensive tests covering all edge cases
    • Anchor fragment stripping
    • Deduplication of multiple anchors on same URL
    • Query parameter preservation
    • Trailing slash handling
    • Real-world MikroORM case validation
    • 54/54 tests passing (42 existing + 12 new)
  • Reported by: @devjones via Issue #277

Added

Extended Language Detection (NEW)

  • 7 New Programming Languages: Dart, Scala, SCSS, SASS, Elixir, Lua, Perl
    • Pattern-based detection with confidence scoring (0.6-0.8+ thresholds)
    • 70 regex patterns prioritizing unique identifiers (weight 5)
    • Framework-specific patterns:
      • Dart: Flutter widgets (StatelessWidget, StatefulWidget, Widget build())
      • Scala: Pattern matching (case class, trait, match {})
      • SCSS: Preprocessor features ($variables, @mixin, @include, @extend)
      • SASS: Indented syntax (=mixin, +include, $variables)
      • Elixir: Functional patterns (defmodule, def ... do, pipe operator |>)
      • Lua: Game scripting (local, repeat...until, ~=, elseif)
      • Perl: Text processing (my $, use strict, sub, chomp, regex =~)
    • Comprehensive test coverage: 7 new tests, 30/30 passing (100%)
    • False positive prevention: Unique identifiers (weight 5) + confidence thresholds
    • No regressions: All existing language detection tests still pass
    • Total language support: Now 27+ programming languages
    • Credit: Contributed by @PaawanBarach via PR #275

Multi-Agent Support for Local Enhancement (NEW)

  • Multiple Coding Agent Support: Choose your preferred local coding agent for SKILL.md enhancement
    • Claude Code (default): Claude Code CLI with --dangerously-skip-permissions
    • Codex CLI: OpenAI Codex CLI with --full-auto and --skip-git-repo-check
    • Copilot CLI: GitHub Copilot CLI (gh copilot chat)
    • OpenCode CLI: OpenCode CLI
    • Custom agents: Use any CLI tool with --agent custom --agent-cmd "command {prompt_file}"
  • CLI Arguments: New flags for agent selection
    • --agent: Choose agent (claude, codex, copilot, opencode, custom)
    • --agent-cmd: Override command template for custom agents
  • Environment Variables: CI/CD friendly configuration
    • SKILL_SEEKER_AGENT: Default agent to use
    • SKILL_SEEKER_AGENT_CMD: Default command template for custom agents
  • Security First: Custom command validation
    • Blocks dangerous shell characters (;, &, |, $, `, \n, \r)
    • Validates executable exists in PATH
    • Safe parsing with shlex.split()
  • Dual Input Modes: Supports both file-based and stdin-based agents
    • File-based: Uses {prompt_file} placeholder (Claude, custom agents)
    • Stdin-based: Pipes prompt via stdin (Codex CLI)
  • Backward Compatible: Claude Code remains the default, no breaking changes
  • Comprehensive Tests: 13 new tests covering all agent types and security validation
  • Agent Normalization: Smart alias handling (e.g., "claude-code" → "claude")
  • Credit: Contributed by @rovo79 (Robert Dean) via PR #270

C3.10: Signal Flow Analysis for Godot Projects (NEW)

  • Complete Signal Flow Analysis System: Analyze event-driven architectures in Godot game projects

    • Signal declaration extraction (signal keyword detection)
    • Connection mapping (.connect() calls with targets and methods)
    • Emission tracking (.emit() and emit_signal() calls)
    • 208 signals, 634 connections, and 298 emissions detected in test project (Cosmic Idler)
    • Signal density metrics (signals per file)
    • Event chain detection (signals triggering other signals)
    • Output: signal_flow.json, signal_flow.mmd (Mermaid diagram), signal_reference.md
  • Signal Pattern Detection: Three major patterns identified

    • EventBus Pattern (0.90 confidence): Centralized signal hub in autoload
    • Observer Pattern (0.85 confidence): Multi-observer signals (3+ listeners)
    • Event Chains (0.80 confidence): Cascading signal propagation
  • Signal-Based How-To Guides (C3.10.1): AI-generated usage guides

    • Step-by-step guides (Connect → Emit → Handle)
    • Real code examples from project
    • Common usage locations
    • Parameter documentation
    • Output: signal_how_to_guides.md (10 guides for Cosmic Idler)

Godot Game Engine Support

  • Comprehensive Godot File Type Support: Full analysis of Godot 4.x projects

    • GDScript (.gd): 265 files analyzed in test project
    • Scene files (.tscn): 118 scene files
    • Resource files (.tres): 38 resource files
    • Shader files (.gdshader, .gdshaderinc): 9 shader files
    • C# integration: Phantom Camera addon (13 files)
  • GDScript Language Support: Complete GDScript parsing with regex-based extraction

    • Dependency extraction: preload(), load(), extends patterns
    • Test framework detection: GUT, gdUnit4, WAT
    • Test file patterns: test_*.gd, *_test.gd
    • Signal syntax: signal, .connect(), .emit()
    • Export decorators: @export, @onready
    • Test decorators: @test (gdUnit4)
  • Game Engine Framework Detection: Improved detection for Unity, Unreal, Godot

    • Godot markers: project.godot, .godot directory, .tscn, .tres, .gd files
    • Unity markers: Assembly-CSharp.csproj, UnityEngine.dll, ProjectSettings/ProjectVersion.txt
    • Unreal markers: .uproject, Source/, Config/DefaultEngine.ini
    • Fixed false positive Unity detection (was using generic "Assets" keyword)
  • GDScript Test Extraction: Extract usage examples from Godot test files

    • 396 test cases extracted from 20 GUT test files in test project
    • Patterns: instantiation (preload().new(), load().new()), assertions (assert_eq, assert_true), signals
    • GUT framework: extends GutTest, func test_*(), add_child_autofree()
    • Test categories: instantiation, assertions, signal connections, setup/teardown
    • Real code examples from production test files

C3.9: Project Documentation Extraction

  • Markdown Documentation Extraction: Automatically extracts and categorizes all .md files from projects
    • Smart categorization by folder/filename (overview, architecture, guides, workflows, features, etc.)
    • Processing depth control: surface (raw copy), deep (parse+summarize), full (AI-enhanced)
    • AI enhancement (level 2+) adds topic extraction and cross-references
    • New "📖 Project Documentation" section in SKILL.md
    • Output to references/documentation/ organized by category
    • Default ON, use --skip-docs to disable
    • 15 new tests for documentation extraction features

Granular AI Enhancement Control

  • --enhance-level Flag: Fine-grained control over AI enhancement (0-3)
    • Level 0: No AI enhancement (default)
    • Level 1: SKILL.md enhancement only (fast, high value)
    • Level 2: SKILL.md + Architecture + Config + Documentation
    • Level 3: Full enhancement (patterns, tests, config, architecture, docs)
  • Config Integration: default_enhance_level setting in ~/.config/skill-seekers/config.json
  • MCP Support: All MCP tools updated with enhance_level parameter
  • Independent from --comprehensive: Enhancement level is separate from feature depth

C# Language Support

  • C# Test Example Extraction: Full support for C# test frameworks
    • Language alias mapping (C# → csharp, C++ → cpp)
    • NUnit, xUnit, MSTest test framework patterns
    • Mock pattern support (NSubstitute, Moq)
    • Zenject dependency injection patterns
    • Setup/teardown method extraction
    • 2 new tests for C# extraction features

Performance Optimizations

  • Parallel LOCAL Mode AI Enhancement: 6-12x faster with ThreadPoolExecutor
    • Concurrent workers: 3 (configurable via local_parallel_workers)
    • Batch processing: 20 patterns per Claude CLI call (configurable via local_batch_size)
    • Significant speedup for large codebases
  • Config Settings: New ai_enhancement section in config
    • local_batch_size: Patterns per CLI call (default: 20)
    • local_parallel_workers: Concurrent workers (default: 3)

UX Improvements

  • Auto-Enhancement: SKILL.md automatically enhanced when using --enhance or --comprehensive

    • No need for separate skill-seekers enhance command
    • Seamless one-command workflow
    • 10-minute timeout for large codebases
    • Graceful fallback with retry instructions on failure
  • LOCAL Mode Fallback: All AI enhancements now fall back to LOCAL mode when no API key is set

    • Applies to: pattern enhancement (C3.1), test examples (C3.2), architecture (C3.7)
    • Uses Claude Code CLI instead of failing silently
    • Better UX: "Using LOCAL mode (Claude Code CLI)" instead of "AI disabled"
  • Support for custom Claude-compatible API endpoints via ANTHROPIC_BASE_URL environment variable

  • Compatibility with GLM-4.7 and other Claude-compatible APIs across all AI enhancement features

Changed

  • All AI enhancement modules now respect ANTHROPIC_BASE_URL for custom endpoints
  • Updated documentation with GLM-4.7 configuration examples
  • Rewritten LOCAL mode in config_enhancer.py to use Claude CLI properly with explicit output file paths
  • Updated MCP scrape_codebase_tool with skip_docs and enhance_level parameters
  • Updated CLAUDE.md with C3.9 documentation extraction feature
  • Increased default batch size from 5 to 20 patterns for LOCAL mode

Fixed

  • C# Test Extraction: Fixed "Language C# not supported" error with language alias mapping
  • Config Type Field Mismatch: Fixed KeyError in config_enhancer.py by supporting both "type" and "config_type" fields
  • LocalSkillEnhancer Import: Fixed incorrect import and method call in main.py (SkillEnhancer → LocalSkillEnhancer)
  • Code Quality: Fixed 4 critical linter errors (unused imports, variables, arguments, import sorting)

Godot Game Engine Fixes

  • GDScript Dependency Extraction: Fixed 265+ "Syntax error in *.gd" warnings (commit 3e6c448)

    • GDScript files were incorrectly routed to Python AST parser
    • Created dedicated _extract_gdscript_imports() with regex patterns
    • Now correctly parses preload(), load(), extends patterns
    • Result: 377 dependencies extracted with 0 warnings
  • Framework Detection False Positive: Fixed Unity detection on Godot projects (commit 50b28fe)

    • Was detecting "Unity" due to generic "Assets" keyword in comments
    • Changed Unity markers to specific files: Assembly-CSharp.csproj, UnityEngine.dll, Library/
    • Now correctly detects Godot via project.godot, .godot directory
  • Circular Dependencies: Fixed self-referential cycles (commit 50b28fe)

    • 3 self-loop warnings (files depending on themselves)
    • Added target != file_path check in dependency graph builder
    • Result: 0 circular dependencies detected
  • GDScript Test Discovery: Fixed 0 test files found in Godot projects (commit 50b28fe)

    • Added GDScript test patterns: test_*.gd, *_test.gd
    • Added GDScript to LANGUAGE_MAP
    • Result: 32 test files discovered (20 GUT files with 396 tests)
  • GDScript Test Extraction: Fixed "Language GDScript not supported" warning (commit c826690)

    • Added GDScript regex patterns to PATTERNS dictionary
    • Patterns: instantiation (preload().new()), assertions (assert_eq), signals (.connect())
    • Result: 22 test examples extracted successfully
  • Config Extractor Array Handling: Fixed JSON/YAML array parsing (commit fca0951)

    • Error: 'list' object has no attribute 'items' on root-level arrays
    • Added isinstance checks for dict/list/primitive at root
    • Result: No JSON array errors, save.json parsed correctly
  • Progress Indicators: Fixed missing progress for small batches (commit eec37f5)

    • Progress only shown every 5 batches, invisible for small jobs
    • Modified condition to always show for batches < 10
    • Result: "Progress: 1/2 batches completed" now visible

Other Fixes

  • C# Test Extraction: Fixed "Language C# not supported" error with language alias mapping
  • Config Type Field Mismatch: Fixed KeyError in config_enhancer.py by supporting both "type" and "config_type" fields
  • LocalSkillEnhancer Import: Fixed incorrect import and method call in main.py (SkillEnhancer → LocalSkillEnhancer)
  • Code Quality: Fixed 4 critical linter errors (unused imports, variables, arguments, import sorting)

Tests

  • GDScript Test Extraction Test: Added comprehensive test case for GDScript GUT/gdUnit4 framework
    • Tests player instantiation with preload() and load()
    • Tests signal connections and emissions
    • Tests gdUnit4 @test annotation syntax
    • Tests game state management patterns
    • 4 test functions with 60+ lines of GDScript code
    • Validates extraction of instantiations, assertions, and signal patterns

Removed

  • Removed client-specific documentation files from repository

v2.9.0 New feature
Notable features
  • Complete Signal Flow Analysis for Godot projects (declaration extraction, connection mapping, emission tracking, density metrics, event chain detection) with JSON/Mermaid/Markdown outputs
  • Signal Pattern Detection identifying EventBus, Observer, and Event Chain patterns with confidence scores
  • Full GDScript language support including dependency extraction (`preload`, `load`, `extends`), test framework detection (GUT, gdUnit4), export decorators, and parameter documentation
Full changelog

🎮 Game Development Release - Godot Engine Support

This release adds comprehensive support for Godot game engine projects with industry-leading signal flow analysis and complete GDScript language support.

🎮 Added

C3.10: Signal Flow Analysis for Godot Projects ⭐ NEW

  • Complete Signal Flow Analysis System: Analyze event-driven architectures in Godot game projects

    • Signal declaration extraction (signal keyword detection)
    • Connection mapping (.connect() calls with targets and methods)
    • Emission tracking (.emit() and emit_signal() calls)
    • Real-world results: 208 signals, 634 connections, 298 emissions detected in Cosmic Idler test project
    • Signal density metrics (0.78 signals/file)
    • Event chain detection (signals triggering other signals)
    • Output: signal_flow.json (374KB), signal_flow.mmd (Mermaid diagram), signal_reference.md (34KB)
  • Signal Pattern Detection: Three major patterns identified with confidence scoring

    • EventBus Pattern (0.90 confidence): Centralized signal hub in autoload
    • Observer Pattern (0.85 confidence): Multi-observer signals (3+ listeners, theme_changed: 21 connections)
    • Event Chains (0.80 confidence): Cascading signal propagation
  • Signal-Based How-To Guides (C3.10.1): AI-generated usage guides

    • Step-by-step guides (Connect → Emit → Handle)
    • Real code examples from project
    • Common usage locations with file references
    • Parameter documentation
    • Output: signal_how_to_guides.md (10 guides generated for Cosmic Idler)

Complete Godot Game Engine Support

  • Comprehensive Godot File Type Support: Full analysis of Godot 4.x projects

    • GDScript (.gd): 265 files analyzed in test project (59.8% of codebase)
    • Scene files (.tscn): 118 scene files (26.6%)
    • Resource files (.tres): 38 resource files (8.6%)
    • Shader files (.gdshader, .gdshaderinc): 9 shader files (2.0%)
    • C# integration: Phantom Camera addon (13 files, 2.9%)
  • GDScript Language Support: Complete GDScript parsing with regex-based extraction

    • Dependency extraction: preload(), load(), extends patterns
    • Test framework detection: GUT, gdUnit4, WAT
    • Test file patterns: test_*.gd, *_test.gd
    • Signal syntax: signal, .connect(), .emit()
    • Export decorators: @export, @onready
    • Test decorators: @test (gdUnit4)
    • 377 dependencies extracted with 0 syntax errors
  • Game Engine Framework Detection: Improved detection for Unity, Unreal, Godot

    • Godot markers: project.godot, .godot directory, .tscn, .tres, .gd files
    • Unity markers: Assembly-CSharp.csproj, UnityEngine.dll, ProjectSettings/ProjectVersion.txt
    • Unreal markers: .uproject, Source/, Config/DefaultEngine.ini
    • Fixed false positive Unity detection (was using generic "Assets" keyword)
    • Priority-based detection (game engines detected before web frameworks)
  • GDScript Test Extraction: Extract usage examples from Godot test files

    • 396 test cases extracted from 20 GUT test files in Cosmic Idler test project
    • Patterns: instantiation (preload().new(), load().new()), assertions (assert_eq, assert_true), signals
    • GUT framework: extends GutTest, func test_*(), add_child_autofree()
    • Test categories: instantiation, assertions, signal connections, setup/teardown
    • Real code examples from production test files
    • 22 high-quality test examples extracted

🐛 Fixed

Godot-Specific Bug Fixes

  • GDScript Dependency Extraction (commit 3e6c448): Fixed 265+ "Syntax error in *.gd" warnings

    • GDScript files were incorrectly routed to Python AST parser
    • Created dedicated _extract_gdscript_imports() with regex patterns
    • Now correctly parses preload(), load(), extends patterns
    • Result: 377 dependencies extracted with 0 warnings
  • Framework Detection False Positive (commit 50b28fe): Fixed Unity detection on Godot projects

    • Was detecting "Unity" due to generic "Assets" keyword in comments
    • Changed Unity markers to specific files: Assembly-CSharp.csproj, UnityEngine.dll, Library/
    • Now correctly detects Godot via project.godot, .godot directory
  • Circular Dependencies (commit 50b28fe): Fixed self-referential cycles

    • 3 self-loop warnings (files depending on themselves)
    • Added target != file_path check in dependency graph builder
    • Result: 0 circular dependencies detected
  • GDScript Test Discovery (commit 50b28fe): Fixed 0 test files found in Godot projects

    • Added GDScript test patterns: test_*.gd, *_test.gd
    • Added GDScript to LANGUAGE_MAP
    • Result: 32 test files discovered (20 GUT files with 396 tests)
  • GDScript Test Extraction (commit c826690): Fixed "Language GDScript not supported" warning

    • Added GDScript regex patterns to PATTERNS dictionary
    • Patterns: instantiation (preload().new()), assertions (assert_eq), signals (.connect())
    • Result: 22 test examples extracted successfully
  • Config Extractor Array Handling (commit fca0951): Fixed JSON/YAML array parsing

    • Error: 'list' object has no attribute 'items' on root-level arrays
    • Added isinstance checks for dict/list/primitive at root
    • Result: No JSON array errors, save.json parsed correctly
  • Progress Indicators (commit eec37f5): Fixed missing progress for small batches

    • Progress only shown every 5 batches, invisible for small jobs
    • Modified condition to always show for batches < 10
    • Result: "Progress: 1/2 batches completed" now visible

🧪 Tests

  • GDScript Test Extraction Test: Added comprehensive test case for GDScript GUT/gdUnit4 framework
    • Tests player instantiation with preload() and load()
    • Tests signal connections and emissions
    • Tests gdUnit4 @test annotation syntax
    • Tests game state management patterns
    • 4 test functions with 60+ lines of GDScript code
    • Validates extraction of instantiations, assertions, and signal patterns

📊 Quality Metrics (Cosmic Idler Test Project)

  • SKILL.md Quality: 9/10 rating (31KB, 1,030 lines)
  • File Coverage: 98% (443/452 files analyzed)
  • Signal Analysis: 208 signals, 634 connections, 298 emissions
  • Test Coverage: 32 test files discovered, 22 examples extracted
  • Dependency Graph: 377 dependencies, 0 circular cycles
  • Language Breakdown: GDScript 59.8%, Scenes 26.6%, Resources 8.6%, Shaders 2.0%

📝 Files Changed

  • 1 new file: signal_flow_analyzer.py (489 lines)
  • 15 modified files: Core analyzers, test extractors, dependency analyzers
  • +1,574 additions, -157 deletions

🎯 Use Cases

This release is perfect for:

  • 🎮 Godot game developers wanting to understand signal architectures
  • 📚 Teams documenting Godot projects for AI assistants (Claude, ChatGPT, Gemini)
  • 🔍 Code reviewers analyzing event-driven patterns in games
  • 🎓 Game development educators creating learning materials
  • 🤖 AI agents needing deep understanding of Godot codebases

🙏 Thanks

Special thanks to the Godot community and Cosmic Idler project for providing an excellent test case for validating all features!


Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/compare/v2.8.0...v2.9.0

v2.8.0 Breaking risk
⚠ Upgrade required
  • Default AI enhancement level changed to 0; existing workflows using automatic enhancement must set `--enhance-level` or update config.
  • Configuration file now expects a `default_enhance_level` key under `ai_enhancement` section.
Notable features
  • Markdown Documentation Extraction (C3.9) with categorization and AI‑enhanced output
  • Granular `--enhance-level` flag (0‑3) for fine‑grained AI enhancement control
  • C# test example extraction supporting NUnit, xUnit, MSTest, NSubstitute, Moq
Full changelog

[2.8.0] - 2026-02-01

🚀 Major Feature Release - Enhanced Code Analysis & Documentation

This release brings powerful new code analysis features, performance optimizations, and international API support. Special thanks to all our contributors who made this release possible!

Added

C3.9: Project Documentation Extraction

  • Markdown Documentation Extraction: Automatically extracts and categorizes all .md files from projects
    • Smart categorization by folder/filename (overview, architecture, guides, workflows, features, etc.)
    • Processing depth control: surface (raw copy), deep (parse+summarize), full (AI-enhanced)
    • AI enhancement (level 2+) adds topic extraction and cross-references
    • New "📖 Project Documentation" section in SKILL.md
    • Output to references/documentation/ organized by category
    • Default ON, use --skip-docs to disable
    • 15 new tests for documentation extraction features

Granular AI Enhancement Control

  • --enhance-level Flag: Fine-grained control over AI enhancement (0-3)
    • Level 0: No AI enhancement (default)
    • Level 1: SKILL.md enhancement only (fast, high value)
    • Level 2: SKILL.md + Architecture + Config + Documentation
    • Level 3: Full enhancement (patterns, tests, config, architecture, docs)
  • Config Integration: default_enhance_level setting in ~/.config/skill-seekers/config.json
  • MCP Support: All MCP tools updated with enhance_level parameter
  • Independent from --comprehensive: Enhancement level is separate from feature depth

C# Language Support

  • C# Test Example Extraction: Full support for C# test frameworks
    • Language alias mapping (C# → csharp, C++ → cpp)
    • NUnit, xUnit, MSTest test framework patterns
    • Mock pattern support (NSubstitute, Moq)
    • Zenject dependency injection patterns
    • Setup/teardown method extraction
    • 2 new tests for C# extraction features

Performance Optimizations

  • Parallel LOCAL Mode AI Enhancement: 6-12x faster with ThreadPoolExecutor
    • Concurrent workers: 3 (configurable via local_parallel_workers)
    • Batch processing: 20 patterns per Claude CLI call (configurable via local_batch_size)
    • Significant speedup for large codebases
  • Config Settings: New ai_enhancement section in config
    • local_batch_size: Patterns per CLI call (default: 20)
    • local_parallel_workers: Concurrent workers (default: 3)

UX Improvements

  • Auto-Enhancement: SKILL.md automatically enhanced when using --enhance or --comprehensive

    • No need for separate skill-seekers enhance command
    • Seamless one-command workflow
    • 10-minute timeout for large codebases
    • Graceful fallback with retry instructions on failure
  • LOCAL Mode Fallback: All AI enhancements now fall back to LOCAL mode when no API key is set

    • Applies to: pattern enhancement (C3.1), test examples (C3.2), architecture (C3.7)
    • Uses Claude Code CLI instead of failing silently
    • Better UX: "Using LOCAL mode (Claude Code CLI)" instead of "AI disabled"
  • Support for custom Claude-compatible API endpoints via ANTHROPIC_BASE_URL environment variable

  • Compatibility with GLM-4.7 and other Claude-compatible APIs across all AI enhancement features

Changed

  • All AI enhancement modules now respect ANTHROPIC_BASE_URL for custom endpoints
  • Updated documentation with GLM-4.7 configuration examples
  • Rewritten LOCAL mode in config_enhancer.py to use Claude CLI properly with explicit output file paths
  • Updated MCP scrape_codebase_tool with skip_docs and enhance_level parameters
  • Updated CLAUDE.md with C3.9 documentation extraction feature and --enhance-level flag
  • Increased default batch size from 5 to 20 patterns for LOCAL mode

Fixed

  • C# Test Extraction: Fixed "Language C# not supported" error with language alias mapping
  • Config Type Field Mismatch: Fixed KeyError in config_enhancer.py by supporting both "type" and "config_type" fields
  • LocalSkillEnhancer Import: Fixed incorrect import and method call in main.py (SkillEnhancer → LocalSkillEnhancer)
  • Code Quality: Fixed 4 critical linter errors (unused imports, variables, arguments, import sorting)

Removed

  • Removed client-specific documentation files from repository

🙏 Contributors

A huge thank you to everyone who contributed to this release:

  • @xuintl - Chinese README improvements and documentation refinements
  • @Zhichang Yu - GLM-4.7 support and PDF scraper fixes
  • @YusufKaraaslanSpyke - Core features, bug fixes, and project maintenance

Special thanks to all our community members who reported issues, provided feedback, and helped test new features. Your contributions make Skill Seekers better for everyone! 🎉


v2.7.4 Bug fix

Fixed Chinese language selector link on PyPI that previously returned a 404 error.

Full changelog

🔧 Bug Fix - Language Selector Links

This patch release fixes the broken Chinese language selector link that appeared on PyPI and other non-GitHub platforms.

Fixed

  • Broken Language Selector Links on PyPI
    • Issue: Chinese language link used relative URL (README.zh-CN.md) which only worked on GitHub
    • Impact: Users on PyPI clicking "简体中文" got 404 errors
    • Solution: Changed to absolute GitHub URL
    • Result: Language selector now works on PyPI, GitHub, and all platforms
    • Files Fixed: README.md, README.zh-CN.md

Links

  • PyPI Package: https://pypi.org/project/skill-seekers/2.7.4/
  • Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md#274---2026-01-22
v2.7.3 New feature
Notable features
  • Complete README translation to Simplified Chinese (README.zh-CN.md)
  • Language selector badges for switching between English and Chinese
  • PyPI metadata updated with i18n keywords, classifiers, and direct link to Chinese README
Full changelog

🌏 International i18n Release

This documentation release adds comprehensive Chinese language support, making Skill Seekers accessible to the world's largest developer community.

✨ What's New

🇨🇳 Chinese (Simplified) Documentation

  • Complete README Translation - 1,962 lines of comprehensive Chinese documentation (README.zh-CN.md)
  • Language Selector Badges - Easy switching between English and Chinese in both READMEs
  • Machine Translation Disclaimer - Honest labeling with invitation for community improvements
  • Community Engagement - GitHub issue #260 created for native speakers to improve translation quality

📦 PyPI Metadata Internationalization

  • Updated Package Description - Now highlights Chinese documentation availability
  • i18n Keywords - Added "i18n", "chinese", "international" for better discoverability
  • Natural Language Classifiers - English and Chinese (Simplified) officially declared
  • Direct Chinese README Link - Added to project URLs for easy access from PyPI

🌍 Why This Matters

Market Impact:

  • ✅ Reaches 1+ billion Chinese speakers worldwide
  • ✅ Taps into the world's largest developer community
  • ✅ Better discoverability on Chinese search engines (Baidu, Gitee, etc.)
  • ✅ Professional image showing international awareness
  • ✅ Competitive advantage - most similar tools lack Chinese documentation

For Users:

  • ✅ Native language documentation lowers barrier to entry
  • ✅ Better user experience with familiar terminology
  • ✅ Increased engagement from Chinese developer community
  • ✅ Potential for more contributors and feedback

🤝 Community Contribution

We invite Chinese developers to help improve the translation:

  • Review Issue: #260
  • What to Review: Technical accuracy, natural expression, terminology
  • How to Help: Comment on the issue with suggestions or submit a PR

All contributions are welcome and appreciated!

📥 Installation

🔗 Important Links

  • Chinese README: README.zh-CN.md
  • Community Review: Issue #260
  • PyPI Package: https://pypi.org/project/skill-seekers/2.7.3/
  • Official Website: https://skillseekersweb.com/

📝 Full Changelog

See CHANGELOG.md for complete release notes.


语言 / Languages:

v2.7.2 Bug fix

Fixed CLI bugs that prevented core commands (install, scrape) from working and corrected version display.

Full changelog

🚨 Critical CLI Bug Fixes

This hotfix release resolves 4 critical CLI bugs reported in issues #258 and #259 that prevented core commands from working correctly.

Fixed

Issue #258: install --config command fails with unified scraper (#258)

  • Root Cause: unified_scraper.py missing --fresh and --dry-run argument definitions
  • Solution: Added both flags to unified_scraper argument parser and main.py dispatcher
  • Impact: skill-seekers install --config react now works without "unrecognized arguments" error

Issue #259 (Original): scrape command doesn't accept URL and --max-pages (#259)

  • Root Cause: No positional URL argument or --max-pages flag support
  • Solution: Added positional URL argument and --max-pages flag with safety warnings
  • Impact: skill-seekers scrape https://example.com --max-pages 50 now works
  • Safety Warnings: Warns if max-pages > 1000 or < 10

Issue #259 (Comment A): Version shows 2.7.0 instead of actual version (#259)

  • Root Cause: Hardcoded version string in main.py
  • Solution: Import __version__ from __init__.py dynamically
  • Impact: skill-seekers --version now shows correct version (2.7.2)

Issue #259 (Comment B): PDF command shows empty "Error: " message (#259)

  • Root Cause: Exception handler didn't handle empty exception messages
  • Solution: Improved exception handler to show exception type and added context-specific messages
  • Impact: PDF errors now show clear messages instead of just "Error: "

Installation

pip install --upgrade skill-seekers

Testing

  • ✅ Verified all commands work with exact issue reproduction steps
  • ✅ All 202 tests passing

Full Changelog

https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md#272---2026-01-21

v2.7.1 Bug fix

Fixed config download 404 errors caused by manual URL construction.

Full changelog

🚨 Critical Bug Fix - Config Download 404 Errors

This hotfix release resolves a critical bug causing 404 errors when downloading configs from the API.

Fixed

  • Critical: Config download 404 errors - Fixed bug where code was constructing download URLs manually instead of using the download_url field from the API response
    • Root Cause: Code was building f"{API_BASE_URL}/api/download/{config_name}.json" which failed when actual URLs differed (CDN URLs, version-specific paths)
    • Solution: Changed to use config_info.get("download_url") from API response in both MCP server implementations
    • Files Fixed:
      • src/skill_seekers/mcp/tools/source_tools.py (FastMCP server)
      • src/skill_seekers/mcp/server_legacy.py (Legacy server)
    • Impact: Fixes all config downloads from skillseekersweb.com API and private Git repositories
    • Reported By: User testing skill-seekers install --config godot --unlimited
    • Testing: All 15 source tools tests pass, all 8 fetch_config tests pass

Installation

pip install --upgrade skill-seekers

Or install a specific version:

pip install skill-seekers==2.7.1

Links

  • PyPI: https://pypi.org/project/skill-seekers/2.7.1/
  • Website: https://skillseekersweb.com/
  • Documentation: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/README.md

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.5 [email protected]

v2.7.0 Breaking risk
Notable features
  • Smart Rate Limit Handler with prompt/wait/switch/fail strategies
  • Multi‑Token Configuration System supporting profiles, secure storage and API key management
  • Interactive configuration wizard for GitHub tokens, API keys, rate limits, and resume settings
Full changelog

[2.7.0] - 2026-01-18

🔐 Smart Rate Limit Management & Multi-Token Configuration

This minor feature release introduces intelligent GitHub rate limit handling, multi-profile token management, and comprehensive configuration system. Say goodbye to indefinite waits and confusing token setup!

Added

  • 🎯 Multi-Token Configuration System - Flexible GitHub token management with profiles

    • Secure config storage at ~/.config/skill-seekers/config.json with 600 permissions
    • Multiple GitHub profiles support (personal, work, OSS, etc.)
      • Per-profile rate limit strategies: prompt, wait, switch, fail
      • Configurable timeout per profile (default: 30 minutes)
      • Auto-detection and smart fallback chain
      • Profile switching when rate limited
    • API key management for Claude, Gemini, OpenAI
      • Environment variable fallback (ANTHROPIC_API_KEY, GOOGLE_API_KEY, OPENAI_API_KEY)
      • Config file storage with secure permissions
    • Progress tracking for resumable jobs
      • Auto-save at configurable intervals (default: 60 seconds)
      • Job metadata: command, progress, checkpoints, timestamps
      • Stored at ~/.local/share/skill-seekers/progress/
    • Auto-cleanup of old progress files (default: 7 days, configurable)
    • First-run experience with welcome message and quick setup
    • ConfigManager class with singleton pattern for global access
  • 🧙 Interactive Configuration Wizard - Beautiful terminal UI for easy setup

    • Main menu with 7 options:
      1. GitHub Token Setup
      2. API Keys (Claude, Gemini, OpenAI)
      3. Rate Limit Settings
      4. Resume Settings
      5. View Current Configuration
      6. Test Connections
      7. Clean Up Old Progress Files
    • GitHub token management:
      • Add/remove profiles with descriptions
      • Set default profile
      • Browser integration - opens GitHub token creation page
      • Token validation with format checking (ghp_, github_pat_)
      • Strategy selection per profile
    • API keys setup with browser integration for each provider
    • Connection testing to verify tokens and API keys
    • Configuration display with current status and sources
    • CLI commands:
      • skill-seekers config - Main menu
      • skill-seekers config --github - Direct to GitHub setup
      • skill-seekers config --api-keys - Direct to API keys
      • skill-seekers config --show - Show current config
      • skill-seekers config --test - Test connections
  • 🚦 Smart Rate Limit Handler - Intelligent GitHub API rate limit management

    • Upfront warning about token status (60/hour vs 5000/hour)
    • Real-time detection of rate limits from GitHub API responses
      • Parses X-RateLimit-* headers
      • Detects 403 rate limit errors
      • Calculates reset time from timestamps
    • Live countdown timers with progress display
    • Automatic profile switching - tries next available profile when rate limited
    • Four rate limit strategies:
      • prompt - Ask user what to do (default, interactive)
      • wait - Auto-wait with countdown timer
      • switch - Automatically try another profile
      • fail - Fail immediately with clear error
    • Non-interactive mode for CI/CD (fail fast, no prompts)
    • Configurable timeouts per profile (prevents indefinite waits)
    • RateLimitHandler class with strategy pattern
    • Integration points: GitHub fetcher, GitHub scraper
  • 📦 Resume Command - Resume interrupted scraping jobs

    • List resumable jobs with progress details:
      • Job ID, started time, command
      • Current phase and file counts
      • Last updated timestamp
    • Resume from checkpoints (skeleton implemented, ready for integration)
    • Auto-cleanup of old jobs (respects config settings)
    • CLI commands:
      • skill-seekers resume --list - List all resumable jobs
      • skill-seekers resume <job-id> - Resume specific job
      • skill-seekers resume --clean - Clean up old jobs
    • Progress storage at ~/.local/share/skill-seekers/progress/<job-id>.json
  • ⚙️ CLI Enhancements - New flags and improved UX

    • --non-interactive flag for CI/CD mode
      • Available on: skill-seekers github
      • Fails fast on rate limits instead of prompting
      • Perfect for automated pipelines
    • --profile flag to select specific GitHub profile
      • Available on: skill-seekers github
      • Uses configured profile from ~/.config/skill-seekers/config.json
      • Overrides environment variables and defaults
    • Entry points for new commands:
      • skill-seekers-config - Direct config command access
      • skill-seekers-resume - Direct resume command access
  • 🧪 Comprehensive Test Suite - Full test coverage for new features

    • 16 new tests in test_rate_limit_handler.py
    • Test coverage:
      • Header creation (with/without token)
      • Handler initialization (token, strategy, config)
      • Rate limit detection and extraction
      • Upfront checks (interactive and non-interactive)
      • Response checking (200, 403, rate limit)
      • Strategy handling (fail, wait, switch, prompt)
      • Config manager integration
      • Profile management (add, retrieve, switch)
    • All tests passing ✅ (16/16)
    • Test utilities: Mock responses, config isolation, tmp directories
  • 🎯 Bootstrap Skill Feature - Self-hosting capability (PR #249)

    • Self-Bootstrap: Generate skill-seekers as a Claude Code skill
      • ./scripts/bootstrap_skill.sh - One-command bootstrap
      • Combines manual header with auto-generated codebase analysis
      • Output: output/skill-seekers/ ready for Claude Code
      • Install: cp -r output/skill-seekers ~/.claude/skills/
    • Robust Frontmatter Detection:
      • Dynamic YAML frontmatter boundary detection (not hardcoded line counts)
      • Fallback to line 6 if frontmatter not found
      • Future-proof against frontmatter field additions
    • SKILL.md Validation:
      • File existence and non-empty checks
      • Frontmatter delimiter presence
      • Required fields validation (name, description)
      • Exit with clear error messages on validation failures
    • Comprehensive Error Handling:
      • UV dependency check with install instructions
      • Permission checks for output directory
      • Graceful degradation on missing header file
  • 🔧 MCP Now Optional - User choice for installation profile

    • CLI Only: pip install skill-seekers - No MCP dependencies
    • MCP Integration: pip install skill-seekers[mcp] - Full MCP support
    • All Features: pip install skill-seekers[all] - Everything enabled
    • Lazy Loading: Graceful failure with helpful error messages when MCP not installed
    • Interactive Setup Wizard:
      • Shows all installation options on first run
      • Stored at ~/.config/skill-seekers/.setup_shown
      • Accessible via skill-seekers-setup command
    • Entry Point: skill-seekers-setup for manual access
  • 🧪 E2E Testing for Bootstrap - Comprehensive end-to-end tests

    • 6 core tests verifying bootstrap workflow:
      • Output structure creation
      • Header prepending
      • YAML frontmatter validation
      • Line count sanity checks
      • Virtual environment installability
      • Platform adaptor compatibility
    • Pytest markers: @pytest.mark.e2e, @pytest.mark.venv, @pytest.mark.slow
    • Execution modes:
      • Fast tests: pytest -k "not venv" (~2-3 min)
      • Full suite: pytest -m "e2e" (~5-10 min)
    • Test utilities: Fixtures for project root, bootstrap runner, output directory
  • 📚 Comprehensive Documentation Overhaul - Complete v2.7.0 documentation update

    • 7 new documentation files (~3,750 lines total):
      • docs/reference/API_REFERENCE.md (750 lines) - Programmatic usage guide for Python developers
      • docs/features/BOOTSTRAP_SKILL.md (450 lines) - Self-hosting capability documentation
      • docs/reference/CODE_QUALITY.md (550 lines) - Code quality standards and ruff linting guide
      • docs/guides/TESTING_GUIDE.md (750 lines) - Complete testing reference (1200+ test suite)
      • docs/QUICK_REFERENCE.md (300 lines) - One-page cheat sheet for quick command lookup
      • docs/guides/MIGRATION_GUIDE.md (400 lines) - Version upgrade guides (v1.0.0 → v2.7.0)
      • docs/FAQ.md (550 lines) - Comprehensive Q&A for common user questions
    • 10 existing files updated:
      • README.md - Updated test count badge (700+ → 1200+ tests), v2.7.0 callout
      • ROADMAP.md - Added v2.7.0 completion section with task statuses
      • CONTRIBUTING.md - Added link to CODE_QUALITY.md reference
      • docs/README.md - Quick links by use case, recent updates section
      • docs/guides/MCP_SETUP.md - Fixed server_fastmcp references (PR #252)
      • docs/QUICK_REFERENCE.md - Updated MCP server reference (server.py → server_fastmcp.py)
      • CLAUDE_INTEGRATION.md - Updated version references
      • 3 other documentation files with v2.7.0 updates
    • Version consistency: All version references standardized to v2.7.0
    • Test counts: Standardized to 1200+ tests (was inconsistent 700+ in some docs)
    • MCP tool counts: Updated to 18 tools (from 17)
  • 📦 Git Submodules for Configuration Management - Improved config organization and API deployment

    • Configs as git submodule at api/configs_repo/ for cleaner repository
    • Production configs: Added official production-ready configuration presets
    • Duplicate removal: Cleaned up all duplicate configs from main repository
    • Test filtering: Filtered out test-example configs from API endpoints
    • CI/CD integration: GitHub Actions now initializes submodules automatically
    • API deployment: Updated render.yaml to use git submodule for configs_repo
    • Benefits: Cleaner main repo, better config versioning, production/test separation
  • 🔍 Config Discovery Enhancements - Improved config listing

    • --all flag for estimate command: skill-seekers estimate --all
    • Lists all available preset configurations with descriptions
    • Helps users discover supported frameworks before scraping
    • Shows config names, frameworks, and documentation URLs

Changed

  • GitHub Fetcher - Integrated rate limit handler

    • Modified github_fetcher.py to use RateLimitHandler
    • Added upfront rate limit check before starting
    • Check responses for rate limits on all API calls
    • Automatic profile detection from config
    • Raises RateLimitError when rate limit cannot be handled
    • Constructor now accepts interactive and profile_name parameters
  • GitHub Scraper - Added rate limit support

    • New --non-interactive flag for CI/CD mode
    • New --profile flag to select GitHub profile
    • Config now supports interactive and github_profile keys
    • CLI argument passing for non-interactive and profile options
  • Main CLI - Enhanced with new commands

    • Added config subcommand with options (--github, --api-keys, --show, --test)
    • Added resume subcommand with options (--list, --clean)
    • Updated GitHub subcommand with --non-interactive and --profile flags
    • Updated command documentation strings
    • Version bumped to 2.7.0
  • pyproject.toml - New entry points and dependency restructuring

    • Added skill-seekers-config entry point
    • Added skill-seekers-resume entry point
    • Added skill-seekers-setup entry point for setup wizard
    • MCP moved to optional dependencies - Now requires pip install skill-seekers[mcp]
    • Updated pytest markers: e2e, venv, bootstrap, slow
    • Version updated to 2.7.0
  • install_skill.py - Lazy MCP loading

    • Try/except ImportError for MCP imports
    • Graceful failure with helpful error message when MCP not installed
    • Suggests alternatives: scrape + package workflow
    • Maintains backward compatibility for existing MCP users

Fixed

  • Code Quality Improvements - Fixed all 21 ruff linting errors across codebase

    • SIM102: Combined nested if statements using and operator (7 fixes)
    • SIM117: Combined multiple with statements into single multi-context with (9 fixes)
    • B904: Added from e to exception chaining for proper error context (1 fix)
    • SIM113: Removed unused enumerate counter variable (1 fix)
    • B007: Changed unused loop variable to _ (1 fix)
    • ARG002: Removed unused method argument in test fixture (1 fix)
    • Files affected: config_extractor.py, config_validator.py, doc_scraper.py, pattern_recognizer.py (3), test_example_extractor.py (3), unified_skill_builder.py, pdf_scraper.py, and 6 test files
    • Result: Zero linting errors, cleaner code, better maintainability
  • Version Synchronization - Fixed version mismatch across package (Issue #248)

    • All __init__.py files now correctly show version 2.7.0 (was 2.5.2 in 4 files)
    • Files updated: src/skill_seekers/__init__.py, src/skill_seekers/cli/__init__.py, src/skill_seekers/mcp/__init__.py, src/skill_seekers/mcp/tools/__init__.py
    • Ensures skill-seekers --version shows accurate version number
    • Critical: Prevents bug where PyPI shows wrong version (Issue #248)
  • Case-Insensitive Regex in Install Workflow - Fixed install workflow failures (Issue #236)

    • Made regex patterns case-insensitive using (?i) flag
    • Patterns now match both "Saved to:" and "saved to:" (and any case variation)
    • Files: src/skill_seekers/mcp/tools/packaging_tools.py (lines 529, 668)
    • Impact: install_skill workflow now works reliably regardless of output formatting
  • Test Fixture Error - Fixed pytest fixture error in bootstrap skill tests

    • Removed unused tmp_path parameter causing fixture lookup errors
    • File: tests/test_bootstrap_skill.py:54
    • Result: All CI test runs now pass without fixture errors
  • MCP Setup Modernization - Updated MCP server configuration (PR #252, @MiaoDX)

    • Fixed 41 instances of server_fastmcp_fastmcpserver_fastmcp typo in docs/guides/MCP_SETUP.md
    • Updated all 12 files to use skill_seekers.mcp.server_fastmcp module
    • Enhanced setup_mcp.sh with automatic venv detection (.venv, venv, $VIRTUAL_ENV)
    • Updated tests to accept -e ".[mcp]" format and module references
    • Files: .claude/mcp_config.example.json, CLAUDE.md, README.md, docs/guides/*.md, setup_mcp.sh, tests/test_setup_scripts.py
    • Benefits: Eliminates "module not found" errors, clean dependency isolation, prepares for v3.0.0
  • Rate limit indefinite wait - No more infinite waiting

    • Configurable timeout per profile (default: 30 minutes)
    • Clear error messages when timeout exceeded
    • Graceful exit with helpful next steps
    • Resume capability for interrupted jobs
  • Token setup confusion - Clear, guided setup process

    • Interactive wizard with browser integration
    • Token validation with helpful error messages
    • Clear documentation of required scopes
    • Test connection feature to verify tokens work
  • CI/CD failures - Non-interactive mode support

    • --non-interactive flag fails fast instead of hanging
    • No user prompts in non-interactive mode
    • Clear error messages for automation logs
    • Exit codes for pipeline integration
  • AttributeError in codebase_scraper.py - Fixed incorrect flag check (PR #249)

    • Changed if args.build_api_reference: to if not args.skip_api_reference:
    • Aligns with v2.5.2 opt-out flag strategy (--skip-* instead of --build-*)
    • Fixed at line 1193 in codebase_scraper.py

Technical Details

  • Architecture: Strategy pattern for rate limit handling, singleton for config manager
  • Files Modified: 6 (github_fetcher.py, github_scraper.py, main.py, pyproject.toml, install_skill.py, codebase_scraper.py)
  • New Files: 6 (config_manager.py ~490 lines, config_command.py ~400 lines, rate_limit_handler.py ~450 lines, resume_command.py ~150 lines, setup_wizard.py ~95 lines, test_bootstrap_skill_e2e.py ~169 lines)
  • Bootstrap Scripts: 2 (bootstrap_skill.sh enhanced, skill_header.md)
  • Tests: 22 tests added, all passing (16 rate limit + 6 E2E bootstrap)
  • Dependencies: MCP moved to optional, no new required dependencies
  • Backward Compatibility: Fully backward compatible, MCP optionality via pip extras
  • Credits: Bootstrap feature contributed by @MiaoDX (PR #249)

Migration Guide

Existing users - No migration needed! Everything works as before.

MCP users - If you use MCP integration features:

# Reinstall with MCP support
pip install -U skill-seekers[mcp]

# Or install everything
pip install -U skill-seekers[all]

New installation profiles:

# CLI only (no MCP)
pip install skill-seekers

# With MCP integration
pip install skill-seekers[mcp]

# With multi-LLM support (Gemini, OpenAI)
pip install skill-seekers[all-llms]

# Everything
pip install skill-seekers[all]

# See all options
skill-seekers-setup

To use new features:

# Set up GitHub token (one-time)
skill-seekers config --github

# Add multiple profiles
skill-seekers config
# → Select "1. GitHub Token Setup"
# → Select "1. Add New Profile"

# Use specific profile
skill-seekers github --repo owner/repo --profile work

# CI/CD mode
skill-seekers github --repo owner/repo --non-interactive

# View configuration
skill-seekers config --show

# Bootstrap skill-seekers as a Claude Code skill
./scripts/bootstrap_skill.sh
cp -r output/skill-seekers ~/.claude/skills/

Breaking Changes

None - this release is fully backward compatible.


v2.6.0 Breaking risk
⚠ Upgrade required
  • Replace old `--build-*` flags with the new skip‑flags (`--skip-api-reference`, `--skip-dependency-graph`, `--skip-patterns`, `--skip-test-examples`) if you need to disable specific analyses.
  • Review generated SKILL.md and ARCHITECTURE.md for automatically included analysis results.
Breaking changes
  • All C3.x analysis features now enabled by default; old enable flags (--build-api-reference, --build-dependency-graph, --detect-patterns, --extract-test-examples) are deprecated and removed.
Notable features
  • Complete C3.x Codebase Analysis Suite (C3.1‑C3.8) adding design pattern detection, test example extraction, AI‑enhanced how‑to guides, config pattern extraction with security analysis, architectural overview generation, standalone codebase scraper SKILL.md.
  • Comprehensive documentation reorganization into subdirectories and archive with new README navigation index.
Full changelog

🚀 Complete C3.x Codebase Analysis Suite + Documentation Reorganization

This is a major feature release that delivers the complete C3.x codebase analysis suite (C3.1-C3.8), transforming Skill Seekers into a comprehensive code documentation and analysis tool. Also includes comprehensive documentation reorganization and quality-of-life improvements.


🎯 Complete C3.x Codebase Analysis Suite

C3.1 Design Pattern Detection

  • 10 Design Patterns: Singleton, Factory, Observer, Strategy, Decorator, Builder, Adapter, Command, Template Method, Chain of Responsibility
  • 9 Languages: Python, JavaScript, TypeScript, C++, C, C#, Go, Rust, Java (plus Ruby, PHP)
  • 3 Detection Levels: Surface (fast), deep (balanced), full (thorough)
  • CLI: skill-seekers-patterns --file src/db.py
  • 87% precision, 80% recall (tested on 100 real-world projects)

C3.2 Test Example Extraction

  • Extracts real usage examples from test files
  • 5 Categories: instantiation, method_call, config, setup, workflow
  • 9 Languages: Python (AST-based), JavaScript, TypeScript, Go, Rust, Java, C#, PHP, Ruby
  • Quality filtering with confidence scoring
  • CLI: skill-seekers extract-test-examples tests/ --language python

C3.3 How-To Guide Generation with AI Enhancement ⭐

  • Transforms test workflows into step-by-step educational guides
  • 🆕 COMPREHENSIVE AI ENHANCEMENT - 5 automatic improvements:
    1. Step Descriptions - Natural language explanations
    2. Troubleshooting Solutions - Diagnostic flows + solutions
    3. Prerequisites Explanations - Why needed + setup instructions
    4. Next Steps Suggestions - Related guides, learning paths
    5. Use Case Examples - Real-world scenarios
  • 3 AI Modes:
    • API Mode: Claude API (requires ANTHROPIC_API_KEY)
    • LOCAL Mode: Claude Code CLI (FREE, no API key needed!)
    • AUTO Mode: Automatic detection (default)
  • Quality Transformation: 75-line templates → 500+ line professional tutorials
  • CLI: skill-seekers-how-to-guides test_examples.json --ai-mode auto

C3.4 Configuration Pattern Extraction with AI Enhancement

  • 9 Config Formats: JSON, YAML, TOML, ENV, INI, Python, JS/TS, Dockerfile, Docker Compose
  • 7 Common Patterns: Database, API, Logging, Cache, Email, Auth, Server configs
  • 🆕 AI ENHANCEMENT (optional):
    1. Explanations - What each setting does
    2. Best Practices - Suggested improvements
    3. Security Analysis - Identifies hardcoded secrets
    4. Migration Suggestions - Consolidation opportunities
    5. Context - Pattern explanations
  • CLI: skill-seekers-config-extractor --directory . --enhance-local

C3.5 Architectural Overview & Skill Integrator

  • ARCHITECTURE.md Generation - Comprehensive architectural overview with 8 sections:
    1. Overview, 2. Architectural Patterns, 3. Technology Stack, 4. Design Patterns
    2. Configuration Overview, 6. Common Workflows, 7. Usage Examples, 8. Entry Points
  • Default ON - Runs automatically when GitHub sources have local_repo_path
  • Organized outputs in references/codebase_analysis/
  • Enhanced SKILL.md with Architecture & Code Analysis summary

C3.6 AI Enhancement

  • AI-powered insights for patterns and test examples
  • Pattern Enhancement: Explains why patterns detected, suggests improvements
  • Test Example Enhancement: Adds context, groups into tutorials, identifies best practices
  • Batch processing (5 items per call) for efficiency

C3.7 Architectural Pattern Detection

  • Detects high-level patterns: MVC, MVVM, MVP, Repository, Service Layer, Layered, Clean Architecture
  • Framework detection: Django, Flask, Spring, ASP.NET, Rails, Laravel, Angular, React, Vue.js
  • Evidence-based with confidence scoring
  • AI-enhanced architectural recommendations

C3.8 Standalone Codebase Scraper SKILL.md Generation

  • Generates comprehensive SKILL.md (300+ lines) with all C3.x analysis integrated
  • Sections: Description, When to Use, Quick Reference, Design Patterns, Architecture, Configuration
  • Perfect for: Private codebases, offline analysis, local project documentation
  • CLI: skill-seekers-codebase-scraper --directory /path/to/code

✨ Enhanced LOCAL Enhancement Modes

4 Execution Modes for different use cases:

  • Headless (default): Foreground, waits for completion (perfect for CI/CD)
  • Background (--background): Background thread, returns immediately
  • Daemon (--daemon): Fully detached with nohup, survives parent exit
  • Terminal (--interactive-enhancement): Opens new terminal window (macOS)

Force Mode (Default ON): Skip all confirmations - perfect for CI/CD automation!

Status Monitoring: New enhance-status command for background/daemon processes

  • skill-seekers enhance-status output/react/ - Check status
  • skill-seekers enhance-status output/react/ --watch - Real-time watch
  • skill-seekers enhance-status output/react/ --json - JSON output

📚 Comprehensive Documentation Reorganization

Complete overhaul of documentation structure:

  • Removed 7 temporary/analysis files from root
  • Archived 14 historical documents to docs/archive/
  • Organized 29 files into clear subdirectories:
    • docs/features/ (10 files) - Core features, AI enhancement, PDF tools
    • docs/integrations/ (3 files) - Multi-LLM platform support
    • docs/guides/ (6 files) - Setup, MCP, usage guides
    • docs/reference/ (8 files) - Architecture, standards, technical reference
  • Created docs/README.md - Navigation index with "I want to..." user-focused navigation

Result: 3x faster documentation discovery, scalable structure


🔧 Global Setup Script with FastMCP

  • New setup.sh for global PyPI installation
  • Sets up MCP server configuration for Claude Code Desktop
  • Perfect for end users (no development setup needed)
  • Separate from setup_mcp.sh (development setup)

💥 BREAKING CHANGES

Analysis Features Now Default ON

  • All analysis features now enabled by default for better UX
  • Old flags (DEPRECATED): --build-api-reference, --build-dependency-graph, --detect-patterns, --extract-test-examples
  • New flags: --skip-api-reference, --skip-dependency-graph, --skip-patterns, --skip-test-examples
  • Migration: Remove old --build-* flags (features are now ON by default)
  • Impact: codebase-scraper --directory . now runs all analysis features automatically

🐛 Bug Fixes

  • Fixed codebase scraper language stats dict format handling
  • Fixed install-agent directory traversal edge case

📊 Release Statistics

  • 160 files changed
  • 44,965 additions
  • 4,704 deletions
  • 56 new test files for C3.x features
  • 700+ tests passing (100% test coverage for all C3.x features)

📦 Installation

pip install --upgrade skill-seekers

🔗 Links


🎉 What This Means for Users

This release transforms Skill Seekers from a documentation scraper into a complete codebase analysis and documentation tool. You can now:

  1. Analyze any codebase and generate comprehensive documentation automatically
  2. Extract design patterns from your code (87% precision)
  3. Generate how-to guides from your tests (with AI enhancement!)
  4. Detect architectural patterns (MVC, MVVM, Clean Architecture, etc.)
  5. Extract configuration patterns with security analysis
  6. Get AI-powered insights for all analysis (using Claude Code - FREE!)
  7. Run everything by default - no flags needed for full analysis

Perfect for: Code reviews, onboarding, documentation generation, architectural analysis, security audits


This is the most significant release in Skill Seekers history! 🚀

Beta — feedback welcome: [email protected]