Release history
skill-seekers/Skill_Seekers releases
Transform 17 source types (docs, GitHub repos, PDFs, videos, Jupyter, Confluence, Notion, Slack/Discord) into AI-ready skills and RAG knowledge. 35 MCP tools for scraping, packaging, enhancing, and exporting to vector databases (Weaviate, Chroma, FAISS, Qdrant). Supports 16+ target platforms.
All releases
20 shown
- IBM Bob packaging target via `--target bob`
- GitHub scraper filters: issue state, labels, and since date
- Per-issue Markdown files for GitHub issues
Full changelog
[3.6.0] - 2026-05-03
Theme: Quality-of-life release — packaging targets, GitHub issue workflow, codebase analysis fixes, and source detection hardening.
Added
- IBM Bob packaging target — new
--target bobadaptor and agent install support for IBM's Bob agent platform (#366) - GitHub issue filtering —
--github-issue-state,--github-issue-labels, and--github-issue-sincefilters in the GitHub scraper for narrowing which issues are pulled (#367) - Per-issue files — GitHub scraper now writes one Markdown file per issue instead of a single bundle, improving navigation and downstream chunking (#367)
- Pinecone frontmatter — Pinecone vector exports now include consistent YAML frontmatter for metadata round-tripping (#367)
Fixed
- Unified scraper now generates
codebase_analysis/index — local sources were producing C3.x outputs with broken SKILL.md links; the unified skill builder now wires up the index and resolves links correctly (#362, #376) - Guides fallback fires correctly —
unified_skill_builderwas emitting a truthy placeholder for empty guides which suppressed the fallback content; placeholder removed (#364, #375) - HTML URLs no longer treated as local files —
source_detectornow checks forhttp(s)://before falling through to the local-path branch, fixing false-positive routing (#373) - PDF extracted images appear in markdown —
pdf_scrapernow insertsreferences for images extracted from PDFs so they render in the generated SKILL.md (#369) - C3.x output for local sources —
unifiedcommand was skipping the C3.x analysis pipeline for local codebase sources; now emits the full pattern/test/guide/config/router output (#363, #372) - Language filter passed to C3.x clone analysis — repos cloned for analysis now respect
--languagesinstead of analyzing every file (fixes #361, #370) - Unity vs Unreal detection — Unity projects with C# imports were being misidentified as Unreal; detection now keys on C# import patterns (fixes #365, #368)
- max_pages default changed from 500 to -1 (unlimited)
- removal of hardcoded magic numbers in constants.py; now reads defaults.json
- Centralized `defaults.json` config as single source of truth for all default values
- Low‑signal code snippet filtering via `_is_low_signal_code_snippet()`
- Pattern description normalization with `_normalize_pattern_description()`
Full changelog
[3.5.1] - 2026-04-12
Added
- Centralized
defaults.jsonconfig — single source of truth for all default values (rate_limit,max_pages,workers,async_mode, enhancement, analysis, RAG settings). Newdefaults.pyloader module. All 15+ files that previously hardcoded defaults now read from this file (#356) - Low-signal code snippet filtering —
_is_low_signal_code_snippet()filters junk patterns like bareTrue,options, single identifiers from quick references (#360) - Pattern description normalization —
_normalize_pattern_description()cleans boilerplate prefixes and truncates to first meaningful sentence (#360) - Example language priority ranking —
_example_language_priority()ranks Python > Bash > JSON > etc. for SKILL.md examples (#360) checkpoint_exists()method onDocToSkillConverter— was called but never defined (#360)- Unified config source normalization —
DocToSkillConverter.__init__merges fields fromsources[0]into flat config for compatibility (#360) display_namesupport in SKILL.md generation — produces cleaner titles and slugs (#360)- New tests:
test_doc_scraper_entrypoint.py(regression for_run_scraping), quick-reference quality tests, docs-only compatibility tests, nested reference coverage tests (#360)
Changed
max_pagesdefault is now unlimited (-1) — the scraper fetches all pages unless the user explicitly sets--max-pages. Previously defaulted to 500 (#356)--no-rate-limitflag now works — was defined in CLI arguments but never consumed byExecutionContext(#356)constants.pyreads fromdefaults.json— no longer contains hardcoded magic numbers (#356)ExecutionContext.ScrapingSettings—rate_limitandmax_pagesnow use real defaults instead ofNone, preventing None-poisoning downstream (#356)- SKILL.md frontmatter cleanup — empty
doc_version:andversion:fields are now omitted; placeholder sections removed (#360) - Enhancement routing through platform adaptors instead of importing nonexistent
enhance_skill_mdhelper (#360) quality_metrics.pyusesrglobfor nested reference directories in unified skills (#360)
Fixed
TypeError: '>' not supported between instances of 'NoneType' and 'int'—rate_limitdefaulted toNoneinExecutionContext, which flowed throughconfig.get("rate_limit", DEFAULT)(dict.get returns None when the key exists with value None, ignoring the fallback). Fixed indoc_scraper.py(sync + async paths),estimate_pages.py, andsync_config.py(#356, #359)discover_urls()loop never executed with unlimitedmax_pages—len(discovered) < -1is always False. Added unlimited mode guard (#356)converter.scrape()called nonexistent method in_run_scraping()— changed toconverter.scrape_all()(#360)- None-safety for BeautifulSoup attributes —
link["href"],sitemap.text,meta_desc["content"]guarded against None XML text nodes (#360) - Python 3.10 compatibility — backslash in f-string in
quality_metrics.pynot supported before 3.12 (#360)
- All content extraction features (pattern detection, test examples, how‑to guides, config extraction, router generation) are now enabled by default; no opt‑in required
- Dynamic routing via `_build_argv()` replaces manual argument forwarding and adds 7 previously missing CLI flags
- Renamed `claude-enhanced` merge mode to `ai-enhanced` (backward‑compatible alias retained)
- Removed hardcoded Claude references across the codebase
- Removed GitHub API analysis limit of 50 files and config extraction limit of 100 files
- Removed command injection vulnerability from cloned repo script execution
- Replaced `git add -A` with targeted staging in marketplace publisher
- Cleared auth tokens from cached `.git/config` after clone
- Grand Unification: single `create` command for 18 source types with auto‑detection and direct converters
- Agent‑agnostic `AgentClient` abstraction supporting Claude, Kimi, Codex, Copilot, OpenCode, and custom agents via API‑key detection
- Headless browser rendering (`--browser` flag) using Playwright to handle JavaScript SPAs
Full changelog
[3.5.0] - 2026-04-09
Theme: Grand Unification — one command, one interface, direct converters. Agent-agnostic architecture, marketplace pipeline, smart SPA discovery, all content extraction enabled by default. 80+ files changed across the codebase.
Added
- Grand Unification — unified
createcommand as single entry point for all 18 source types with auto-detection, direct converter invocation, and centralized enhancement (#346) - Agent-agnostic
AgentClientabstraction — all 5 enhancers now support Claude, Kimi, Codex, Copilot, OpenCode, and custom agents via a unified interface. Auto-detects agent from API keys instead of hardcoding (#336) - Kimi CLI integration with stdin piping and output parsing (#336)
MarketplacePublisher— publish skills to Claude Code plugin marketplace repos (#336)MarketplaceManager— register and manage marketplace repositories (#336)ConfigPublisher— push configs to registered config source repos (#336)push_configMCP tool for automated config publishing (#336)- Smart SPA discovery engine — three-layer discovery: sitemap.xml, llms.txt, SPA nav rendering (#336)
"browser": trueconfig support for JavaScript SPA sites with browser renderer timeout defaults (60s, domcontentloaded) (#336)- Dynamic routing via
_build_argv()— replaced manual arg forwarding with dynamic forwarder, added 7 missing CLI flags (#336) - Kotlin language support for codebase analysis — Full C3.x pipeline support: AST parsing (classes, objects, functions, data/sealed classes, extension functions, coroutines), dependency extraction, design pattern recognition (object declaration→Singleton, companion object→Factory, sealed class→Strategy), test example extraction (JUnit, Kotest, MockK, Spek), language detection patterns, config detection (build.gradle.kts), and extension maps across all analyzers (#287)
- Headless browser rendering (
--browserflag) — uses Playwright to render JavaScript SPA sites (React, Vue, etc.) that return empty HTML shells. Auto-installs Chromium on first use. Optional dep:pip install "skill-seekers[browser]"(#321) skill-seekers doctorcommand — 8 diagnostic checks (Python version, package install, git, core/optional deps, API keys, MCP server, output dir) with pass/warn/fail status and--verboseflag (#316)- Prompt injection check workflow — bundled
prompt-injection-checkworkflow scans scraped content for injection patterns (role assumption, instruction overrides, delimiter injection, hidden instructions). Added as first stage indefaultandsecurity-focusworkflows. Flags suspicious content without removing it (#324) - Codex CLI plugin manifest (
.codex-plugin/plugin.json) for OpenAI Codex integration (#350) - 6 behavioral UML diagrams — 3 sequence (create pipeline, GitHub+C3.x flow, MCP invocation), 2 activity (source detection, enhancement pipeline), 1 component (runtime dependencies with interface contracts)
- 134 new tests —
test_agent_client.py,test_config_publisher.py,_build_argvtests. Total: 3194 passed, 39 expected skips (#336)
Changed
- All content extraction features enabled by default — pattern detection, test examples, how-to guides, config extraction, and router generation no longer require explicit opt-in
- Renamed
claude-enhancedmerge mode toai-enhanced— backward compatibility alias kept (#336) - Removed 118+ hardcoded Claude references across 60+ files (#336)
- Refactored 5 enhancers to use
AgentClientabstraction (#336) - Removed 50-file GitHub API analysis limit (#336)
- Removed 100-file config extraction limit (#336)
- Fixed unified scraper default
max_pagesfrom 100 to 500 (#336) - Centralized enhancement timeouts to 45min default with unlimited support (#336)
- Excluded slow MCP/e2e tests from CI coverage step to prevent timeout
Fixed
glob('*.md')replaced withrglob('*.md')in all adaptors — fixes packaging when skills are in nested directories (#349)scraped_datalist-vs-dict bug in conflict detection (#336)base_urlpassthrough to doc scraper subprocess (#336)- URL filtering now uses base directory correctly (#336)
- C3.x analysis data loss (#336)
--enhance-levelflag not passed correctly (#336)guide_enhancermethod rename —_call_claude_apirenamed to_call_ai(#336)- 11 pre-existing test failures fixed (#336)
- Per-file language detection in GitHub scraper (#336)
- GitHub language detection crashes with
TypeErrorwhen API response contains non-integer metadata keys (e.g.,"url") — now filters to integer values only (#322) - C3.x codebase analysis crashes with
TypeError—_run_c3_analysis()and_analyze_c3x()passed removedenhance_with_ai/ai_modekwargs toanalyze_codebase()instead ofenhance_level(#323)
Security
- Removed command injection via cloned repo script execution (#336)
- Replaced
git add -Awith targeted staging in marketplace publisher (#336) - Clear auth tokens from cached
.git/configafter clone (#336) - Use
defusedxmlfor sitemap XML parsing (XXE protection) (#336) - Path traversal validation for config names (#336)
- 8 new LLM platform adaptors (OpenCode, Kimi, DeepSeek, Qwen, OpenRouter, Together AI, Fireworks AI) bringing total to 12
- 7 new CLI agent install paths (roo, cline, aider, bolt, kilo, continue, kimi-code) raising count to 18
- OpenCode skill tools: auto‑splitter and bi‑directional converter
Full changelog
What's New in v3.4.0
Theme: 8 new LLM platform adaptors (12 total), 7 new CLI agent paths (18 total), OpenCode skill tools, SPA site detection, 8 bug fixes, and full UML architecture documentation.
Platform Expansion: 5 → 12 LLM Targets
| New Platform | Flag | Base |
|---|---|---|
| OpenCode | --target opencode | Directory-based, dual YAML |
| Kimi | --target kimi | OpenAI-compatible |
| DeepSeek | --target deepseek | OpenAI-compatible |
| Qwen | --target qwen | OpenAI-compatible |
| OpenRouter | --target openrouter | OpenAI-compatible |
| Together AI | --target together | OpenAI-compatible |
| Fireworks AI | --target fireworks | OpenAI-compatible |
All new platforms inherit from a shared OpenAI-compatible base class for consistent behavior.
Agent Expansion: 11 → 18 Install Paths
New agents: roo, cline, aider, bolt, kilo, continue, kimi-code
OpenCode Skill Tools
- Skill splitter — auto-split large docs into focused sub-skills with router
- Bi-directional converter — import/export between OpenCode and any platform format
Distribution
- Smithery manifest (
smithery.yaml) - GitHub Actions template for automated skill updates
- Claude Code Plugin with slash commands
Bug Fixes
sanitize_url()crash on Python 3.14 stricturlparse(#284)- Blind
/index.html.mdappend breaking non-Docusaurus sites (#277) - Unified scraper temp config format (#317)
- Unicode arrows breaking Windows cp1252 terminals
- CLI flags in plugin slash commands
- MiniMax adaptor improvements (#319)
- Misleading "Scraped N pages" count — now shows
(N saved, M skipped)(#320) - SPA site detection — warns when site requires JavaScript rendering (#320, #321)
Documentation
- Full UML architecture — 14 class diagrams synced from source code via StarUML
- StarUML HTML API reference export
- Ecosystem section linking all Skill Seekers repos
- Architecture references in README and CONTRIBUTING
- Consolidated
Docs/intodocs/
Test Results
2929 passed, 39 skipped, 0 failures
Install / Upgrade
pip install --upgrade skill-seekers
Full changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md
- Optional dependency groups (`[jupyter]`, `[asciidoc]`, `[pptx]`, `[confluence]`, `[notion]`, `[rss]`, `[chat]`) added; install with `pip install "skill-seekers[jupyter]"` etc. for new source types
- Config validator now checks all 17 source types (including previously missing `word` and `video`). Existing configs should be validated after upgrade.
- CLI has ten new subcommands (`jupyter`, `html`, `openapi`, `asciidoc`, `pptx`, `rss`, `manpage`, `confluence`, `notion`, `chat`) – update scripts or aliases accordingly.
- 10 new source types (Jupyter, HTML, OpenAPI/Swagger, AsciiDoc, PowerPoint, RSS/Atom, Man pages, Confluence, Notion, Slack/Discord) integrated into CLI and multi‑source configs
- Unified EPUB pipeline added to `skill-seekers` with DRM detection and TOC bug workaround
- `sync-config` subcommand that crawls navigation links, diffs against config start URLs, and optionally updates the configuration
Full changelog
[3.3.0] - 2026-03-16
Theme: 10 new source types (17 total), EPUB unified integration, sync-config command, performance optimizations, 12 README translations, and 19 bug fixes. 117 files changed, +41,588 lines since v3.2.0.
Supported Source Types (17)
| # | Type | CLI Command | Config Type | Auto-Detection |
|---|------|-------------|-------------|----------------|
| 1 | Documentation (web) | scrape / create <url> | documentation | HTTP/HTTPS URLs |
| 2 | GitHub repository | github / create owner/repo | github | owner/repo or github.com URLs |
| 3 | PDF document | pdf / create file.pdf | pdf | .pdf extension |
| 4 | Word document | word / create file.docx | word | .docx extension |
| 5 | EPUB e-book | epub / create file.epub | epub | .epub extension |
| 6 | Video | video / create <url/file> | video | YouTube/Vimeo URLs, video extensions |
| 7 | Local codebase | analyze / create ./path | local | Directory paths |
| 8 | Jupyter Notebook | jupyter / create file.ipynb | jupyter | .ipynb extension |
| 9 | Local HTML | html / create file.html | html | .html/.htm extensions |
| 10 | OpenAPI/Swagger | openapi / create spec.yaml | openapi | .yaml/.yml with OpenAPI content |
| 11 | AsciiDoc | asciidoc / create file.adoc | asciidoc | .adoc/.asciidoc extensions |
| 12 | PowerPoint | pptx / create file.pptx | pptx | .pptx extension |
| 13 | RSS/Atom feed | rss / create feed.rss | rss | .rss/.atom extensions |
| 14 | Man pages | manpage / create cmd.1 | manpage | .1–.8/.man extensions |
| 15 | Confluence wiki | confluence | confluence | API or export directory |
| 16 | Notion pages | notion | notion | API or export directory |
| 17 | Slack/Discord chat | chat | chat | Export directory or API |
Added
10 New Skill Source Types (17 total)
Skill Seekers now supports 17 source types — up from 7. Every new type is fully integrated into the CLI (skill-seekers <type>), create command auto-detection, unified multi-source configs, config validation, the MCP server, and the skill builder.
-
Jupyter Notebook —
skill-seekers jupyter --notebook file.ipynborskill-seekers create file.ipynb- Extracts markdown cells, code cells with outputs, kernel metadata, imports, and language detection
- Handles single files and directories of notebooks; filters
.ipynb_checkpoints - Optional dependency:
pip install "skill-seekers[jupyter]"(nbformat) - Entry point:
skill-seekers-jupyter
-
Local HTML —
skill-seekers html --html-path file.htmlorskill-seekers create file.html- Parses HTML using BeautifulSoup with smart main content detection (
<article>,<main>,.content, largest div) - Extracts headings, code blocks, tables (to markdown), images, links; converts inline HTML to markdown
- Handles single files and directories; supports
.html,.htm,.xhtmlextensions - No extra dependencies (BeautifulSoup is a core dep)
- Parses HTML using BeautifulSoup with smart main content detection (
-
OpenAPI/Swagger —
skill-seekers openapi --spec spec.yamlorskill-seekers create spec.yaml- Parses OpenAPI 3.0/3.1 and Swagger 2.0 specs from YAML or JSON (local files or URLs via
--spec-url) - Extracts endpoints, parameters, request/response schemas, security schemes, tags
- Resolves
$refreferences with circular reference protection; handlesallOf/oneOf/anyOf - Groups endpoints by tags; generates comprehensive API reference markdown
- Source detection sniffs YAML file content for
openapi:orswagger:keys (avoids false positives on non-API YAML files) - Optional dependency:
pip install "skill-seekers[openapi]"(pyyaml — already a core dep, guard added for safety)
- Parses OpenAPI 3.0/3.1 and Swagger 2.0 specs from YAML or JSON (local files or URLs via
-
AsciiDoc —
skill-seekers asciidoc --asciidoc-path file.adocorskill-seekers create file.adoc- Regex-based parser (no external library required) with optional
asciidoclibrary support - Extracts headings (= through =====),
[source,lang]code blocks,|===tables, admonitions (NOTE/TIP/WARNING/IMPORTANT/CAUTION), andinclude::directives - Converts AsciiDoc formatting to markdown; handles single files and directories
- Optional dependency:
pip install "skill-seekers[asciidoc]"(asciidoc library for advanced rendering)
- Regex-based parser (no external library required) with optional
-
PowerPoint (.pptx) —
skill-seekers pptx --pptx file.pptxorskill-seekers create file.pptx- Extracts slide text, speaker notes, tables, images (with alt text), and grouped shapes
- Detects code blocks by monospace font analysis (30+ font families)
- Groups slides into sections by layout type; handles single files and directories
- Optional dependency:
pip install "skill-seekers[pptx]"(python-pptx)
-
RSS/Atom Feeds —
skill-seekers rss --feed-url <url>/--feed-path file.rssorskill-seekers create feed.rss- Parses RSS 2.0, RSS 1.0, and Atom feeds via feedparser
- Optionally follows article links (
--follow-links, default on) to scrape full page content using BeautifulSoup - Extracts article titles, summaries, authors, dates, categories; configurable
--max-articles(default 50) - Source detection matches
.rssand.atomextensions (.xmlexcluded to avoid false positives) - Optional dependency:
pip install "skill-seekers[rss]"(feedparser)
-
Man Pages —
skill-seekers manpage --man-names git,curl/--man-path dir/orskill-seekers create git.1- Extracts man pages by running
mancommand via subprocess or reading.1–.8/.manfiles directly - Handles gzip/bzip2/xz compressed man files; strips troff/groff formatting (backspace overstriking, macros, font escapes)
- Parses structured sections (NAME, SYNOPSIS, DESCRIPTION, OPTIONS, EXAMPLES, SEE ALSO)
- Source detection uses basename heuristic to avoid false positives on log rotation files (e.g.,
access.log.1) - No external dependencies (stdlib only)
- Extracts man pages by running
-
Confluence —
skill-seekers confluence --base-url <url> --space-key <key>or--export-path dir/- API mode: fetches pages from Confluence REST API with pagination (
atlassian-python-api) - Export mode: parses Confluence HTML/XML export directories
- Extracts page content, code/panel/info/warning macros, page hierarchy, tables
- Optional dependency:
pip install "skill-seekers[confluence]"(atlassian-python-api)
- API mode: fetches pages from Confluence REST API with pagination (
-
Notion —
skill-seekers notion --database-id <id>/--page-id <id>or--export-path dir/- API mode: fetches pages via Notion API with support for 20+ block types (paragraph, heading, code, callout, toggle, table, etc.)
- Export mode: parses Notion Markdown/CSV export directories
- Extracts rich text with annotations (bold, italic, code, links), 16+ property types for database entries
- Optional dependency:
pip install "skill-seekers[notion]"(notion-client)
-
Slack/Discord Chat —
skill-seekers chat --export-path dir/or--token <token> --channel <channel>- Slack: parses workspace JSON exports or fetches via Slack Web API (
slack_sdk) - Discord: parses DiscordChatExporter JSON or fetches via Discord HTTP API
- Extracts messages, code snippets (fenced blocks), shared URLs, threads, reactions, attachments
- Generates per-channel summaries and topic categorization
- Optional dependency:
pip install "skill-seekers[chat]"(slack-sdk)
- Slack: parses workspace JSON exports or fetches via Slack Web API (
EPUB Unified Pipeline Integration
- EPUB (.epub) input support via
skill-seekers create book.epuborskill-seekers epub --epub book.epub- Extracts chapters, metadata (Dublin Core), code blocks, images, and tables from EPUB 2 and EPUB 3 files
- DRM detection with clear error messages (Adobe ADEPT, Apple FairPlay, Readium LCP)
- Font obfuscation correctly identified as non-DRM
- EPUB 3 TOC bug workaround (
ignore_ncxoption) --help-epubflag for EPUB-specific help- Optional dependency:
pip install "skill-seekers[epub]"(ebooklib) - 107 tests across 14 test classes
- EPUB added to unified scraper —
_scrape_epub()method,scraped_data["epub"], config validation (_validate_epub_source), and dry-run display. Previously EPUB worked standalone but was missing from multi-source configs.
Unified Skill Builder — Generic Merge System
_generic_merge()— Priority-based section merge for any combination of source types not covered by existing pairwise synthesis (docs+github, docs+pdf, etc.). Produces YAML frontmatter + source-attributed sections._append_extra_sources()— Appends additional source type content (e.g., Jupyter + PPTX) to pairwise-synthesized SKILL.md._generate_generic_references()— Generatesreferences/<type>/index.mdfor any source type, with ID resolution fallback chain._SOURCE_LABELSdict — Human-readable labels for all 17 source types used in merge attribution.
Config Validator Expansion
- 17 source types in
VALID_SOURCE_TYPES— All new types pluswordandvideonow have per-type validation methods. _validate_word_source()— Validatespathfield for Word documents (was previously missing)._validate_video_source()— Validatesurl,path, orplaylistfield for video sources (was previously missing).- 11 new
_validate_*_source()methods — One for each new type with appropriate required-field checks.
Source Detection Improvements
- 7 new file extension detections in
SourceDetector.detect()—.ipynb,.html/.htm,.pptx,.adoc/.asciidoc,.rss/.atom,.1–.8/.man,.yaml/.yml(with content sniffing) _looks_like_openapi()— Content sniffing for YAML files: only classifies as OpenAPI if the file containsopenapi:orswagger:key in first 20 lines (prevents false positives on docker-compose, Ansible, Kubernetes manifests, etc.)- Man page basename heuristic —
.1–.8extensions only detected as man pages if the basename has no dots (e.g.,git.1matches butaccess.log.1does not) .xmlexcluded from RSS detection — Too generic; only.rssand.atomtrigger RSS detection
MCP Server Integration
scrape_generictool — New MCP tool handles all 10 new source types via subprocess with per-type flag mapping_PATH_FLAGS/_URL_FLAGSdicts — Correct flag routing for each source type (e.g., jupyter→--notebook, html→--html-path, rss→--feed-url)GENERIC_SOURCE_TYPEStuple — Lists all 10 new types for validation- Config validation display —
validate_configtool now shows source details for all new types - Tool count updated — 33 → 34 tools (scraping tools 10 → 11)
CLI Wiring
- 10 new CLI subcommands —
jupyter,html,openapi,asciidoc,pptx,rss,manpage,confluence,notion,chatinCOMMAND_MODULES - 10 new argument modules —
arguments/{jupyter,html,openapi,asciidoc,pptx,rss,manpage,confluence,notion,chat}.pywith per-type*_ARGUMENTSdicts - 10 new parser modules —
parsers/{jupyter,html,openapi,asciidoc,pptx,rss,manpage,confluence,notion,chat}_parser.pywithSubcommandParserimplementations createcommand routing —_route_generic()method for all new types with correct module names and CLI flags- 10 new entry points in pyproject.toml —
skill-seekers-{jupyter,html,openapi,asciidoc,pptx,rss,manpage,confluence,notion,chat} - 7 new optional dependency groups in pyproject.toml —
[jupyter],[asciidoc],[pptx],[confluence],[notion],[rss],[chat] [all]group updated — Includes all 7 new optional dependencies
Sync Config Command
skill-seekers sync-config— New subcommand that crawls a docs site's navigation, diffs discovered URLs against a config'sstart_urls, and optionally writes the updated list back with--apply(#306)- BFS link discovery with configurable depth (default 2), max-pages, rate-limit
- Respects
url_patterns.include/excludefrom config - Supports optional
nav_seed_urlsconfig field - Handles both unified (sources array) and legacy flat config formats
- MCP
sync_configtool included - 57 tests (39 unit + 18 E2E with local HTTP server)
Workflow & Documentation
complex-merge.yaml— New 7-stage AI-powered workflow for complex multi-source merging (source inventory → cross-reference → conflict detection → priority merge → gap analysis → synthesis → quality check)- AGENTS.md rewritten — Updated with all 17 source types, scraper pattern docs, project layout, and key pattern documentation
- 77 new integration tests in
test_new_source_types.py— Source detection, config validation, generic merge, CLI wiring, validation, and create command routing docs/BEST_PRACTICES.md— Comprehensive guide for creating high-quality skills: SKILL.md structure, code examples, prerequisites, troubleshooting, quality targets, and real-world Grade F to Grade A example (#206)- Documentation updated for 17 source types — 32 files updated across README, CLI reference, feature matrix, MCP reference, config format, API reference, unified scraping, multi-source guide, installation, quick-start, core concepts, user guide, FAQ, troubleshooting, architecture, and all Chinese (zh-CN) translations
- README translations for 10 languages (12 total) — Added Japanese (日本語), Korean (한국어), Spanish (Español), French (Français), German (Deutsch), Portuguese (Português), Turkish (Türkçe), Arabic (العربية), Hindi (हिन्दी), and Russian (Русский) README translations with language selector bar across all versions
Performance
- Pre-compiled regex and O(1) URL dedup in doc_scraper — Module-level compiled patterns,
_enqueued_urlsset for O(1) dedup, cached URL patterns, async error logging fix (#309) - Bisect-based line indexing in code_analyzer and dependency_analyzer — O(log n)
offset_to_line()via bisect replaces O(n)count("\n")across all 10 language analyzers and all import extractors - O(n) parent class map for Python method detection — Replaces O(n²) repeated AST walks in code_analyzer
- O(1) tree traversal in github_scraper —
deque.popleft()replaces listpop(0) - Shared
build_line_index()/offset_to_line()utilities incli/utils.py— DRY extraction from code_analyzer and dependency_analyzer
Fixed
- Config validator missing
wordandvideodispatch —_validate_source()had noelifbranches forwordorvideotypes, silently skipping validation. Added dispatch entries and_validate_word_source()/_validate_video_source()methods. openapi_scraper.pyunconditionalimport yaml— Would crash at import time if pyyaml not installed. Addedtry/except ImportErrorguard withYAML_AVAILABLEflag and_check_yaml_deps()helper.asciidoc_scraper.pymissing standard arguments —main()manually defined args instead of usingadd_asciidoc_arguments(). Refactored to use shared argument definitions + added enhancement workflow integration.pptx_scraper.pymissing standard arguments — Same issue. Refactored to useadd_pptx_arguments().chat_scraper.pymissing standard arguments — Same issue. Refactored to useadd_chat_arguments().notion_scraper.pymissingrun_workflowscall —--enhance-workflowflags were silently ignored. Added workflow runner integration.openapi_scraper.pyreturn typeNone—main()returnedNoneinstead ofint. Fixed toreturn 0on success, matching all other scrapers.- MCP
scrape_generic_toolflag mismatch — Was passing--path/--urlas generic flags, but every scraper expects its own flag name (e.g.,--notebook,--html-path,--spec). All 10 source types would have failed at runtime. Fixed with per-type_PATH_FLAGSand_URL_FLAGSmappings. - Word scraper
docx_idkey mismatch — Unified scraper data dict useddocx_idbut generic reference generation looked forword_id. Addedword_idalias. main.pydocstring stale — Missing all 10 new commands. Updated to list all 27 commands.source_detector.pymodule docstring stale — Described only 5 source types. Updated to describe 14+ detected types.manpage_parser.pydocstring referenced wrong file — Saidmanpage_scraper.pybut actual file isman_scraper.py. Fixed.- Parser registry test count — Updated expected count from 25 to 35 for 10 new parsers.
- 'Invalid IPv6 URL' error on bracket-containing URLs (#284) — URLs with square brackets (e.g.,
/api/[v1]/users) discovered via BFS crawl or HTML extraction bypassed the original fix in_clean_url(). Added sharedsanitize_url()utility applied at every URL ingestion point. 16 new tests. - GitHub scraper 'list index out of range' on issue extraction (#269) — PyGithub's
PaginatedListslicing could fail on some versions or empty repos. Replaced withitertools.islice(). - Release workflow version mismatch — GitHub release showed wrong version (v3.1.3 instead of v3.2.0) because no explicit release name was set and sed regex had unescaped dots. Added explicit
name/tag_name, version consistency check (tag vs pyproject.toml vs package), and empty release notes fallback. - Release workflow Python 3.10 compatibility — Version consistency check used
tomllib(Python 3.11+). Replaced with grep/sed for 3.10 compatibility. infer_categories()"tutorial" vs "tutorials" key mismatch — Guard checked'tutorial'but wrote to'tutorials'key, risking silent overwrites in category inference.- Flaky
test_benchmark_metadata_overhead— Stabilized with 20 iterations, warm-up run, median averaging, and 200% threshold (was failing on CI with 5 iterations and mean). - CI branch protection check permanently pending — Summary job was named 'All Checks Complete' but branch protection required 'Tests'. PRs were stuck as 'Expected — Waiting for status to be reported'. Renamed job to match.
- Install optional extras with `pip install skill-seekers[video]`, `skill-seekers[docx]`, `skill-seekers[pinecone]` or `skill-seekers[all]`.
- Run `skill-seekers video --setup` after upgrade to auto‑detect and configure GPU dependencies.
- Video Extraction Pipeline: CLI command `skill-seekers video --url` to scrape YouTube/local videos with transcript, OCR, panel detection, code timeline and GPU auto‑setup.
- Word Document (.docx) Support: Command `skill-seekers word --docx` converts .docx via mammoth → HTML → SKILL.md with smart code block detection.
- Pinecone Vector Database Adaptor: `skill-seekers package … --format pinecone --upload` provides full CRUD, namespace support and OpenAI/Sentence‑Transformer embedding integration.
Full changelog
v3.2.0 — Video Extraction, Word Support, Pinecone Adaptor
Theme: Video source support, Word document support, Pinecone adaptor, and quality improvements. 94 files changed, +23,500 lines since v3.1.3. 2,540 tests passing.
🎬 Video Extraction Pipeline
Complete video extraction system that converts YouTube videos and local video files into AI-consumable skills.
skill-seekers video --url <youtube-url>— New CLI command for video scrapingskill-seekers create <youtube-url>— Auto-detects YouTube URLs- Transcript extraction — 3-tier fallback: YouTube API → yt-dlp → faster-whisper
- Visual OCR — Multi-engine ensemble (EasyOCR + pytesseract) for code frames
- Panel detection — Splits IDE screenshots into independent sub-sections
- Code timeline — Tracks code evolution across frames with edit history
- Two-pass AI enhancement — Cleans OCR noise using transcript context
- GPU auto-detection —
skill-seekers video --setupdetects CUDA/ROCm/CPU and installs correct PyTorch - 197 tests covering models, metadata, transcript, visual, OCR, and CLI
📄 Word Document (.docx) Support
skill-seekers word --docx <file>— Full pipeline: mammoth → HTML → sections → SKILL.mdskill-seekers create document.docx— Auto-detects .docx files- Smart code detection — Identifies monospace paragraphs as code blocks
- Install:
pip install skill-seekers[docx]
🌲 Pinecone Vector Database Adaptor
skill-seekers package output/ --format pinecone --upload— Direct Pinecone upload- Full CRUD operations with namespace support
- OpenAI and Sentence Transformers embedding support
- Batch upsert with configurable batch sizes
- 764 tests for comprehensive coverage
🐛 Bug Fixes
- 6 OCR quality fixes — Skip webcam frames, clean IDE decorations, fix duplicate lines, filter UI junk
- 15 video pipeline fixes — Timeout handling, MCP integration, filename collisions, dependency management
- Issue #300 — Selector fallback & dry-run link discovery (ReactFlow found 20+ pages, was 1)
- Issue #301 —
setup.shmacOS fix - RAG chunking crash — Fixed
AttributeError: output_dir - Chunk overlap auto-scaling — Scales to
max(50, chunk_tokens // 10) - Reference file limits removed — No more caps on GitHub issues, releases, or code blocks
- See CHANGELOG.md for full details
📦 Install / Upgrade
pip install --upgrade skill-seekers
# With video support
pip install skill-seekers[video]
skill-seekers video --setup # Auto-detect GPU, install deps
# With Word support
pip install skill-seekers[docx]
# With Pinecone
pip install skill-seekers[pinecone]
# Everything
pip install skill-seekers[all]
Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md
- --chunk-size → --chunk-tokens
- --chunk-overlap → --chunk-overlap-tokens
- --chunk → --chunk-for-rag
Full changelog
[3.1.3] - 2026-02-24
🐛 Hotfix — Explicit Chunk Flags & Argument Pipeline Cleanup
Fixed
- Issue #299:
skill-seekers package --target claudeunrecognised argument crash —_reconstruct_argv()inmain.pyemits default flag values back into argv when routing subcommands.package_skill.pyhad a 105-line inline argparser that used different flag names to those inarguments/package.py, so forwarded flags were rejected. Fixed by replacing the inline block with a call toadd_package_arguments(parser)— the single source of truth.
Changed
package_skill.pyargparser refactored — Replaced ~105 lines of inline argparse duplication with a singleadd_package_arguments(parser)call. Flag names are now guaranteed consistent with_reconstruct_argv()output, preventing future argument-name drift.- Explicit chunk flag names — All
--chunk-*flags now include unit suffixes to eliminate ambiguity between RAG tokens and streaming characters:--chunk-size(RAG tokens) →--chunk-tokens--chunk-overlap(RAG tokens) →--chunk-overlap-tokens--chunk(enable RAG chunking) →--chunk-for-rag--streaming-chunk-size(chars) →--streaming-chunk-chars--streaming-overlap(chars) →--streaming-overlap-chars--chunk-sizein PDF extractor (pages) →--pdf-pages-per-chunk
setup_logging()centralized — Addedsetup_logging(verbose, quiet)toutils.pyand removed 4 duplicate module-levellogging.basicConfig()calls fromdoc_scraper.py,github_scraper.py,codebase_scraper.py, andunified_scraper.py
- pip install --upgrade skill-seekers
- docker pull yusufk/skill-seekers:latest
Full changelog
What's Changed
🐛 Critical Bug Fixes
Gemini enhancement 404 errors — The gemini-2.0-flash-exp model was retired by Google, causing all Gemini enhancement requests to fail with 404. Replaced with gemini-2.5-flash (stable GA).
skill-seekers enhance auto-detection — The documented behaviour of automatically using API mode when an API key is present was never implemented. This release fixes it:
ANTHROPIC_API_KEYset → Claude API modeGOOGLE_API_KEYset → Gemini API modeOPENAI_API_KEYset → OpenAI API mode- No key → LOCAL mode (Claude Code Max, free)
Use --mode LOCAL to force local mode even when API keys are present.
create command argument forwarding — Universal flags (--dry-run, --verbose, --quiet, --name, --description) were crashing when used with GitHub, PDF, and codebase sources. All fixed. Also adds --dry-run support to skill-seekers github and skill-seekers pdf.
Upgrade
pip install --upgrade skill-seekers
docker pull yusufk/skill-seekers:latest
Full Changelog
See CHANGELOG.md for complete details.
Fixed AttributeError when creating commands with max_pages.
Full changelog
What's Changed
- fix: use getattr for max_pages in create command web routing by @YusufKaraaslanSpyke in https://github.com/yusufkaraaslan/Skill_Seekers/pull/294
- hotfix: v3.1.1 — fix create command max_pages AttributeError by @yusufkaraaslan in https://github.com/yusufkaraaslan/Skill_Seekers/pull/295
- Max page hot fix by @yusufkaraaslan in https://github.com/yusufkaraaslan/Skill_Seekers/pull/296
Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/compare/v3.1.0...v3.1.1
- Unified `create` command that auto‑detects source type and consolidates all workflow entry points
- 65 bundled enhancement workflow presets (e.g., security-focus, api-documentation) with management commands
Full changelog
🎯 v3.1.0 — "Unified CLI & Developer Experience"
One command for everything. 65 workflow presets. 178 production configs. 2280+ tests.
🚀 What's New
✨ Unified create Command — One command to rule them all
No more remembering which command to use. Just create with anything:
# Auto-detects source type
skill-seekers create https://docs.react.dev/ # → web scraper
skill-seekers create facebook/react # → GitHub analysis
skill-seekers create ./my-project # → local codebase
skill-seekers create tutorial.pdf # → PDF extraction
skill-seekers create configs/react.json # → multi-source unified
# Quick preset shortcut (-p)
skill-seekers create https://docs.react.dev/ -p quick
skill-seekers create facebook/react -p comprehensive
# Progressive help — no more flag overwhelm
skill-seekers create --help # 13 universal flags (clean)
skill-seekers create --help-web # web-specific options
skill-seekers create --help-github # GitHub-specific options
skill-seekers create --help-all # every flag (120+)
🔧 65 Enhancement Workflow Presets
Tailor your skills for specific use cases with bundled workflow presets:
# Chain multiple workflows
skill-seekers create facebook/react \
--enhance-workflow security-focus \
--enhance-workflow api-documentation
# Manage presets
skill-seekers workflows list # Browse all 65 bundled presets
skill-seekers workflows show security-focus # Inspect a preset
skill-seekers workflows copy security-focus # Copy to user dir for customization
skill-seekers workflows add my-preset.yaml # Add custom preset
Bundled presets cover: security-focus, api-documentation, architecture-comprehensive, testing-focus, microservices-patterns, kubernetes-deployment, database-schema, mlops-pipeline, rest-api-design, graphql-schema, responsive-design, performance-optimization, accessibility-a11y and 50+ more.
⚡ Smart Enhancement Dispatcher
# Auto-detects API key or falls back to Claude Code CLI
skill-seekers enhance output/react/
# Explicit target
skill-seekers enhance output/react/ --target gemini
# Docker/root guard — clear error instead of silent failure
# (fixes #286, #289)
📄 ReStructuredText (RST) Support
Sphinx/RST documentation sites now extract content properly — class references, code blocks, tables, and cross-references are all parsed correctly.
🗃️ 178 Production Configs — All Reviewed & Enhanced
All configs in skill-seekers-configs brought to v1.1.0 quality standard:
- ✅ All
max_pagesfields removed (deprecated, defaults apply automatically) - ✅ 5–13 categories per config, 3–6 keywords each
- ✅ Semantic selector fallback chains (
article, main, div[role='main']) - ✅ Outdated URLs fixed (Astro v3 restructure, Laravel 12.x)
- ✅
scripts/validate-config.pybug fixes
🐛 Notable Bug Fixes
| Fix | Issue |
|-----|-------|
| --enhance-workflow flag forwarding in create command | workflows were silently ignored |
| LOCAL enhancement blocked for root/Docker users | fixes #286, #289 |
| %APPDATA% config paths on Windows | fixes #283 |
| Bracket characters in llms.txt URLs (IPv6 parse error) | fixes #284 |
| Unified config categories not found in validate-config.py | multi-source configs always failed |
📊 Stats
| Metric | v3.0.0 | v3.1.0 |
|--------|--------|--------|
| Tests passing | 1,852 | 2,280+ |
| Enhancement workflow presets | 0 | 65 |
| Production configs | 178 | 178 (all reviewed) |
| CLI entry points | 22 | 23 (workflows) |
| Platforms supported | 16 | 16 |
📦 Installation
pip install skill-seekers==3.1.0
# or
pip install --upgrade skill-seekers
🐳 Docker
docker pull yusufk/skill-seekers:3.1.0
docker pull yusufk/skill-seekers:latest
# MCP server
docker pull yusufk/skill-seekers-mcp:3.1.0
🔗 Links
Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/compare/v3.0.0...v3.1.0
- 16 platform adaptors (RAG, AI platforms, coding assistants, Markdown)
- 26 MCP tools covering config generation, scraping, packaging, source management, splitting and vector DB exports
- Cloud storage support for AWS S3, Google Cloud Storage, Azure Blob with upload/download/list/presigned URL features
Full changelog
[3.0.0] - 2026-02-10
🚀 "Universal Intelligence Platform" - Major Release
Theme: Transform any documentation into structured knowledge for any AI system.
This is our biggest release ever! v3.0.0 establishes Skill Seekers as the universal documentation preprocessor for the entire AI ecosystem - from RAG pipelines to AI coding assistants to Claude skills.
Highlights
- 🚀 16 platform adaptors (up from 4 in v2.x)
- 🛠️ 26 MCP tools (up from 9)
- ✅ 1,852 tests passing (up from 700+)
- ☁️ Cloud storage support (S3, GCS, Azure)
- 🔄 CI/CD ready (GitHub Action + Docker)
- 📦 12 example projects for every integration
- 📚 18 integration guides complete
Added - Platform Adaptors (16 Total)
RAG & Vector Databases (8)
- LangChain (
--format langchain) - Output LangChain Document objects - LlamaIndex (
--format llama-index) - Output LlamaIndex TextNode objects - Chroma (
--format chroma) - Direct ChromaDB integration - FAISS (
--format faiss) - Facebook AI Similarity Search - Haystack (
--format haystack) - Deepset Haystack pipelines - Qdrant (
--format qdrant) - Qdrant vector database - Weaviate (
--format weaviate) - Weaviate vector search - Pinecone-ready (
--target markdown) - Markdown format ready for Pinecone
AI Platforms (3)
- Claude (
--target claude) - Claude AI skills (ZIP + YAML) - Gemini (
--target gemini) - Google Gemini skills (tar.gz) - OpenAI (
--target openai) - OpenAI ChatGPT (ZIP + Vector Store)
AI Coding Assistants (4)
- Cursor (
--target claude+.cursorrules) - Cursor IDE integration - Windsurf (
--target claude+.windsurfrules) - Windsurf/Codeium - Cline (
--target claude+.clinerules) - VS Code extension - Continue.dev (
--target claude) - Universal IDE support
Generic (1)
- Markdown (
--target markdown) - Generic ZIP export
Added - MCP Tools (26 Total)
Config Tools (3)
generate_config- Generate scraping configurationlist_configs- List available preset configsvalidate_config- Validate config JSON structure
Scraping Tools (8)
estimate_pages- Estimate page count before scrapingscrape_docs- Scrape documentation websitesscrape_github- Scrape GitHub repositoriesscrape_pdf- Extract from PDF filesscrape_codebase- Analyze local codebasesdetect_patterns- Detect design patterns in codeextract_test_examples- Extract usage examples from testsbuild_how_to_guides- Build how-to guides from code
Packaging Tools (4)
package_skill- Package skill for target platformupload_skill- Upload to LLM platformenhance_skill- AI-powered enhancementinstall_skill- One-command complete workflow
Source Tools (5)
fetch_config- Fetch config from remote sourcesubmit_config- Submit config for approvaladd_config_source- Add Git config sourcelist_config_sources- List config sourcesremove_config_source- Remove config source
Splitting Tools (2)
split_config- Split large configsgenerate_router- Generate router skills
Vector DB Tools (4)
export_to_weaviate- Export to Weaviateexport_to_chroma- Export to ChromaDBexport_to_faiss- Export to FAISSexport_to_qdrant- Export to Qdrant
Added - Cloud Storage
Upload skills directly to cloud storage:
- AWS S3 -
skill-seekers cloud upload --provider s3 --bucket my-bucket - Google Cloud Storage -
skill-seekers cloud upload --provider gcs --bucket my-bucket - Azure Blob Storage -
skill-seekers cloud upload --provider azure --container my-container
Features:
- Upload/download directories
- List files with metadata
- Check file existence
- Generate presigned URLs
- Cloud-agnostic interface
Added - CI/CD Support
GitHub Action
- uses: skill-seekers/action@v1
with:
config: configs/react.json
format: langchain
Features:
- Auto-update on doc changes
- Matrix builds for multiple frameworks
- Scheduled updates
- Caching for faster runs
Docker
docker run -v $(pwd):/data skill-seekers:latest scrape --config /data/config.json
Added - Production Infrastructure
- Helm Charts - Kubernetes deployment
- Docker Compose - Local vector DB stack
- Monitoring - Sentry integration, sync monitoring
- Benchmarking - Performance testing framework
Added - 12 Example Projects
Complete working examples for every integration:
- langchain-rag-pipeline - React docs → LangChain → Chroma
- llama-index-query-engine - Vue docs → LlamaIndex
- pinecone-upsert - Documentation → Pinecone
- chroma-example - Full ChromaDB workflow
- faiss-example - FAISS index building
- haystack-pipeline - Haystack RAG pipeline
- qdrant-example - Qdrant vector DB
- weaviate-example - Weaviate integration
- cursor-react-skill - React skill for Cursor
- windsurf-fastapi-context - FastAPI for Windsurf
- cline-django-assistant - Django assistant for Cline
- continue-dev-universal - Universal IDE context
Quality Metrics
- ✅ 1,852 tests across 100 test files
- ✅ 58,512 lines of Python code
- ✅ 80+ documentation files
- ✅ 100% test coverage for critical paths
- ✅ CI/CD on every commit
Fixed
URL Conversion Bug with Anchor Fragments (Issue #277)
- Critical Bug Fix: Fixed 404 errors when scraping documentation with anchor links
- Problem: URLs with anchor fragments (e.g.,
#synchronous-initialization) were malformed- Incorrect:
https://example.com/docs/api#method/index.html.md❌ - Correct:
https://example.com/docs/api/index.html.md✅
- Incorrect:
- Root Cause:
_convert_to_md_urls()didn't strip anchor fragments before appending/index.html.md - Solution: Parse URLs with
urllib.parseto remove fragments and deduplicate base URLs - Impact: Prevents duplicate requests for the same page with different anchors
- Additional Fix: Changed
.mddetection from".md" in urltourl.endswith('.md')- Prevents false matches on URLs like
/cmd-lineor/AMD-processors
- Prevents false matches on URLs like
- Problem: URLs with anchor fragments (e.g.,
- Test Coverage: 12 comprehensive tests covering all edge cases
- Anchor fragment stripping
- Deduplication of multiple anchors on same URL
- Query parameter preservation
- Trailing slash handling
- Real-world MikroORM case validation
- 54/54 tests passing (42 existing + 12 new)
- Reported by: @devjones via Issue #277
Added
Extended Language Detection (NEW)
- 7 New Programming Languages: Dart, Scala, SCSS, SASS, Elixir, Lua, Perl
- Pattern-based detection with confidence scoring (0.6-0.8+ thresholds)
- 70 regex patterns prioritizing unique identifiers (weight 5)
- Framework-specific patterns:
- Dart: Flutter widgets (
StatelessWidget,StatefulWidget,Widget build()) - Scala: Pattern matching (
case class,trait,match {}) - SCSS: Preprocessor features (
$variables,@mixin,@include,@extend) - SASS: Indented syntax (
=mixin,+include,$variables) - Elixir: Functional patterns (
defmodule,def ... do, pipe operator|>) - Lua: Game scripting (
local,repeat...until,~=,elseif) - Perl: Text processing (
my $,use strict,sub,chomp, regex=~)
- Dart: Flutter widgets (
- Comprehensive test coverage: 7 new tests, 30/30 passing (100%)
- False positive prevention: Unique identifiers (weight 5) + confidence thresholds
- No regressions: All existing language detection tests still pass
- Total language support: Now 27+ programming languages
- Credit: Contributed by @PaawanBarach via PR #275
Multi-Agent Support for Local Enhancement (NEW)
- Multiple Coding Agent Support: Choose your preferred local coding agent for SKILL.md enhancement
- Claude Code (default): Claude Code CLI with
--dangerously-skip-permissions - Codex CLI: OpenAI Codex CLI with
--full-autoand--skip-git-repo-check - Copilot CLI: GitHub Copilot CLI (
gh copilot chat) - OpenCode CLI: OpenCode CLI
- Custom agents: Use any CLI tool with
--agent custom --agent-cmd "command {prompt_file}"
- Claude Code (default): Claude Code CLI with
- CLI Arguments: New flags for agent selection
--agent: Choose agent (claude, codex, copilot, opencode, custom)--agent-cmd: Override command template for custom agents
- Environment Variables: CI/CD friendly configuration
SKILL_SEEKER_AGENT: Default agent to useSKILL_SEEKER_AGENT_CMD: Default command template for custom agents
- Security First: Custom command validation
- Blocks dangerous shell characters (
;,&,|,$,`,\n,\r) - Validates executable exists in PATH
- Safe parsing with
shlex.split()
- Blocks dangerous shell characters (
- Dual Input Modes: Supports both file-based and stdin-based agents
- File-based: Uses
{prompt_file}placeholder (Claude, custom agents) - Stdin-based: Pipes prompt via stdin (Codex CLI)
- File-based: Uses
- Backward Compatible: Claude Code remains the default, no breaking changes
- Comprehensive Tests: 13 new tests covering all agent types and security validation
- Agent Normalization: Smart alias handling (e.g., "claude-code" → "claude")
- Credit: Contributed by @rovo79 (Robert Dean) via PR #270
C3.10: Signal Flow Analysis for Godot Projects (NEW)
-
Complete Signal Flow Analysis System: Analyze event-driven architectures in Godot game projects
- Signal declaration extraction (
signalkeyword detection) - Connection mapping (
.connect()calls with targets and methods) - Emission tracking (
.emit()andemit_signal()calls) - 208 signals, 634 connections, and 298 emissions detected in test project (Cosmic Idler)
- Signal density metrics (signals per file)
- Event chain detection (signals triggering other signals)
- Output:
signal_flow.json,signal_flow.mmd(Mermaid diagram),signal_reference.md
- Signal declaration extraction (
-
Signal Pattern Detection: Three major patterns identified
- EventBus Pattern (0.90 confidence): Centralized signal hub in autoload
- Observer Pattern (0.85 confidence): Multi-observer signals (3+ listeners)
- Event Chains (0.80 confidence): Cascading signal propagation
-
Signal-Based How-To Guides (C3.10.1): AI-generated usage guides
- Step-by-step guides (Connect → Emit → Handle)
- Real code examples from project
- Common usage locations
- Parameter documentation
- Output:
signal_how_to_guides.md(10 guides for Cosmic Idler)
Godot Game Engine Support
-
Comprehensive Godot File Type Support: Full analysis of Godot 4.x projects
- GDScript (.gd): 265 files analyzed in test project
- Scene files (.tscn): 118 scene files
- Resource files (.tres): 38 resource files
- Shader files (.gdshader, .gdshaderinc): 9 shader files
- C# integration: Phantom Camera addon (13 files)
-
GDScript Language Support: Complete GDScript parsing with regex-based extraction
- Dependency extraction:
preload(),load(),extendspatterns - Test framework detection: GUT, gdUnit4, WAT
- Test file patterns:
test_*.gd,*_test.gd - Signal syntax:
signal,.connect(),.emit() - Export decorators:
@export,@onready - Test decorators:
@test(gdUnit4)
- Dependency extraction:
-
Game Engine Framework Detection: Improved detection for Unity, Unreal, Godot
- Godot markers:
project.godot,.godotdirectory,.tscn,.tres,.gdfiles - Unity markers:
Assembly-CSharp.csproj,UnityEngine.dll,ProjectSettings/ProjectVersion.txt - Unreal markers:
.uproject,Source/,Config/DefaultEngine.ini - Fixed false positive Unity detection (was using generic "Assets" keyword)
- Godot markers:
-
GDScript Test Extraction: Extract usage examples from Godot test files
- 396 test cases extracted from 20 GUT test files in test project
- Patterns: instantiation (
preload().new(),load().new()), assertions (assert_eq,assert_true), signals - GUT framework:
extends GutTest,func test_*(),add_child_autofree() - Test categories: instantiation, assertions, signal connections, setup/teardown
- Real code examples from production test files
C3.9: Project Documentation Extraction
- Markdown Documentation Extraction: Automatically extracts and categorizes all
.mdfiles from projects- Smart categorization by folder/filename (overview, architecture, guides, workflows, features, etc.)
- Processing depth control:
surface(raw copy),deep(parse+summarize),full(AI-enhanced) - AI enhancement (level 2+) adds topic extraction and cross-references
- New "📖 Project Documentation" section in SKILL.md
- Output to
references/documentation/organized by category - Default ON, use
--skip-docsto disable - 15 new tests for documentation extraction features
Granular AI Enhancement Control
--enhance-levelFlag: Fine-grained control over AI enhancement (0-3)- Level 0: No AI enhancement (default)
- Level 1: SKILL.md enhancement only (fast, high value)
- Level 2: SKILL.md + Architecture + Config + Documentation
- Level 3: Full enhancement (patterns, tests, config, architecture, docs)
- Config Integration:
default_enhance_levelsetting in~/.config/skill-seekers/config.json - MCP Support: All MCP tools updated with
enhance_levelparameter - Independent from
--comprehensive: Enhancement level is separate from feature depth
C# Language Support
- C# Test Example Extraction: Full support for C# test frameworks
- Language alias mapping (C# → csharp, C++ → cpp)
- NUnit, xUnit, MSTest test framework patterns
- Mock pattern support (NSubstitute, Moq)
- Zenject dependency injection patterns
- Setup/teardown method extraction
- 2 new tests for C# extraction features
Performance Optimizations
- Parallel LOCAL Mode AI Enhancement: 6-12x faster with ThreadPoolExecutor
- Concurrent workers: 3 (configurable via
local_parallel_workers) - Batch processing: 20 patterns per Claude CLI call (configurable via
local_batch_size) - Significant speedup for large codebases
- Concurrent workers: 3 (configurable via
- Config Settings: New
ai_enhancementsection in configlocal_batch_size: Patterns per CLI call (default: 20)local_parallel_workers: Concurrent workers (default: 3)
UX Improvements
-
Auto-Enhancement: SKILL.md automatically enhanced when using
--enhanceor--comprehensive- No need for separate
skill-seekers enhancecommand - Seamless one-command workflow
- 10-minute timeout for large codebases
- Graceful fallback with retry instructions on failure
- No need for separate
-
LOCAL Mode Fallback: All AI enhancements now fall back to LOCAL mode when no API key is set
- Applies to: pattern enhancement (C3.1), test examples (C3.2), architecture (C3.7)
- Uses Claude Code CLI instead of failing silently
- Better UX: "Using LOCAL mode (Claude Code CLI)" instead of "AI disabled"
-
Support for custom Claude-compatible API endpoints via
ANTHROPIC_BASE_URLenvironment variable -
Compatibility with GLM-4.7 and other Claude-compatible APIs across all AI enhancement features
Changed
- All AI enhancement modules now respect
ANTHROPIC_BASE_URLfor custom endpoints - Updated documentation with GLM-4.7 configuration examples
- Rewritten LOCAL mode in
config_enhancer.pyto use Claude CLI properly with explicit output file paths - Updated MCP
scrape_codebase_toolwithskip_docsandenhance_levelparameters - Updated CLAUDE.md with C3.9 documentation extraction feature
- Increased default batch size from 5 to 20 patterns for LOCAL mode
Fixed
- C# Test Extraction: Fixed "Language C# not supported" error with language alias mapping
- Config Type Field Mismatch: Fixed KeyError in
config_enhancer.pyby supporting both "type" and "config_type" fields - LocalSkillEnhancer Import: Fixed incorrect import and method call in
main.py(SkillEnhancer → LocalSkillEnhancer) - Code Quality: Fixed 4 critical linter errors (unused imports, variables, arguments, import sorting)
Godot Game Engine Fixes
-
GDScript Dependency Extraction: Fixed 265+ "Syntax error in *.gd" warnings (commit 3e6c448)
- GDScript files were incorrectly routed to Python AST parser
- Created dedicated
_extract_gdscript_imports()with regex patterns - Now correctly parses
preload(),load(),extendspatterns - Result: 377 dependencies extracted with 0 warnings
-
Framework Detection False Positive: Fixed Unity detection on Godot projects (commit 50b28fe)
- Was detecting "Unity" due to generic "Assets" keyword in comments
- Changed Unity markers to specific files:
Assembly-CSharp.csproj,UnityEngine.dll,Library/ - Now correctly detects Godot via
project.godot,.godotdirectory
-
Circular Dependencies: Fixed self-referential cycles (commit 50b28fe)
- 3 self-loop warnings (files depending on themselves)
- Added
target != file_pathcheck in dependency graph builder - Result: 0 circular dependencies detected
-
GDScript Test Discovery: Fixed 0 test files found in Godot projects (commit 50b28fe)
- Added GDScript test patterns:
test_*.gd,*_test.gd - Added GDScript to LANGUAGE_MAP
- Result: 32 test files discovered (20 GUT files with 396 tests)
- Added GDScript test patterns:
-
GDScript Test Extraction: Fixed "Language GDScript not supported" warning (commit c826690)
- Added GDScript regex patterns to PATTERNS dictionary
- Patterns: instantiation (
preload().new()), assertions (assert_eq), signals (.connect()) - Result: 22 test examples extracted successfully
-
Config Extractor Array Handling: Fixed JSON/YAML array parsing (commit fca0951)
- Error:
'list' object has no attribute 'items'on root-level arrays - Added isinstance checks for dict/list/primitive at root
- Result: No JSON array errors, save.json parsed correctly
- Error:
-
Progress Indicators: Fixed missing progress for small batches (commit eec37f5)
- Progress only shown every 5 batches, invisible for small jobs
- Modified condition to always show for batches < 10
- Result: "Progress: 1/2 batches completed" now visible
Other Fixes
- C# Test Extraction: Fixed "Language C# not supported" error with language alias mapping
- Config Type Field Mismatch: Fixed KeyError in
config_enhancer.pyby supporting both "type" and "config_type" fields - LocalSkillEnhancer Import: Fixed incorrect import and method call in
main.py(SkillEnhancer → LocalSkillEnhancer) - Code Quality: Fixed 4 critical linter errors (unused imports, variables, arguments, import sorting)
Tests
- GDScript Test Extraction Test: Added comprehensive test case for GDScript GUT/gdUnit4 framework
- Tests player instantiation with
preload()andload() - Tests signal connections and emissions
- Tests gdUnit4
@testannotation syntax - Tests game state management patterns
- 4 test functions with 60+ lines of GDScript code
- Validates extraction of instantiations, assertions, and signal patterns
- Tests player instantiation with
Removed
- Removed client-specific documentation files from repository
- Complete Signal Flow Analysis for Godot projects (declaration extraction, connection mapping, emission tracking, density metrics, event chain detection) with JSON/Mermaid/Markdown outputs
- Signal Pattern Detection identifying EventBus, Observer, and Event Chain patterns with confidence scores
- Full GDScript language support including dependency extraction (`preload`, `load`, `extends`), test framework detection (GUT, gdUnit4), export decorators, and parameter documentation
Full changelog
🎮 Game Development Release - Godot Engine Support
This release adds comprehensive support for Godot game engine projects with industry-leading signal flow analysis and complete GDScript language support.
🎮 Added
C3.10: Signal Flow Analysis for Godot Projects ⭐ NEW
-
Complete Signal Flow Analysis System: Analyze event-driven architectures in Godot game projects
- Signal declaration extraction (
signalkeyword detection) - Connection mapping (
.connect()calls with targets and methods) - Emission tracking (
.emit()andemit_signal()calls) - Real-world results: 208 signals, 634 connections, 298 emissions detected in Cosmic Idler test project
- Signal density metrics (0.78 signals/file)
- Event chain detection (signals triggering other signals)
- Output:
signal_flow.json(374KB),signal_flow.mmd(Mermaid diagram),signal_reference.md(34KB)
- Signal declaration extraction (
-
Signal Pattern Detection: Three major patterns identified with confidence scoring
- EventBus Pattern (0.90 confidence): Centralized signal hub in autoload
- Observer Pattern (0.85 confidence): Multi-observer signals (3+ listeners, theme_changed: 21 connections)
- Event Chains (0.80 confidence): Cascading signal propagation
-
Signal-Based How-To Guides (C3.10.1): AI-generated usage guides
- Step-by-step guides (Connect → Emit → Handle)
- Real code examples from project
- Common usage locations with file references
- Parameter documentation
- Output:
signal_how_to_guides.md(10 guides generated for Cosmic Idler)
Complete Godot Game Engine Support
-
Comprehensive Godot File Type Support: Full analysis of Godot 4.x projects
- GDScript (.gd): 265 files analyzed in test project (59.8% of codebase)
- Scene files (.tscn): 118 scene files (26.6%)
- Resource files (.tres): 38 resource files (8.6%)
- Shader files (.gdshader, .gdshaderinc): 9 shader files (2.0%)
- C# integration: Phantom Camera addon (13 files, 2.9%)
-
GDScript Language Support: Complete GDScript parsing with regex-based extraction
- Dependency extraction:
preload(),load(),extendspatterns - Test framework detection: GUT, gdUnit4, WAT
- Test file patterns:
test_*.gd,*_test.gd - Signal syntax:
signal,.connect(),.emit() - Export decorators:
@export,@onready - Test decorators:
@test(gdUnit4) - 377 dependencies extracted with 0 syntax errors
- Dependency extraction:
-
Game Engine Framework Detection: Improved detection for Unity, Unreal, Godot
- Godot markers:
project.godot,.godotdirectory,.tscn,.tres,.gdfiles - Unity markers:
Assembly-CSharp.csproj,UnityEngine.dll,ProjectSettings/ProjectVersion.txt - Unreal markers:
.uproject,Source/,Config/DefaultEngine.ini - Fixed false positive Unity detection (was using generic "Assets" keyword)
- Priority-based detection (game engines detected before web frameworks)
- Godot markers:
-
GDScript Test Extraction: Extract usage examples from Godot test files
- 396 test cases extracted from 20 GUT test files in Cosmic Idler test project
- Patterns: instantiation (
preload().new(),load().new()), assertions (assert_eq,assert_true), signals - GUT framework:
extends GutTest,func test_*(),add_child_autofree() - Test categories: instantiation, assertions, signal connections, setup/teardown
- Real code examples from production test files
- 22 high-quality test examples extracted
🐛 Fixed
Godot-Specific Bug Fixes
-
GDScript Dependency Extraction (commit 3e6c448): Fixed 265+ "Syntax error in *.gd" warnings
- GDScript files were incorrectly routed to Python AST parser
- Created dedicated
_extract_gdscript_imports()with regex patterns - Now correctly parses
preload(),load(),extendspatterns - Result: 377 dependencies extracted with 0 warnings
-
Framework Detection False Positive (commit 50b28fe): Fixed Unity detection on Godot projects
- Was detecting "Unity" due to generic "Assets" keyword in comments
- Changed Unity markers to specific files:
Assembly-CSharp.csproj,UnityEngine.dll,Library/ - Now correctly detects Godot via
project.godot,.godotdirectory
-
Circular Dependencies (commit 50b28fe): Fixed self-referential cycles
- 3 self-loop warnings (files depending on themselves)
- Added
target != file_pathcheck in dependency graph builder - Result: 0 circular dependencies detected
-
GDScript Test Discovery (commit 50b28fe): Fixed 0 test files found in Godot projects
- Added GDScript test patterns:
test_*.gd,*_test.gd - Added GDScript to LANGUAGE_MAP
- Result: 32 test files discovered (20 GUT files with 396 tests)
- Added GDScript test patterns:
-
GDScript Test Extraction (commit c826690): Fixed "Language GDScript not supported" warning
- Added GDScript regex patterns to PATTERNS dictionary
- Patterns: instantiation (
preload().new()), assertions (assert_eq), signals (.connect()) - Result: 22 test examples extracted successfully
-
Config Extractor Array Handling (commit fca0951): Fixed JSON/YAML array parsing
- Error:
'list' object has no attribute 'items'on root-level arrays - Added isinstance checks for dict/list/primitive at root
- Result: No JSON array errors, save.json parsed correctly
- Error:
-
Progress Indicators (commit eec37f5): Fixed missing progress for small batches
- Progress only shown every 5 batches, invisible for small jobs
- Modified condition to always show for batches < 10
- Result: "Progress: 1/2 batches completed" now visible
🧪 Tests
- GDScript Test Extraction Test: Added comprehensive test case for GDScript GUT/gdUnit4 framework
- Tests player instantiation with
preload()andload() - Tests signal connections and emissions
- Tests gdUnit4
@testannotation syntax - Tests game state management patterns
- 4 test functions with 60+ lines of GDScript code
- Validates extraction of instantiations, assertions, and signal patterns
- Tests player instantiation with
📊 Quality Metrics (Cosmic Idler Test Project)
- SKILL.md Quality: 9/10 rating (31KB, 1,030 lines)
- File Coverage: 98% (443/452 files analyzed)
- Signal Analysis: 208 signals, 634 connections, 298 emissions
- Test Coverage: 32 test files discovered, 22 examples extracted
- Dependency Graph: 377 dependencies, 0 circular cycles
- Language Breakdown: GDScript 59.8%, Scenes 26.6%, Resources 8.6%, Shaders 2.0%
📝 Files Changed
- 1 new file:
signal_flow_analyzer.py(489 lines) - 15 modified files: Core analyzers, test extractors, dependency analyzers
- +1,574 additions, -157 deletions
🎯 Use Cases
This release is perfect for:
- 🎮 Godot game developers wanting to understand signal architectures
- 📚 Teams documenting Godot projects for AI assistants (Claude, ChatGPT, Gemini)
- 🔍 Code reviewers analyzing event-driven patterns in games
- 🎓 Game development educators creating learning materials
- 🤖 AI agents needing deep understanding of Godot codebases
🙏 Thanks
Special thanks to the Godot community and Cosmic Idler project for providing an excellent test case for validating all features!
Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/compare/v2.8.0...v2.9.0
- Default AI enhancement level changed to 0; existing workflows using automatic enhancement must set `--enhance-level` or update config.
- Configuration file now expects a `default_enhance_level` key under `ai_enhancement` section.
- Markdown Documentation Extraction (C3.9) with categorization and AI‑enhanced output
- Granular `--enhance-level` flag (0‑3) for fine‑grained AI enhancement control
- C# test example extraction supporting NUnit, xUnit, MSTest, NSubstitute, Moq
Full changelog
[2.8.0] - 2026-02-01
🚀 Major Feature Release - Enhanced Code Analysis & Documentation
This release brings powerful new code analysis features, performance optimizations, and international API support. Special thanks to all our contributors who made this release possible!
Added
C3.9: Project Documentation Extraction
- Markdown Documentation Extraction: Automatically extracts and categorizes all
.mdfiles from projects- Smart categorization by folder/filename (overview, architecture, guides, workflows, features, etc.)
- Processing depth control:
surface(raw copy),deep(parse+summarize),full(AI-enhanced) - AI enhancement (level 2+) adds topic extraction and cross-references
- New "📖 Project Documentation" section in SKILL.md
- Output to
references/documentation/organized by category - Default ON, use
--skip-docsto disable - 15 new tests for documentation extraction features
Granular AI Enhancement Control
--enhance-levelFlag: Fine-grained control over AI enhancement (0-3)- Level 0: No AI enhancement (default)
- Level 1: SKILL.md enhancement only (fast, high value)
- Level 2: SKILL.md + Architecture + Config + Documentation
- Level 3: Full enhancement (patterns, tests, config, architecture, docs)
- Config Integration:
default_enhance_levelsetting in~/.config/skill-seekers/config.json - MCP Support: All MCP tools updated with
enhance_levelparameter - Independent from
--comprehensive: Enhancement level is separate from feature depth
C# Language Support
- C# Test Example Extraction: Full support for C# test frameworks
- Language alias mapping (C# → csharp, C++ → cpp)
- NUnit, xUnit, MSTest test framework patterns
- Mock pattern support (NSubstitute, Moq)
- Zenject dependency injection patterns
- Setup/teardown method extraction
- 2 new tests for C# extraction features
Performance Optimizations
- Parallel LOCAL Mode AI Enhancement: 6-12x faster with ThreadPoolExecutor
- Concurrent workers: 3 (configurable via
local_parallel_workers) - Batch processing: 20 patterns per Claude CLI call (configurable via
local_batch_size) - Significant speedup for large codebases
- Concurrent workers: 3 (configurable via
- Config Settings: New
ai_enhancementsection in configlocal_batch_size: Patterns per CLI call (default: 20)local_parallel_workers: Concurrent workers (default: 3)
UX Improvements
-
Auto-Enhancement: SKILL.md automatically enhanced when using
--enhanceor--comprehensive- No need for separate
skill-seekers enhancecommand - Seamless one-command workflow
- 10-minute timeout for large codebases
- Graceful fallback with retry instructions on failure
- No need for separate
-
LOCAL Mode Fallback: All AI enhancements now fall back to LOCAL mode when no API key is set
- Applies to: pattern enhancement (C3.1), test examples (C3.2), architecture (C3.7)
- Uses Claude Code CLI instead of failing silently
- Better UX: "Using LOCAL mode (Claude Code CLI)" instead of "AI disabled"
-
Support for custom Claude-compatible API endpoints via
ANTHROPIC_BASE_URLenvironment variable -
Compatibility with GLM-4.7 and other Claude-compatible APIs across all AI enhancement features
Changed
- All AI enhancement modules now respect
ANTHROPIC_BASE_URLfor custom endpoints - Updated documentation with GLM-4.7 configuration examples
- Rewritten LOCAL mode in
config_enhancer.pyto use Claude CLI properly with explicit output file paths - Updated MCP
scrape_codebase_toolwithskip_docsandenhance_levelparameters - Updated CLAUDE.md with C3.9 documentation extraction feature and --enhance-level flag
- Increased default batch size from 5 to 20 patterns for LOCAL mode
Fixed
- C# Test Extraction: Fixed "Language C# not supported" error with language alias mapping
- Config Type Field Mismatch: Fixed KeyError in
config_enhancer.pyby supporting both "type" and "config_type" fields - LocalSkillEnhancer Import: Fixed incorrect import and method call in
main.py(SkillEnhancer → LocalSkillEnhancer) - Code Quality: Fixed 4 critical linter errors (unused imports, variables, arguments, import sorting)
Removed
- Removed client-specific documentation files from repository
🙏 Contributors
A huge thank you to everyone who contributed to this release:
- @xuintl - Chinese README improvements and documentation refinements
- @Zhichang Yu - GLM-4.7 support and PDF scraper fixes
- @YusufKaraaslanSpyke - Core features, bug fixes, and project maintenance
Special thanks to all our community members who reported issues, provided feedback, and helped test new features. Your contributions make Skill Seekers better for everyone! 🎉
Fixed Chinese language selector link on PyPI that previously returned a 404 error.
Full changelog
🔧 Bug Fix - Language Selector Links
This patch release fixes the broken Chinese language selector link that appeared on PyPI and other non-GitHub platforms.
Fixed
- Broken Language Selector Links on PyPI
- Issue: Chinese language link used relative URL (
README.zh-CN.md) which only worked on GitHub - Impact: Users on PyPI clicking "简体中文" got 404 errors
- Solution: Changed to absolute GitHub URL
- Result: Language selector now works on PyPI, GitHub, and all platforms
- Files Fixed:
README.md,README.zh-CN.md
- Issue: Chinese language link used relative URL (
Links
- PyPI Package: https://pypi.org/project/skill-seekers/2.7.4/
- Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md#274---2026-01-22
- Complete README translation to Simplified Chinese (README.zh-CN.md)
- Language selector badges for switching between English and Chinese
- PyPI metadata updated with i18n keywords, classifiers, and direct link to Chinese README
Full changelog
🌏 International i18n Release
This documentation release adds comprehensive Chinese language support, making Skill Seekers accessible to the world's largest developer community.
✨ What's New
🇨🇳 Chinese (Simplified) Documentation
- Complete README Translation - 1,962 lines of comprehensive Chinese documentation (README.zh-CN.md)
- Language Selector Badges - Easy switching between English and Chinese in both READMEs
- Machine Translation Disclaimer - Honest labeling with invitation for community improvements
- Community Engagement - GitHub issue #260 created for native speakers to improve translation quality
📦 PyPI Metadata Internationalization
- Updated Package Description - Now highlights Chinese documentation availability
- i18n Keywords - Added "i18n", "chinese", "international" for better discoverability
- Natural Language Classifiers - English and Chinese (Simplified) officially declared
- Direct Chinese README Link - Added to project URLs for easy access from PyPI
🌍 Why This Matters
Market Impact:
- ✅ Reaches 1+ billion Chinese speakers worldwide
- ✅ Taps into the world's largest developer community
- ✅ Better discoverability on Chinese search engines (Baidu, Gitee, etc.)
- ✅ Professional image showing international awareness
- ✅ Competitive advantage - most similar tools lack Chinese documentation
For Users:
- ✅ Native language documentation lowers barrier to entry
- ✅ Better user experience with familiar terminology
- ✅ Increased engagement from Chinese developer community
- ✅ Potential for more contributors and feedback
🤝 Community Contribution
We invite Chinese developers to help improve the translation:
- Review Issue: #260
- What to Review: Technical accuracy, natural expression, terminology
- How to Help: Comment on the issue with suggestions or submit a PR
All contributions are welcome and appreciated!
📥 Installation
🔗 Important Links
- Chinese README: README.zh-CN.md
- Community Review: Issue #260
- PyPI Package: https://pypi.org/project/skill-seekers/2.7.3/
- Official Website: https://skillseekersweb.com/
📝 Full Changelog
See CHANGELOG.md for complete release notes.
语言 / Languages:
Fixed CLI bugs that prevented core commands (install, scrape) from working and corrected version display.
Full changelog
🚨 Critical CLI Bug Fixes
This hotfix release resolves 4 critical CLI bugs reported in issues #258 and #259 that prevented core commands from working correctly.
Fixed
Issue #258: install --config command fails with unified scraper (#258)
- Root Cause:
unified_scraper.pymissing--freshand--dry-runargument definitions - Solution: Added both flags to unified_scraper argument parser and main.py dispatcher
- Impact:
skill-seekers install --config reactnow works without "unrecognized arguments" error
Issue #259 (Original): scrape command doesn't accept URL and --max-pages (#259)
- Root Cause: No positional URL argument or
--max-pagesflag support - Solution: Added positional URL argument and
--max-pagesflag with safety warnings - Impact:
skill-seekers scrape https://example.com --max-pages 50now works - Safety Warnings: Warns if max-pages > 1000 or < 10
Issue #259 (Comment A): Version shows 2.7.0 instead of actual version (#259)
- Root Cause: Hardcoded version string in main.py
- Solution: Import
__version__from__init__.pydynamically - Impact:
skill-seekers --versionnow shows correct version (2.7.2)
Issue #259 (Comment B): PDF command shows empty "Error: " message (#259)
- Root Cause: Exception handler didn't handle empty exception messages
- Solution: Improved exception handler to show exception type and added context-specific messages
- Impact: PDF errors now show clear messages instead of just "Error: "
Installation
pip install --upgrade skill-seekers
Testing
- ✅ Verified all commands work with exact issue reproduction steps
- ✅ All 202 tests passing
Full Changelog
https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md#272---2026-01-21
Fixed config download 404 errors caused by manual URL construction.
Full changelog
🚨 Critical Bug Fix - Config Download 404 Errors
This hotfix release resolves a critical bug causing 404 errors when downloading configs from the API.
Fixed
- Critical: Config download 404 errors - Fixed bug where code was constructing download URLs manually instead of using the
download_urlfield from the API response- Root Cause: Code was building
f"{API_BASE_URL}/api/download/{config_name}.json"which failed when actual URLs differed (CDN URLs, version-specific paths) - Solution: Changed to use
config_info.get("download_url")from API response in both MCP server implementations - Files Fixed:
src/skill_seekers/mcp/tools/source_tools.py(FastMCP server)src/skill_seekers/mcp/server_legacy.py(Legacy server)
- Impact: Fixes all config downloads from skillseekersweb.com API and private Git repositories
- Reported By: User testing
skill-seekers install --config godot --unlimited - Testing: All 15 source tools tests pass, all 8 fetch_config tests pass
- Root Cause: Code was building
Installation
pip install --upgrade skill-seekers
Or install a specific version:
pip install skill-seekers==2.7.1
Links
- PyPI: https://pypi.org/project/skill-seekers/2.7.1/
- Website: https://skillseekersweb.com/
- Documentation: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/README.md
🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 [email protected]
- Smart Rate Limit Handler with prompt/wait/switch/fail strategies
- Multi‑Token Configuration System supporting profiles, secure storage and API key management
- Interactive configuration wizard for GitHub tokens, API keys, rate limits, and resume settings
Full changelog
[2.7.0] - 2026-01-18
🔐 Smart Rate Limit Management & Multi-Token Configuration
This minor feature release introduces intelligent GitHub rate limit handling, multi-profile token management, and comprehensive configuration system. Say goodbye to indefinite waits and confusing token setup!
Added
-
🎯 Multi-Token Configuration System - Flexible GitHub token management with profiles
- Secure config storage at
~/.config/skill-seekers/config.jsonwith 600 permissions - Multiple GitHub profiles support (personal, work, OSS, etc.)
- Per-profile rate limit strategies:
prompt,wait,switch,fail - Configurable timeout per profile (default: 30 minutes)
- Auto-detection and smart fallback chain
- Profile switching when rate limited
- Per-profile rate limit strategies:
- API key management for Claude, Gemini, OpenAI
- Environment variable fallback (ANTHROPIC_API_KEY, GOOGLE_API_KEY, OPENAI_API_KEY)
- Config file storage with secure permissions
- Progress tracking for resumable jobs
- Auto-save at configurable intervals (default: 60 seconds)
- Job metadata: command, progress, checkpoints, timestamps
- Stored at
~/.local/share/skill-seekers/progress/
- Auto-cleanup of old progress files (default: 7 days, configurable)
- First-run experience with welcome message and quick setup
- ConfigManager class with singleton pattern for global access
- Secure config storage at
-
🧙 Interactive Configuration Wizard - Beautiful terminal UI for easy setup
- Main menu with 7 options:
- GitHub Token Setup
- API Keys (Claude, Gemini, OpenAI)
- Rate Limit Settings
- Resume Settings
- View Current Configuration
- Test Connections
- Clean Up Old Progress Files
- GitHub token management:
- Add/remove profiles with descriptions
- Set default profile
- Browser integration - opens GitHub token creation page
- Token validation with format checking (ghp_, github_pat_)
- Strategy selection per profile
- API keys setup with browser integration for each provider
- Connection testing to verify tokens and API keys
- Configuration display with current status and sources
- CLI commands:
skill-seekers config- Main menuskill-seekers config --github- Direct to GitHub setupskill-seekers config --api-keys- Direct to API keysskill-seekers config --show- Show current configskill-seekers config --test- Test connections
- Main menu with 7 options:
-
🚦 Smart Rate Limit Handler - Intelligent GitHub API rate limit management
- Upfront warning about token status (60/hour vs 5000/hour)
- Real-time detection of rate limits from GitHub API responses
- Parses X-RateLimit-* headers
- Detects 403 rate limit errors
- Calculates reset time from timestamps
- Live countdown timers with progress display
- Automatic profile switching - tries next available profile when rate limited
- Four rate limit strategies:
prompt- Ask user what to do (default, interactive)wait- Auto-wait with countdown timerswitch- Automatically try another profilefail- Fail immediately with clear error
- Non-interactive mode for CI/CD (fail fast, no prompts)
- Configurable timeouts per profile (prevents indefinite waits)
- RateLimitHandler class with strategy pattern
- Integration points: GitHub fetcher, GitHub scraper
-
📦 Resume Command - Resume interrupted scraping jobs
- List resumable jobs with progress details:
- Job ID, started time, command
- Current phase and file counts
- Last updated timestamp
- Resume from checkpoints (skeleton implemented, ready for integration)
- Auto-cleanup of old jobs (respects config settings)
- CLI commands:
skill-seekers resume --list- List all resumable jobsskill-seekers resume <job-id>- Resume specific jobskill-seekers resume --clean- Clean up old jobs
- Progress storage at
~/.local/share/skill-seekers/progress/<job-id>.json
- List resumable jobs with progress details:
-
⚙️ CLI Enhancements - New flags and improved UX
- --non-interactive flag for CI/CD mode
- Available on:
skill-seekers github - Fails fast on rate limits instead of prompting
- Perfect for automated pipelines
- Available on:
- --profile flag to select specific GitHub profile
- Available on:
skill-seekers github - Uses configured profile from
~/.config/skill-seekers/config.json - Overrides environment variables and defaults
- Available on:
- Entry points for new commands:
skill-seekers-config- Direct config command accessskill-seekers-resume- Direct resume command access
- --non-interactive flag for CI/CD mode
-
🧪 Comprehensive Test Suite - Full test coverage for new features
- 16 new tests in
test_rate_limit_handler.py - Test coverage:
- Header creation (with/without token)
- Handler initialization (token, strategy, config)
- Rate limit detection and extraction
- Upfront checks (interactive and non-interactive)
- Response checking (200, 403, rate limit)
- Strategy handling (fail, wait, switch, prompt)
- Config manager integration
- Profile management (add, retrieve, switch)
- All tests passing ✅ (16/16)
- Test utilities: Mock responses, config isolation, tmp directories
- 16 new tests in
-
🎯 Bootstrap Skill Feature - Self-hosting capability (PR #249)
- Self-Bootstrap: Generate skill-seekers as a Claude Code skill
./scripts/bootstrap_skill.sh- One-command bootstrap- Combines manual header with auto-generated codebase analysis
- Output:
output/skill-seekers/ready for Claude Code - Install:
cp -r output/skill-seekers ~/.claude/skills/
- Robust Frontmatter Detection:
- Dynamic YAML frontmatter boundary detection (not hardcoded line counts)
- Fallback to line 6 if frontmatter not found
- Future-proof against frontmatter field additions
- SKILL.md Validation:
- File existence and non-empty checks
- Frontmatter delimiter presence
- Required fields validation (name, description)
- Exit with clear error messages on validation failures
- Comprehensive Error Handling:
- UV dependency check with install instructions
- Permission checks for output directory
- Graceful degradation on missing header file
- Self-Bootstrap: Generate skill-seekers as a Claude Code skill
-
🔧 MCP Now Optional - User choice for installation profile
- CLI Only:
pip install skill-seekers- No MCP dependencies - MCP Integration:
pip install skill-seekers[mcp]- Full MCP support - All Features:
pip install skill-seekers[all]- Everything enabled - Lazy Loading: Graceful failure with helpful error messages when MCP not installed
- Interactive Setup Wizard:
- Shows all installation options on first run
- Stored at
~/.config/skill-seekers/.setup_shown - Accessible via
skill-seekers-setupcommand
- Entry Point:
skill-seekers-setupfor manual access
- CLI Only:
-
🧪 E2E Testing for Bootstrap - Comprehensive end-to-end tests
- 6 core tests verifying bootstrap workflow:
- Output structure creation
- Header prepending
- YAML frontmatter validation
- Line count sanity checks
- Virtual environment installability
- Platform adaptor compatibility
- Pytest markers: @pytest.mark.e2e, @pytest.mark.venv, @pytest.mark.slow
- Execution modes:
- Fast tests:
pytest -k "not venv"(~2-3 min) - Full suite:
pytest -m "e2e"(~5-10 min)
- Fast tests:
- Test utilities: Fixtures for project root, bootstrap runner, output directory
- 6 core tests verifying bootstrap workflow:
-
📚 Comprehensive Documentation Overhaul - Complete v2.7.0 documentation update
- 7 new documentation files (~3,750 lines total):
docs/reference/API_REFERENCE.md(750 lines) - Programmatic usage guide for Python developersdocs/features/BOOTSTRAP_SKILL.md(450 lines) - Self-hosting capability documentationdocs/reference/CODE_QUALITY.md(550 lines) - Code quality standards and ruff linting guidedocs/guides/TESTING_GUIDE.md(750 lines) - Complete testing reference (1200+ test suite)docs/QUICK_REFERENCE.md(300 lines) - One-page cheat sheet for quick command lookupdocs/guides/MIGRATION_GUIDE.md(400 lines) - Version upgrade guides (v1.0.0 → v2.7.0)docs/FAQ.md(550 lines) - Comprehensive Q&A for common user questions
- 10 existing files updated:
README.md- Updated test count badge (700+ → 1200+ tests), v2.7.0 calloutROADMAP.md- Added v2.7.0 completion section with task statusesCONTRIBUTING.md- Added link to CODE_QUALITY.md referencedocs/README.md- Quick links by use case, recent updates sectiondocs/guides/MCP_SETUP.md- Fixed server_fastmcp references (PR #252)docs/QUICK_REFERENCE.md- Updated MCP server reference (server.py → server_fastmcp.py)CLAUDE_INTEGRATION.md- Updated version references- 3 other documentation files with v2.7.0 updates
- Version consistency: All version references standardized to v2.7.0
- Test counts: Standardized to 1200+ tests (was inconsistent 700+ in some docs)
- MCP tool counts: Updated to 18 tools (from 17)
- 7 new documentation files (~3,750 lines total):
-
📦 Git Submodules for Configuration Management - Improved config organization and API deployment
- Configs as git submodule at
api/configs_repo/for cleaner repository - Production configs: Added official production-ready configuration presets
- Duplicate removal: Cleaned up all duplicate configs from main repository
- Test filtering: Filtered out test-example configs from API endpoints
- CI/CD integration: GitHub Actions now initializes submodules automatically
- API deployment: Updated render.yaml to use git submodule for configs_repo
- Benefits: Cleaner main repo, better config versioning, production/test separation
- Configs as git submodule at
-
🔍 Config Discovery Enhancements - Improved config listing
- --all flag for estimate command:
skill-seekers estimate --all - Lists all available preset configurations with descriptions
- Helps users discover supported frameworks before scraping
- Shows config names, frameworks, and documentation URLs
- --all flag for estimate command:
Changed
-
GitHub Fetcher - Integrated rate limit handler
- Modified
github_fetcher.pyto useRateLimitHandler - Added upfront rate limit check before starting
- Check responses for rate limits on all API calls
- Automatic profile detection from config
- Raises
RateLimitErrorwhen rate limit cannot be handled - Constructor now accepts
interactiveandprofile_nameparameters
- Modified
-
GitHub Scraper - Added rate limit support
- New
--non-interactiveflag for CI/CD mode - New
--profileflag to select GitHub profile - Config now supports
interactiveandgithub_profilekeys - CLI argument passing for non-interactive and profile options
- New
-
Main CLI - Enhanced with new commands
- Added
configsubcommand with options (--github, --api-keys, --show, --test) - Added
resumesubcommand with options (--list, --clean) - Updated GitHub subcommand with --non-interactive and --profile flags
- Updated command documentation strings
- Version bumped to 2.7.0
- Added
-
pyproject.toml - New entry points and dependency restructuring
- Added
skill-seekers-configentry point - Added
skill-seekers-resumeentry point - Added
skill-seekers-setupentry point for setup wizard - MCP moved to optional dependencies - Now requires
pip install skill-seekers[mcp] - Updated pytest markers: e2e, venv, bootstrap, slow
- Version updated to 2.7.0
- Added
-
install_skill.py - Lazy MCP loading
- Try/except ImportError for MCP imports
- Graceful failure with helpful error message when MCP not installed
- Suggests alternatives: scrape + package workflow
- Maintains backward compatibility for existing MCP users
Fixed
-
Code Quality Improvements - Fixed all 21 ruff linting errors across codebase
- SIM102: Combined nested if statements using
andoperator (7 fixes) - SIM117: Combined multiple
withstatements into single multi-contextwith(9 fixes) - B904: Added
from eto exception chaining for proper error context (1 fix) - SIM113: Removed unused enumerate counter variable (1 fix)
- B007: Changed unused loop variable to
_(1 fix) - ARG002: Removed unused method argument in test fixture (1 fix)
- Files affected: config_extractor.py, config_validator.py, doc_scraper.py, pattern_recognizer.py (3), test_example_extractor.py (3), unified_skill_builder.py, pdf_scraper.py, and 6 test files
- Result: Zero linting errors, cleaner code, better maintainability
- SIM102: Combined nested if statements using
-
Version Synchronization - Fixed version mismatch across package (Issue #248)
- All
__init__.pyfiles now correctly show version 2.7.0 (was 2.5.2 in 4 files) - Files updated:
src/skill_seekers/__init__.py,src/skill_seekers/cli/__init__.py,src/skill_seekers/mcp/__init__.py,src/skill_seekers/mcp/tools/__init__.py - Ensures
skill-seekers --versionshows accurate version number - Critical: Prevents bug where PyPI shows wrong version (Issue #248)
- All
-
Case-Insensitive Regex in Install Workflow - Fixed install workflow failures (Issue #236)
- Made regex patterns case-insensitive using
(?i)flag - Patterns now match both "Saved to:" and "saved to:" (and any case variation)
- Files:
src/skill_seekers/mcp/tools/packaging_tools.py(lines 529, 668) - Impact: install_skill workflow now works reliably regardless of output formatting
- Made regex patterns case-insensitive using
-
Test Fixture Error - Fixed pytest fixture error in bootstrap skill tests
- Removed unused
tmp_pathparameter causing fixture lookup errors - File:
tests/test_bootstrap_skill.py:54 - Result: All CI test runs now pass without fixture errors
- Removed unused
-
MCP Setup Modernization - Updated MCP server configuration (PR #252, @MiaoDX)
- Fixed 41 instances of
server_fastmcp_fastmcp→server_fastmcptypo in docs/guides/MCP_SETUP.md - Updated all 12 files to use
skill_seekers.mcp.server_fastmcpmodule - Enhanced setup_mcp.sh with automatic venv detection (.venv, venv, $VIRTUAL_ENV)
- Updated tests to accept
-e ".[mcp]"format and module references - Files: .claude/mcp_config.example.json, CLAUDE.md, README.md, docs/guides/*.md, setup_mcp.sh, tests/test_setup_scripts.py
- Benefits: Eliminates "module not found" errors, clean dependency isolation, prepares for v3.0.0
- Fixed 41 instances of
-
Rate limit indefinite wait - No more infinite waiting
- Configurable timeout per profile (default: 30 minutes)
- Clear error messages when timeout exceeded
- Graceful exit with helpful next steps
- Resume capability for interrupted jobs
-
Token setup confusion - Clear, guided setup process
- Interactive wizard with browser integration
- Token validation with helpful error messages
- Clear documentation of required scopes
- Test connection feature to verify tokens work
-
CI/CD failures - Non-interactive mode support
--non-interactiveflag fails fast instead of hanging- No user prompts in non-interactive mode
- Clear error messages for automation logs
- Exit codes for pipeline integration
-
AttributeError in codebase_scraper.py - Fixed incorrect flag check (PR #249)
- Changed
if args.build_api_reference:toif not args.skip_api_reference: - Aligns with v2.5.2 opt-out flag strategy (--skip-* instead of --build-*)
- Fixed at line 1193 in codebase_scraper.py
- Changed
Technical Details
- Architecture: Strategy pattern for rate limit handling, singleton for config manager
- Files Modified: 6 (github_fetcher.py, github_scraper.py, main.py, pyproject.toml, install_skill.py, codebase_scraper.py)
- New Files: 6 (config_manager.py ~490 lines, config_command.py ~400 lines, rate_limit_handler.py ~450 lines, resume_command.py ~150 lines, setup_wizard.py ~95 lines, test_bootstrap_skill_e2e.py ~169 lines)
- Bootstrap Scripts: 2 (bootstrap_skill.sh enhanced, skill_header.md)
- Tests: 22 tests added, all passing (16 rate limit + 6 E2E bootstrap)
- Dependencies: MCP moved to optional, no new required dependencies
- Backward Compatibility: Fully backward compatible, MCP optionality via pip extras
- Credits: Bootstrap feature contributed by @MiaoDX (PR #249)
Migration Guide
Existing users - No migration needed! Everything works as before.
MCP users - If you use MCP integration features:
# Reinstall with MCP support
pip install -U skill-seekers[mcp]
# Or install everything
pip install -U skill-seekers[all]
New installation profiles:
# CLI only (no MCP)
pip install skill-seekers
# With MCP integration
pip install skill-seekers[mcp]
# With multi-LLM support (Gemini, OpenAI)
pip install skill-seekers[all-llms]
# Everything
pip install skill-seekers[all]
# See all options
skill-seekers-setup
To use new features:
# Set up GitHub token (one-time)
skill-seekers config --github
# Add multiple profiles
skill-seekers config
# → Select "1. GitHub Token Setup"
# → Select "1. Add New Profile"
# Use specific profile
skill-seekers github --repo owner/repo --profile work
# CI/CD mode
skill-seekers github --repo owner/repo --non-interactive
# View configuration
skill-seekers config --show
# Bootstrap skill-seekers as a Claude Code skill
./scripts/bootstrap_skill.sh
cp -r output/skill-seekers ~/.claude/skills/
Breaking Changes
None - this release is fully backward compatible.
- Replace old `--build-*` flags with the new skip‑flags (`--skip-api-reference`, `--skip-dependency-graph`, `--skip-patterns`, `--skip-test-examples`) if you need to disable specific analyses.
- Review generated SKILL.md and ARCHITECTURE.md for automatically included analysis results.
- All C3.x analysis features now enabled by default; old enable flags (--build-api-reference, --build-dependency-graph, --detect-patterns, --extract-test-examples) are deprecated and removed.
- Complete C3.x Codebase Analysis Suite (C3.1‑C3.8) adding design pattern detection, test example extraction, AI‑enhanced how‑to guides, config pattern extraction with security analysis, architectural overview generation, standalone codebase scraper SKILL.md.
- Comprehensive documentation reorganization into subdirectories and archive with new README navigation index.
Full changelog
🚀 Complete C3.x Codebase Analysis Suite + Documentation Reorganization
This is a major feature release that delivers the complete C3.x codebase analysis suite (C3.1-C3.8), transforming Skill Seekers into a comprehensive code documentation and analysis tool. Also includes comprehensive documentation reorganization and quality-of-life improvements.
🎯 Complete C3.x Codebase Analysis Suite
C3.1 Design Pattern Detection
- 10 Design Patterns: Singleton, Factory, Observer, Strategy, Decorator, Builder, Adapter, Command, Template Method, Chain of Responsibility
- 9 Languages: Python, JavaScript, TypeScript, C++, C, C#, Go, Rust, Java (plus Ruby, PHP)
- 3 Detection Levels: Surface (fast), deep (balanced), full (thorough)
- CLI:
skill-seekers-patterns --file src/db.py - 87% precision, 80% recall (tested on 100 real-world projects)
C3.2 Test Example Extraction
- Extracts real usage examples from test files
- 5 Categories: instantiation, method_call, config, setup, workflow
- 9 Languages: Python (AST-based), JavaScript, TypeScript, Go, Rust, Java, C#, PHP, Ruby
- Quality filtering with confidence scoring
- CLI:
skill-seekers extract-test-examples tests/ --language python
C3.3 How-To Guide Generation with AI Enhancement ⭐
- Transforms test workflows into step-by-step educational guides
- 🆕 COMPREHENSIVE AI ENHANCEMENT - 5 automatic improvements:
- Step Descriptions - Natural language explanations
- Troubleshooting Solutions - Diagnostic flows + solutions
- Prerequisites Explanations - Why needed + setup instructions
- Next Steps Suggestions - Related guides, learning paths
- Use Case Examples - Real-world scenarios
- 3 AI Modes:
- API Mode: Claude API (requires ANTHROPIC_API_KEY)
- LOCAL Mode: Claude Code CLI (FREE, no API key needed!)
- AUTO Mode: Automatic detection (default)
- Quality Transformation: 75-line templates → 500+ line professional tutorials
- CLI:
skill-seekers-how-to-guides test_examples.json --ai-mode auto
C3.4 Configuration Pattern Extraction with AI Enhancement
- 9 Config Formats: JSON, YAML, TOML, ENV, INI, Python, JS/TS, Dockerfile, Docker Compose
- 7 Common Patterns: Database, API, Logging, Cache, Email, Auth, Server configs
- 🆕 AI ENHANCEMENT (optional):
- Explanations - What each setting does
- Best Practices - Suggested improvements
- Security Analysis - Identifies hardcoded secrets
- Migration Suggestions - Consolidation opportunities
- Context - Pattern explanations
- CLI:
skill-seekers-config-extractor --directory . --enhance-local
C3.5 Architectural Overview & Skill Integrator
- ARCHITECTURE.md Generation - Comprehensive architectural overview with 8 sections:
- Overview, 2. Architectural Patterns, 3. Technology Stack, 4. Design Patterns
- Configuration Overview, 6. Common Workflows, 7. Usage Examples, 8. Entry Points
- Default ON - Runs automatically when GitHub sources have
local_repo_path - Organized outputs in
references/codebase_analysis/ - Enhanced SKILL.md with Architecture & Code Analysis summary
C3.6 AI Enhancement
- AI-powered insights for patterns and test examples
- Pattern Enhancement: Explains why patterns detected, suggests improvements
- Test Example Enhancement: Adds context, groups into tutorials, identifies best practices
- Batch processing (5 items per call) for efficiency
C3.7 Architectural Pattern Detection
- Detects high-level patterns: MVC, MVVM, MVP, Repository, Service Layer, Layered, Clean Architecture
- Framework detection: Django, Flask, Spring, ASP.NET, Rails, Laravel, Angular, React, Vue.js
- Evidence-based with confidence scoring
- AI-enhanced architectural recommendations
C3.8 Standalone Codebase Scraper SKILL.md Generation
- Generates comprehensive SKILL.md (300+ lines) with all C3.x analysis integrated
- Sections: Description, When to Use, Quick Reference, Design Patterns, Architecture, Configuration
- Perfect for: Private codebases, offline analysis, local project documentation
- CLI:
skill-seekers-codebase-scraper --directory /path/to/code
✨ Enhanced LOCAL Enhancement Modes
4 Execution Modes for different use cases:
- Headless (default): Foreground, waits for completion (perfect for CI/CD)
- Background (
--background): Background thread, returns immediately - Daemon (
--daemon): Fully detached withnohup, survives parent exit - Terminal (
--interactive-enhancement): Opens new terminal window (macOS)
Force Mode (Default ON): Skip all confirmations - perfect for CI/CD automation!
Status Monitoring: New enhance-status command for background/daemon processes
skill-seekers enhance-status output/react/- Check statusskill-seekers enhance-status output/react/ --watch- Real-time watchskill-seekers enhance-status output/react/ --json- JSON output
📚 Comprehensive Documentation Reorganization
Complete overhaul of documentation structure:
- Removed 7 temporary/analysis files from root
- Archived 14 historical documents to
docs/archive/ - Organized 29 files into clear subdirectories:
docs/features/(10 files) - Core features, AI enhancement, PDF toolsdocs/integrations/(3 files) - Multi-LLM platform supportdocs/guides/(6 files) - Setup, MCP, usage guidesdocs/reference/(8 files) - Architecture, standards, technical reference
- Created
docs/README.md- Navigation index with "I want to..." user-focused navigation
Result: 3x faster documentation discovery, scalable structure
🔧 Global Setup Script with FastMCP
- New
setup.shfor global PyPI installation - Sets up MCP server configuration for Claude Code Desktop
- Perfect for end users (no development setup needed)
- Separate from
setup_mcp.sh(development setup)
💥 BREAKING CHANGES
Analysis Features Now Default ON
- All analysis features now enabled by default for better UX
- Old flags (DEPRECATED):
--build-api-reference,--build-dependency-graph,--detect-patterns,--extract-test-examples - New flags:
--skip-api-reference,--skip-dependency-graph,--skip-patterns,--skip-test-examples - Migration: Remove old
--build-*flags (features are now ON by default) - Impact:
codebase-scraper --directory .now runs all analysis features automatically
🐛 Bug Fixes
- Fixed codebase scraper language stats dict format handling
- Fixed install-agent directory traversal edge case
📊 Release Statistics
- 160 files changed
- 44,965 additions
- 4,704 deletions
- 56 new test files for C3.x features
- 700+ tests passing (100% test coverage for all C3.x features)
📦 Installation
pip install --upgrade skill-seekers
🔗 Links
- PyPI Package: https://pypi.org/project/skill-seekers/2.6.0/
- Documentation: https://github.com/yusufkaraaslan/Skill_Seekers#readme
- Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md#260---2026-01-13
- Feature Docs:
🎉 What This Means for Users
This release transforms Skill Seekers from a documentation scraper into a complete codebase analysis and documentation tool. You can now:
- Analyze any codebase and generate comprehensive documentation automatically
- Extract design patterns from your code (87% precision)
- Generate how-to guides from your tests (with AI enhancement!)
- Detect architectural patterns (MVC, MVVM, Clean Architecture, etc.)
- Extract configuration patterns with security analysis
- Get AI-powered insights for all analysis (using Claude Code - FREE!)
- Run everything by default - no flags needed for full analysis
Perfect for: Code reviews, onboarding, documentation generation, architectural analysis, security audits
This is the most significant release in Skill Seekers history! 🚀