Currently, site2pdf does not support feeding it a list of known URLs directly. However, I believe detecting sub-pages from a sitemap.xml file could work well. Thank you for the question!
Looks like GPT4All[1] and AnythingLLM[2] are worth exploring. There's also the closed-source macOS app RecurseChat[3,4] which appeared on HN a few months ago[5].
Exploring local solutions like GPT4All and AnythingLLM sounds promising. I'll also look into RecurseChat on macOS. Thanks again for the suggestions and for sharing the insights! Another tool that might be worth considering is Dify.
It can be used with the following endpoint, but it's not particularly good
export ANTHROPIC_BASE_URL=https://dashscope-intl.aliyuncs.com/api/v2/apps/claude-code-...
By the way, I'm benchmarking these comparisons: https://github.com/laiso/ts-bench/blob/main/src/agents/build...