feat: add fastCRW tool block#5025
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryLow Risk Overview New Wiring is additive across block/tool registries, Reviewed by Cursor Bugbot for commit 056eac2. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2964aed. Configure here.
| pages: [], | ||
| total: 0, | ||
| }, | ||
| } |
There was a problem hiding this comment.
Crawl job id wrong path
High Severity
The crawl create handler sets jobId from the top-level id on the JSON body, but fastCRW’s documented POST /v1/crawl response puts the job id under the nested data object. jobId stays undefined, so status polling hits /v1/crawl/undefined and the crawl operation fails end-to-end.
Reviewed by Cursor Bugbot for commit 2964aed. Configure here.
| formats: params.formats || ['markdown'], | ||
| onlyMainContent: params.onlyMainContent || false, | ||
| }, | ||
| } |
There was a problem hiding this comment.
Crawl sends maxPages not limit
Medium Severity
The crawl request body sends maxPages, while fastCRW’s Firecrawl-compatible POST /v1/crawl expects limit for the page cap. The block’s Max Pages value is ignored and the service falls back to its default crawl size.
Reviewed by Cursor Bugbot for commit 2964aed. Configure here.
Greptile SummaryThis PR adds fastCRW as a new tool block (scrape / crawl / map / search), mirroring the existing Firecrawl block. The integration is additive-only: new files under
Confidence Score: 4/5The change is purely additive and isolated to new files; no existing functionality is modified. The three tools with hardcoded success responses will silently swallow API-level errors, but they won't cause data corruption or affect other blocks. Three of the four new tools (scrape, search, crawl) always return apps/sim/tools/crw/scrape.ts, apps/sim/tools/crw/search.ts, apps/sim/tools/crw/crawl.ts — the transformResponse functions in all three need to check data.success before reporting a successful result. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[CrwBlock - crw.ts] -->|operation=scrape| B[crw_scrape tool]
A -->|operation=search| C[crw_search tool]
A -->|operation=crawl| D[crw_crawl tool]
A -->|operation=map| E[crw_map tool]
B --> F["POST /v1/scrape\n(fastcrw.com/api)"]
C --> G["POST /v1/search\n(fastcrw.com/api)"]
D --> H["POST /v1/crawl\n(fastcrw.com/api)"]
E --> I["POST /v1/map\n(fastcrw.com/api)"]
D -->|async job| J[postProcess polling loop]
J --> K["GET /v1/crawl/{jobId}"]
K -->|completed| L[Return pages + total]
K -->|failed| M[Return error]
K -->|timeout| N[Return timeout error]
B --> O[transformResponse - always success:true]
C --> P[transformResponse - always success:true]
E --> Q[transformResponse - checks data.success]
|
| const result = data.data ?? data | ||
|
|
||
| return { | ||
| success: true, | ||
| output: { | ||
| markdown: result.markdown, | ||
| html: result.html, | ||
| metadata: result.metadata, | ||
| }, | ||
| } | ||
| }, | ||
|
|
||
| outputs: { |
There was a problem hiding this comment.
Scrape/search always report
success: true regardless of API error body
Both scrape.ts and search.ts hardcode success: true in transformResponse. The map.ts counterpart correctly propagates data.success. When the fastCRW API returns HTTP 200 with { success: false, error: "…" } (e.g., invalid URL or auth error), the scrape and search tools will still emit success: true with undefined output fields, masking the failure from downstream blocks. map.ts shows the correct pattern: return success: data.success and reflect it in the output envelope.
| transformResponse: async (response: Response) => { | ||
| const data = await response.json() | ||
|
|
||
| return { | ||
| success: true, | ||
| output: { | ||
| data: data.data, | ||
| }, | ||
| } | ||
| }, |
There was a problem hiding this comment.
Search always reports
success: true on API-level failures
Same issue as scrape.ts — transformResponse always returns success: true without checking data.success. The map.ts tool in this same PR correctly checks data.success. If the search API returns { success: false, error: "…" } with HTTP 200, downstream blocks see a successful result with data: undefined rather than a proper error.
scrape, search, and crawl transformResponse hardcoded success: true, masking HTTP 200 responses with { success: false, error }. They now reflect data.success and surface the error, matching map.ts. Crawl additionally fails fast when job creation has no id, preventing a poll loop against /v1/crawl/undefined. Adds error-path tests.


What
Adds fastCRW as a tool block (scrape / crawl / map / search), mirroring the existing Firecrawl block.
Why
fastCRW is a Firecrawl-API-compatible web engine in a single ~8MB binary — self-host free or managed cloud. Flat pricing (1 credit = 1 page; no 4x stealth surcharge, no billed-on-failure) and free anti-bot stealth — a drop-in alternative to the Firecrawl block for Sim workflows.
Changes (additive only)
apps/sim/tools/crw/: scrape/crawl/map/search + types (mirrorstools/firecrawl/).apps/sim/blocks/blocks/crw.ts+ registered inblocks/registry.ts,tools/registry.ts.integrations.json— every place Firecrawl is registered.Config
CRW_API_KEYfrom https://fastcrw.com/dashboard (free tier); base URL overridable for self-host.Happy to adjust — I maintain the integration and can provide free credits.