feat: add ref (commit SHA) to search and symbol navigation results#1307
feat: add ref (commit SHA) to search and symbol navigation results#1307Amresh-01 wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (11)
🚧 Files skipped from review as they are similar to previous changes (10)
WalkthroughThis PR propagates git ref/commit SHA information across the system and standardizes repository path handling. It updates public API schemas to include ref fields, normalizes all repository path derivation to use POSIX path joining, fixes file:// URL-to-filesystem path conversion, and threads the ref value through search responses, CodeNav APIs, and symbol tools with comprehensive tests. ChangesGit Ref Propagation and Path Normalization
🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly Related PRs
Suggested Reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/backend/src/zoekt.ts (1)
17-33:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse
execFile/spawnargs instead of shell-stringexecto prevent command injection.Line 32 still executes a shell command string assembled from dynamic inputs (Lines 23, 27, 28). Quoting alone is not a complete boundary if inputs contain shell-significant characters. Please pass arguments as an array via
execFile(orspawn) to eliminate shell parsing.Suggested fix
-import { exec } from "child_process"; +import { execFile } from "child_process"; @@ - const command = [ - 'zoekt-git-index', - '-allow_missing_branches', - `-index "${INDEX_CACHE_DIR}"`, - `-max_trigram_count ${settings.maxTrigramCount}`, - `-file_limit ${settings.maxFileSize}`, - `-branches "${revisions.join(',')}"`, - `-tenant_id ${repo.orgId}`, - `-repo_id ${repo.id}`, - `-shard_prefix_override ${shardPrefix}`, - ...largeFileGlobPatterns.map((pattern) => `-large_file "${pattern}"`), - `"${repoPath}"` - ].join(' '); + const args = [ + '-allow_missing_branches', + '-index', INDEX_CACHE_DIR, + '-max_trigram_count', String(settings.maxTrigramCount), + '-file_limit', String(settings.maxFileSize), + '-branches', revisions.join(','), + '-tenant_id', String(repo.orgId), + '-repo_id', String(repo.id), + '-shard_prefix_override', shardPrefix, + ...largeFileGlobPatterns.flatMap((pattern) => ['-large_file', pattern]), + repoPath, + ]; @@ - exec(command, { signal }, (error, stdout, stderr) => { + execFile('zoekt-git-index', args, { signal }, (error, stdout, stderr) => {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/backend/src/zoekt.ts` around lines 17 - 33, The code currently builds a shell command string from dynamic inputs (INDEX_CACHE_DIR, settings.maxTrigramCount, settings.maxFileSize, revisions, repo.orgId, repo.id, shardPrefix, largeFileGlobPatterns, repoPath) and calls exec, which risks command injection; change to use execFile or spawn by constructing an args array instead of a single command string (e.g., args = ['-allow_missing_branches', '-index', INDEX_CACHE_DIR, '-max_trigram_count', String(settings.maxTrigramCount), '-file_limit', String(settings.maxFileSize), '-branches', revisions.join(','), '-tenant_id', String(repo.orgId), '-repo_id', String(repo.id), '-shard_prefix_override', shardPrefix, ...largeFileGlobPatterns.flatMap(p => ['-large_file', p]), repoPath]) and call execFile('zoekt-git-index', args, { signal }, (error, stdout, stderr) => ...) so all dynamic values are passed as argv items (no shell parsing) and preserve the same callback behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/backend/src/zoekt.ts`:
- Around line 17-33: The code currently builds a shell command string from
dynamic inputs (INDEX_CACHE_DIR, settings.maxTrigramCount, settings.maxFileSize,
revisions, repo.orgId, repo.id, shardPrefix, largeFileGlobPatterns, repoPath)
and calls exec, which risks command injection; change to use execFile or spawn
by constructing an args array instead of a single command string (e.g., args =
['-allow_missing_branches', '-index', INDEX_CACHE_DIR, '-max_trigram_count',
String(settings.maxTrigramCount), '-file_limit', String(settings.maxFileSize),
'-branches', revisions.join(','), '-tenant_id', String(repo.orgId), '-repo_id',
String(repo.id), '-shard_prefix_override', shardPrefix,
...largeFileGlobPatterns.flatMap(p => ['-large_file', p]), repoPath]) and call
execFile('zoekt-git-index', args, { signal }, (error, stdout, stderr) => ...) so
all dynamic values are passed as argv items (no shell parsing) and preserve the
same callback behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cdcfd05c-c5ce-42b8-ae9d-1c3e82eaf561
📒 Files selected for processing (11)
docs/api-reference/sourcebot-public.openapi.jsonpackages/backend/src/repoCompileUtils.tspackages/backend/src/zoekt.tspackages/shared/src/utils.tspackages/web/src/features/codeNav/api.test.tspackages/web/src/features/codeNav/api.tspackages/web/src/features/codeNav/types.tspackages/web/src/features/search/types.tspackages/web/src/features/search/zoektSearcher.tspackages/web/src/features/tools/findSymbolDefinitions.tspackages/web/src/features/tools/findSymbolReferences.ts
cbbe225 to
22700d9
Compare
Fixes : #1173
[FR] Add ref to search results
#1173
Description
This PR addresses issue #1173 by returning the git
ref(commit SHA) in search results, symbol definitions, and symbol references. This allows clients and AI agents to fetch stable file references at the exact matching commit via/api/sourcelater.Additionally, this PR resolves local repository setup and indexing compatibility bugs on Windows OS when paths contain spaces.
Key Changes
1. Schema Enhancements
refas an optional string in search (searchFileSchema) and symbol navigation (findRelatedSymbolsResponseSchema) schemas.sourcebot-public.openapi.json) to reflect the newreffield.2. Result Mapping
versionfield (which stores the commit SHA) to thereffield in search response parsing.refcleanly through the definitions and references endpoints.3. Agent Tools Update
findSymbolDefinitionsandfindSymbolReferencesagent tools to usefile.refas their revision instead of hardcoding"HEAD".4. Windows Compatibility Bug Fixes
fileURLToPathwith a fallback mechanism inrepoCompileUtils.tsandshared/src/utils.tsto prevent path parsing errors (such asPath /C:/... does not exist) on Windows.-indexpath and repository source path inzoekt-git-indexCLI commands to support directories containing spaces (e.g.Open Source).path.joincalls topath.posix.jointo enforce forward-slash/URI separators consistently across all operating systems.Verification & Testing
yarn test). All 884 tests passed successfully./api/find_referencesreturn the correct commit SHA (ref) in the JSON output:Summary by CodeRabbit
New Features
Bug Fixes
Tests