ENG-1906 Add Content.content_type and backfill rows#1154
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
PR size/scope checkThis PR is over our review-size guideline.
Please split this into smaller PRs unless there is a clear reason the changes need to land together. If keeping it as one PR, please add a brief justification covering:
|
|
Updates to Preview Branch (eng-1906-add-contentcontent_type-and-backfill-rows) ↗︎
Tasks are run on every commit but only new migration files are pushed.
View logs for this Workflow Run ↗︎. |
b8cce16 to
3f58bcb
Compare
3f58bcb to
e85f933
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e85f9334c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| db_content.space_id, | ||
| db_content.last_modified, | ||
| db_content.part_of_id, | ||
| db_content.content_type |
There was a problem hiding this comment.
Default content_type before inserting content
When a content payload omits content_type (the Roam content converter in this PR still does), db_content.content_type remains NULL. Because this INSERT explicitly lists the new NOT NULL column and passes that NULL, PostgreSQL will not apply the column default, so upsert_content fails before creating or updating the row. Set a fallback on db_content (or use DEFAULT/omit the column) before this VALUES entry.
Useful? React with 👍 / 👎.
| document.metadata := '{}'; | ||
| END IF; | ||
| IF document.content_type IS NULL THEN | ||
| document.content_type = 'text/plain'; |
There was a problem hiding this comment.
Keep document defaults platform-specific
When upsert_documents is called without content_type, this fallback sets every new or updated document to text/plain regardless of the space platform. That disagrees with the migration backfill and upsert_content's inline-document path, which classify Roam documents as application/roam+json and Obsidian documents as text/obsidian+markdown; repeated upserts can overwrite the correct persisted type with text/plain. Use the selected v_platform default or preserve the existing value when the input omits it.
Useful? React with 👍 / 👎.
mdroidian
left a comment
There was a problem hiding this comment.
See codex's comments, probably worth addressing.
Also I’m going to approve this PR, but I want to note one concern around application/roam+json.
Right now this PR appears to backfill Roam Document.content_type values to application/roam+json, but we are not actually storing Roam JSON payloads yet. For the immediate v0 push/pull path between Obsidian and Roam, the stored content will be plain text/markdown-like content, not native Roam JSON.
If Document.content_type is intended to mean “native source document format,” then application/roam+json may make sense as a future-facing label. But if it is intended to describe the actual persisted payload, this seems premature and potentially misleading until we store the original Roam JSON/atJSON alongside the text representation.
I do not think this needs to block the PR, but I would like us to clarify that semantic before relying on content_type for behavior.
|
It is absolutely intended to store the persisted payload. I put roam+json as the database default thinking this was imminent; and knowing we would set the value to roam+markdown explicitly in the meantime. I don't mind setting roam+markdown as the default, but changing defaults is a database migration, I was trying to avoid that. But we can do that as well. Let me know what you prefer. |
|
Thanks, that clarifies it. If For this PR, I think the cleanest path is to use the current persisted format as the value: probably I’m not too worried about adding/changing a migration here since this PR is already introducing the column and backfill. I’d rather get the semantic right now than have downstream code start treating |
|
Yes, text/roam+markdown is what I'm using in the sync. I'll update defaults accordingly |
https://linear.app/discourse-graphs/issue/ENG-1906/add-contentcontent-type-and-backfill-rows
https://www.loom.com/share/9162523288284401b9679c7b743a726c