Should org-roam's treatment of all ID's as nodes be opt-in?

I was thinking about this recently. Org-roam is an Org extension, while ID’s are a native feature. Other Org extensions, including org-attach (which is also part of Org upstream), and org-caldav, use ID’s in very different ways.

This results in annoying conflicts for those who want to have Org files serve multiple purposes and be managed by multiple extensions. This means org-attach and org-caldav users end up with many extraneous org-roam nodes if they use those extensions on files in their org-roam directory. In particular, org-caldav generates ID’s for every node in a file in order to track sync state with a CalDAV server. This can lead to hundreds of extra ID’s. At the same time, it’s nice to have org-roam nodes in the same file as event headings managed by org-caldav. I do it frequently for Org files for academic coursework.

What if we redefine a node to be “any file or headline with an ID property and a non-nil ROAM property”? That way, ID’s created outside of org-roam wouldn’t pollute org-roam, even if the ID’s are in your org-roam directory.

This change seems pretty low-impact to me, since if you have an ID you already have a properties drawer, so one more property doesn’t really affect legibility if you’re already folding or hiding the drawer. If some people still find this undesirable, we could have some sort of setting, like org-roam-use-all-id-nodes, and default it to true for backwards compatibility until a v3 release? That way, those of us who want to have ID’s for more purposes than org-roam could set this variable, while everyone else could continue with the current system.

Migration should be trivial. Since all ID nodes are currently org-roam nodes, you would simply have org-roam add the ROAM property with value t to every existing org-roam node. ROAM_EXCLUDE could also be deprecated.

Creating new nodes would have to take place via an org-roam wrapper function, rather than directly calling an org-id function, but this doesn’t seem like a big issue. Again, ID’s are generally useful beyond org-roam, so it seems presumptuous to assume that all ID nodes are automatically org-roam nodes. And org-roam nodes seem like they should generally be a deliberate thing, so I don’t think it’s crazy to expect users to manually indicate nodes via an org-roam function or capture template accordingly.

As an aside, with SQLite support in Emacs 29, Org upstream could track ID nodes in a database much like org-roam does already. Bastien has expressed interest in this. org-roam (and other extensions using ID links, for that matter) could theoretically depend on this database rather than having to maintain its own, but that’s obviously a ways out and would depend on coordination between Org and org-roam development efforts.

3 Likes

I think you can already do this via user option org-roam-db-node-include-function. Something like this:

(defun my/org-roam-include-only-roam-prop ()
  "Return t when the prop at point includes \"ROAM\"."
  (org-entry-get (point) "ROAM"))

(setq org-roam-db-node-include-function #'my/org-roam-include-only-roam-prop)

You can then modify your template to include the ROAM prop with value of t.

This is viable, I guess, but you would also have to hook or advise any org-roam function that creates nodes to add this property.

That’s probably possible with some more work as well, but I’m also trying to open a discussion about defaults. Personally, I think org-roam’s default node definition is a bit too general, considering how ID’s are used in the Org ecosystem already, and I think it’s likely to become more of an issue if Org adopts SQLite and Org and extensions begin using ID’s more heavily.

1 Like