Org-roam major redesign

I’m making an extremely large breaking change to Org-roam moving forward. Org-roam was developed in a time where files were the lowest denomination for a note. Fast forward to today, we now support headlines. The current implementation we have is a mess: there are special paths for handling file and id links, and it is getting difficult for me to make changes without accidentally breaking something else.

There is a strong need to rebuild Org-roam’s basic mechanisms. I think this is the only reasonable path forward, allowing me to trim a big deal of technical debt and increase development speed. As the primary maintainer of the project, I’m finding it hard to wrap my head around all these different features that I don’t use, which are supported in awkward ways. I think the proposed change is much easier to reason about both as a developer, and as a user, and will result in a higher quality product overall.

What is likely going to happen is that I will release a final tag on Org-roam v1, and start working on Org-roam in a separate dev branch. That Org-roam will be tagged v2. Those who wish to continue using Org-roam v1 will have to pin the repo to v1.

The proposal

I term the lowest denomination we have in Org-roam a node. A node is defined as follows:

A node is any headline or top level file with an ID.

Nodes link to other nodes using ID links. Nodes also have an implicit hierarchical structure, from the levels in the Org file. Nodes can link to many other nodes. Org-ref links etc. will continue to be supported.

Here we enforce usage of Org IDs. This is simplifying for many reasons: all links within Org-roam are ID links, and we no longer have to deal with handling different kinds of path links and path breakages on file changes such as renames and deletions. We also no longer have to handle files and headlines differently within the schema, which has been a recurring pain point.

The migration strategy

It should be simple to write an elisp script that adds IDs to everything, and converts existing file links.

27 Likes

Looking very much forward to v2 and beyond :rocket:

1 Like

This is brilliant, thank you.

One question: you write “A node is any headline or top level file with an ID.” Is there a reason why the headline is the lowest (most granular) node level? E.g., why can’t plain list items or quote blocks etc. be nodes? Is it because Org-id only supports UUIDs at the headline level?

I love Org-Roam for all the usual reasons (built on Emacs, local storage, plain text), but one think I miss about Roam Research is its hyper-granular block-level architecture, especially when it comes to transclusion. (I’ve also been playing around with @nobiot’s amazing Org-transclusion work, and I have to assume it would make his life easier if one could add UUIDs at a more granular level than just headlines).

Anyway, amazing work all!

1 Like

Let me quickly jump in to comment on Org-transclusion, and leave the UUID question for Jethro.

Firstly, thank you for your kind words, and commenting with my work in mind.

My intuition is that Org-roam supporting UUID at the block-level and other more granular levels than headlines is unlikely to get me/Org-transclusion anything extra.

[Edit: fixed the problem the exchange with Jethro made clear to me]
Blocks, tables, and lists that are named (#+name keyword) can be transcluded using Org Mode’s standard links ([[file:path/to/file.org::name]]).

I am indebted to Org-roam for the idea of authoring Org-transclusion; it is intended to work alongside Org-roam. But it is a standalone package by design, and, at the moment, I do not intend to make it dependent on Org-roam.

I will strive to make the writing process for users of both packages easy (because I’m one of them). I think that this sort of “interconnectivity” comes from sticking to their common foundation; that is, Org Mode without much reliance on specialised features.

3 Likes

This is an interesting thought. But how then would you reference a plain list item? It’s possible to make blocks a node too, but they will have to have an ID attached to them.

One idea is to use #+name: UUID like I show above.
Org Mode can then use the link [[file:path/to/file.org::UUID]] – that’s what I do for the block quote, table, and list examples above.

‘file:projects.org::some words’ (text search)27

I don’t think this works for plain list items, but blocks, definitely.

1 Like

Oh, yes, you’re right. I see now the hole in this approach; they don’t correctly end (end in the next headline/block, etc.). Will need to fix this… Thanks!

[Edit: I think I have fixed the problem in Org-transclusion for the moment; I think this is a separate topic from the original intent of Jethro’s post]

1 Like

This seems to me like a good path forward. Looking forward to see what will come out!

Regarding translating file links and id-links, I played around with this when evaluating gkroam a while back.

Translation functions back and forth between file and ID-links can be found here:

(Note that the functions work between normal file links and ID-links as well, even though the gist mentions “gkroam”)

I really like this idea. This is something I’ve enforced in all my notes and now with assumption that every note has an id, I can write some generic functions to work with any notes (files, headings). Some of them are extracted as part of vulpea library. Having it structured helps me so much. But making id mandatory in the core, I think it would also enable many cool features.

So looking forward! Please let me know if there is something I can help you with.

2 Likes

One of the reasons is, as you said, org-id supports IDs only on heading level and file level (starting with one of the latest release of org mode).

But another reason is performance. I tried using org-roam with relatively big file (1k headings) and experience was awful. I think org-roam really shines when using lots of small org files as opposed to another popular approach in org mode - few huge files.

That being said, I agree with @jethro that it’s an interesting thought.

1 Like

This is going off tangent from the original intent (sorry!). But it looks like in theory you could use #+name UUID idea for plain-list elements.

See an excerpt below. org-element seems to correctly takes the #+name for the plain-list element with the correct end position (the element is plain-list with the correct :end position, and :name).

The reason why my code copies more than it should is an error in my code. [Edit: I think I patched the problem in Org-transclusion for now]

(plain-list (:type unordered :begin 2198 :end 2242 :contents-begin 2211 :contents-end 2241 :structure ((2211 0 "- " nil nil nil 2225) (2225 0 "- " nil nil nil 2241)) :post-blank 1 :post-affiliated 2211 :name “list” :parent nil)

2 Likes

I agree entirely with the many-small-files approach. But even within a relatively small file it can be useful to link to something lower than the headline level. But if Org-id doesn’t support that I guess it’s a limitation. I asked this question about Org-id on the Org mailing list but didn’t get an answer.

1 Like

One thing I would like to additionally point out is that the simplifications also makes daemonizing the DB building much easier, which would mean remove any performance problems.

6 Likes

You could use dedicated targets, which can be put anywhere – it’s what I do in plain org-files.

3 Likes

He would v2 break my current setup if I don use links to headers?

No, but your files will need IDs too, and your links need to be to the file IDs.

So that is a yes. How does one convert?

I am not org-roam maintainer, so not sure if the solution for migration will be baked into org-roam (in my opinion, it should). But in case it will not, just ask me, I will share a script that will migrate your notes.

4 Likes

This sounds great! This won’t impact [[roam:]] links, right?