Since I finally got more comfortable with Elisp and Org-roam lately, I decided to revisit my config to make some improvments.
A big part of this work focuses on how I organize my notes (metadata, subfolders), and query the database for various purposes. The problem being more comfortable with Elisp is that I have many ways to achieve the same results. More options mean I need to base my design choices on other considerations than my ability to code, so I can think more about convenience and performance.
As I was (re)writing my different finders (i.e. filtered variations of org-roam-node-find
), I started to question my decisions and parctices. Mostly from a database/query performance standpoint. I do not experience delays or performance issues so far, but I would like to start my Org-roam journey on sane bases so I do not have to start over the most structural parts on my setup.
My doubts for the most part reside in my ignorance of some of the underlying technologies and code I am using, so I thought more knowledgeable users could help me separate reasonable asumptions from misconception and maybe even a touch of magical thinking
It is also possible that my worries about slowing down my ’notes finders’ by using poorly written FILTER-FN are simply baseless and that writing efficient filters will only have a marginal effect (or none at all). I really do not know.
Please help debunk or confirm my ’beliefs’:
-
The less filters, the better.
This assumption of mine is the one I deem the more realistic, so I hope I am not gonna get disappointed on that one.
I usually do not use more than three criterias to filter my notes, most of the time two. I feel like if the node is tested against
n
criterias, it increases the complexity hence the computing time byn
fold.True, False or kind of ?
-
Not all criterias are equal.
That one has some common logic with the previous one. For example I get the preconceived idea that when testing a node property against a list of strings using
(member prop string-list)
, it will be(length string-list)
times longer to process than testing it against a single string using(string= string prop)
for example.Is there any truth to that ? If yes, how dramatic would the difference be in a situation like the example above ?
-
Order matters.
When writing a filter function, I always end up reordering my condition testing in some kind of a reverse ’funneling’ sequence.
I basically start by the criteria with the narrowest pool of candidate (i.e. the highest power of elimination), and work my way down. It is hard to illustrate, because it depends on the content and organization of one’s notes. But let’s imagine I write a lot about politics, and very little about religion. If I write the following function:
(org-roam-node-find other-window nil
(lambda (node)
(and
(member "Politics" (org-roam-node-tags node))
(member "Religion" (org-roam-node-tags node)))))
I feel it will be faster if I switch up because since the Religion
tag is rare in my zk, more nodes will be eliminated before testing against the more popular Politics
tag. Which in my mind means less code execution before yielding results.
This idea of mine is also reinforced by the Info page for the and
’special form’ as the doc calls it which says that the evaluation stops as soon as a condition returns nil.
> Eval args until one of them yields nil, then return nil.
>
> The remaining args are not evalled at all.
> If no arg yields nil, return the last arg’s value.
Thanks for sharing your knowledge/experience and help me understand what is happening behind the scenes a bit better. Hopefully it can be useful to future readers facing the same kind of dilemmas and doubts as well.