Improving org-roam-format-template (next to make org-roam-node-find fast)

dmg · May 25, 2024, 8:31pm

I was profiling the code of org-roam-node-find and the biggest bottleneck (after the improvements to org-roam-node-list) is the processing, for each node, of the template (it is using around 80% of the total time of finding a node with my org-roam).

The main issue is that the template (eg: “${title} ${filename}”) is parsed as many times as nodes exist (using a regular expression). If this processing was done once only, it will certainly improve the performance of org-roam-node-find.

This will require to refactor the code of this function and org-roam-node–format-entry and rewrite org-roam-node-read–to-candidate

akashp · May 27, 2024, 6:15pm

Controlled experiment on Display Format Generation with and without GC running for node size 20000
for an example display template – all other processes have been controlled for,

Format Generation Result

| 0.8807215509999999 | 0 | 0.0 |
| 0.9324620770000001 | 0 | 0.0 |
| 0.9284624810000001 | 0 | 0.0 |
|        0.925824983 | 0 | 0.0 |
|        0.926202192 | 0 | 0.0 |
|        0.939134124 | 0 | 0.0 |
|        0.928057464 | 0 | 0.0 |
| 0.9281281800000001 | 0 | 0.0 |
| 0.9267680730000001 | 0 | 0.0 |
|        0.927419631 | 0 | 0.0 |
| 0.9294667249999999 | 0 | 0.0 |

| 1.7701753569999998 | 11 | 0.9080397510000005 |
| 1.8481772189999999 | 12 | 0.9911701650000015 |
|        1.765555101 | 11 | 0.9101836789999993 |
| 1.8435445910000001 | 12 | 0.9885689709999994 |
|        1.848470344 | 12 | 0.9940336959999989 |
|        1.769367922 | 11 | 0.9133971390000006 |
|        1.849403378 | 12 | 0.9926142330000012 |
| 1.7668457899999999 | 11 | 0.9117966240000008 |
|        1.842279775 | 12 | 0.9882231329999982 |
|        1.761206654 | 11 | 0.9065899379999998 |
|        1.843822947 | 12 | 0.9877636340000002 |

Let me know if you make any headway in reducing the number off temporary variables in the process.

https://hastebin.skyra.pw/isisibupuh.org

akashp · May 27, 2024, 8:35pm

I attempted to totally bypass the functions and construct the display candidates directly -

|        0.512200102 | 0 | 0.0 |
| 0.5243189109999999 | 0 | 0.0 |
| 0.5256468750000001 | 0 | 0.0 |
|        0.527694037 | 0 | 0.0 |
|         0.52837358 | 0 | 0.0 |
| 0.5306920380000001 | 0 | 0.0 |
|         0.52618145 | 0 | 0.0 |
|        0.526668523 | 0 | 0.0 |
|        0.526865157 | 0 | 0.0 |
|        0.532609853 | 0 | 0.0 |
|        0.528698882 | 0 | 0.0 |

|        1.309958374 | 2 |  0.8066643420000048 |
|        0.902891209 | 1 |         0.405633967 |
|        0.892309232 | 1 | 0.40388118600000666 |
|        0.897282229 | 1 |  0.4023565109999936 |
|        0.896623804 | 1 |   0.401657111999981 |
|        1.302014031 | 2 |  0.8049902160000215 |
| 0.8949897739999999 | 1 |   0.404243699999995 |
|        0.890800676 | 1 |  0.4025110569999981 |
|         0.89391253 | 1 |   0.402396901000003 |
|        1.301704781 | 2 |  0.8076995979999992 |
| 0.9027907079999999 | 1 |  0.4049892609999972 |

My conclusion is that - the functions themselves don’t take that much time - they generate a lot of garbage sure, but even if we eliminate them – the display format templates themselves will drag down the performance,
This is the impact of using two display templates , directories and title. Using only one has significant performance boosts

If we then do our calculations to make estimate for the functions in between - and estimate for their impact - it is negligible in the scale.

(defun +org-roam-node-read--completions (&rest args)
  "Generate completion candidates for Org-roam nodes, optionally filtered and sorted."
  (let* ((template (org-roam-node--process-display-format org-roam-node-display-template))
	 (width (1- (if (bufferp (current-buffer))
				      (window-width) (frame-width)))))
    
    (mapcar 
     
     (lambda (node)
       (let ((candidate-main (+org-roam-node-read--to-candidate node width)))
	 (cons (propertize candidate-main 'node node) node)))
     
     (++org-roam-node-list))))


(defun +org-roam-node-read--to-candidate (node width)
  (concat (test/directory node) " " (test/hierarchy node)))


(defun test/directory (node)
  (if-let ((dirs (file-name-directory (file-relative-name (org-roam-node-file node) org-roam-directory))))
      (format "(%s)" (car (split-string dirs "/")))
    ""))

(defun test/hierarchy (node)
  (let ((level (org-roam-node-level node)))
    (concat
     (when (> level 0) (concat (propertize (org-roam-node-file-title node) 'face 'italic) " / "))
     (when (> level 1) (concat (string-join (org-roam-node-olp node) " > ") " > "))
    (propertize  (org-roam-node-title node) 'face 'bold))))

dmg · May 28, 2024, 8:56pm

I spent time trying to understand where the actual bottle necks. These are my results.

gist.github.com

https://gist.github.com/dmgerman/fbf8e657918f6bd76d0bda2a396616c1

retrieve-fields.org

* formatting fields for presentation

- The way the nodes are processed for the template is very slow.
- To test how slow, I replaced the code and returned the value of the field
  without any extra manipulation

  See below

- With this changes, the take home:

This file has been truncated. show original

basically, retrieving and formatting the fields for the template are the most expensive actions.

akashp · May 29, 2024, 1:03am

Excellent results - I think your experiments also corroborate my finding that any function that comes from the files.el are disgustingly slow.

Elimination thereof skyrockets performance. I think if you collect your findings more concisely you may improve the existing implementation and reduce the function times from about 0.4 sec to 0.1. This could be a significant improvement -

mainly I want to focus on this part that it should be made common knowledge among org-roam users to not use any function that comes from the files.el - so to give up about finding directory information for nodes without implementing a cache in place.

For my case - I have eliminated the entire middle circuit of generation of display template from user set options so many of these functions have been totally retired for my use case.

I think you have made significant improvement - made the middle of the circuit 4 times better nevertheless - I would if possible like a confirmation on the impact of just ripping from the code any function that comes from the said file especially here

github.com

org-roam/org-roam/blob/8667e441876cd2583fbf7282a65796ea149f0e5f/org-roam-node.el#L604C4-L606C83


      
          (when (and (equal field-name "file")
                     field-value)
            (setq field-value (file-relative-name field-value org-roam-directory)))

Edit I just saw your message

Excellent ! Consider doing a PR with your findings.

akashp · May 29, 2024, 2:54pm

@nobiot maybe I can bring your attention to here – removing this function from files.elfile will speed up performance – I dont understand what this line is supposed to do here – but it creates a huge latency – GC reduction as proposed by @dmg will have the most significant impact here – this is the most expensive protocol between the trio of functions that make up the read protocol –

nobiot · May 29, 2024, 5:53pm

So you are talking about this 3-line when form in the function org-roam-node--format-entry?

         (when (and (equal field-name "file")
                    field-value)
           (setq field-value (file-relative-name field-value org-roam-directory)))

It’s a bit surprising that this form affects performance – I think it’s used to construct each candidate displayed in org-roam-node-find based on user option org-roam-node--format-entry. So… If you did’t have ${file} as part of the user option (usually, you would not), I would have thought the form would not be evaluated and hence no performance implication…

dmg · May 29, 2024, 6:36pm

hi nobiot,

you are right:

org-roam-node-display-template is a variable defined in ‘org-roam-node.el’.

Its value is "${title} ${tags} ${file} ${todo}"
Original value was "${title}"

So my template is the one that triggers that.

To be honest, I didn’t look into the original template. If that is the case, the “default” query for the database does not need to do all those joins!

it brings an interesting point: a major rewrite would be:

the query should retrieve only what it needs (plus the id) and the completion mechanism should be used this data only (plus perhaps an ID, if one wants to dynamically show the candidates contents)

akashp · May 30, 2024, 3:16am

Yes, thanks for your clarification @nobiot - I have been led to believe that I need to speak more clearly, because I feel I can never get my intent across

By default it doesn’t you are correct - I discovered when I was testing a more direct route to generate the display candidate that if I used any function coming from the files.el file the performance will degrade very fast. So in the condition it gets evaluated – it shall manifest itself as a noticeable lag.

I want to ask you a final opinion - do you think a hash table here would benefit the larger community? It will have to be reset by some mechanism when display candidate format is changed – but if say we make the node struct the key and the main candidate format as the value – then we can improve read latency easily.

Between the trio of functions - the latency here will be noticed by users first. Can you think of some reasons why a hash table by default will be a bad idea – or whether it should be a user set feature – I want to ask you this because over time collective analysis has shown that trivially memoizing this part has the greatest impact for medium sized knowledge bases.

Thanks fr your time
Best.

Topic		Replies	Views
Improving performance of node-find et. al Development	31	1043	July 28, 2024
Rewriting org-roam-node-list for speed (it is not sqlite) Development	90	805	August 8, 2024
Preconceived Ideas & org-roam-node-find Performance UX & UI	5	107	September 10, 2024
Org-roam-node-read speedup without cache - Guide Guides	2	120	June 28, 2024
How to make org-roam-node-find faster? Requests	4	718	December 31, 2022

Improving org-roam-format-template (next to make org-roam-node-find fast)

Format Generation Result

Edit I just saw your message

Related topics