Rewriting org-roam-node-list for speed (it is not sqlite)

dmg · May 19, 2024, 7:08pm

Please see attached profiler report. This is consistent in my runs: 80% of the time in org-roam-node-list is spent transforming the SQL result to a list of notes. I think there is something wrong with the code, and it might be a n^2 algorithm inside the macro. But more investigation is required.

I was also playing with SQLIte. The query used by org-roam-node-list can be simplified by using group_concat(distinct attr). That might remove two group-by in the query.
Another issue, the query aggregates the aliases into an attribute, but then the code below (cl-loop undoes that). The code can be simplified and have one less group by by simply removing this redundancy.

Another 3% (the “+if” below) is the sort of the results. I reckon that the DB can return the results sorted, so this function will have less work to do (and run faster), at least for the most common case (ordering by the title of the node).

How to read the profiler: 10% of overall recorded time was spent inside this function, 2% of that was the db query, and the rest the let after the query. In my DB, this function returns around 1k nodes. I was using the memoized version of org-roam-node-read–to-candidate.

dmg · May 19, 2024, 7:50pm

ok, reading the documentation of cl-loop, yes, it is using an n^2 algorithm.

Can anybody comment on whether this code is equivalent to the old one? It seems to work for me.

The time spent building the list of nodes is the same as the time reading from the database (I have 1000 nodes).

For my database, the bottleneck (when using this code and memoizing org-roam-node-read–to-candidate) is now consult (slightly more than 50% of the time). These two improvements feel good enough for my current database.

gist.github.com

https://gist.github.com/dmgerman/53e266502e4d53294d1e8be46af93b4f

improved org-roam-node-list.el

(defun org-roam-node-list ()
  (let ((rows (org-roam-db-query
               "SELECT
  id,
  file,
  filetitle,
  \"level\",
  todo,
  pos,
  priority ,

This file has been truncated. show original

dmg · May 21, 2024, 6:49am

Here are my tests of the performance of the function. I think the big difference is garbage collection. The first set of results is using the new function (almost no garbabe collection). The second is using the old function.

#+begin_src emacs-lisp   :exports both
(cl-loop
 for i from 1 to 10
 collect (benchmark-run 1 (org-roam-node-read--completions)))
#+end_src

#+RESULTS:
|            0.164094 | 0 |                 0.0 |
|            0.140535 | 0 |                 0.0 |
|            0.139243 | 0 |                 0.0 |
|            0.149014 | 0 |                 0.0 |
|            0.150503 | 0 |                 0.0 |
|            0.140875 | 0 |                 0.0 |
|            0.350733 | 1 | 0.21093799999999874 |
| 0.13880699999999999 | 0 |                 0.0 |
|            0.138218 | 0 |                 0.0 |
|            0.144497 | 0 |                 0.0 |


#+begin_src emacs-lisp   :exports both
(cl-loop
 for i from 1 to 10
 collect (benchmark-run 1 (org-roam-node-read--completions)))
#+end_src

#+RESULTS:
|            0.198443 | 0 |                 0.0 |
| 0.42432200000000003 | 1 | 0.24116800000000183 |
|            0.185615 | 0 |                 0.0 |
|            0.411156 | 1 | 0.23619899999999916 |
|            0.174655 | 0 |                 0.0 |
|            0.546205 | 1 |  0.3707629999999966 |
|            0.173224 | 0 |                 0.0 |
|            0.555526 | 1 | 0.37898799999999966 |
|            0.174533 | 0 |                 0.0 |
|            0.173881 | 0 |                 0.0 |

akashp · May 21, 2024, 9:44am

For 20,000 nodes

Test 
(cl-loop
 for i from 1 to 20
 collect (benchmark-run 1 (org-roam-node-list)))

Some improvement is noticeable in post processing, but not significant. The bottle neck is emacsql in querying the db

CONTROL
#+RESULTS:
| 0.7299019400000001 | 4 |  0.3237409144639969 |
|        0.719273009 | 4 | 0.31644689105451107 |
| 0.7238439560000001 | 4 | 0.31617216393351555 |
|        0.721819844 | 4 |   0.316596582531929 |
|        0.723913723 | 4 |   0.317101800814271 |
|         0.72303185 | 4 |  0.3165578916668892 |
|        0.720873412 | 4 |   0.317128237336874 |
| 0.7200310830000001 | 4 |  0.3174141813069582 |
|        0.723670793 | 4 |  0.3173398170620203 |
| 0.7225935680000001 | 4 | 0.31758482195436954 |
| 0.7237227390000001 | 4 |   0.318003011867404 |
|         0.72211695 | 4 |  0.3177121039479971 |
|        0.723423175 | 4 | 0.31752581521868706 |
|        0.724030364 | 4 |  0.3179106302559376 |
|        0.724301683 | 4 | 0.31938767433166504 |
|        0.728445038 | 4 | 0.31795258820056915 |
| 0.7267083280000001 | 4 |   0.320640679448843 |
|        0.760352068 | 4 | 0.35379437915980816 |
| 0.7580344330000001 | 4 | 0.35416010953485966 |
|        0.727544302 | 4 | 0.31856883503496647 |

EXPERIMENT
#+RESULTS:
| 0.7292170410000001 | 4 |  0.3274015989154577 |
|          0.6765667 | 3 | 0.27632732316851616 |
|        0.678893372 | 3 |  0.2820032276213169 |
|        0.687574553 | 3 |  0.2887851446866989 |
|        0.686951109 | 3 | 0.29012931138277054 |
| 0.6911566100000001 | 3 | 0.29237726889550686 |
|        0.694173444 | 3 |  0.2917533665895462 |
|        0.689142251 | 3 | 0.29213809221982956 |
|        0.693177771 | 3 |  0.2920475658029318 |
|        0.691042293 | 3 |   0.291819978505373 |
| 0.6912445389999999 | 3 | 0.29204029589891434 |
| 0.6907293179999999 | 3 |   0.291503781452775 |
|        0.691926034 | 3 | 0.29254630766808987 |
|        0.690904596 | 3 | 0.29187605902552605 |
|        0.692919889 | 3 | 0.29172540083527565 |
|        0.693844989 | 3 |  0.2930472865700722 |
| 0.6927477240000001 | 3 |  0.2928554993122816 |
|        0.694191575 | 3 |  0.2936169896274805 |
|        0.694379735 | 3 | 0.29261351004242897 |
|        0.692311003 | 3 | 0.29310736805200577 |

#+begin_src elisp
(cl-loop
 for i from 1 to 20
 collect (benchmark-run 1 (org-roam-db-query
		 "SELECT
    id,
    file,
    filetitle,
    \"level\",
    todo,
    pos,
    priority ,
    scheduled ,
    deadline ,
    title,
    properties ,
    olp,
    atime,
    mtime,
    '(' || group_concat(tags, ' ') || ')' as tags,
    aliases,
    refs
  FROM
    (
    SELECT
      id,
      file,
      filetitle,
      \"level\",
      todo,
      pos,
      priority ,
      scheduled ,
      deadline ,
      title,
      properties ,
      olp,
      atime,
      mtime,
      tags,
      '(' || group_concat(aliases, ' ') || ')' as aliases,
      refs
    FROM
      (
      SELECT
	nodes.id as id,
	nodes.file as file,
	nodes.\"level\" as \"level\",
	nodes.todo as todo,
	nodes.pos as pos,
	nodes.priority as priority,
	nodes.scheduled as scheduled,
	nodes.deadline as deadline,
	nodes.title as title,
	nodes.properties as properties,
	nodes.olp as olp,
	files.atime as atime,
	files.mtime as mtime,
	files.title as filetitle,
	tags.tag as tags,
	aliases.alias as aliases,
	'(' || group_concat(RTRIM (refs.\"type\", '\"') || ':' || LTRIM(refs.ref, '\"'), ' ') || ')' as refs
      FROM nodes
      LEFT JOIN files ON files.file = nodes.file
      LEFT JOIN tags ON tags.node_id = nodes.id
      LEFT JOIN aliases ON aliases.node_id = nodes.id
      LEFT JOIN refs ON refs.node_id = nodes.id
      GROUP BY nodes.id, tags.tag, aliases.alias )
    GROUP BY id, tags )
  GROUP BY id")))
#+end_src

#+RESULTS:
|        0.600206946 | 3 |  0.2811295632272959 |
|        0.612772136 | 3 |  0.2911699693650007 |
| 0.6061969460000001 | 3 | 0.28548470325767994 |
|        0.610293451 | 3 | 0.28799888491630554 |
|        0.604368607 | 3 |  0.2836131118237972 |
|        0.617678633 | 3 | 0.29066314920783043 |
|         0.60434602 | 3 |  0.2837772723287344 |
|        0.613693533 | 3 | 0.29190177842974663 |
|        0.606856127 | 3 |   0.287681695073843 |
|        0.610696312 | 3 | 0.28828025236725807 |
|        0.608461516 | 3 |  0.2850720416754484 |
|        0.618411659 | 3 |   0.293755479156971 |
|        0.606569084 | 3 | 0.28513619117438793 |
|        0.613769477 | 3 |  0.2913442775607109 |
|        0.614647936 | 3 |  0.2909548692405224 |
|        0.613321754 | 3 |  0.2889191582798958 |
|        0.608215155 | 3 | 0.28532721288502216 |
| 0.6192423260000001 | 3 | 0.29310990683734417 |
|        0.623140292 | 3 |  0.2861739285290241 |
|        0.626372126 | 3 | 0.29102059081196785 |

akashp · May 21, 2024, 9:46am

#!/bin/bash

time sqlite3 /home/akash/Desktop/test2/test/org-roam.db <<EOF | wc -l

SELECT
  id,
  file,
  filetitle,
  "level",
  todo,
  pos,
  priority ,
  scheduled ,
  deadline ,
  title,
  properties ,
  olp,
  atime,
  mtime,
  '(' || group_concat(tags, ' ') || ')' as tags,
  aliases,
  refs
FROM
  (
  SELECT
    id,
    file,
    filetitle,
    "level",
    todo,
    pos,
    priority ,
    scheduled ,
    deadline ,
    title,
    properties ,
    olp,
    atime,
    mtime,
    tags,
    '(' || group_concat(aliases, ' ') || ')' as aliases,
    refs
  FROM
    (
    SELECT
      nodes.id as id,
      nodes.file as file,
      nodes."level" as "level",
      nodes.todo as todo,
      nodes.pos as pos,
      nodes.priority as priority,
      nodes.scheduled as scheduled,
      nodes.deadline as deadline,
      nodes.title as title,
      nodes.properties as properties,
      nodes.olp as olp,
      files.atime as atime,
      files.mtime as mtime,
      files.title as filetitle,
      tags.tag as tags,
      aliases.alias as aliases,
      '(' || group_concat(RTRIM (refs."type", '"') || ':' || LTRIM(refs.ref, '"'), ' ') || ')' as refs
    FROM nodes
    LEFT JOIN files ON files.file = nodes.file
    LEFT JOIN tags ON tags.node_id = nodes.id
    LEFT JOIN aliases ON aliases.node_id = nodes.id
    LEFT JOIN refs ON refs.node_id = nodes.id
    GROUP BY nodes.id, tags.tag, aliases.alias )
  GROUP BY id, tags )
GROUP BY id
EOF

./testdb
20001

real	0m0.233s
user	0m0.181s
sys	0m0.056s

Notice the difference in querying the db from sqlite directly and then through emacsql in the above result. That is the significant part in my opinion.

akashp · May 21, 2024, 9:56am

In profiler I get this

CONTROL

Screenshot_2024-05-21_15-23-04

YOUR FN

The difference between org-roam-node-list and org-roam-db query not significant between both cases
after accounting for mapcan which shows up as additional load.

vherrmann · May 21, 2024, 10:06am

In my tests there were no relevant time differences either (3000 nodes). Further I have not seen a change in the semantics.

dmg · May 21, 2024, 5:04pm

If your database is synthetic, would you mind sharing it? That would make it easier to compare result. I think we are noticing that garbage collection is the using 1/2 of the time when the data is large.

dmg · May 21, 2024, 5:32pm

I was tempted to blame org-roam-db-query, but in my tests, it runs very fast and only have GC once in 10 runs:

gist.github.com

https://gist.github.com/dmgerman/868f9ce838a9dbdb6c0a6d369f8f770b

test-org-roam-db-query.el

(defun dmg-test-function ()
  (org-roam-db-query
               "SELECT
  id,
  file,
  filetitle,
  \"level\",
  todo,

  pos,

This file has been truncated. show original

if we could create the node without creating almost 50 variables per node, that might reduce the need to do GC

For example, making sure the list returned by SQL is exactly the parameters and order that org-roam-node-create uses and the use apply to call it with the list.

that would obviate the need to create each independent value and each parameter.

Perhaps even just calling org-node-create with a list of parameters without their name (because they have to be reordered) would make reduce the amount of memory needed.

#+RESULTS:
|             0.041108 | 0 |                0.0 |
|             0.028866 | 0 |                0.0 |
|             0.025647 | 0 |                0.0 |
|             0.024124 | 0 |                0.0 |
|             0.024285 | 0 |                0.0 |
|             0.024206 | 0 |                0.0 |
|             0.024036 | 0 |                0.0 |
|             0.024063 | 0 |                0.0 |
|             0.024332 | 0 |                0.0 |
|             0.024406 | 0 |                0.0 |
|             0.024261 | 0 |                0.0 |
|             0.024249 | 0 |                0.0 |
|             0.024303 | 0 |                0.0 |
|             0.227907 | 1 | 0.2036719999999974 |
| 0.024589999999999997 | 0 |                0.0 |
|             0.023835 | 0 |                0.0 |
|             0.023943 | 0 |                0.0 |
|             0.023985 | 0 |                0.0 |
| 0.024034999999999997 | 0 |                0.0 |
|             0.024061 | 0 |                0.0 |

akashp · May 21, 2024, 7:56pm

Would you want to generate a synthetic org roam db?

gist.github.com

https://gist.github.com/akashpal-21/1e0488e75703ea7eb29bd588e11cfc17

gistfile1.txt

#!/bin/bash

echo "Enter the number of files to generate:"
read num_files

# Check if the input is a positive integer
if [[ ! $num_files =~ ^[1-9][0-9]*$ ]]; then
    echo "Invalid input. Please enter a positive integer."
    exit 1
fi

This file has been truncated. show original

gistfile2.txt

((nil . ((org-roam-directory . "/home/user/Desktop/test/")
         (org-roam-db-location . "/home/user/Desktop/test/org-roam.db"))))

I dont know how to share non text files across the internet trivially.

Please generate a high number of nodes and stress test and post results if possible. See if your results hold asymptotically.

dmg · May 21, 2024, 8:22pm

Please see this gist:

https://gist.github.com/dmgerman/f862c1c78d2a870af1bf622add837fa4

I am creating a new constructor for org-roam-node that does not require named parameters. the node, seems to improve performance significantly. Creating 50k nodes gets an improvement slightly over 10 times faster.

akashp · May 21, 2024, 8:56pm

 CONTROL
| 0.28171295300000004 | 2 |         0.084284362 |
|         0.277870468 | 2 | 0.08414801599999855 |
|         0.277171279 | 2 |  0.0839628440000002 |
|         0.277248302 | 2 |  0.0834364480000005 |
|         0.276339267 | 2 | 0.08321827099999979 |
|         0.277154009 | 2 | 0.08335580900000039 |
|         0.275812979 | 2 | 0.08348886599999972 |
|         0.276216001 | 2 | 0.08317939199999991 |
| 0.27705607299999996 | 2 |  0.0832222490000003 |
|           0.2771496 | 2 | 0.08391891600000001 |

EXPERIMENT
|         0.132725327 | 2 | 0.08886539900000123 |
|         0.123759457 | 2 |  0.0830589789999987 |
| 0.12593211399999998 | 2 | 0.08323572800000001 |
|         0.126485794 | 2 | 0.08447631300000147 |
|         0.124876214 | 2 | 0.08367610299999839 |
|         0.126569966 | 2 | 0.08447158600000115 |
| 0.12523775599999998 | 2 | 0.08350987599999904 |
|         0.124439221 | 2 | 0.08326421699999997 |
|         0.124985401 | 2 |  0.0839019710000013 |
|         0.124684379 | 2 | 0.08347416400000007 |

Results are statistically significant - seems promising in reducing redundant processing -
Will this show up when generating node list? AFAIC Understand you want to structure the data better so as to increase efficiency.

I am totally lost on data structures, I can only understand what I feel intuitively. But this is what I feel what you are trying to do.

akashp · May 21, 2024, 8:57pm

But still my Control case is much lower than yours - why are we seeing such big deviations?

Interestingly control worsens after running experiment – consistently.

Second Control Run after Running Experiment
|        1.004602551 | 11 |  0.3946154679999996 |
|        0.996442979 | 11 |  0.3934522349999998 |
| 1.0004184539999998 | 11 |   0.393173667000001 |
|         0.99922416 | 11 | 0.39315555700000004 |
|        1.040827695 | 12 | 0.43161453899999813 |
|        1.001624114 | 11 | 0.39341975399999995 |
| 0.9980252949999999 | 11 | 0.39309455400000104 |
|        0.999549576 | 11 |  0.3929026740000019 |
|        0.999298959 | 11 | 0.39272235999999694 |
|        1.001026039 | 11 | 0.39427720500000163 |

Experiment Results consistent
| 0.15062167999999998 | 3 | 0.10843308800000173 |
| 0.11248324800000001 | 2 | 0.07184136599999746 |
| 0.14853618999999998 | 3 | 0.10716503200000105 |
|         0.112866889 | 2 | 0.07138863200000145 |
|         0.147375123 | 3 | 0.10682175999999899 |
| 0.11207946099999999 | 2 | 0.07141477500000093 |
| 0.14798034599999998 | 3 | 0.10689594799999824 |
|         0.112714898 | 2 | 0.07139104700000232 |
|         0.147596864 | 3 | 0.10656844300000046 |
|         0.112206112 | 2 |  0.0710386399999976 |

Control after restart again
|          0.30494312 | 3 | 0.10776294900000138 |
| 0.26504791499999997 | 2 | 0.07142495899999801 |
| 0.30040536700000003 | 3 | 0.10709436100000147 |
|         0.266023406 | 2 |  0.0713700519999989 |
| 0.30048368400000003 | 3 |  0.1071907670000023 |
|          0.26618901 | 2 | 0.07152417999999727 |
| 0.30043515299999995 | 3 | 0.10767216400000024 |
|         0.264418572 | 2 | 0.07140399700000089 |
|         0.265141299 | 2 | 0.07154106799999838 |
| 0.30027506299999995 | 3 | 0.10734223800000109 |

akashp · May 21, 2024, 9:34pm

Modified the node list fn to use your connector – the results are very promising

|        0.653262737 | 4 |  0.3129922920000041 |
|        0.654214992 | 4 | 0.30418103499999916 |
|        0.663630402 | 4 |  0.3086082529999956 |
|        0.659400501 | 4 |  0.3094828970000023 |
|        0.647324661 | 4 |  0.3043669179999995 |
|        0.657838998 | 4 |  0.3077575419999974 |
|        0.651547078 | 4 |  0.3046992610000032 |
|        0.656753804 | 4 | 0.31173734099999706 |
|        0.650651777 | 4 |  0.3028202170000043 |
|        0.646700192 | 4 |  0.3039561339999963 |
|        0.649505127 | 4 |  0.3049423660000059 |
|        0.644238977 | 4 |  0.3048566369999932 |
| 0.6451174289999999 | 4 | 0.30427459600000617 |
| 0.6490824089999999 | 4 |  0.3064676730000002 |
|        0.647599587 | 4 |  0.3042002400000001 |
| 0.6463355559999999 | 4 |  0.3051224279999971 |
|        0.653356245 | 4 |  0.3057291749999962 |
|        0.644165108 | 4 | 0.30516703199999995 |
| 0.6454056819999999 | 4 | 0.30434682900000354 |
| 0.6457858339999999 | 4 |  0.3060034439999981 |

(defun dmg-test-function ()
      (let ((rows (org-roam-db-query
		   "SELECT
      id,
      file,
      filetitle,
      \"level\",
      todo,
      pos,
      priority ,
      scheduled ,
      deadline ,
      title,
      properties ,
      olp,
      atime,
      mtime,
      '(' || group_concat(tags, ' ') || ')' as tags,
      aliases,
      refs
    FROM
      (
      SELECT
	id,
	file,
	filetitle,
	\"level\",
	todo,
	pos,
	priority ,
	scheduled ,
	deadline ,
	title,
	properties ,
	olp,
	atime,
	mtime,
	tags,
	'(' || group_concat(aliases, ' ') || ')' as aliases,
	refs
      FROM
	(
	SELECT
	  nodes.id as id,
	  nodes.file as file,
	  nodes.\"level\" as \"level\",
	  nodes.todo as todo,
	  nodes.pos as pos,
	  nodes.priority as priority,
	  nodes.scheduled as scheduled,
	  nodes.deadline as deadline,
	  nodes.title as title,
	  nodes.properties as properties,
	  nodes.olp as olp,
	  files.atime as atime,
	  files.mtime as mtime,
	  files.title as filetitle,
	  tags.tag as tags,
	  aliases.alias as aliases,
	  '(' || group_concat(RTRIM (refs.\"type\", '\"') || ':' || LTRIM(refs.ref, '\"'), ' ') || ')' as refs
	FROM nodes
	LEFT JOIN files ON files.file = nodes.file
	LEFT JOIN tags ON tags.node_id = nodes.id
	LEFT JOIN aliases ON aliases.node_id = nodes.id
	LEFT JOIN refs ON refs.node_id = nodes.id
	GROUP BY nodes.id, tags.tag, aliases.alias )
      GROUP BY id, tags )
    GROUP BY id
    --order by id, file
    "

    )))
    (cl-loop for row in rows do
	     (apply 'org-roam-node-create-from-db row))))

dmg · May 21, 2024, 9:45pm

The extra constructor I added to the org-roam-node does not alter the behaviour of any existing code.
It simply adds a new constructor that does not take named parameters. Named parameters are very good to avoid
errors, but they also incur a cost in their processing.

The big improvement is in avoiding garbage collection.

I’ll try the synthetic database tonight or tomorrow.

dmg · May 22, 2024, 12:10am

Good question. I am running emacs

GNU Emacs 29.3 (build 1, aarch64-apple-darwin21.6.0, NS appkit-2113.60 Version 12.6.6 (Build 21G646)) of 2024-03-24

using a mac air M1 with 16 gigs of memory.

akashp · May 22, 2024, 3:06am

I wanted to ask – how are we making slots inside the struct?

without the slots we can never access any data through our accessor functions.
I couldnt get any accessor function to return the result of its slot with your constructor

dmg · May 22, 2024, 3:47am

The new constructor takes care of that. if you look at the definition, it expects the parameters to come in certain order (as compared to the by-name definition, which expects every parameter to be named).

See :constructor in

A struct can have zero or more constructors.

For example:

(cl-defstruct
    (person
     (:constructor my-person)   ; no default constructor
     (:constructor new-person
                   (first-name sex &optional (age 0)))

  first-name age sex)

(setq p1 (new-person "jim" 12 "male"))
(setq p2 (my-person :first-name "jim" :age 12 :sex "male"))

(format "Person [%S] [%S]" p1 p2)

would output

"Person [#s(person \"jim\" \"male\" 12)] [#s(person \"jim\" 12 \"male\")]"

#+begin_src emacs-lisp   :exports both :results verbatim
(setq a-node
  (apply 'org-roam-node-create-from-db
          '("the-title"  "the-id" "the-filename" "d" "e" 
                          "f" "g" "h" "i" "j" 
                          "k" "l" "m" "n" "o"
                          (123 235 989) "q"  ; "r"
                         ; "s" "t"
                          )))
(list (org-roam-node-title a-node) (org-roam-node-file a-node))
#+end_src

#+RESULTS:
#+begin_example
("the-title" "the-filename")
#+end_example

akashp · May 23, 2024, 2:25am

Ok I had to make myself literate in using the Common-Lisp library;

One good thing your tests have done is it has allowed me to understand the data-types and how they are used,

I think you have defined your constructor function wrongly – we have to define the arg list in the same way we get from the query – which is conveniently provided by the p-case destructive binding

Here’s the full corrected code -

(cl-defstruct (org-roam-node (:constructor org-roam-node-create)
                             (:constructor org-roam-node-create+
                                           (id &optional file file-title level todo pos priority scheduled deadline
					       title properties olp atime mtime tags aliases refs))
                             (:copier nil))
  "A heading or top level file with an assigned ID property."
  file file-title file-hash file-atime file-mtime
  id level point todo priority scheduled deadline title properties olp
  tags aliases refs)

(defun test/org-roam-node-list ()
  "Return all nodes stored in the database as a list of `org-roam-node's."
  (let ((rows (org-roam-db-query
               "SELECT
  id,
  file,
  filetitle,
  \"level\",
  todo,
  pos,
  priority ,
  scheduled ,
  deadline ,
  title,
  properties ,
  olp,
  atime,
  mtime,
  '(' || group_concat(tags, ' ') || ')' as tags,
  aliases,
  refs
FROM
  (
  SELECT
    id,
    file,
    filetitle,
    \"level\",
    todo,
    pos,
    priority ,
    scheduled ,
    deadline ,
    title,
    properties ,
    olp,
    atime,
    mtime,
    tags,
    '(' || group_concat(aliases, ' ') || ')' as aliases,
    refs
  FROM
    (
    SELECT
      nodes.id as id,
      nodes.file as file,
      nodes.\"level\" as \"level\",
      nodes.todo as todo,
      nodes.pos as pos,
      nodes.priority as priority,
      nodes.scheduled as scheduled,
      nodes.deadline as deadline,
      nodes.title as title,
      nodes.properties as properties,
      nodes.olp as olp,
      files.atime as atime,
      files.mtime as mtime,
      files.title as filetitle,
      tags.tag as tags,
      aliases.alias as aliases,
      '(' || group_concat(RTRIM (refs.\"type\", '\"') || ':' || LTRIM(refs.ref, '\"'), ' ') || ')' as refs
    FROM nodes
    LEFT JOIN files ON files.file = nodes.file
    LEFT JOIN tags ON tags.node_id = nodes.id
    LEFT JOIN aliases ON aliases.node_id = nodes.id
    LEFT JOIN refs ON refs.node_id = nodes.id
    GROUP BY nodes.id, tags.tag, aliases.alias )
  GROUP BY id, tags )
GROUP BY id")))
    (cl-loop for row in rows
             collect (apply #'org-roam-node-create+ row))))

(advice-add 'org-roam-node-list :override #'test/org-roam-node-list)

What do we lose?

Aliases - they would not work as intended.

EDIT

Sorry we cannot take directly from the p-case destructive bind

(cl-defstruct (org-roam-node (:constructor org-roam-node-create)
                             (:constructor org-roam-node-create+
                                           (id &optional file file-title level todo point priority scheduled deadline
					       title properties olp file-atime file-mtime tags aliases refs))
                             (:copier nil))
  "A heading or top level file with an assigned ID property."
  file file-title file-hash file-atime file-mtime
  id level point todo priority scheduled deadline title properties olp
  tags aliases refs)

We need to adjust the name changes done inside that function
Overall this should be our new constructor definition

akashp · May 23, 2024, 2:29am

It seems to me the point of doing the rigmarole and using &key parameters in the original implementation is to allow for aliases.
Without processing for aliases we gain about +0.1 seconds for the case of 20000 nodes in my test.

Topic		Replies	Views
Improving performance of node-find et. al Development	31	1018	July 28, 2024
Org-roam-node-read speedup without cache - Guide Guides	2	115	June 28, 2024
How to make org-roam-node-find faster? Requests	4	710	December 31, 2022
Improving org-roam-format-template (next to make org-roam-node-find fast) Development	8	146	May 30, 2024
Difficult to use at >20,000 nodes? Troubleshooting	2	369	October 23, 2023

Rewriting org-roam-node-list for speed (it is not sqlite)

CONTROL

YOUR FN

What do we lose?

EDIT

Related topics