What does it feel like to work with 10,000 notes in Org-roam: Benchmarking Org-roam’s search methods
Intro
I got inspiration from this exchange in Org-roam Discourse on “Performance Testing”.
I went ahead, downloaded the 10,000 markdown files, and took Org-roam (with Md-roam as a companion) for a quick spin.
Here is my quick write-up, and my short video demo to share the impression with the community here. The write-up would feel more like lab notes rather than finished paper—I have never been a scientist, so it’s just my imagination…
Thank you, @cobblepot, for the link to the 10,000 markdown files
Summary
-
I find it good and workable with 10,000 notes.
-
For
org-roam--list-files-xx
functions, I observe significant performance difference betweenelisp
andrg
(two different search methods that work for Windows)—I suspect it would affect performance oforg-roam-insert
andorg-roam-find-file
, but I did not confirm this myself -
elisp
starts out faster, but then the performance degrades linearly.rg
is flat for 100, 1,000, and 10,000 files.rg
starts to be more attractive after between 2,000 and 3,000 files.elisp
does not seem feasible at the 10,000 mark -
The first DB build for 10,000 files take 6–7 minutes with my machine. Given that you would have to do this once in a while—you should build your DB overtime, and only occasionally would you need to re-build your DB—it should not be a problem
-
After the initial build, launching Org-roam takes about a minute (
org-roam-db-build-cache
needs this time). It might be once a day when you start working with PC; or it could be once in some weeks if you don’t turn it off. Not bad, for 10,000 knowledge base helping you generate more and better knowledge work -
Inserting and searching an existing note is OK—see my 2-minute YouTube video for an impression yourself
There are obvious limitations to my tests.
-
I only looked at the number of files; I did not create many more number of backlinks among the files. To prepare this data, I would need some programming support to add links in the 10,000 files. I do not know the SQL and DB architecture enough to assess if I should expect big difference in performance if you had exponentially more backlinks
-
In addition, I didn’t test graph capabilities, and I didn’t have notes in subdirectories in
org-roam-directory
. I do not expect subdirectories would change performance much, but I can be wrong
Environment
I do not know where the machine power starts to play a significant role, but here are basic characteristics of mine:
- Processor: Intel Core i7-8650U 1.90 GHz
- RAM: 16.0 GB
Method
It’s rather rudimentary set-up and easy manual execution.
-
Download 10,000 markdown files from here
-
Unzip, copy 100, 1000, 2000 files to separate folders
-
Change configuration for
org-roam-directory
to point to the respective folder for each case -
Run benchmark with interactive function
benchmark
as:
(benchmark 10 '(org-roam--list-files-xx "full/path/to/org-roam-directory"))
(I learned about benchmark
from GitHub user siawyoung in this exchange on Org-roam PR). Thanks!)
The full path seems to be required for Windows as ~/
does not seem to expand in this form. It seems that for Windows I need to use \\
for the path.
- Run the benchmark again and record the elapsed time for both the first and second repeats.
Results
100 notes
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\100-markdown"))
Elapsed time: 0.400056s
Elapsed time: 0.264749s
(benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\100-markdown"))
Elapsed time: 5.150758s
Elapsed time: 5.137724s
1,000
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\1000-markdown"))
Elapsed time: 2.274566s (0.079555s in 1 GCs)
Elapsed time: 2.486280s
(benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\1000-markdown"))
Elapsed time: 5.174479s
Elapsed time: 5.167108s
2,000
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\2000-markdown"))
Elapsed time: 4.788072s (0.069892s in 1 GCs)
Elapsed time: 4.725148s (0.075500s in 1 GCs)
(benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\2000-markdown"))
Elapsed time: 5.151170s
Elapsed time: 5.095258s
10,000 notes
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\10000-markdown"))
Elapsed time: 22.899702s (0.419321s in 6 GCs)
Elapsed time: 22.441639s (0.408953s in 6 GCs)
(benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\10000-markdown"))
Elapsed time: 6.148009s (0.067745s in 1 GCs)
Elapsed time: 6.456070s (0.162346s in 2 GCs)
org-roam-db-build-cache
Build the db file from scratch (no db file exists).
I’m not repeating to wait for that long for org-roam-db-build-cache
.
(benchmark 1 '(org-roam-db-build-cache))
(org-roam) files: 10000, links: 0, tags: 0, titles: 10000, refs: 0, deleted: 0
Elapsed time: 376.709222s (28.662395s in 521 GCs)
Once the db is built, run org-roam-db-build-cache
again. This happens every time you re-launch Emacs and Org-roam, even if there has been no change to notes after you shutdown Emacs and restart it.
(benchmark 1 '(org-roam-db-build-cache))
(org-roam) files: 0, links: 0, tags: 0, titles: 0, refs: 0, deleted: 0
Elapsed time: 47.147790s (0.913966s in 18 GCs)
Video demo to share impression
My video demo uses org-roam--list-files-rg
, instead of the default org-roam--list-files-elisp
.
Limitation of this performance testing
-
Few backlinks between notes
If you have multiple backlinks between notes, you might potentially exponetially higher volume of data. I am not sure how this translates to our perceived performance of the system. -
Related to the backlinks, I didn’t test graph capabilities.
-
I had only flat directory. I don’t necessarily expect it would be much different, but use of subdirectories might influence performance.
Coda
It was fun. It also feels to me like a useful exercise. Hope you can take away something useful from my report and video, too.
I don’t know how I can add backlinks easily. Perhaps someone in the community can go beyond my tests here, and see where that takes us to.
Luhmann is said to have had 90,000 slip-notes (source, and referenced here). I don’t think I’ll ever reach the 10,000 mark, but it’s good to imagine what it could be like. It seems Org-roam can continue to be your good companion there.
Happy note taking.