What does it feel like to work with 10,000 notes in Org-roam: Benchmarking Org-roam’s search methods
I got inspiration from this exchange in Org-roam Discourse on “Performance Testing”.
I went ahead, downloaded the 10,000 markdown files, and took Org-roam (with Md-roam as a companion) for a quick spin.
Here is my quick write-up, and my short video demo to share the impression with the community here. The write-up would feel more like lab notes rather than finished paper—I have never been a scientist, so it’s just my imagination…
Thank you, @cobblepot, for the link to the 10,000 markdown files
I find it good and workable with 10,000 notes.
org-roam--list-files-xxfunctions, I observe significant performance difference between
rg(two different search methods that work for Windows)—I suspect it would affect performance of
org-roam-find-file, but I did not confirm this myself
elispstarts out faster, but then the performance degrades linearly.
rgis flat for 100, 1,000, and 10,000 files.
rgstarts to be more attractive after between 2,000 and 3,000 files.
elispdoes not seem feasible at the 10,000 mark
The first DB build for 10,000 files take 6–7 minutes with my machine. Given that you would have to do this once in a while—you should build your DB overtime, and only occasionally would you need to re-build your DB—it should not be a problem
After the initial build, launching Org-roam takes about a minute (
org-roam-db-build-cacheneeds this time). It might be once a day when you start working with PC; or it could be once in some weeks if you don’t turn it off. Not bad, for 10,000 knowledge base helping you generate more and better knowledge work
Inserting and searching an existing note is OK—see my 2-minute YouTube video for an impression yourself
There are obvious limitations to my tests.
I only looked at the number of files; I did not create many more number of backlinks among the files. To prepare this data, I would need some programming support to add links in the 10,000 files. I do not know the SQL and DB architecture enough to assess if I should expect big difference in performance if you had exponentially more backlinks
In addition, I didn’t test graph capabilities, and I didn’t have notes in subdirectories in
org-roam-directory. I do not expect subdirectories would change performance much, but I can be wrong
I do not know where the machine power starts to play a significant role, but here are basic characteristics of mine:
- Processor: Intel Core i7-8650U 1.90 GHz
- RAM: 16.0 GB
It’s rather rudimentary set-up and easy manual execution.
Download 10,000 markdown files from here
Unzip, copy 100, 1000, 2000 files to separate folders
Change configuration for
org-roam-directoryto point to the respective folder for each case
Run benchmark with interactive function
(benchmark 10 '(org-roam--list-files-xx "full/path/to/org-roam-directory"))
(I learned about
benchmark from GitHub user siawyoung in this exchange on Org-roam PR). Thanks!)
The full path seems to be required for Windows as
~/ does not seem to expand in this form. It seems that for Windows I need to use
\\ for the path.
- Run the benchmark again and record the elapsed time for both the first and second repeats.
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\100-markdown")) Elapsed time: 0.400056s Elapsed time: 0.264749s (benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\100-markdown")) Elapsed time: 5.150758s Elapsed time: 5.137724s
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\1000-markdown")) Elapsed time: 2.274566s (0.079555s in 1 GCs) Elapsed time: 2.486280s (benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\1000-markdown")) Elapsed time: 5.174479s Elapsed time: 5.167108s
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\2000-markdown")) Elapsed time: 4.788072s (0.069892s in 1 GCs) Elapsed time: 4.725148s (0.075500s in 1 GCs) (benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\2000-markdown")) Elapsed time: 5.151170s Elapsed time: 5.095258s
(benchmark 10 '(org-roam--list-files-elisp "C:\\Users\\nobiot\\10000-markdown")) Elapsed time: 22.899702s (0.419321s in 6 GCs) Elapsed time: 22.441639s (0.408953s in 6 GCs) (benchmark 10 '(org-roam--list-files-rg "C:\\Users\\nobiot\\scoop\\shims\\rg.exe" "C:\\Users\\nobiot\\10000-markdown")) Elapsed time: 6.148009s (0.067745s in 1 GCs) Elapsed time: 6.456070s (0.162346s in 2 GCs)
Build the db file from scratch (no db file exists).
I’m not repeating to wait for that long for
(benchmark 1 '(org-roam-db-build-cache)) (org-roam) files: 10000, links: 0, tags: 0, titles: 10000, refs: 0, deleted: 0 Elapsed time: 376.709222s (28.662395s in 521 GCs)
Once the db is built, run
org-roam-db-build-cache again. This happens every time you re-launch Emacs and Org-roam, even if there has been no change to notes after you shutdown Emacs and restart it.
(benchmark 1 '(org-roam-db-build-cache)) (org-roam) files: 0, links: 0, tags: 0, titles: 0, refs: 0, deleted: 0 Elapsed time: 47.147790s (0.913966s in 18 GCs)
Video demo to share impression
My video demo uses
org-roam--list-files-rg, instead of the default
Limitation of this performance testing
Few backlinks between notes
If you have multiple backlinks between notes, you might potentially exponetially higher volume of data. I am not sure how this translates to our perceived performance of the system.
Related to the backlinks, I didn’t test graph capabilities.
I had only flat directory. I don’t necessarily expect it would be much different, but use of subdirectories might influence performance.
It was fun. It also feels to me like a useful exercise. Hope you can take away something useful from my report and video, too.
I don’t know how I can add backlinks easily. Perhaps someone in the community can go beyond my tests here, and see where that takes us to.
Luhmann is said to have had 90,000 slip-notes (source, and referenced here). I don’t think I’ll ever reach the 10,000 mark, but it’s good to imagine what it could be like. It seems Org-roam can continue to be your good companion there.
Happy note taking.