Org-roam v2 is an amazing project and brought org-roam to a new level of sophistication. Still, I never made the switch. As a reminder, the crucial difference between the two was the move from files as the main unit of interaction to “nodes”, which are org headlines with an org-id attached to them. This allowed for impressive graphs and more thorough linking to headlines (and even to show backlinks for nodes) but it never clicked with me and so I continued to use v1 (as mentioned a fewtimes here).
This worked fine for me until recently, where some weird emacsql problem broke the interaction between Emacs 29 and orgroam v1 on a new (old) intel Mac from work, which I wanted to use for field research (instead of my newer and expensive MBP). Enter orgrr:
Orgrr is an almost feature-complete replica of the core functionality of org-roam v1, built using ripgrep (rg), a lot of regex and hashtables. It does recognize alternative note titles (#+roam_alias) and tags (#+roam_tags) as introduced by org-roam v1. Orgrr currently only works with org-files (i.e. files ending in .org).
In part I wrote this to learn more elisp (just so you know where I’m coming from ). I’ve been using orgrr for many hours in the past few days and it worked for me (I fixed a few remaining smaller shortcomings). If you also miss the first version of org-roam or prefer to not use databases (or to reduce dependencies), then this might be something for you.
PS For full text search I strongly recommend deadgrep, which also uses rg.
The GitHub traffic page for orgrr shows that folks are still being send there from this post. On the occasion of the two year anniversary of the project, here some notes on how orgrr has evolved:
Most of these changes have addressed personal needs of a social scientist (me). As I am certainly biased, take my words with a grain of salt - for me orgrr is the best tool to write, structure and analyze thousands of plain-text notes. I love the independence/freedom that writing this package myself provides me.
Very fascinating. Will try out someday. Was looking forward to writing a minimal system from scratch for myself. Interested to know the pro/cons of going the database route or on the fly regexp. Does it scale well? Is it possible to parse various information programmatically. One thing that intimidates me is having to write regexps continuously to get regular data.
Will check out your work. Thanks for the ping.
Thanks! For most use-cases I use rg and regexp to put the data on org files in local hashtables, which are very fast. Even without caching it scales pretty well, I think:
Parsing orgfiles and filenames (as unique identifiers) is pretty straight forward because of their regular nature. In comparison parsing HTML is just horror (for example website2org is 90% regexp and pattern matching).
Yes and no. It is somewhat similar in that we both keep the database in ram only. His is in sql+hashtables, mine is hashtables only. And his package does collect a lot more metadata (including tasks), individually parsing every orgmode file. Mine uses ripgrep, is a bit older and so I had more time to work on all the eventual quirks and issues. I think org-node - the basis for this project - is already pretty mature.
@laotang, orgrr shows links, both forward and backward, to the second degree (2-hop links). It also shows sequential links (Luhmann’s Folgezettel).
What are your experience of both?
I am thinking that backlinks are not as useful as many of us thought they would be initially. But 2-hops links may deliver what backlinks once “promised”. I am curious about your real experience as a sociologist (I suspect you mainly do qualitative research, which I am also interested in, rather than quantitative).
For sequential links, I have also started to find them useful as the number of notes has grown. They help me keep track of different trains of thoughts — I can pick up my thought where I left it a few months before. Do you have similar experience?
Honestly, I just use org-roam as a file-system for all my org-files. So I end up creating a major node and then almost all the information in a linear and sequential manner relating to its subtopics, for me sequence encodes an information that is lost when writing in small nodes that briefly talks about that topic and then links to others. To print all these sequential information in a single file is also a hassle which is also important for me.. Links do not encode the same information for me. But your zettel sequencing may be really what is required for me to move to a one small node one file paradigm coupled with native project support for printing- something that mirrors org-transclusion.
Definitely very alluring features, but will require me to migrate quite a bit. It would definitely be worthwhile in knowing your experience in using zettel to encode information about sequences in this way.
In short, both for me are more useful than backlinks to retrieve the “known-knowns”, “known-unknowns” and the “unknown-unknowns”. Especially sequential notes are useful to create trains of thoughts that mix facts and ideas.
I recently finished reading Doto 2024 - A System for Writing, which is a good addition to Ahrens 2017 - How to take smart notes. If I want to see how the book and its ideas have entered my notes, I use backlinks:
If, however, I want to learn about a specific topic, I now almost always use show-multiverse, which combines a view of the sequence and related notes (and was an idea suggested to me by a user):
The drawback is that in order for sequences to be useful all notes require Zettel No values. It is a telling sign that I have added these numbers to almost all of my 4000+ old notes.
On backlinks I agree, see above. I am a political scientist working mostly on local politics in rural China. Orgrr is the QDA software I use to analyse primary sources (and secondary sources, of course). To give you one example, I constantly monitor Chinese newspapers for specific terms to be informed about developments on the ground. This now completely takes place in Emacs: Elfeed > cuckoo-search > website2org > orgrr.
I created the compile-sequence function for exactly this, see below. There is also an option to remove the level 1 headings linking to the notes. So one could use this to write a paper or book.
Thank you for sharing the detail; very insightful! QDA = Qualitative Data Analysis, I believe.
I want to read it too… Didn’t know about it.
I have taken a different approach for implementation. My observation is that notes in the sequential relationships form an ordered rooted tree (a mathematical notion borrowed from Jens Getreu's blog - Set up a Zettelkasten with Tp-Note – it’s stricter than file systems because a node=file can only belong to one parent, so no symbolic link, which allows a node=file to belong to more than one parents). With this, I went ahead and implemented my own hierarchical representation with using the built-in hierarchy library and custom header data – I wanted to work with plaintext .txt files.
I figured that the “Zettel number” was not essential but the structure is (an ordered rooted tree). I am not sure if it saved any efforts (you had to add Zettel numbers; I needed to add the note’s ID directly in the header=meta data section) – but at least I didn’t need to deal with generating a number with a certain format.
I have come to differentiate the following three types of links (My libraries in the brackets).
Dictionary: (Ten=glossary.el)
Line of thinking (Tei)
Association (Ren, Roku, (Org-roam))
I have decided that, at least to me, they are distinct types and do not overlap, conceptually and functionally. And for each type, I have implemented my own, small program.
This is a screenshot of Ten, showing the tree on the left and one of the notes on the right.
This is something I want to look into in the next step… It has never occurred to me to display both a line of thinking (#2) and associations (#3) in the same buffer. Thank you for inspiration.
This is a fascinating system, thanks for all the details! Looking at the metadata of the note on the right hand side: what part of this decides on the position of a note?
I also separate different data points. Interviews and newspaper articles go in their own directory (=container in orgrr parlance). A first analysis (loosely based on Grounded Theory) takes place without necessarily referencing ideas from secondary literature. I have designed all functions that deal with relationship to be either limited to the current folder or to search across all active orgrr folders.
I am thinking about a function to manually stack notes. Doto has written a lot about this. I also was pretty jealous about the simplicity of Ryan Holiday’s notecard system.
It’s the arrow => followed by an ID (timestamp in my case). Let me explain.
For this little system of mine (Tei 綴 – Japanese pronunciation; I am adding this because you’d know the original Chinese way), the position is simply determined by the parent knowing what notes are its children. A root does not have a parent – that’s the “theme” or the beginning of a certain train of thought. It’s a simple rooted ordered tree: root -> children -> grand children -> .... Children do not know their parent (singular, because a note can have only one parent). But as the tree knows the entire structure, I can traverse on a theme branch, navigate from a child to its parent and vice versa. See this example.
The current note “What are we liking?” is under the root named “Note-taking” and has four child notes (each => followed by an ID. You see the file names but I have improved this aspect so that I only need the ID now). Each child can have 0, 1, or more notes.
To place a new note to an existing part of the tree, I simply write the ID following a => to a parent manually – or create a note from within the parent with C-u tei-new-note command.
I do not know if the hierarchy.el and my implementation scale with thousands of notes yet but I am happy and will go further with this.
The CCP has simplified the character a bit (缀) but the meaning (“stitching together”) is probably the same . I love how despite the differences in execution - your system is parent-based, mine children-based - the logic and the results are so similar. The limitations, if I understand correctly, are also the same: there is only one true path from root down to children and grand children.
I think so, and I look at it as the positive property of the structure, not as a limitation. If I understand the math correctly with the aid of ChatGPT, in an ordered rooted tree, a child can belong to only one parent. If a child can belong to multiple parents, the structure will be a network (a graph). I wanted the structure so that there is only one path from a root to a given note (and back to the root).
This is what I have been experimenting: clear conceptual and functional separation among these three kinds of “links”:
Dictionary (definition-references, like Xref but in note-taking)
I agree with this differentiation into three different types of links (and notes, by extension). The difference here might be that I try to have all three in orgrr.
The thing orgrr may still missing is something like org-transclusion - a manual synthesis of notes. But then again, why not just use that great project ;).