Is there a simple, scalable way to do full text search?

How do I do full-text search through org-roam files? Things I’ve looked into or tried:

  • org-roam manual suggests Deft, which I have not tried because it doesn’t scale well (according to what I read)
  • org-roam manual also suggests NotDeft. This looks complicated.
  • this thread looks very interesting but is above my head.
  • I used to use org-search-view and it dutifully and reliably reported the org-entries in which search string was found.But it works on org-agenda-files and it seems a bad idea to include all roam files as org-agenda-files.

I would think this is a critical functionality. Relying on pulling up a note based on its title and tags works some of the time; but, what are people doing when you just want to search (possibly by regex) for a string among thousands of org-roam files? I guess I’m hoping that such a fundamental action for basic functionality already exists as a MELPA package.

EDIT: Solved

I use ripgrep, rg and consult-ripgrep

3 Likes

Deft really does not scale well. For a while I used helm-org-rifle, which has a great UI - but also does not scale well. Now I am using deadgrep set to my org-roam directory, which is just awesome. I have about 1800 notes, including several long Chinese legal documents and it really flies.

3 Likes

This is one of the areas where being a plain text format truly shines. You can recursively search for any regex inside all the directory’s files using the grep tool.

In emacs, I use two commands for this, both based on grep. The first is the rgrep command. Searches the roam directory recursively and stores every instance of the requested regex inside a buffer it creates for output. You can them open each instance and see everything. The great thing is that since this creates a new buffer, you can keep the search results for later even. The other tool I use is counsel-rg (as I am an Ivy user), which uses ripgrep as its backend (an excellent rust rewrite of grep) and stores the output in the minibuffer. You then choose what you want and can even narrow it down as it interactively searches as you type. I am pretty sure other completion frameworks have this as well.

I typically rgrep when looking for multiple instances of a word inside the directory (or every instance) as it gives a retainable buffer easily allowing you to search multiple times, and counsel-rg when I need a one off or when I want to create the regex as I go and see the results slowly show up.

The consult-ripgrep can show the finding list, but I can’t preview the content like org-notes-search results list with content preview.
In other words:

  1. In consult-ripgrep, I can see the search result list and enter one of them but lost the research result. if the content is not what I want , I have to search again and browse the result from beginning again.
  2. In org-notes-search, I can preview the content in the search result, then choose the right one to enter.
    But org-notes-search only work for the org directory if the roam is in another directory.

So I wonder what is the better way, here’s some ideas:

  1. Set the org-roam directory in the org directory
  2. Can we use org-notes-search for the org-roam?

btw, rg is not as simple as the above options to search more than 1 keywords.

I missed the ‘helm’, that’s pretty easy to use. M-x "helm/project-searcch" . Use tab to preview.

Sorry I haven’t been keeping up with this group – yes, there’s a simple, scalable way to do full text search: https://github.com/zot/microfts

The elisp package automatically indexes your org-mode files as you visit them and updates the index when you save them.