A demo of AI for linking, writing, and thinking with org-roam. Should we build org-roam-ai?

lgmoneda · December 15, 2022, 3:12pm

I’ve been tweaking a tool in the last two weeks to recommend possible connections given a node/text portion, and I wonder if we should start working in an “org-roam-ai” package to manipulate org-roam data using AI.

My pain is that with company package, I get recommendations based on the title in an auto-complete style. So I need to type precisely the title of another node. I’ve been creating more and more nodes with longer titles that provide more meaningful access to their content when looking at a graph, which makes it harder to auto-complete.

To solve it, I’ve put together a very precarious implementation of a semantic search (with the help of ChatGPT!), but it is usable enough to let me test it and decide to invest in it. It shows the 20 most similar nodes and a 2d representation that lets us see their relation with each other and keep aware of the similarity ordering by their color.

I have been using it for three things:

To find a node when I remember the subject but not the title; Prompt: “a preprocessing step for a text that splits the words into tokens based in a vocabulary built by fragments occurrence” (I was thinking on WordPiece indeed).

Look in my notes for references to read/remember/think about a particular subject. Prompt: “The performance of my machine learning model is degrading”

To help me write/think/connect to other nodes. Start writing, select a sentence, and search:

Screen Shot 2022-12-15 at 10.57.182174×1093 179 KB

I’ve been using open-source language models using sentence-transformers library.

It aligns with AI generated node connections using semantic similarity estimation.

I’ve requested GPT-3 to write a set of features of an “org-roam-ai” package:

Automatically generate a graph of related documents based on natural language processing (NLP) analysis.
Suggest relevant documents and nodes to link to while writing in org-mode.
Automatically detect keywords and create links to other documents.
Cross-document/node search with full-text search capabilities.
Automatically tag documents and nodes with relevant topics.
Automatically suggest new topics for documents, nodes, and notes.
Automatically generate summary cards for documents and nodes.

They all look exciting, but I’m not sure they are possible with open-source models.

So I’d like to hear from the org-roam community: should we build org-roam-ai? I’m a Data Scientist, so my Software Engineering skills are limited, and I’m open to working with more experienced developers or simply helping design the interface with these models.

AuroraDraco · December 17, 2022, 10:35am

Hmm, this sounds interesting. I am very far from being able to help develop something like this, but I would definitely be happy to help with testing.

In some cases, you do want to do all the job on your own - especially so when first writing something - but I could see a use case for this in retrieval of notes.

lgmoneda · December 17, 2022, 12:50pm

I see it more as grabbing your references before writing, while writing, or after writing the first version.

I’ve been using some paragraphs of articles I did as input, and it is great it recommends concepts I’d cite in the following 1-3 paragraphs and others I didn’t, but I could have. It is more about augmenting my ability to put my references together than writing for me.

AuroraDraco · December 17, 2022, 1:05pm

Yeah, I see.

Its definitely interesting as a concept and I would love to see a complete implementation of it in org roam.

laotang · December 18, 2022, 10:26am

Sounds really interesting! Have you looked at org-similarity? So far, it only works on a per-document basis - but does so surprisingly well for my 2.5K+ notes.

AuroraDraco · December 18, 2022, 8:17pm

Oh, org-similarity looks interesting. I will definitely give it a try!

laotang · December 18, 2022, 8:44pm

I recently put this together to have the results of org-similarity in a side-window (like org-roam v1). It makes org-similarity even more useful for me (and now looks a tiny bit like DevonThink). Use at will

(defun lt/org-similarity-sidebuffer ()
    "Puts the results of org-similarity in a side-window."
    (interactive)
    (let ((command (format "python3 %s -i %s -d %s -l %s -n %s %s"
            (concat org-similarity-root "/assets/org-similarity.py")
             buffer-file-name
             org-similarity-directory
             org-similarity-language
             org-similarity-number-of-documents
             (if org-similarity-show-scores "--score" ""))))
      (setq similarity-results (shell-command-to-string command)))
      (with-output-to-temp-buffer "*Similarity Results*"
      (princ similarity-results))
      (with-current-buffer "*Similarity Results*"
      (org-mode))
    )
  (add-to-list 'display-buffer-alist
               '("*Similarity Results*"
                 (display-buffer-in-side-window)
                 (inhibit-same-window . t)
                 (side . right)
                 (window-width . 0.33))
  )

The screenshot shows the note “AI and the competitive struggle of nations” on the left and the org-similarity results for that buffer on the very right.

AuroraDraco · December 18, 2022, 8:51pm

I was actually just trying to get it to insert to a separate buffer as I didn’t like the default behaviour of inserting in the current buffer so you saved me like 30 mins of setting this up myself .

And since you seem to have played with this more, do you also have a workaround for the inserted links showing :PROPERTIES: (which is the first line of a typical org-roam file) or am I going to have to hack that one in myself?

laotang · December 18, 2022, 9:00pm

You’re welcome.

Sorry, no workaround. I am still on v1 and don’t use org-id.

AuroraDraco · December 18, 2022, 9:06pm

I see, no prob

lgmoneda · December 19, 2022, 7:00pm

I did not know it! Thanks for pointing it out.
One difference is that I’m using a Large Language Model to represent the input and nodes as a vector, which should provide better results than the org-similarity approach. The other is that I’m working on top of org-roam node granularity instead of documents. Theoretically, org-similarity could replace the approach to include LLM.

I want to keep expanding it on top of what org-roam offers in terms of data (nodes, links, tags), but I understand many people will find org-similarity covering most of their usage.

One common thing is that the org-similarity author and I are Brazilians

suliveevil · December 19, 2022, 7:31pm

Brilliant idea !

When I was using Obsidian, there are two plugins I like most:

Find the paths between two notes:

Find similar notes:

Org-roam deserves a more powerful tool !

scotto · December 20, 2022, 5:05pm

I’m quite interested in this too! Besides looking at my notes, I’d love it if it looked at the stack of a thousand data science pdfs that I’ve collected over the years.

Relatedly, here’s chatGTP in emacs, writing elisp: ChatGPT in Emacs

misha · February 28, 2023, 12:39pm

This looks amazing. Although my coding skills are limited I would love to help build this.

lgmoneda · April 11, 2023, 12:10pm

Here’s a blog post with directions to replicate it.

lgmoneda · April 15, 2023, 5:49pm

Now about a Q&A with sources over Org roam notes.

laotang · May 3, 2023, 6:28pm

I’m currently writing a new function to better surface related notes for my minimalist org-roam v1 clone (see also here). The function collects all notes related to a given note to the second degree - the backlinks for the backlinks and the outgoing links mentioned by outgoing links. To use the image of a family, it considers all parents and grandparents as well as all children and grandchildren of a note. All links to a specific note are counted and the resulting list is ranked by frequency.

The results are really interesting (at least for me). This is a note and its backlinks:

This is the same note with org-similarity in the side-window:

And this is the same note with orgrr-related-notes (the above teased function, not yet released):

Topic		Replies	Views
AI generated node connections using semantic similarity estimation Development	1	684	December 4, 2022
Org-roam-search Development	26	2744	June 16, 2022
Find node UI possibilities for v2 UX & UI	18	2823	May 11, 2021
Maximum number of notes Troubleshooting	23	2945	April 13, 2022
Zettelkasten on Org-Roam is a two tool process? Am I doing this right? Requests	16	3714	July 20, 2022

A demo of AI for linking, writing, and thinking with org-roam. Should we build org-roam-ai?

Related topics