Linking multiple urls to an org roam file

Summary

Org-roam tags/- tags :: but for urls. refer to tags vs - tags :: for more context.

Goal

The idea is to link multiple urls to multiple org-roam files.
This can then be searched for and opened in bulk.
org-roam-protocol-capture-open-ref can be used for searching. This means if I have navigated to a page on the browser and execute javascript to note the url, I can figure out If I have taken a note on a page before.

Current features

org-roam’s key property can reliably link a url to an org-roam file. Its current limitation is that it doesn’t work with multiple urls nor does allow . The latter is to be expected because As I understand it, It was designed for citations where uniqueness of keys is a necessary feature.

Implementation idea

Introduce another org-roam property specifically for urls. functions like org-roam tag property but has font locking support, org-open-at-point and so on for urls. Will take some work to implement

parallel to that, leverage org-roam’s backlinks and introduce a drawer that holds file links to files that hold urls in their org-roam key property.

The later workflow can be enhanced by certain functions that increase productivity of the user experience.
.- org-roam-open-key-url
opens the key property of a file. This should be a link (can check while opening).

  • org-roam-open-resources-url
    opens select urls of resource files linked inside a file. if the resources are urls themselves, open those. The idea is to painlessly open multiple urls linked to a file that are hidden behind a file layer.
  • org-roam-open-resources-url-at-point
    same as the above but directly opens the url of a file under point.
  • org-roam-insert-multiple-resources
    @459 introduces the concept. The idea is to get the convenience of placing multiple urls for this workflow. Other tools can be used that transform urls into org-roam file links. The vision is to make this workflow as easy as selecting a couple of urls, running a function which transforms them to file links to files containing each of those urls as a key.

Discussion

this feature request contains my though process on this feature.

the tradeoffs between the two approaches mirror the discussion between tags vs - tags ::.

the first approach has added advantage that there won’t be stub files linking to urls while the latter leverages already existing org-roam functionality.

as a meta comment, I would like to give org-roam an understanding that some of my links are in fact tags. Can leverage to add this symanics into the backlink buffer utilizing org’s new drawer support for files. This means that the backlink buffer places the name of the drawer which it found the links in.

2 Likes

I moved away from the original idea of creating files for each url and using #+roam-key, when I found that org-roam already parses and places all links, including urls in the database. This means that extending org-roam to support linking multiple urls to a file only requires a few other pieces.

I have a mock version of this implemented. Ended up calling it org-roam-protocol-open-url. Borrows code from the implementation of org-roam-find-file and other org roam utilities.

It looks over all the urls linked within all org roam files and checks if the url in question is one of them. No need for an explicit location within a file.

I have some further utilities to create an blacklist of files that shouldn’t have their links (urls) extracted which you can find on my repo.

explanation

running

 javascript:location.href =
    'org-protocol://roam-url?template=w&ref='
    + encodeURIComponent(location.href)
    + '&title='
    + encodeURIComponent(document.title);
       });

checks if the url is referenced in any org roam file, if so lists those files as selection. If there is only one such file it directly opens it. If not, it uses org-roam-protocol-open-file to handle the url.

 javascript:location.href =
    'org-protocol://roam-url?template=w&ref='
    + encodeURIComponent(location.href)
    + '&title='
    + encodeURIComponent(document.title) + '&check=';
       });

the difference is that if a matching file isn’t found, it does nothing. A good way to figure out if a file that references the url exists.

my setup

my setup for surfing-keys.

           mapkey('cw', 'org protocol capture website', function() {
         javascript:location.href =
    'org-protocol://roam-url?template=w&ref='
    + encodeURIComponent(location.href)
    + '&title='
    + encodeURIComponent(document.title);
       });
       
      // setup org protocol website
      
       mapkey('cc', 'org protocol capture check website', function() {
         javascript:location.href =
    'org-protocol://roam-url?template=w&ref='
    + encodeURIComponent(location.href)
    + '&title='
    + encodeURIComponent(document.title) + '&check=';
       });

installation

script

quick setup

place the following in a file that is run when emacs starts up.

;;; Code:
(require 'org-protocol)
(require 'org-roam-protocol)
(require 'org-roam)
;;;; Functions

(defun org-roam--get-url-title-path-completions (url)
  "Return an alist for completion.
The car is the displayed title for completion, and the cdr is the
to the file."
  (let* ((url-path (s-replace-regexp "http[s]?:"  "" url))
         (rows (org-roam-db-query  [:select [files:file titles:title tags:tags files:meta] :from titles
                                 :left :join tags
                                 :on (= titles:file tags:file)
                                 :left :join files
                                 :on (= titles:file files:file)
                                 :left :join links
                                 :on (= files:file links:from)
                                 :where (= links:to $s1)
                                ] url-path))
         completions)
    (setq rows (seq-sort-by (lambda (x)
                              (plist-get (nth 3 x) :mtime))
                            #'time-less-p
                            rows))
    (dolist (row rows completions)
      (pcase-let ((`(,file-path ,title ,tags) row))
        (let ((k (org-roam--prepend-tag-string title tags))
              (v (list :path file-path :title title)))
          (push (cons k v) completions))))))


(defun org-roam--prepend-url-place (props title tags)
  (concat (org-roam--prepend-tag-string title tags) " :" (number-to-string (plist-get props :point)) ":"
                         "\n"
                         "* "
                         (if-let ((outline (plist-get props :outline)))
                             (string-join outline " > ")
                           "Top")
                         "\n"
                          "=> " (s-trim (s-replace "\n" " "
                                             (plist-get props :content)))
                         "\n\n"
                         ))

(defun org-roam--get-url-place-title-path-completions (url)
  "Return an alist for completion.
The car is the displayed title for completion, and the cdr is the
to the file."
  (let* ((url-path (s-replace-regexp "http[s]?:"  "" url))
         (rows (org-roam-db-query  [:select [links:properties files:file titles:title tags:tags files:meta] :from links
                                  :left :join titles
                                 :on (= links:from titles:file)
                                 :left :join tags
                                 :on (= titles:file tags:file)
                                 :left :join files
                                 :on (= titles:file files:file)
                                 :where (= links:to $s1)
                                 :order-by (asc links:from)
                                ] url-path))
         completions)
    ;; sort by point in file
    (setq rows (seq-sort-by (lambda (x)
                              (plist-get (nth 0 x) :point))
                            #'<
                            rows))
    ;; then by file opening time
    (setq rows (seq-sort-by (lambda (x)
                              (plist-get (nth 4 x) :mtime))
                            #'time-less-p
                            rows))
    (dolist (row rows completions)
      (pcase-let ((`(,props ,file-path ,title ,tags) row))
        (let ((k (org-roam--prepend-url-place props title tags))
              (v (list :path file-path :title title :point (plist-get props :point))))
          (push (cons k v) completions))))
    ))

(cl-defun org-roam-completion--completing-read-url (prompt choices &key
                                                       require-match initial-input
                                                       action)
  "Present a PROMPT with CHOICES and optional INITIAL-INPUT.
If REQUIRE-MATCH is t, the user must select one of the CHOICES.
Return user choice."
  (setq org-roam-completion-system 'helm)
  (let (res)
    (setq res
          (cond
           ((eq org-roam-completion-system 'ido)
            (let ((candidates (mapcar #'car choices)))
              (ido-completing-read prompt candidates nil require-match initial-input)))
           ((eq org-roam-completion-system 'default)
            (completing-read prompt choices nil require-match initial-input))
           ((eq org-roam-completion-system 'ivy)
            (if (fboundp 'ivy-read)
                (ivy-read prompt choices
                          :initial-input initial-input
                          :require-match require-match
                          :action (prog1 action
                                    (setq action nil))
                          :caller 'org-roam--completing-read)
              (user-error "Please install ivy from \
https://github.com/abo-abo/swiper")))
           ((eq org-roam-completion-system 'helm)
            (unless (and (fboundp 'helm)
                         (fboundp 'helm-make-source))
              (user-error "Please install helm from \
https://github.com/emacs-helm/helm"))
            (let ((source (helm-make-source prompt 'helm-source-sync
                            :candidates (mapcar #'car choices)
                            :multiline t
                            :filtered-candidate-transformer
                            (and (not require-match)
                                 #'org-roam-completion--helm-candidate-transformer)))
                  (buf (concat "*org-roam "
                               (s-downcase (s-chop-suffix ":" (s-trim prompt)))
                               "*")))
              (or (helm :sources source
                        :action (if action
                                    (prog1 action
                                      (setq action nil))
                                  #'identity)
                        :prompt prompt
                        :input initial-input
                        :buffer buf)
                  (keyboard-quit))))))
    (if action
        (funcall action res)
      res)))

(defun org-roam-find-file-url (&optional initial-prompt completions filter-fn no-confirm setup-fn)
  "Find and open an Org-roam file.
  INITIAL-PROMPT is the initial title prompt.
  COMPLETIONS is a list of completions to be used instead of
  `org-roam--get-title-path-completions`.
  FILTER-FN is the name of a function to apply on the candidates
  which takes as its argument an alist of path-completions.  See
  `org-roam--get-title-path-completions' for details.
  If NO-CONFIRM, assume that the user does not want to modify the initial prompt."
  (interactive)
  (unless org-roam-mode (org-roam-mode))
  (let* ((completions (funcall (or filter-fn #'identity)
                               completions))
         (title-with-tags (case (length completions)
                             (0 nil)
                             (1 (caar completions))
                             (t (if no-confirm
                             initial-prompt
                             (when setup-fn (funcall setup-fn))
                             (org-roam-completion--completing-read-url "File: " completions
                                                                       :initial-input initial-prompt)))))
         (res (cdr (assoc title-with-tags completions)))
         (file-path  (plist-get res :path))
         (point  (plist-get res :point)))
    (if file-path
        (progn (find-file file-path) (goto-char point) '(t))
      nil)
    ))
(defun org-roam-protocol-open-url (info)
  "Process an org-protocol://roam-url?ref= style url with INFO.
  It checks, opens, searchs or creates a note with the given ref.
When check is available in url, no matter what it is set to, just check if file exists, if not don't open anything or create org file.
    javascript:location.href = \\='org-protocol://roam-url?template=r&ref=\\='+ \\
          encodeURIComponent(location.href) + \\='&title=\\=' \\
          encodeURIComponent(document.title) + \\='&body=\\=' + \\
          encodeURIComponent(window.getSelection()) + \\ + \\='&check=\\='
" 
  (setq ref (plist-get info :ref))
  (setq check (plist-get info :check))
  (setq opened-file (org-roam-find-file-url nil (org-roam--get-url-place-title-path-completions ref) nil nil (lambda () (x-focus-frame nil) (raise-frame) (select-frame-set-input-focus (selected-frame))))) 
  (unless (or check opened-file)
    (org-roam-protocol-open-ref info)
    )
  )

(push '("org-roam-url"  :protocol "roam-url"   :function org-roam-protocol-open-url)
      org-protocol-protocol-alist)
(provide 'org-roam-protocol-url)

repo

you can also get it from my repo.
It is on

or just multi-url

and then setup it up, with melpa/qualpa.

(org-roam
                                         :location (recipe :fetcher github
                                                           :repo "natask/org-roam"
                                                           :branch "my-latest" ;or "multi-url"
                                                           :files ( "*.el")
                                                           ) 
                                         )

configuration

and then place the following alongside your normal org-roam initialization.

 (use-package org-roam-protocol-url
    :after org-protocol)

extensions

I will like to write something that promotes (maybe even demotes) urls to their own file with a key. it is not a high priority for me though.

I already have done some work on opening a url key of a file beforehand.

It is also possible to extend this to other types of links.

1 Like

My main use case for linking multiple urls to a file is that I have a habit of reading information from multiple sites and then curating a summary into a single file. I then would keep the urls I curated from inside that file as links. When I then stumble onto such a site in the future, I would like to quickly know if I had curated from it.

As a practical example, I ended up stumbling on https://github.com/jeremy-compostella/org-msg and wanting to take notes on it. I checked if the url is within my org directory through this script, and it turns out that I had mentioned it in one of my files as a link some time ago. Without grepping my whole directory, I wouldn’t have been able to stumble into that info. Goes to show that it is adding positive value to my workflow.

1 Like

@savnk,
This practical use case sounds useful.
If I may ask some questions to understand it better…
What does the note with multiple URLs look like with the enhancement implementation that you did?
How does it present the multiple URLs to you so that you see you have a curated note including the org-msg URL in the past, compared to normal Org-roam?

@nobiot
the note should contain links either in org style or bare.
for example if I have the following file,

#+title: notes
I think [[https://org-roam.discourse.group/t/linking-multiple-urls-to-an-org-roam-file/777][Linking multiple urls to an org roam file - Development - Org-roam]] is interesting.

it may have something to do with https://org-roam.discourse.group/t/how-to-use-roam-tags-and-or-tags/190.

if I run org-roam-protocol-open-url on How to use ROAM_TAGS and/or tags? or Linking multiple urls to an org roam file, the note and the link’s position in the note will be listed as one of the options. If there is only one location in all of org-roam directory, file containing that location is opened at that location.

The number one thing on my implementation list is url matching. For some websites, I will like to see children paths as well as ones that match the parent. Sometimes the opposite is also useful.
for example, this website assigns a child path for each comment, i.e 5th comment or 2th comment
The current implementation doesn’t match this url with it’s 5th comment nor vise versa.
roam-keys also suffers from this issue.

with commit #1140, which shows urls in the backlink buffer, it has really become tags/- tags ::.

Oh, cool!

So… Would I be correct if I said that it’s a plug-in to org-roam-protocols that adds functionality to:

  • automatically looks up the Org-roam database to
  • check if you have already had notes referencing the URL
  • when you use a bookmarklet to copy the reference/URL to an Org-roam note?
1 Like

yes. Think of it as a org-roam backlink buffer but for urls activated through org-roam-protocol.

1 Like

I have cleaned it up and placed it in another repo.
named it org-roam-url.

installation for spacemacs (melpa) . place in init file.

(org-roam-url
                                         :location (recipe :fetcher github
                                                           :repo "natask/org-roam-url"
                                                           :files ( "*.el")
                                                           ) 
                                         )