Org-roam-dailies search files and export headings with specified tag

I’ve been using org-roam-dailies to keep track of things I do on a daily basis. I work as a researcher and I also teach courses at a university. Every couple of years the board asks for a report on the things I’ve done on the last period, which include supervising students, participating and/or organizing congresses and workshops, participating in PhD theses as a jury member, reviewing articles, writing my own research articles, teaching courses, etc. But this is just a part of what I write in my org-roam-dailies, as I also write other more personal stuff in there.

The headings related to things I need to report later to the university board are tagged with the “:report:” tag. Now I would like know if there’s a way (built-in or otherwise) to automatically produce a search over a given period of time in my org-roam-dailies “journal” for all headings (and contents) tagged “:report:” so that all that material is copied to an org file and sorted chronologically as they appear in my org-roam-dailies files.

This would be a really useful feature as I would have the report almost already written for me (and by me!) based on the entries I’ve already written in my journal.

I would very much appreciate any help and/or pointers on this matter.

Thanks in advance for all your help.

You could use org-roam-ql (I am the author of the package). If that is not something you’d be interested in, you should be able to use something like ripgrep or deft to filter the files/heading you are interested in.

1 Like

Thanks @ahmed-shariff for your quick answer and for what seems to be a very useful package. Can you give me a hint as to how I could achieve that with your package?

Thanks again for your help.

Ok I installed your package and manage to launch a search; for “(tags (“report”))”, but I get an empty buffer. Could it be that my org-roam-dailies files are not searched? How can I tell org-roam-ql which files to search for?

Try using (tags "report").

If its a file org-roam sees, you should be able to see it with org-roam-ql

Thanks, but I got the same empty list of results. I tried launching a search for title (for an existing title) and that works ok. Just to clarify, would this be able to find tags entered at headings with the syntax :tag: ?

Thanks again for your help and patience.

That is odd. Are you installing from melpa or melpa-stable?

Yes, anything with that syntax at the end of the heading would be considered a tag.

Uh I’m not pretty sure how to check that. M-x list-packages gives me this information:

Package org-roam-ql is available.

     Status: Available from melpa -- Install
    Archive: melpa
    Version: 20240501.1929
     Commit: 55344b185cdaa79f4a36d04f67aa5f6a67b94fc4
    Summary: Interface to query and view results from org-roam
   Requires: emacs-28, org-roam-2.2.0, s-1.12.0, magit-section-3.3.0, transient-0.4, org-super-agenda-1.2
    Website: https://github.com/ahmed-shariff/org-roam-ql
 Maintainer: Shariff AM Faleel
     Author: Shariff AM Faleel

Inspired by org-ql, this package provides an interface to easily
query and view the results from an org-roam database.

Just to check: I’m using org-roam-ql-search, that is correct, right?

That is being installed from melpa :slight_smile:

Yes, you would be using org-roam-ql-search

Ok I think I have a hint from this experiment I just made.

If I’m on a org-roam-dailies file in which I added a filetag with #+filetags: :thistag:, this is actually found by org-roam-ql-search with a query given by (tags "thistag").

BUT if a heading on that same org-roam-dailies file ends in :anothertag:, then this is NOT found by an org-roam-ql-search query with (tags "anothertag").

So I’m guessing this is not the right tool for the job… or am I using it wrong? This is not criticism, the tools seems pretty useful and impressive, I’m just trying to understand here if it’s me using it wrong.

Thanks again!

Ah, I think I know what the issue is. The heading on which you have the tag is not an org-roam node. i.e., only headings/files which have the “ID” property are considered as nodes in org-roam. If you are trying to filter based on headings that are not nodes. The existing predicates in org-roam-ql won’t help. That being said, you can write one easily if you want - I have an example here: add query option for full-text search · Issue #6 · ahmed-shariff/org-roam-ql · GitHub. This uses ripgrep to search body of text. You could use a pattern like \\*+ .* :report: with that example to get the results you are expecting.

Beyond that there are also tools like org-ql, which is excellent, but may suffer from performance depending on your setup. As I mentioned before, tools like ripgrep and deft are also well suited for this.

Well, that explains the behavior that I observe.

Thanks a lot for your help. I’ll try what you propose.

And thanks again for this package, it is certainly very useful.

1 Like

;; Code

;; Section 0 // Helper Functions
(defun custom/org-get-entry-headline ()
  "Get the entry text, after heading, not entire subtree.
   Return \"\" if there is no text entry."

  ;; This function albeit could be reduced in complexity by using more higher order already defined
  ;; procedures -- but I have written it from more first principles to have more control over the process
  
  (let* ((current-element (org-element-at-point))
	 (entry-text (save-excursion
			 (if (eq (org-element-type current-element) 'headline)
			     (goto-char (org-element-property :begin current-element))
			   (outline-previous-heading))
			 (forward-line 1)
			 (if (and
			      (eq (org-element-type (org-element-at-point)) 'headline)
			      (not (looking-at-p "^$")))
			     ""
			   (progn
			     (let ((start (point)))
			       (outline-next-visible-heading 1)
			       (when (outline-on-heading-p t)
				 (backward-char))

			       ;; the text body is cleansed of :PROPERTIES: drawer
			       ;; this is a coarse implementation since the intention is
			       ;; to only remove :ID: so that clashes do not occur
			       ;; should be corrected of its this fault by future work
			       ;; to not remove the :PROPERTIES: drawer use
			       ;; `buffer-substring' instead of `buffer-substring-no-properties'
			       (buffer-substring-no-properties start (point))))))))
    entry-text))



(defun custom/within-date-range-p (date begin-date end-date)
  "Check if DATE is within the range defined by BEGIN-DATE and END-DATE.
If only one or none of BEGIN-DATE or END-DATE is provided, filtering works accordingly."
  (or

   ;; case01 both Begin and End dates provided,
   ;; and Date is within range -- weak inequality.
   (and begin-date
        end-date
        (or (string> date begin-date) (string= date begin-date))
        (or (string< date end-date) (string= date end-date)))

   ;; case02 Only Begin date is provided
   ;; and-or End date format is not illegal if provided
   ;; and Date is within range -- weak inequality
   (and begin-date
        (or (not end-date) (not (string> date end-date)))
        (or (string> date begin-date) (string= date begin-date)))


   ;; case 03 Only End date is provided
   ;; and-or Begin date format is not illegal if provided
   ;; and Date is within range -- weak ineqality
   (and end-date
        (or (not begin-date) (not (string< date begin-date)))
        (or (string< date end-date) (string= date end-date)))))



;; test the filter function during debug
;; (custom/within-date-range-p "2024-01-03"
;; 			    "2024-01-01"
;; 			    "2024-01-04")



;; Section 1 // Main Function(s)
(defun org-extract-reports-from-journal (journal-dir reports-file &optional subtree begin-date end-date)
  "Extract reports from org files in JOURNAL-DIR and save them in REPORTS-FILE.
Optionally extract the SUBTREE with :report: tag.
Filter files based on the date range specified by BEGIN-DATE and END-DATE."
  (interactive "DDirectory containing journal files: \nFReports file to save: \nP")

  ;; Create the reports file with title if it doesn't exist
  (unless (file-exists-p reports-file)
    (with-temp-file reports-file
      (insert "#+title: Reports\n\n")))

  ;; Do not process . dot files or files beginning with # (autosaves)
  (let ((file-pattern "\\(?:[^.#].*\\.org$\\)"))
    (dolist (file (directory-files-recursively journal-dir file-pattern))
      (let ((file-date (file-name-base file)))

	;; optional date filtering logic
	;; branch if and not if dates are nil
	;; also ensure minimally file is actually currently readable
	(when (and 
	       (or (and (not begin-date) (not end-date))
		   (custom/within-date-range-p file-date begin-date end-date))
	       (file-readable-p file))

	  ;; file process logic
	  ;; search for headlines with tag :report: and get various information about it
	  ;; while determining the text-body determine if optional SUBTREE is t
	  (with-current-buffer (find-file-noselect file)
            (goto-char (point-min))
            (while (re-search-forward "^\\*+\\s-+.*:report:" nil t)
              (let* ((file-title (org-get-title))
                     (headline-text (org-get-heading t t t t))
                     (tags (org-get-tags))
                     (body-text
		      ;; if SUBTREE is t use `org-get-entry' otherwise use our custom function
		      ;; `custom/org-get-entry-headline' which only retrieves
                      (if subtree
                          (org-get-entry)
                        (custom/org-get-entry-headline))))

		;; optional SUBTREE formatting logic
		;; Remove :Properties: drawer -- coarse implementation,
		;; intention is to only remove ID to not cause clashes, requires future work
		;; for removing this provision - remove the immediate following two setq entries
		;; the same is true for our `custom/org-entry-headline' function. See comment inside
		;; that function to change the behaviour there.
		(when subtree
                  (setq body-text (replace-regexp-in-string ":PROPERTIES:[^\000]*:END:" "" body-text))
                  (setq body-text (replace-regexp-in-string "\\`\\s-+\\|\\s-+\\'" "" body-text))
                  (with-temp-buffer
                    (org-mode)
                    (insert body-text)
                    (goto-char (point-min))
                    (when (outline-next-heading)
		      ;; ensure that text-body is at most at org-current-level 3
		      ;; since level 1 and 2 is used to denote the title (date here)
		      ;; and the appropriate headline which is being processed
                      (while (< (org-current-level) 3) (org-demote-subtree)))
                    (setq body-text (buffer-string))))

		;; output file formatting logic
		(with-current-buffer (find-file-noselect reports-file)
                  (goto-char (point-max))
                  (insert (make-string 1 ?*) " " file-title "\n")
                  (insert (make-string 2 ?*) " " headline-text)
                  (when tags
                    (insert " " (org-make-tag-string tags))
                    (org-align-tags))
                  (when body-text
                    (insert
                     (concat "\n" (when subtree "\n") body-text)))
                  (insert "\n\n"))))

	    ;; clean up - kill the FILE buffer being processed as part of the loop
            (kill-buffer))))))

  ;; clean up - save & kill the REPORT-FILE buffer
  (with-current-buffer (find-file-noselect reports-file)
    (save-buffer)
    (kill-buffer))

  (message "Reports extraction completed."))