Guide: Bibliography system with org-roam-bibtex and org-noter integration

NOTE: Some of these codes and texts look crooked in this website, this is a visual bug - if you copy the codes and paste in your text editor they will automatically resolve themselves!

Introduction

In my attempt to setup org-roam-bibtex, I quickly found out that there was no “good” guide that could quickly help me set up a bibliography system - with sane defaults - and get out of my way so that I can get on with my life; many guides that exist on the internet, are awfully outdated use deprecated variables that are not in existence since 2021; the system is also confusing to newcomers since it requires multiple packages and the user is expected to go through several manuals before the system could be set up. All in all, it took me 2 days reading through several of these manuals to set up the system. In this guide, I wish to elucidate the distilled essence of the whole, so that a new user may set up a working system in a matter of minutes (not days) and get on with the actual research part without having to read through the manuals (unless they wish to customise or fall in a problem).

We will set up org-roam-bibtex with two backends - org-ref and helm-bibtex.

1.) helm-bibtex is a front end to bibtex-completion this allows us to work with our .bib bibtex file globally, we get a full buffer when we call helm-bibtex from where we can manage our citations easily, we may add new entries, edit entries, add citations to our documents, and many more from the comfort of a dedicated buffer.

Note: ivy-bibtex is an alternate to this which is more minimal, but we will not be setting this up here!

2.) org-ref will allow us to work with citations of the form cite:&citekey and allow us to easily make pdfs, it also gives us a new export backend C-c C-e r where we can also export to htmls with these citations, further it will give us a small window on the bottom of the buffer when doing <Ret> on the citations which will make working with citations on the buffer very easy.

3.) Finally org-roam-bibtex will allow us to integrate our bibliography system with our org-roam system.

4.) We will also set up org-noter which will allow us to work with pdf annotations on an .org file and integrate it with the system.

Basic Setup

The system requires,

i. A .bib file where we will store our entries (./references/master.bib in this guide)
ii. A folder to keep our documents related to an entry - being pdf(s), such as the paper related to the entry and so on (./references/documents/ in this guide)
iii. A folder to keep our notes related to the entry and the document(s) (./references/notes/ in this guide)

In this guide it is assumed that the value of org-roam-directory is ~/roam for convenience


                                ╭── documents
                                │
roam ───────── references ──────┼── notes ───────── notes-template.org*
                                │
                                ╰── master.bib

*for org-noter integration, explained below				

FIG_01: Diagram explaining the directory and file structure used

Installation

Ensure (or install the following from Melpa: M-x package-install <Ret>):

  1. helm-bibtex
  2. org-ref
  3. org-roam-bibtex
  4. org-noter
    4.1 pdf-tools*

*pdf-tools makes the experience of working with pdfs inside emacs better.

Configuration

Set the following in your init file:

We do not need to configure org-ref it shares variables with helm-bibtex (bibtex-completion) since V3 2021!

helm-bibtex (bibtex-completion)

;; IMP: Ensure 'latexmk' installed as a system package!
;; see also: http://www.jonathanleroux.org/bibtex-mode.html
(setq bibtex-completion-bibliography '("~/roam/references/master.bib"))  ; location of .bib file containing bibliography entries
(setq bibtex-completion-find-additional-pdfs t)                          ; support for multiple pdfs for one %citekey
(setq bibtex-completion-pdf-field "File")                                ; in bib entry, file = {/path/to/file.pdf} could be set to locate the accompanying file
                                                                         ;; for multiple files use, file = {:/path/to/file0.pdf:PDF;:/path/to/file1.pdf:PDF}
(setq bibtex-completion-library-path '("~/roam/references/documents/"))  ; in this dir, %citekey-name(s).pdf would automatically attach pdf(s) to %citekey
                                                                         ;; if only !exist "file" field in bib entry
(setq bibtex-completion-notes-path "~/roam/references/notes/")           ; dir to keep notes for the pdfs

;; BEGIN: Change insert citation (<f3>) behaviour of helm-bibtex for org-mode 
(defun custom/bibtex-completion-format-citation-org (keys)
  "Custom cite definition for org-mode"
  (s-join ", "
	  (--map (format "cite:&%s" it) keys)))

(setq bibtex-completion-format-citation-functions
      '((org-mode      . custom/bibtex-completion-format-citation-org)
	(latex-mode    . bibtex-completion-format-citation-cite)
	(markdown-mode . bibtex-completion-format-citation-pandoc-citeproc)
	(python-mode   . bibtex-completion-format-citation-sphinxcontrib-bibtex)
	(rst-mode      . bibtex-completion-format-citation-sphinxcontrib-bibtex)
	(default       . bibtex-completion-format-citation-default))
      )
;; END: Change insert citation (<f3>) behaviour of helm-bibtex for org-mode

(setq bibtex-autokey-year-length 4                          ; customisations for 'bibtex-generate-autokey'
      bibtex-autokey-name-year-separator "-"                ; press C-c C-c (bibtex-clean-entry) on a bib entry w/o %citekey
      bibtex-autokey-year-title-separator "-"               ; to automatically insert a %citekey based on meta data
      bibtex-autokey-titleword-separator "-"                ; use M-x crossref-add-bibtex-entry <ret>: to add an entry from
      bibtex-autokey-titlewords 2                           ; https://www.crossref.org/
      bibtex-autokey-titlewords-stretch 1
      bibtex-autokey-titleword-length 5)

org-roam-bibtex

If we use org-roam for .org files exclusively, we can just set this variable and forget about it. We do not need to enable org-roam-bibtex-mode seperately when working with .org files of org-roam, the mode basically toggles this variable!


;(setq bibtex-completion-edit-notes-function 'bibtex-completion-edit-notes-default) ; default to org-ref for notes
(setq bibtex-completion-edit-notes-function 'orb-bibtex-completion-edit-note) ; use org-roam-capture-templates for notes

org-noter integration

  1. Configuration for pdf-tools:

IMP: After installing pdf-tools, do M-x pdf-tools-install <Ret> and put this line in your init file:


(pdf-loader-install) ; use PDFView in place of Doc View

  1. org-noter integration with org-roam and org-roam-bibtex

(setq org-noter-notes-search-path '("/home/USER/roam/references/notes/")) ; V IMPORTANT: SET FULL PATH!

(setq orb-preformat-keywords '("citekey" "title" "url" "author-or-editor" "keywords" "file") ; customisation for notes, org-noter integration
      orb-process-file-keyword t
      orb-attached-file-extensions '("pdf"))

Create an org-roam-capture-template - add only the template to your already existing variable, do not copy paste this as-is!!


(setq org-roam-capture-templates
      '(
        ("b" "bibliography notes" plain             ; Org-noter integration
         (file "~/roam/references/notes/notes-template.org")
	 :target (file+head "references/notes/${citekey}.org"
			    "#+title: ${title}\n")
	 :empty-lines 1)
	)
      )

For example, my full entry looks something like this, they also contain some extra export ‘options’ i deem sane!

(setq org-roam-capture-templates                    ; Org-roam capture templates
      '(
	("d" "default" plain
	 "%?"
	 :target (file+head "%<%Y%m%d%H%M%S>-${slug}.org"
			    "#+title: ${title}\n#+options: author:nil, date:nil, toc:nil, num:5, H:5, html-postamble:nil\n")
	 :empty-lines 1
	 :unnarrowed t)
	("b" "bibliography notes" plain             ; Org-noter integration
	 (file "~/roam/references/notes/notes-template.org")
	 :target (file+head "references/notes/${citekey}.org"
			    "#+title: ${title}\n#+options: author:nil, date:nil, toc:nil, num:5, H:5, html-postamble:nil\n")
	 :empty-lines 1)
	)
      )
  

Other configurations

Create a file ~/roam/references/notes/notes-template.org and add the following:


- tags ::
- keywords :: %^{keywords}

* %^{title}
:PROPERTIES:
:Custom_ID: %^{citekey}
:URL: %^{url}
:AUTHOR: %^{author-or-editor}
:NOTER_DOCUMENT: %^{file}  
:NOTER_PAGE:              
:END:

Create another file ~/roam/references/master.bib and prepend the following:


  Manual: https://www.jonathanleroux.org/bibtex-mode.html
 ---------------------------------------------------------
 
+---------------Keybindings-----------------+---------Particulars--------------+
| C-c C-e C-a	bibtex-Article		    | Article:Journ,Mag,Newsp,Periodi. |
| C-c C-e C-b	bibtex-InBook		    | Section inside Book	       |
|  or (C-c C-e I)                           |				       |
| C-c C-e b	bibtex-Book		    | Book			       |
| C-c C-e B	bibtex-Booklet		    | Bound but lacking Publshr/Inst.  |
| C-c C-e C-c	bibtex-InCollection	    | Article in a Collection	       |
|  or (C-c C-e i)                           |				       |
| C-c C-e C-p	bibtex-InProceedings	    | A Conference Paper	       |
|  or (C-c C-e C-i)                         |				       |
| C-c C-e p	bibtex-Proceedings	    | The whole Conference proceedings |
| C-c C-e m	bibtex-MastersThesis	    | Thesis for Grad lvl	       |
| C-c C-e P	bibtex-PhdThesis	    | Thesis for Phd level	       |
| C-c C-e C-m   bibtex-Manual		    | Technical Manual		       |
| C-c C-e C-t	bibtex-TechReport	    | Inst.Report,WhitePaper,WorkPaper |
| C-c C-e C-u	bibtex-Unpublished	    | Accepted/Submitted/Not Publ.     |
|                                           | (Include 'note' to specify which)|
| C-c C-e M	bibtex-Misc		    | Web pages,Slides,Personal notes  |
| C-c C-e M-p	bibtex-Preamble		    | Formatter code for .bbl file     |
| C-c C-e C-s	bibtex-String		    | Define Variables inside entry    |
+-------------------------------------------+----------------------------------+

Finally ! We add just one keybinding! and everything will fall in place


(global-set-key (kbd "C-c f r") 'helm-bibtex) ; keybinding 

I use the "f" key as a org-roam modifier, and "r" seems natural for references, use whatever works for you. The important point is, we shouldn’t need to have more than one keybinding, one is more than enough!

Misc optional configurations

Writing technical documents requires us to write in paragraphs, whereas org mode by default is intended to be used as an outliner, to get around this problem, setting up org-export to preserve line breaks is useful, there are two ways to achieve this, we can add \n:t to #+options: as a document specific setting, or we can set

(setq org-export-preserve-breaks t)

in our init file for this to work globally in all .org files.

Usage

When we are writing a document, we can press C-c f r to bring up a dedicated buffer where we can see our bibtex entries, pressing <Tab> on an entry will show all the options available. Pressing <f3> for example will insert that citation that was selected and close the buffer, pressing <Ret> on an entry will take us to the url <f1> if exists otherwise, and so on.

Pressing C-o on the buffer will change selection to the “Fallback Options”, the first one of which is to include a new entry to our bibtex file.

When an cite:&citekey exists on a document, pressing <Ret> on it will show a mini buffer where we can do most of what we need: open the associated pdf(s), “find” notes, go to the url, change the citation, remove it, and many more, this minibuffer is of `org-ref’.

Notes created of the associated documents are also compatible with org-noter.

Example of a .bib entry for an article


@article{cohen1963,
  author   = "P. J. Cohen",
  title    = "The independence of the continuum hypothesis",
  journal  = "Proceedings of the National Academy of Sciences",
  year     = 1963,
  volume   = "50",
  number   = "6",
  pages    = "1143--1148",
  url      = {https://localhost:1},
  file     = {path/to/pdf.pdf},
}

Note: if we include the file entry, bibtex-complete will not look for a file in ./references/documents/ , although the document says that it should look for a file if the file provided does not exist, but for some reason if the file doesn’t exist, emacs opems / in Dired. So use either-or,

Documents kept in ./references/documents/ must be of the type : citekey-name.pdf. in our example, they may be:

  1. cohen1963.pdf
  2. cohen1963-original.pdf, cohen1963-whatever.pdf and so on, all of these documents will be automatically connected to citekey:cohen1963.

Finally, very important: in every .org file where we include citations, we MUST include two lines at the end for the export to pdf C-c C-e l p to work correctly, namely:

#+latex: \bibliographystyle{unsrt}
#+latex: \bibliography{references/master} 

For other bibliography styles click here

When we are exporting to html from C-c C-e r h, only inclusion of

bibliography:/home/USER/roam/references/master.bib

at the end of the document is needed. Notice that for latex export we can get away with using relative file path and without the extension .bib, but for html, we must use the full file path alongside with the extension .bib!

Further Notes

  1. Installing latexmk will suffice for exporting, emacs is configured to auto detect its installation and adjust its export-process automatically, you don’t need to fiddle with org-latex-pdf-process. If you do not have it installed as a system package, you’ll inevitably fall into problems and need to read the manuals and scour support forums for answers.

  2. org-noter integration is non-commutative: meaning the intended workflow is to first create the note from org-roam and then open the pdf and do M-x org-noter inside the pdf. To create a note, either press <Ret> on an inserted citation and press n; or, call helm-bibtex (C-c f r), select the entry and press <f8> (edit notes). If the note is created from org-noter on the other hand, when org-roam will first try to visit the file it will fail to create the #+title and will append :PROPERTIES: to the first heading created by org-noter instead of the whole page. This is probably not what you want.

  3. M-x orb-insert-link will insert a link for the notes page of a citation when called, similar to org-roam-node-insert, could be bound to a key for some workflows; I have it bound to
    C-c f C-r i.
    Also, M-x orb-note-actions when on an notes page will give some options such as “Open Pdf” “Add Pdf” “yank citekey” “Go to bibtex entry” and so on, pressing <Tab> will show all the options available, can be also considered to be bounded ex: C-c f C-r n

  4. This guide was written for vanilla-emacs, on a GNU/Linux system, I have not used Windows for many years, and I have never used Mac nor any other Emacs distribution, so I cannot know how to make it work for these other distributions, or if this guide is 1-1 compatible with them.

Conclusion

I think that is all, I have tried my best to make this error free, if any error has inadvertently crept in please let me know. I hope you have a productive time doing research. Thank you.

4 Likes

Checked it quickly. Looks great! Thank you for sharing. Will try to set it up myself in one of these days :heart_eyes: momentarily I’m on a windows machine will let know if it works

1 Like

Just skimmed through it. Looks very detailed and complete. Thanks for taking your time writing it.

One question I have with these bibliography and citation management systems is this: How do you guys handle the online resources like blog posts, videos, podcasts, in your bibliography management? For example, many academic papers have proper short-hand names, like in your example, “cohen1963”. Now, the online resources that one might want to take notes from and manage using a bibliography system are without such shorthand names. So, how do these resources come into play in your bibliography management systems?

1 Like

Hi Chuck, cohen1963 is the citekey, this is internal to the system and will never show up in your exported document. This is just for our internal referencing before export. The format in your export that will actually show up will depend upon the bibliography style that was selected. This style is auto generated from the meta data in the entry itself. The styles you are referencing resembles either alpha or apalike, please see this for possible bibliography styles - I posted this link earlier in the main body also.

This document has some guidelines on when to use what bibtex entries, it even lists what to do when you want to cite a conversation you had with someone :smile:

I also included a table in the main body which when prepended to the .bib file will help in choosing the right format, it lists the keybindings to insert a format and possible situations when one might want to use it.

For websites we can use the @misc{ } format. To insert this format in your .bib file, do C-c C-e M, Emacs will automatically insert a format template. When you insert this template you’ll find many entries having OPT before them, this is Emacs way of saying OPTional. In Misc, all the fields will be OPTional when inserted, simply populate the fields you need- look in this document for explanation of the various fields - then do C-c C-c when the cursor is anywhere inside the format, and Emacs will do bibtex-clean-entry this will clean up all the fields you have not populated and remove OPT from all the fields you have populated!

Emacs would also call bibtex-generate-autokey to automatically insert a citekey based on the metadata, I have also supplied the code to make this autogenerated citekey have certain formatting, but in general this would follow “lastname-date-some_chars_from_title”. All we need is for this citekey to be unique, remember it wouldn’t show up in our final document!

Now for websites, we do not have dates of creation, we have date accessed, but we won’t be inserting that in our year field, instead we may use the howpublished= field to insert both the url and the date accessed. But, since we won’t be having a year entry, bibtex-generate-autokey will fail to create a citekey, we may simply manually create one!

I am including some screenshot, to explain you this process with an example:

Step 0: We insert the bibtex format for misc. using C-c C-e M (Bottom Left Window)

Step 1: We fill up all the fields we need, we may also manually add more fields

Step 2: We press C-c C-c Emacs calls bibtex-clean-entry; citekey would be created if the entry has a Title, Date and Author, we create one manually if either one is missing, like here.

Step 3: We Insert the citation in our document and finish our writing.

Step 4: The final document would look like this, using apalike bibliography style

I hope it cleared your doubt.

2 Likes

Yeah, I just saw that in the document you provided. Fascinating.

1 Like

Thanks for this extremely helpful manual: it did help me to establish a smooth integration between org-roam-bibtex and org-noter.
Just one additional remark: in order for org-noter to edit a pdf within the previously created “org-roam-capture-template”, the following workflow is required in my case. I wonder if this is also the common sequence of commands for everyone.

  1. M-x org-noter-enable-org-roam-integration inside the pdf (without this command, org-noter prompts the creation of an entirely new document, which breaks the integration with org-roam)
  2. M-x org-noter inside the pdf

Hi I have stated in Further Notes (2) that the integration is non-commutative, first the note should be created outside the pdf then, subsequent visits can be made from within pdfs just fine.

Org-noter has its own integration with org-roam and the last time I tried it, it conflicted with Org-roam-bibtex’s method, so I dont know how well they will synchronise. I would not use org-noter’s org-roam integration with org-roam-bibtex.

If the file is created and the search-path is correct it /should/ jump you back in the file

Yes, this is the sequence I’m following. Search path lands in the correct folder too. Yet, for some reason, subsequent visits from pdf do not land at the previously created note, unless I invoke M-x org-noter-enable-org-roam-integration. This is totally fine for me, just adds one minor step.

1 Like

one thing I want to ask, are you keeping the name of the document = name of the org file = citation key?

If not this could cause a problem if I remember correctly…

I followed the format that you suggested in the “org-roam-capture-template”, whereby the name of the document is the citation key ${citekey}.org

name of the document being the pdf file too? If everything works then dw. Just curious because it works on my end without using the command you suggested.

The name of the file is ${citekey}.org, whereas the document title (which is also the title of the org-roam node) is ${title}.
I’ll experiment a bit by changing the name of the org-roam note to the name of the pdf, i.e. ${title} or by specifying the name of the pdf in advance to be the ${citekey}. I’ll report back in case this works.

In short, they aren’t same, which must the root of the problem. Thanks for the helpful feedback!

Glad we could diagnose it; mainly org-noter’s org-roam integration is built around the workflow of keeping pdf’s of notes whereas org-roam-bibtex’s workflow is built around keeping a three way link between a citation entry - a pdf document - and an accompanying note. So if they don’t line up things would go awry - but with org-noter’s org-roam integration you could keep the name of the notes file as the name of the document, but this would break the three way link in org-roam-bibtex somewhere down the line. I cannot recall from my memory what exact problems I faced, so I resorted to keeping atleast one copy of the pdf as the same name as the citekey.

so for things to work smoothly; this must be the case : citekey.pdf ; citekey.org (with a $title of your choice ofcourse)

There are inbuilt ways to attach the pdf such that the name is automatically resolved. For example: try <F10> in the Helm menu.

also see orb-note-actions

1 Like