I can’t stop thinking about using org-roam for full-stack bibliographic management, so I thought I’d share it. Partly because I don’t think I have all the skills necessary to do this myself, and I’m hoping someone would want to pick up this idea, and/or collaborate with me on it.
The way I see it, maintaining org-roam bibliographic notes can be a bit kludgey at times, especially if you’re like me, and use org-roam, org-roam-bibtex, org-ref, helm-bibtex, bibtex-completion, and more. Here are some problems with that stack, as I see them:
Problems
- That’s a big software stack, with lots of moving parts to keep track of.
- Helm-bibtex and its siblings need to parse all your bibtex files each time you try to insert a citation. That can get pretty time-intensive, if you have a large bibtex file, like I do.
- Org-roam notes reference citations (
cite:stanley2021
), which point to entries in bibtex files (@article{stanley2021...
), which are what ultimately contain the bibliographic metadata. But this means that you’re maintaining notes for bibliographic entries in two places, ultimately. This can get messy. - Org-ref is slowly being replaced with org-cite, the built-in citation mechanism for Org-mode.
Proposed Solution
What if, instead of trying to Frankenstein together a bibliographic manager, by combining org-roam, org-ref, org-roam-bibtex, helm-bibtex, and others, what if you could just use org-roam as your bibliographic manager? That way, not only would you simplify your software stack, but you could simplify your notes, too, by avoiding BibTeX altogether.
The pieces are all almost there. Org-bibtex, which is already included in Org, provides templates for org-based bibliographic metadata storage, using PROPERTIES
drawers.
Here’s an example:
* A multi-language computing environment for literate programming and reproducible research :babel:
:PROPERTIES:
:BTYPE: article
:AUTHOR: Eric Schulte and Dan Davidson and Tom Dye and Carsten Dominik
:JOURNAL: Journal of Statistical Software
:VOLUME: 46
:NUMBER: 3
:YEAR: 2012
:MONTH: January
:CUSTOM_ID: schulte2012babel
:END:
Some annotation about babel.
This isn’t too far off from what an org-roam node looks like, as generated by org-roam-bibtex.
If this were your canonical format for bibliographic metadata, it would allow you to do some sophisticated org-ql queries for finding, say, all entries from 2021. It would also allow you to leverage hierarchical org-roam nodes to represent articles in edited collections–the collection itself could have the parent heading, and the collection’s articles could be subheadings.
What would be needed
What would be needed for something like this is, I imagine:
- Importer functions that could take ISBNs or DOIs, or even plain-text queries, and convert them to org-roam nodes, containing bibliographic metadata gleaned from some REST API. It could just be a light rewrite of org-ref’s isbn-to-bibtex function, and other functions like that.
- Functions to cache bibliographic metadata in org-roam’s database, whenever it scans for changes.
- Functions for helm, ivy, or whatever other system, to select bibliographic items from org-roam nodes, based on that cached metadata, like titles, authors, years, and so on.
- A backend for org-cite’s citeproc, that can look up bibliographic data from the org-roam database, rather than looking for it in a bibtex file. That way, when exporting an org file containing citations, it can generate the Works Cited page automagically, using org-roam data.
Let me know what you think, and whether you think this might deserve further thought and effort.