Too many URL links

I have an org file with all the papers submitted to an academic conference, that has an org table that looks like this:

|    # | Title & CVPR page                                                                                                                                   | arXiv |
|------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------|
|      | <50>                                                                                                                                                |       |
|    1 | [[https://openaccess.thecvf.com/content/CVPR2021/html/Liu_Invertible_Denoising_Network_A_Light_Solution_for_Real_Noise_Removal_CVPR_2021_paper.html][Invertible Denoising Network: A Light Solution for Real Noise Removal]]                                                                               | [[http://arxiv.org/abs/2104.10546][arXiv]] |
|    2 | [[https://openaccess.thecvf.com/content/CVPR2021/html/Wu_Greedy_Hierarchical_Variational_Autoencoders_for_Large-Scale_Video_Prediction_CVPR_2021_paper.html][Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction]]                                                                       |       |
|    3 | [[https://openaccess.thecvf.com/content/CVPR2021/html/Pony_Over-the-Air_Adversarial_Flickering_Attacks_Against_Video_Recognition_Networks_CVPR_2021_paper.html][Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks]]                                                                      | [[http://arxiv.org/abs/2002.05123][arXiv]] |
|    4 | [[https://openaccess.thecvf.com/content/CVPR2021/html/Feng_Encoder_Fusion_Network_With_Co-Attention_Embedding_for_Referring_Image_Segmentation_CVPR_2021_paper.html][Encoder Fusion Network With Co-Attention Embedding for Referring Image Segmentation]]                                                                 | [[http://arxiv.org/abs/2105.01839][arXiv]] |

There are 1659 rows in this table. WHen I save it file, Emacs locks up for around a minute while org-roam-db-update-file adds a row to the links table for every one of those http links.

I was kind of expecting there to be a configurable variable to turn off capturing http type links at all – but from the implementation of org-roam-db-update-file, which has

          (org-roam-db-map-links
           (list #'org-roam-db-insert-link)))))))

and the implementation of org-roam-db-map-links which is

(defun org-roam-db-map-links (fns)
  "Run FNS over all links in the current buffer."
  (org-with-point-at 1
    (org-element-map (org-element-parse-buffer) 'link
      (lambda (link)
        (dolist (fn fns)
          (funcall fn link))))))

(BTW what language should I be putting on the Markdown code block so elisp formats correctly? I forget. I tried a few things like lisp, elisp, emacs-lisp, but none worked)

I am not seeing any easily configurable way to just ignore URLs which looking at org-mode style links.

I don’t find the http links to be particularly helpful – do others? I am thinking of ways to just disable their collection across the board. They clutter up org-roam-graphs, and I don’t use them for organization in any way that I’m aware of.

Greg

I don’t use URLs in my notes but I believe storing http(s) links is for roam_refs. In this post I attempted to describe what they do. It was for V1 but I think it still applies to V2 – please correct me if I’m mistaken.

Re how to ignore http links for caching, org-roam-db-map-links is called like this.

            (org-roam-db-map-links
             info
             (list #'org-roam-db-insert-link))

Perhaps you could simply override org-roam-db-insert-link and ignore links of type “http” or something for web – I believe the type variable can be used to single out ID links.