bims: 2022 statement

This is the fourth state of bims statement (sobs). Sobs comes out every 30 of January. It commemorates the first sober meeting between Gavin and Thomas on 30th January 2017. This statement contains a part by Gavin and a part by Thomas.

Gavin writes:

In 2021 Biomed News has continued to grow. The number of Bimsers has increased but not to a number we had hoped for. At the time of writing, we had 84 active reports that contained 5987 issues in total. We have defined active reports as those occurring within the last 6 weeks of issue, so this number can fluctuate as Bimsers catch up. Reports lagging by 6 weeks, or more are not seen on our Reports page, but are included if they catch up. So, if you have got behind, don’t fret, you can catch up and your report will be visible again. Efforts to recruit Bimsers remain a priority to increase our visibility.

There have been some online events where we have presented Biomed News. Two online presentations and a nomination for an award. We were nominated for an Open Publishing Award organized by the Coko Foundation. Thank you for taking the time to nominate us. As a result of this we presented at the Open Publishing Fest in November. Avinash Mukkala, from the University of Toronto, joined us to provide a testimonial of his experience using Biomed News – thank you Avinash! Unfortunately, we were not successful with this award. In July, I presented at the Mobilizing Biological Computable Knowledge annual meeting. These online events had some interest but did not result in much follow up activity.

There are some interesting developments with other tools using smart learning for biomedical research literature. One of the largest platforms, Meta, will be discontinued in early 2022. This was run by the Chan Zuckerberg Initiative, but the decision to close this platform to focus on biomedical research data. Biomed News aims to exist and provide a free platform for users to discover biomedical literature and share this openly.

As 2022 begins, we hope to recruit more Bimsers and spread the word of Biomed News. Hopefully we may be able to see each other face to face. Wishing you the best for 2022.

Thomas writes:

A lot of technical details have changed. I know this is not obvious because the user interface has stayed the same.

My work has touched three areas. I am reviewing them in reverse historical order, starting with the most recent development, because it is the most important for the future.

  1. emais subscriptions to report issues (one month)
  2. learning and issue release (five months)
  3. work on the issue timing (two months)

(1) email subscriptions to report issues
In December 2021, I started to work on the new items poster nitpo. This is a software to disseminate report issues via email. I had written a funding proposal, and when it was accepted I started work on a project called sebeg. It will implement a substantial part of nitpo. Sebeg’s aim is have an early version nitpo ready for full deployment in the second half of 2022. Then we can have readers coming along to subscribe to report issues via email. On the report homepages, we will feature a subscription form. If a reader subscribes to several reports, nitpo will automatically hide duplicate papers. Thus readers will be able to look at several reports without being distracted by overlap between them. In the longer run, I plan to have a system where readers can manage a portfolio of reports that they subscribe to. But this is not in the works for now because I need to have the software ready this year.

(2) learning and issue release
I have rewritten the entire learning software from scratch. The main problem with the old software was slow release of report issues. It would take us about 7 minutes wall clock time on our server to presort a single report. This would limit the amount of reports we can release without getting ridiculously late. The new software is faster at this job. This was the main reason why I wrote it. The new software also features a new approach to presort the first issue of a report. The new approach requires fewer seed PubMed papers. Previously I recommended ten. Yes, I essentially plugged that number out of thin air. The new approach can run with just one PubMed paper. But that paper needs to be carefully chosen to be exactly on the topic in the sense that it mentions as much of the topic’s jargon as possible.

I put the new software into production for the 2021‒06‒20 issue. Around that time I also changed the way the release is produced. I continue to release on Sunday at midnight UTC. But I prepare for the release as soon as I have the PubMed data that is required for it. With the small number of reports we have at this time, I manage to have all report issues already presorted by midnight UTC. Reports are still released in the order that they appear on the reports page. But now it takes ninty seconds rather than seven hours for the release to complete.

(3) work on issue timing
Since the start of NEP in 1998, issues have, in general, been made on a weekly base. Since the introduction of the Yanabino protocol, NEP has been able to stick to strictly weekly regime. The only exception are the NEP holidays. Bims inherited this way of working. Once a week, on Sunday morning it is time for a release. I call that point in time “release time”. Ernad creates an issue. The issue contains all new papers in PubMed. It is released to selectors. Let us call this the strictly weekly release approach. At release time, all report issues are presorted. That means that the issue is sorted for each report, based on the most recent machine learning model available for that report. This presort is only done once. If a selector is doing several issues of the same report one immediately after the other—say in an effort to catch up—then presorting of the issue the selector works on is out of date, because new information about the selector’s preferences is available from the issue that is just created.

In early 2021 I started to make exceptions to the strictly weekly approach. I call such an exception a “bremse”. There are two bremses, the “Laia bremse” and the “Camila bremse”.

The “Laia bremse” is named after Laia Caja Puigsubira. It says that a report can not have more than six outstanding report issues. If it has six issues outstanding, no further issues are created at release time. The release time notification informs the selector about the lack of a new issue. When the selector has cleared the backlog completely, new report issues are automatically created, but only up to six report issues at a time.

The “Camila bremse” is named after Camila Kehl Dias. It says that for reports that are less than six issues old, no new report issue is created unless all previous report issues have been done. You may ask what the reason for this is. well, for recent reports, the incremental informational value of a new issue is much larger than for older reports. And the time to compute a new model is much faster for a recent report. Thus a selector only has to wait a few minutes from the creation of one report issue to the next report issue becoming available. Both arguments serve in favour of creating this more restrictive bremse for new reports.

Implementing the bremeses took quite some time, because the strictly weekly approach is deeply ingrained in the way ernad works.

Both write:

We will continue to focus our work on recruitment, fund-raising to support the system and technical improvements of the platform. Once again, we thank you for your support and the time you spend to keep your reports up to date and your comments to improve the system. Wishing everyone a successful 2022.