Our onboarding process, that ensures that packages contributed by the community undergo a transparent, constructive, non adversarial and open review process, involves a lot of work from many actors: authors, reviewers and editors; but how much work? Managing the effort involved in the peer-review process is a major part of ensuring its sustainability and quality. In this post, we’ll take a look at the effort put in by participants in the review process, and also learn something about exploring GitHub data along the way....
On March the 17th I had the honor to give a keynote talk about rOpenSci’s package onboarding system at the satRday conference in Cape Town, entitled “Our package reviews in review: introducing and analyzing rOpenSci onboarding system”. You can watch its recording, skim through the corresponding slides or… read this series! What is rOpenSci onboarding? rOpenSci’s suite of packages is partly contributed by staff members and partly contributed by community members, which means the suite stems from a great diversity of skills and experience of developers....
Our onboarding reviews, that ensure that packages contributed by the community undergo a transparent, constructive, non adversarial and open review process, take place in the issue tracker of a GitHub repository. Development of the packages we onboard also takes place in the open, most often in GitHub repositories. Therefore, when wanting to get data about our onboarding system for giving a data-driven overview, my mission was to extract data from GitHub and git repositories, and to put it into nice rectangles (as defined by Jenny Bryan) ready for analysis....
The Apache Tika parser is like the Babel fish in Douglas Adam’s book, “The Hitchhikers’ Guide to the Galaxy” 1. The Babel fish translates any natural language to any other. Although Tika does not yet translate natural language, it starts to tame the tower of babel of digital document formats. As the Babel fish allowed a person to understand Vogon poetry, Tika allows an analyst to extract text and objects from Microsoft Word....
library(tidyverse) library(monkeylearn) This is a story (mostly) about how I started contributing to the rOpenSci package monkeylearn. I can’t promise any life flipturning upside down, but there will be a small discussion about git best practices which is almost as good 🤓. The tl;dr here is nothing novel but is something I wish I’d experienced firsthand sooner. That is, that tinkering with and improving on the code others have written is more rewarding for you and more valuable to others when you contribute it back to the original source....