Projects
R Packages
- tmuxr. An R package for managing tmux and interacting with the processes it runs. It features a pipeable API with which you can create, control, and capture tmux sessions, windows, and panes.
- rexpect. An R package that allows you to automate interactions with programs that expose a text terminal interface. The API is inspired by the original Expect tool by Don Libes. Programs are optionally run inside a Docker container. Sessions can be recorded using asciinema.
- knitractive. An R package that provides a knitr engine which allows you to simulate interactive sessions (e.g., Python console, Bash shell) across multiple code chunks. Interactive sessions are run inside a tmux session through the tmuxr and rexpect packages.
- raylibr. An R package that wraps Raylib, a simple and easy-to-use library to enjoy videogames programming. Features real-time 2D & 3D graphics, keyboard & mouse interactivity, music, sound effects, and shaders. Presented at NYR Conference 2022.
- rush. Run R expressions, create ggplot2 visualizations, and install R packages directly from the shell.
Python Packages
- scikit-sos. A Python implementation of the Stochastic Outlier Selection (SOS) algorithm. The algorithm is covered in Chapter 4 of my PhD thesis. SOS is also available in the PyOD package.
- sample-stream. Filter lines from standard input according to some probability, with a given delay, and for a certain duration.
Miscellaneous Projects and Scripts
- Data Science Toolbox.
A batteries-included Docker image for polyglot data scientists. Based on Packer, Ansible, and Docker. Includes Python, R, many packages, and command-line tools such as
jq
, xmlstarlet
, parallel
, and xsv
.
- dsutils. A collection of command-line tools for working with data.
- Embrace the Command Line. Archive of my three-week online course Embrace the Command Line.
- tidytree and tidynaivebayes. Understandable but slow implementations in R of a Decision Tree classifier and a Naive Bayes classifier, respectively.
- cache.R. Cache the result of an expression in R. The discussion is at least as interesting as the code itself.
- itermkeymap.py. Generate iTerm Key Mappings with Python. Discussed in Scripting iTerm Key Mappings.
Various Contributions
- pola-rs/polars. Polars is a DataFrame interface on top of an OLAP Query Engine implemented in Rust. I contributed a fix to file globbing.
- pola-rs/tpch. Runs the TPC-standardised benchmark suite to evaluate the performance of Polars, Pandas, Dask, DuckDB, and Spark. I implemented a Python script that creates a dot plot of the results using Plotnine.
- has2k1/plotnine. Plotnine is an implementation of a grammar of graphics in Python based on ggplot2. I made a small fix that allowed the alignment of text to be based on data. This was needed for the blog post Plotnine: Grammar of Graphics for Python.
- wireservice/csvkit. A suite of utilities for converting to and working with CSV, the king of tabular file formats. I extended
csvsql
such that it can execute SQL queries directly on CSV files. Mentioned in Data Science at the Command Line.
- takluyver/bash_kernel. A Jupyter kernel for Bash. I added the ability to show inline images. More information in the blog post IBash Notebook.
- ohmyzsh/ohmyzsh. Oh My Zsh is an open source, community-driven framework for managing your zsh configuration. I contributed the
jump
plugin, which allows you to easily jump around the file system.
- r-lib/pkgdown. Easily generate a static website for an R package. I added a fix to ensure that example code inside
\dontshow{}
is not skipped.
- rstudio/concept-maps. A collection of mental models used in introductory data science lessons. I added a concept map for the pipe operator (
%>%
) that I created as part of the RStudio Instructor Training.
- hadley/r4ds. Contains the source of the book R for Data Science by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. I fixed a typo because I was curious to see how the process of editing an open source book works. This miniscule contribution still got me mentioned in the acknowledgments.
- jehiah/json2csv. A command-line tool, written in Go, that converts a stream of newline-separated JSON data to CSV format. I added support for nested fields.
© 2013–2024 Jeroen Janssens