I founded Data Science Workshops a little over 5 years ago. Since then I’ve been able to help hundreds of researchers, developers, and analysts with Python, R, Unix, data science, and machine learning.
It’s been a fantastic ride, but I’m ready to switch gears.
I’m now actively looking for a role as a Machine Learning Engineer or a Developer Relations Engineer. If you have any pointers then I’d very much appreciate it if you drop me a line. Also don’t hesitate to reach out in case you have any questions about Data Science Workshops or myself. Thanks.
August 3, 2022
About
Jeroen Janssens is an independent data science consultant and certified instructor. He enjoys visualizing data, implementing machine learning models, and building solutions using Python, R, JavaScript, and Bash. He’s passionate about helping and teaching others to do such things.
Jeroen runs Data Science Workshops, a training and coaching firm that organizes open enrollment workshops, in-company courses, inspiration sessions, hackathons, and meetups. Clients include Amazon, Apple, eHealth Africa, KPN, Schiphol Airport, The New York Times, and T-Mobile.
Previously, he was an assistant professor at Jheronimus Academy of Data Science and a data scientist at Elsevier in Amsterdam and various startups in New York City. He is the author of Data Science at the Command Line, published by O’Reilly Media. Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University.
He lives with his wife and two kids in Rotterdam, the Netherlands. For more information about Jeroen’s experience and education, download his CV.
Contact
Jeroen is available to provide consulting and training in the areas of data science, data engineering, and machine learning. He’s also available to speak at private and public events. If you would like to know more about his services, fees, and availability, then please email Jeroen. You can also find him on Twitter, GitHub, and LinkedIn.
Projects
Embrace the Command Line. My three-week cohort-based course. Next cohort starts on September 12, 2022.
Data Science NL. A community with over 3,000 members dedicated to sharing knowledge about the exciting multidisciplinary field of data science.
Data Science Toolbox. A batteries-included Docker image for polyglot data scientists.
raylibr. An R package that wraps Raylib, a simple and easy-to-use library to enjoy videogames programming.
tmuxr. An R package for managing tmux and interacting with the processes it runs.
scikit-sos. A Python implementation of the Stochastic Outlier Selection algorithm.
sample. Filter lines from standard input according to some probability, with a given delay, and for a certain duration.
Talks
R and Raylib: All Fun and Games‽ NYR Conference. New York, NY. June 9, 2022.
How Researchers and Developers Can Benefit from the Command Line. Live online master class. March 8, 2022.
Set Your R Code Free; Turn It Into a Command-Line Tool. NYR Conference.
September 9, 2021.Visualizing High-Dimensional Data with Python. O’Reilly Live Training. August 17, 2020.
Scalable Anomaly Detection With Spark and SOS. Strata Data Conference. New York, NY. September 26, 2019.
50 Reasons to Learn the Shell for Doing Data Science. Strata Data Conference. New York, NY. September 13, 2018.
Data Science with Unix Power Tools. NLUUG Spring Conference. Utrecht, the Netherlands. May 15, 2018.
Everybody Can Knit With {knitractive}. amst-R-dam. Amsterdam, the Netherlands. March 1, 2018.
Create Interactive Maps in Seconds with R and Leaflet. Strata Data Conference. London, UK. May 24, 2017.
The Polyglot Data Scientist. New York Open Statistical Programming Meetup. New York, NY. June 23, 2016.
The Polyglot Data Scientist. Strata + Hadoop World. London, UK. June 2, 2016.
Vowpal Wabbit: The Essence of Speed in Machine Learning. Strata + Hadoop World. San Jose, CA. March 31, 2016.
Poor Man’s Parallel Pipelines. Strata + Hadoop World. London, UK. May 7, 2015.
Data Science Toolbox and the Importance of Reproducible Research. Strata + Hadoop World. Barcelona, Spain. November 20, 2014.
Predicting at the Command Line. 1st International Conference on Predictive APIs and Apps. Barcelona, Spain. November 17, 2014.
Building a Data Science Toolbox. Data Science London Meetup. London, UK. April 10, 2014.
Obtaining, Scrubbing, and Exploring Data at the Command Line. New York Open Statistical Programming Meetup. New York, NY. January 29, 2014.
Sudo Make Me a Visualization! Strata Ignite. New York, NY. October 28, 2013.
Algorithms for Outlier Selection and One-Class Classification. NYC Machine Learning. New York, NY. November 21, 2013.
Publications
Scrape HTML Elements Across Paginated Content in R and Rvest. A useful helper function along with some examples. November 5, 2021.
Data Science at the Command Line, second edition. Foreword by Tim O’Reilly. Published by O’Reilly Media. August 17, 2021.
Data Science from the Shell. The command line is a great environment for inspecting a dataset, automating data science tasks, and more. Hit the ground running with this playlist. June 12, 2020.
Plotnine: Grammar of Graphics for Python. A translation of the visualisation chapters from “R for Data Science” to Python using Plotnine and Pandas. December 11, 2019.
IBash Notebook‽ A Bash kernel for Jupyter Notebook. Now with inline images. February 19, 2015.
Lean, Mean Data Science Machine. A virtual environment that enables you to get up and running quickly. December 7, 2013.
Stochastic Outlier Selection. An algorithm for detecting anomalous patterns. Includes a demo and a Python implementation. November 24, 2013.
7 Command-Line Tools for Data Science. Obtain, scrub, and explore data with jq, json2csv, csvkit, scrape, xml2json, sample, and Rio. September 19, 2013.
Quickly Navigate your Filesystem from the Command Line. Bookmark and jump to important directories using symbolic links. August 16, 2013.
Outlier Selection and One-Class Classification. PhD thesis. Supervised by Eric Postma and Jaap van den Herik. Tilburg University, June 11, 2013.
Ranking Images on Semantic Attributes using Human Computation. Computational Social Science and the Wisdom of Crowds (NIPS 2010). Whistler, Canada. October 8, 2010.
Outlier Detection with One-Class Classifiers from ML and KDD. International Conference on Machine Learning and Applications. Miami, FL. December 13, 2009.
Media Mentions
SDS 531: Data Science at the Command Line. Interview with Jon Krohn for the SuperDataScience Podcast. December 14, 2021.
Five reasons why researchers should learn to love the command line. I spoke with Jeffrey Perkel at Nature about the benefits of the Unix command line. February 2, 2021
Interview met Transavia, DPD en Data Science Workshops (Dutch). Studio Data at Big Data Expo. Utrecht, the Netherlands. September 19, 2019.
SE-Radio Episode 315: Jeroen Janssens on Tools for Data Science. Interview with Felienne Hermans for Software Engineering Radio. January 23, 2018.
Anomalies, Concerts, and the Command Line. Data Science Weekly interviews Jeroen Janssens. May 18, 2015.