Convert CSV to Vowpal Wabbit’s Input Format

Blog article by Jeroen Janssens.
Mar 29, 2016 • 2 min read.
Mar 29, 2016 • 2 min read.
I’ve created a Python script called csv2vw
which, as the name implies,
converts CSV data to Vowpall Wabbit’s input
format.
csv2vw
is available on GitHub in my dsutils
repository.

Here are some examples to give you an idea of what it can do:
Leave label values as is:
$ csv2vw spam.csv --label target
Relabel values ‘ham’ to 0 and ‘spam’ to 1:
$ csv2vw spam.csv --label target --classes ham,spam
Relabel values ‘ham’ to -1 and ‘spam’ to +1 (needed for logistic loss):
$ csv2vw spam.csv --label target --classes ham,spam --minus-plus-one
Relabel first label value to 0, second to 1, and ignore the rest:
$ csv2vw iris.csv -lspecies --auto-relabel --ignore-extra-classes
Relabel first label value to 1, second to 2, and so on:
$ < iris.csv csv2vw -lspecies --multiclass --auto-relabel
Relabel ‘versicolor’ to 1, ‘virginica’ to 2, and ‘setosa’ to 3:
$ < iris.csv csv2vw -lspecies --multiclass -cversicolor,virginica,setosa
Note that csv2vw
does not support namespaces.
— Jeroen
Would you like to receive an email whenever I have a new blog post, organize an event, or have an important announcement to make? Sign up to my newsletter:
© 2013–2025 Jeroen Janssens