I'm having trouble using factor in ggplot for python. Here is the R code that I want to re-create in python.
library('ggplot2')
head(iris)
ggplot(iris, aes(Sepal.Length,Sepal.Width)) + geom_point(aes(colour=factor(Species)))
In python I do the following:
from ggplot import *
from sklearn.datasets import load_iris
iris = load_iris()
ggplot(iris.data, aes(iris.data[:, x_index], iris.data[:, y_index])) \
+ geom_point(aes(colour = iris.target)) + xlab('x axis') + ylab('y axis')
I keep getting errors. I believe this has to do with the factor part. I can't set to a factor and then allow it to plot. Any help would be appreciated.
load_iris()
produces, but I supect that it is not a pandas dataframe. you probably wantiris_df = pandas.DataFrame(iris.data.data, columns=iris.feature_names); iris_df["flower_types"] = iris.target
. Afterwards you should be able to use the column names (iris_df.columns
) to specify theaes
andcolour
. You don't need factor ifiris_df["flower_types"]
is of dtype string -> ggplot automatically uses strings as if they were factors (unfortunately you can't reorder them :-()