8. Load Data into H2O JVM
1. Iris dataset
a. 150 rows x 5 columns
b. Sepal Width, Sepal Length, Petal Width, Petal
Length, and Species (Verginica, Setosa, Versicolor)
2. Methods
a. h2o.upload_file
b. h2o.import_frame
c. h2o.H2OFrame
9. Load Data into H2O JVM
my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv
Python
>>> import h2o
>>> h2o.init()
>>> iris_H2OFrame = h2o.upload_file
(“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”)
>>> iris_H2OFrame = h2o.import_frame
(“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”)
H2O JVM
ip=localhost, port=54321
10. Load Data into H2O JVM
my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv
Python
>>> import h2o
>>> h2o.init(ip=”172.16.2.181”, port=54321)
>>>
>>> iris_H2OFrame = h2o.upload_file
(“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”)
>>>
>>> iris_H2OFrame = h2o.import_frame
(“/home/eric/iris.csv”)
H2O JVM
ip=172.16.2.181, port=54321
server room: /home/eric/iris.csv
13. Model-Building
1. H2O K-means
a. h2o_model = h2o.kmeans(x=iris_H2OFrame[:,0:4],
k=3)
b. h2o_model.centers()
2. Scikit Learn
a. from sklearn.cluster import KMeans
b. sk_model = KMeans(n_clusters=3)
c. sk_model.fit(iris_DataFrame.iloc[:,0:4])
d. sk_model.cluster_centers_