K-means in Spark realization task plan
K-means in Spark realization task plan
K-means algorithm is provided by Spark
(in examples/src/main/scala/org/apache/spark/examples/mllib/KMeanExample.scala)
DRAM: 192GB (12x 16GB DDR4 2666)
Intel Optane DC Persistent Memory: 1.4TB (12x 126GB)
- Optane configuration:
(1) AppDirect Mode
Optane configuration:
ipmctl create -goal PersistentMemoryType=AppDirectNotInterleaved
Expose persistent memory to application:
ipmctl create -namesapce (Fsdax mode)
mkdir /mnt/pmemdir
mkfs.ext4 /dev/pmem{X.Y}
mount -o dax /dev/pmem{X.Y} /mnt/pmemdir
(2) DRAM-Only
Optane configuration: ipmctl create -goal MemoryMode=100
- Spark configuration
Workload:
Data Size:
K-means(Iteration times):
K-means(K value):