K-means in Spark realization task plan

K-means algorithm is provided by Spark

(in examples/src/main/scala/org/apache/spark/examples/mllib/KMeanExample.scala)

DRAM: 192GB (12x 16GB DDR4 2666)

Intel Optane DC Persistent Memory: 1.4TB (12x 126GB)

  1. Optane configuration:

(1) AppDirect Mode

Optane configuration:

ipmctl create -goal PersistentMemoryType=AppDirectNotInterleaved

Expose persistent memory to application:

ipmctl create -namesapce (Fsdax mode)

mkdir /mnt/pmemdir

mkfs.ext4 /dev/pmem{X.Y}

mount -o dax /dev/pmem{X.Y} /mnt/pmemdir

(2) DRAM-Only

Optane configuration: ipmctl create -goal MemoryMode=100

  1. Spark configuration

Workload:

Data Size:

K-means(Iteration times):

K-means(K value):