I have few questions regarding the Lustre Filesystem with 1 MDS/MDT on the same machine, 12 OSTs configured with LVM  for 2 OSS total and 4 Lustre Client running Hadoop(1 namenode and 3 datanode). Hadoop using Lustre instead of HDFS.

Question: I have created LVM for OSTs instead of physical hard disk? How is it going to affect my wordcount example running on 1 Namenode and 3 Datanodes. Say, if its 30 min for 18GB plain data wordcount to finish, using physical hard disk will lessen the time?

Question: I would like to use other dataset like wikipedia dump instead of simple wordcount. How shall I put the http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2 into Lustre.
In case of HDFS, I simply loaded into HDFS through -copyFromLocal command. Please suggest for Lustre.