To just it in simple way, here is my finding:
I tried out wordcount with small size ( ie. in MB) for Hadoop running
over lustre in different condition. I could see that in every
condition the results were same.(completed in 23 seconds). Seems like
stripping on one OST is not different from stripping on all 6 OST on
each OSS'es.
Condition:1 - 1 MDS, 2 OSS/OST , 1 NameNode, 1 DataNode, Stripping
count 1(Only 1 OST being used)
[root@lustreclient1 hadoop]# bin/hadoop jar hadoop-examples-1.1.1.jar
wordcount /mnt/lustre/ebook /mnt/lustre/result1-ebook
13/03/26 12:14:12 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/03/26 12:14:13 INFO input.FileInputFormat: Total input paths to process : 2
13/03/26 12:14:13 WARN snappy.LoadSnappy: Snappy native library not loaded
13/03/26 12:14:13 INFO mapred.JobClient: Running job: job_201303261206_0003
13/03/26 12:14:14 INFO mapred.JobClient: map 0% reduce 0%
13/03/26 12:14:25 INFO mapred.JobClient: map 50% reduce 0%
13/03/26 12:14:26 INFO mapred.JobClient: map 100% reduce 0%
13/03/26 12:14:33 INFO mapred.JobClient: map 100% reduce 33%
13/03/26 12:14:35 INFO mapred.JobClient: map 100% reduce 100%
13/03/26 12:14:37 INFO mapred.JobClient: Job complete: job_201303261206_0003
13/03/26 12:14:37 INFO mapred.JobClient: Counters: 27
13/03/26 12:14:37 INFO mapred.JobClient: Job Counters
13/03/26 12:14:37 INFO mapred.JobClient: Launched reduce tasks=1
13/03/26 12:14:37 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=13455
13/03/26 12:14:37 INFO mapred.JobClient: Total time spent by all
reduces waiting after reserving slots (ms)=0
13/03/26 12:14:37 INFO mapred.JobClient: Total time spent by all
maps waiting after reserving slots (ms)=0
13/03/26 12:14:37 INFO mapred.JobClient: Rack-local map tasks=2
13/03/26 12:14:37 INFO mapred.JobClient: Launched map tasks=2
13/03/26 12:14:37 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10339
13/03/26 12:14:37 INFO mapred.JobClient: File Output Format Counters
13/03/26 12:14:37 INFO mapred.JobClient: Bytes Written=474862
13/03/26 12:14:37 INFO mapred.JobClient: FileSystemCounters
13/03/26 12:14:37 INFO mapred.JobClient: FILE_BYTES_READ=2787742
13/03/26 12:14:37 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1999630
13/03/26 12:14:37 INFO mapred.JobClient: File Input Format Counters
13/03/26 12:14:37 INFO mapred.JobClient: Bytes Read=2053173
13/03/26 12:14:37 INFO mapred.JobClient: Map-Reduce Framework
13/03/26 12:14:37 INFO mapred.JobClient: Map output materialized
bytes=734165
13/03/26 12:14:37 INFO mapred.JobClient: Map input records=44865
13/03/26 12:14:37 INFO mapred.JobClient: Reduce shuffle bytes=734165
13/03/26 12:14:37 INFO mapred.JobClient: Spilled Records=102042
13/03/26 12:14:37 INFO mapred.JobClient: Map output bytes=3473556
13/03/26 12:14:37 INFO mapred.JobClient: CPU time spent (ms)=10320
13/03/26 12:14:37 INFO mapred.JobClient: Total committed heap
usage (bytes)=563216384
13/03/26 12:14:37 INFO mapred.JobClient: Combine input records=361111
13/03/26 12:14:37 INFO mapred.JobClient: SPLIT_RAW_BYTES=186
13/03/26 12:14:37 INFO mapred.JobClient: Reduce input records=51021
13/03/26 12:14:37 INFO mapred.JobClient: Reduce input groups=44464
13/03/26 12:14:37 INFO mapred.JobClient: Combine output records=51021
13/03/26 12:14:37 INFO mapred.JobClient: Physical memory (bytes)
snapshot=555302912
13/03/26 12:14:37 INFO mapred.JobClient: Reduce output records=44464
13/03/26 12:14:37 INFO mapred.JobClient: Virtual memory (bytes)
snapshot=2588442624
13/03/26 12:14:37 INFO mapred.JobClient: Map output records=361111
Condition2 - 1 MDS, 2 OSS/OST , 1 NameNode, 2 DataNode
13/03/26 12:11:00 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/03/26 12:11:01 INFO input.FileInputFormat: Total input paths to process : 2
13/03/26 12:11:01 WARN snappy.LoadSnappy: Snappy native library not loaded
13/03/26 12:11:01 INFO mapred.JobClient: Running job: job_201303261206_0002
13/03/26 12:11:02 INFO mapred.JobClient: map 0% reduce 0%
13/03/26 12:11:12 INFO mapred.JobClient: map 50% reduce 0%
13/03/26 12:11:14 INFO mapred.JobClient: map 100% reduce 0%
13/03/26 12:11:20 INFO mapred.JobClient: map 100% reduce 33%
13/03/26 12:11:23 INFO mapred.JobClient: map 100% reduce 100%
13/03/26 12:11:24 INFO mapred.JobClient: Job complete: job_201303261206_0002
13/03/26 12:11:24 INFO mapred.JobClient: Counters: 27
13/03/26 12:11:24 INFO mapred.JobClient: Job Counters
13/03/26 12:11:24 INFO mapred.JobClient: Launched reduce tasks=1
13/03/26 12:11:24 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=13627
13/03/26 12:11:24 INFO mapred.JobClient: Total time spent by all
reduces wai
ting after reserving slots (ms)=0
13/03/26 12:11:24 INFO mapred.JobClient: Total time spent by all
maps waitin
g after reserving slots (ms)=0
13/03/26 12:11:24 INFO mapred.JobClient: Rack-local map tasks=2
13/03/26 12:11:24 INFO mapred.JobClient: Launched map tasks=2
13/03/26 12:11:24 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10191
13/03/26 12:11:24 INFO mapred.JobClient: File Output Format Counters
13/03/26 12:11:24 INFO mapred.JobClient: Bytes Written=474862
13/03/26 12:11:24 INFO mapred.JobClient: FileSystemCounters
13/03/26 12:11:24 INFO mapred.JobClient: FILE_BYTES_READ=2787742
13/03/26 12:11:24 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1999627
13/03/26 12:11:24 INFO mapred.JobClient: File Input Format Counters
13/03/26 12:11:24 INFO mapred.JobClient: Bytes Read=2053173
13/03/26 12:11:24 INFO mapred.JobClient: Map-Reduce Framework
13/03/26 12:11:24 INFO mapred.JobClient: Map output materialized
bytes=73416
5
13/03/26 12:11:24 INFO mapred.JobClient: Map input records=44865
13/03/26 12:11:24 INFO mapred.JobClient: Reduce shuffle bytes=734165
13/03/26 12:11:24 INFO mapred.JobClient: Spilled Records=102042
13/03/26 12:11:24 INFO mapred.JobClient: Map output bytes=3473556
13/03/26 12:11:24 INFO mapred.JobClient: CPU time spent (ms)=11050
13/03/26 12:11:24 INFO mapred.JobClient: Total committed heap
usage (bytes)=
563281920
13/03/26 12:11:24 INFO mapred.JobClient: Combine input records=361111
13/03/26 12:11:24 INFO mapred.JobClient: SPLIT_RAW_BYTES=186
13/03/26 12:11:24 INFO mapred.JobClient: Reduce input records=51021
13/03/26 12:11:24 INFO mapred.JobClient: Reduce input groups=44464
13/03/26 12:11:24 INFO mapred.JobClient: Combine output records=51021
13/03/26 12:11:24 INFO mapred.JobClient: Physical memory (bytes)
snapshot=55
1596032
13/03/26 12:11:24 INFO mapred.JobClient: Reduce output records=44464
13/03/26 12:11:24 INFO mapred.JobClient: Virtual memory (bytes)
snapshot=257
2816384
13/03/26 12:11:24 INFO mapred.JobClient: Map output records=361111
Condition-3: 1 MDS, 2 OSS/OST, 1 NameNode, 2 DataNode, Striping count
-1(All OST being used)
[root@lustreclient1 hadoop]# bin/hadoop jar hadoop-examples-1.1.1.jar
wordcount /mnt/lustre/ebook /mnt/lustre/result-stripAllOSS-1M2Sl
13/03/26 14:55:03 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/03/26 14:55:03 INFO input.FileInputFormat: Total input paths to process : 2
13/03/26 14:55:03 WARN snappy.LoadSnappy: Snappy native library not loaded
13/03/26 14:55:03 INFO mapred.JobClient: Running job: job_201303261454_0001
13/03/26 14:55:04 INFO mapred.JobClient: map 0% reduce 0%
13/03/26 14:55:16 INFO mapred.JobClient: map 50% reduce 0%
13/03/26 14:55:18 INFO mapred.JobClient: map 100% reduce 0%
13/03/26 14:55:24 INFO mapred.JobClient: map 100% reduce 33%
13/03/26 14:55:27 INFO mapred.JobClient: map 100% reduce 100%
13/03/26 14:55:28 INFO mapred.JobClient: Job complete: job_201303261454_0001
13/03/26 14:55:28 INFO mapred.JobClient: Counters: 27
13/03/26 14:55:28 INFO mapred.JobClient: Job Counters
13/03/26 14:55:28 INFO mapred.JobClient: Launched reduce tasks=1
13/03/26 14:55:28 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=13908
13/03/26 14:55:28 INFO mapred.JobClient: Total time spent by all
reduces waiting after reserving slots (ms)=0
13/03/26 14:55:28 INFO mapred.JobClient: Total time spent by all
maps waiting after reserving slots (ms)=0
13/03/26 14:55:28 INFO mapred.JobClient: Rack-local map tasks=2
13/03/26 14:55:28 INFO mapred.JobClient: Launched map tasks=2
13/03/26 14:55:28 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10760
13/03/26 14:55:28 INFO mapred.JobClient: File Output Format Counters
13/03/26 14:55:28 INFO mapred.JobClient: Bytes Written=474862
13/03/26 14:55:28 INFO mapred.JobClient: FileSystemCounters
13/03/26 14:55:28 INFO mapred.JobClient: FILE_BYTES_READ=2787742
13/03/26 14:55:28 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1999663
13/03/26 14:55:28 INFO mapred.JobClient: File Input Format Counters
13/03/26 14:55:28 INFO mapred.JobClient: Bytes Read=2053173
13/03/26 14:55:28 INFO mapred.JobClient: Map-Reduce Framework
13/03/26 14:55:28 INFO mapred.JobClient: Map output materialized
bytes=734165
13/03/26 14:55:28 INFO mapred.JobClient: Map input records=44865
13/03/26 14:55:28 INFO mapred.JobClient: Reduce shuffle bytes=734165
13/03/26 14:55:28 INFO mapred.JobClient: Spilled Records=102042
13/03/26 14:55:28 INFO mapred.JobClient: Map output bytes=3473556
13/03/26 14:55:28 INFO mapred.JobClient: CPU time spent (ms)=11120
13/03/26 14:55:28 INFO mapred.JobClient: Total committed heap
usage (bytes)=552206336
13/03/26 14:55:28 INFO mapred.JobClient: Combine input records=361111
13/03/26 14:55:28 INFO mapred.JobClient: SPLIT_RAW_BYTES=186
13/03/26 14:55:28 INFO mapred.JobClient: Reduce input records=51021
13/03/26 14:55:28 INFO mapred.JobClient: Reduce input groups=44464
13/03/26 14:55:28 INFO mapred.JobClient: Combine output records=51021
13/03/26 14:55:28 INFO mapred.JobClient: Physical memory (bytes)
snapshot=551116800
13/03/26 14:55:28 INFO mapred.JobClient: Reduce output records=44464
13/03/26 14:55:28 INFO mapred.JobClient: Virtual memory (bytes)
snapshot=2570797056
13/03/26 14:55:28 INFO mapred.JobClient: Map output records=361111
[root@lustreclient1 hadoop]#
Condition-3: 1 MDS, 2 OSS/OST, 1 NameNode, 1 DataNode, Striping count
-1(All OST being used)
[root@lustreclient1 hadoop]# bin/hadoop jar hadoop-examples-1.1.1.jar
wordcount
/mnt/lustre/ebook
/mnt/lustre/result-stripAllOSS-1M1S
13/03/26 14:51:21 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/03/26 14:51:22 INFO input.FileInputFormat: Total input paths to process : 2
13/03/26 14:51:22 WARN snappy.LoadSnappy: Snappy native library not loaded
13/03/26 14:51:22 INFO mapred.JobClient: Running job: job_201303261449_0001
13/03/26 14:51:23 INFO mapred.JobClient: map 0% reduce 0%
13/03/26 14:51:33 INFO mapred.JobClient: map 50% reduce 0%
13/03/26 14:51:36 INFO mapred.JobClient: map 100% reduce 0%
13/03/26 14:51:41 INFO mapred.JobClient: map 100% reduce 33%
13/03/26 14:51:44 INFO mapred.JobClient: map 100% reduce 100%
13/03/26 14:51:46 INFO mapred.JobClient: Job complete: job_201303261449_0001
13/03/26 14:51:46 INFO mapred.JobClient: Counters: 27
13/03/26 14:51:46 INFO mapred.JobClient: Job Counters
13/03/26 14:51:46 INFO mapred.JobClient: Launched reduce tasks=1
13/03/26 14:51:46 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=13223
13/03/26 14:51:46 INFO mapred.JobClient: Total time spent by all
reduces waiting after reserving slots (ms)=0
13/03/26 14:51:46 INFO mapred.JobClient: Total time spent by all
maps waiting after reserving slots (ms)=0
13/03/26 14:51:46 INFO mapred.JobClient: Rack-local map tasks=2
13/03/26 14:51:46 INFO mapred.JobClient: Launched map tasks=2
13/03/26 14:51:46 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10408
13/03/26 14:51:46 INFO mapred.JobClient: File Output Format Counters
13/03/26 14:51:46 INFO mapred.JobClient: Bytes Written=474862
13/03/26 14:51:46 INFO mapred.JobClient: FileSystemCounters
13/03/26 14:51:46 INFO mapred.JobClient: FILE_BYTES_READ=2787742
13/03/26 14:51:46 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1999660
13/03/26 14:51:46 INFO mapred.JobClient: File Input Format Counters
13/03/26 14:51:46 INFO mapred.JobClient: Bytes Read=2053173
13/03/26 14:51:46 INFO mapred.JobClient: Map-Reduce Framework
13/03/26 14:51:46 INFO mapred.JobClient: Map output materialized
bytes=734165
13/03/26 14:51:46 INFO mapred.JobClient: Map input records=44865
13/03/26 14:51:46 INFO mapred.JobClient: Reduce shuffle bytes=734165
13/03/26 14:51:46 INFO mapred.JobClient: Spilled Records=102042
13/03/26 14:51:46 INFO mapred.JobClient: Map output bytes=3473556
13/03/26 14:51:46 INFO mapred.JobClient: CPU time spent (ms)=10080
13/03/26 14:51:46 INFO mapred.JobClient: Total committed heap
usage (bytes)=549650432
13/03/26 14:51:46 INFO mapred.JobClient: Combine input records=361111
13/03/26 14:51:46 INFO mapred.JobClient: SPLIT_RAW_BYTES=186
13/03/26 14:51:46 INFO mapred.JobClient: Reduce input records=51021
13/03/26 14:51:46 INFO mapred.JobClient: Reduce input groups=44464
13/03/26 14:51:46 INFO mapred.JobClient: Combine output records=51021
13/03/26 14:51:46 INFO mapred.JobClient: Physical memory (bytes)
snapshot=555511808
13/03/26 14:51:46 INFO mapred.JobClient: Reduce output records=44464
13/03/26 14:51:46 INFO mapred.JobClient: Virtual memory (bytes)
snapshot=2567835648
13/03/26 14:51:46 INFO mapred.JobClient: Map output records=361111
http://paste.ubuntu.com/5648898/
My Question is : Is it giving the same result due to small size where
lustre doesnt show any difference in performance?
Please suggest.
On Tue, Mar 26, 2013 at 11:33 AM, linux freaker <linuxfreaker(a)gmail.com> wrote:
Hi,
I tried a new lustre client3 and tried to run the same wordcount
example. It stucked at map 100% and reduce 17%.
Its not proceeding further. I just wanted to see if the total time
will get reduced if I add one more node(is it expected?).
To lessen the total time for map and reduce, do I need to add OSS or
add Lustreclient. Please confirm.
Here is the output while I run adding one more lustre client:
Hadoop job_201303261112_0001 on lustreclient1
User: root
Job Name: word count
Job File:
file:/mnt/lustre/hadoop_tmp/lustrecient1/mapred/staging/root/.staging/job_201303261112_0001/job.xml
Submit Host: lustreclient1
Submit Host Address: 10.94.214.188
Job-ACLs: All users are allowed
Job Setup: Successful
Status: Running
Started at: Tue Mar 26 11:14:59 IST 2013
Running for: 4mins, 15sec
Job Cleanup: Pending
Kind % Complete Num
Tasks Pending Running Complete Killed Failed/Killed Task Attempts
map 100.00% 17 0 0
17 0 0 / 0
reduce 17.64% 1 0 1 0 0
0 / 0
Counter Map Reduce Total
Job Counters SLOTS_MILLIS_MAPS 0 0 76,817
Launched reduce tasks 0 0 1
Rack-local map tasks 0 0 17
Launched map tasks 0 0 17
File Input Format Counters Bytes Read 18,546,894 0 18,546,894
FileSystemCounters FILE_BYTES_READ 18,566,121 0 18,566,121
FILE_BYTES_WRITTEN 6,859,813 18,799 6,878,612
Map-Reduce Framework Map output materialized bytes 6,539,696 0
6,539,696
Map input records 410,391 0 410,391
Reduce shuffle bytes 0 3,603,036 3,603,036
Spilled Records 456,036 0 456,036
Map output bytes 31,477,059 0 31,477,059
CPU time spent (ms) 60,550 6,410 66,960
Total committed heap usage (bytes) 3,018,391,552 35,848,192 3,054,239,744
Combine input records 3,281,716 0 3,281,716
SPLIT_RAW_BYTES 1,612 0 1,612
Reduce input records 0 0 0
Reduce input groups 0 0 0
Combine output records 456,036 0 456,036
Physical memory (bytes) snapshot 3,665,850,368 101,208,064 3,767,058,432
Reduce output records 0 0 0
Virtual memory (bytes) snapshot 14,500,483,072 872,202,240 15,372,685,312
Map output records 3,281,716 0 3,281,716
Map Completion Graph - close
The issue is with Reduce stucking at 17% while map is 100%.
Any idea what could be the issue?
On Mon, Mar 25, 2013 at 8:28 PM, Diep, Minh <minh.diep(a)intel.com> wrote:
> That's great!
>
> Since you have input and output from /mnt/lustre (assuming that's lustre
> mount point). At the very least you are using lustre on data. If you have
> set hadoop.tmp.dir on /mnt/lustre/..., then you are using all lustre for
> hadoop.
>
> As for stripping, if you want to use all OSTs, then do lfs setstripe -c -1
> /mnt/lustre. You need to remove all directories under /mnt/lustre and
> restart hadoop.
>
> Thanks
> -Minh
>
> On 3/24/13 11:33 AM, "linux freaker" <linuxfreaker(a)gmail.com> wrote:
>
>>Wow !!! I am able to run hadoop over lustre.
>>I started tasktracker at datanode.
>>This is the complete output now:
>>
>># bin/hadoop jar hadoop-examples-1.1.1.jar wordcount
>>
>>/mnt/lustre/ebook /mnt/lustre/ebook-result12
>>13/03/21 06:49:00 INFO util.NativeCodeLoader: Loaded the native-hadoop
>>library
>>13/03/21 06:49:00 INFO input.FileInputFormat: Total input paths to
>>process : 17
>>13/03/21 06:49:00 WARN snappy.LoadSnappy: Snappy native library not loaded
>>13/03/21 06:49:00 INFO mapred.JobClient: Running job:
>>job_201303191951_0004
>>13/03/21 06:49:01 INFO mapred.JobClient: map 0% reduce 0%
>>13/03/21 06:49:41 INFO mapred.JobClient: map 5% reduce 0%
>>13/03/21 06:49:43 INFO mapred.JobClient: map 11% reduce 0%
>>13/03/21 06:49:46 INFO mapred.JobClient: map 17% reduce 0%
>>13/03/21 06:49:48 INFO mapred.JobClient: map 23% reduce 0%
>>13/03/21 06:49:50 INFO mapred.JobClient: map 23% reduce 7%
>>13/03/21 06:49:51 INFO mapred.JobClient: map 29% reduce 7%
>>13/03/21 06:49:53 INFO mapred.JobClient: map 35% reduce 7%
>>13/03/21 06:49:56 INFO mapred.JobClient: map 41% reduce 7%
>>13/03/21 06:49:58 INFO mapred.JobClient: map 47% reduce 7%
>>13/03/21 06:49:59 INFO mapred.JobClient: map 47% reduce 11%
>>13/03/21 06:50:00 INFO mapred.JobClient: map 52% reduce 11%
>>13/03/21 06:50:02 INFO mapred.JobClient: map 58% reduce 15%
>>13/03/21 06:50:04 INFO mapred.JobClient: map 64% reduce 15%
>>13/03/21 06:50:05 INFO mapred.JobClient: map 64% reduce 19%
>>13/03/21 06:50:06 INFO mapred.JobClient: map 70% reduce 19%
>>13/03/21 06:50:08 INFO mapred.JobClient: map 76% reduce 19%
>>13/03/21 06:50:10 INFO mapred.JobClient: map 82% reduce 19%
>>13/03/21 06:50:12 INFO mapred.JobClient: map 88% reduce 19%
>>13/03/21 06:50:14 INFO mapred.JobClient: map 94% reduce 19%
>>13/03/21 06:50:15 INFO mapred.JobClient: map 94% reduce 29%
>>13/03/21 06:50:16 INFO mapred.JobClient: map 100% reduce 29%
>>13/03/21 06:50:23 INFO mapred.JobClient: map 100% reduce 100%
>>13/03/21 06:50:25 INFO mapred.JobClient: Job complete:
>>job_201303191951_0004
>>13/03/21 06:50:25 INFO mapred.JobClient: Counters: 27
>>13/03/21 06:50:25 INFO mapred.JobClient: Job Counters
>>13/03/21 06:50:25 INFO mapred.JobClient: Launched reduce tasks=1
>>13/03/21 06:50:25 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=79405
>>13/03/21 06:50:25 INFO mapred.JobClient: Total time spent by all
>>reduces wai
>> ting after reserving slots (ms)=0
>>13/03/21 06:50:25 INFO mapred.JobClient: Total time spent by all
>>maps waitin
>> g after reserving slots (ms)=0
>>13/03/21 06:50:25 INFO mapred.JobClient: Rack-local map tasks=17
>>13/03/21 06:50:25 INFO mapred.JobClient: Launched map tasks=17
>>13/03/21 06:50:25 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=41959
>>13/03/21 06:50:25 INFO mapred.JobClient: File Output Format Counters
>>13/03/21 06:50:25 INFO mapred.JobClient: Bytes Written=511158
>>13/03/21 06:50:25 INFO mapred.JobClient: FileSystemCounters
>>13/03/21 06:50:25 INFO mapred.JobClient: FILE_BYTES_READ=25105721
>>13/03/21 06:50:25 INFO mapred.JobClient: FILE_BYTES_WRITTEN=13929334
>>13/03/21 06:50:25 INFO mapred.JobClient: File Input Format Counters
>>13/03/21 06:50:25 INFO mapred.JobClient: Bytes Read=18546894
>>13/03/21 06:50:25 INFO mapred.JobClient: Map-Reduce Framework
>>13/03/21 06:50:25 INFO mapred.JobClient: Map output materialized
>>bytes=65396
>> 96
>>13/03/21 06:50:25 INFO mapred.JobClient: Map input records=410391
>>13/03/21 06:50:25 INFO mapred.JobClient: Reduce shuffle bytes=6539696
>>13/03/21 06:50:25 INFO mapred.JobClient: Spilled Records=912072
>>13/03/21 06:50:25 INFO mapred.JobClient: Map output bytes=31477059
>>13/03/21 06:50:25 INFO mapred.JobClient: CPU time spent (ms)=70850
>>13/03/21 06:50:25 INFO mapred.JobClient: Total committed heap
>>usage (bytes)=
>> 2643394560
>>13/03/21 06:50:25 INFO mapred.JobClient: Combine input records=3281716
>>13/03/21 06:50:25 INFO mapred.JobClient: SPLIT_RAW_BYTES=1612
>>13/03/21 06:50:25 INFO mapred.JobClient: Reduce input records=456036
>>13/03/21 06:50:25 INFO mapred.JobClient: Reduce input groups=44464
>>13/03/21 06:50:25 INFO mapred.JobClient: Combine output records=456036
>>13/03/21 06:50:25 INFO mapred.JobClient: Physical memory (bytes)
>>snapshot=34
>> 75795968
>>13/03/21 06:50:25 INFO mapred.JobClient: Reduce output records=44464
>>13/03/21 06:50:25 INFO mapred.JobClient: Virtual memory (bytes)
>>snapshot=154
>> 14870016
>>13/03/21 06:50:25 INFO mapred.JobClient: Map output records=3281716
>>
>>I wonder if the output really says its running on Lustre and not
>>HDFS.(just being specific).
>>
>>Also, I am trying to see how can I reduce the time of complete
>>execution? Will I need to do anything with stripping of data?
>>
>>
>>On Sat, Mar 23, 2013 at 10:44 PM, linux freaker <linuxfreaker(a)gmail.com>
>>wrote:
>>> You said that I just need to start mapred service at master. So, While
>>> I run jps command I get te following output:
>>>
>>> [root@lustreclient1 hadoop]# jps
>>> 18295 Jps
>>> 15561 JobTracker
>>>
>>> What about the lustreclient2(datanode)? Do I need to run :
>>> bin/start-all.sh
>>>
>>> As of now while I run jps it shows no output which means nothing is
>>> running at datanode end.
>>>
>>> Please suggest.
>>>
>>> On 3/20/13, linux freaker <linuxfreaker(a)gmail.com> wrote:
>>>> I am going to test it today.
>>>> Thanks for confirming.
>>>>
>>>> On 3/20/13, Diep, Minh <minh.diep(a)intel.com> wrote:
>>>>> It would be /mnt/lustre/<some data dir>
>>>>>
>>>>> On 3/19/13 9:48 AM, "linux freaker"
<linuxfreaker(a)gmail.com> wrote:
>>>>>
>>>>>>Thanks.
>>>>>>One more query- I can see that when we run wordcount example in
case
>>>>>>of HDFS, we simply used to run command as:
>>>>>>
>>>>>>
>>>>>>$bin/hadoop jar hadoop*examples*.jar wordcount
/user/hduser/ebooks
>>>>>>/user/hduser/ebooks-output
>>>>>>
>>>>>>Before which we used to copy Local data into HDFS as shown:
>>>>>>
>>>>>>bin/hadoop dfs -copyFromLocal /tmp/ebooks /user/hduser/ebooks.
>>>>>>
>>>>>>If I want to run wordcount example in case of lustre, what would
be
>>>>>>the right approach?
>>>>>>
>>>>>>Please suggest.
>>>>>>
>>>>>>On Tue, Mar 19, 2013 at 9:40 PM, Diep, Minh
<minh.diep(a)intel.com>
>>>>>>wrote:
>>>>>>> Yes, that is correct.
>>>>>>>
>>>>>>> On 3/19/13 9:05 AM, "linux freaker"
<linuxfreaker(a)gmail.com> wrote:
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>
>>>>>>>>Thanks for the quick response.
>>>>>>>>All I understand is:
>>>>>>>>
>>>>>>>>
>>>>>>>>Master Node (NameNode)
>>>>>>>>=====================
>>>>>>>>
>>>>>>>>File: conf/core-site.xml
>>>>>>>>
>>>>>>>><property>
>>>>>>>><name>fs.default.name</name>
>>>>>>>><value>file:///</value>
>>>>>>>></property>
>>>>>>>><property>
>>>>>>>>
>>>>>>>><name>fs.file.impl</name>
>>>>>>>><value>org.apache.hadoop.fs.LocalFileSystem</value>
>>>>>>>>
>>>>>>>><name>hadoop.tmp.dir</name>
>>>>>>>><value>/mnt/lustre/hadoop_tmp/lustrecient1</value>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>File: mapred-site.xml
>>>>>>>>
>>>>>>>>
>>>>>>>><name>mapred.job.tracker</name>
>>>>>>>><value>lustreclient1:9101</value>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Slave Nodes(DataNodes)
>>>>>>>>======================
>>>>>>>>
>>>>>>>>File: conf/core-site.xml
>>>>>>>>
>>>>>>>><property>
>>>>>>>><name>fs.default.name</name>
>>>>>>>><value>file:///</value>
>>>>>>>></property>
>>>>>>>><property>
>>>>>>>>
>>>>>>>><name>fs.file.impl</name>
>>>>>>>><value>org.apache.hadoop.fs.LocalFileSystem</value>
>>>>>>>>
>>>>>>>><name>hadoop.tmp.dir</name>
>>>>>>>><value>/mnt/lustre/hadoop_tmp/lustrecient2</value>
>>>>>>>>
>>>>>>>>File:mapred-site.xml
>>>>>>>>
>>>>>>>>In mapred-site.xml:
>>>>>>>><name>mapred.job.tracker</name>
>>>>>>>><value>lustreclient1:9101</value> <==
Is it correct?
>>>>>>>>
>>>>>>>>Please confirm if the entry is correct?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>On Tue, Mar 19, 2013 at 8:30 PM, Diep, Minh
<minh.diep(a)intel.com>
>>>>>>>> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I would suggest you set this instead.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <name>fs.default.name</name>
>>>>>>>>> <value>file:///</value>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <name>fs.file.impl</name>
>>>>>>>>>
<value>org.apache.hadoop.fs.LocalFileSystem</value>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We set different paths to hadoop.tmp.dir on every
node since they
>>>>>>>>>are
>>>>>>>>> sharing the same space.
>>>>>>>>> On master
>>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>>>
<value>/mnt/lustre/hadoop_tmp/lustrecient1</value>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On slave
>>>>>>>>>
<value>/mnt/lustre/hadoop_tmp/lustrecient2</value>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In mapred-site.xml:
>>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>>> <value>client1:9101</value>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On master, don't start hdfs since you are using
lustre. Start
>>>>>>>>>mapred
>>>>>>>>>only.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> HTH
>>>>>>>>> -Minh
>>>>>>>>>
>>>>>>>>> On 3/19/13 4:48 AM, "linux freaker"
<linuxfreaker(a)gmail.com>
>>>>>>>>>wrote:
>>>>>>>>>
>>>>>>>>>>Hello,
>>>>>>>>>>
>>>>>>>>>>I am in verse to setup Hadoop over
lustre(replacing HDFS).
>>>>>>>>>>I have 1 MDS, 2 OSS/OST and 2 Lustre Client.
>>>>>>>>>>My MDS shows:
>>>>>>>>>>
>>>>>>>>>>[code]
>>>>>>>>>>[root@MDS ~]# lctl list_nids
>>>>>>>>>>10.84.214.185@tcp
>>>>>>>>>>[/code]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Lustre Client shows:
>>>>>>>>>>[code]
>>>>>>>>>>[root@lustreclient1 ~]# lfs df -h
>>>>>>>>>>UUID bytes Used
Available Use%
>>>>>>>>>>Mounted
>>>>>>>>>>on
>>>>>>>>>>lustre-MDT0000_UUID 4.5G 274.3M
3.9G 6%
>>>>>>>>>>/mnt/lustre[MDT:0]
>>>>>>>>>>lustre-OST0000_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:0]
>>>>>>>>>>lustre-OST0001_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:1]
>>>>>>>>>>lustre-OST0002_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:2]
>>>>>>>>>>lustre-OST0003_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:3]
>>>>>>>>>>lustre-OST0004_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:4]
>>>>>>>>>>lustre-OST0005_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:5]
>>>>>>>>>>lustre-OST0006_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:6]
>>>>>>>>>>lustre-OST0007_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:7]
>>>>>>>>>>lustre-OST0008_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:8]
>>>>>>>>>>lustre-OST0009_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:9]
>>>>>>>>>>lustre-OST000a_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:10]
>>>>>>>>>>lustre-OST000b_UUID 5.9G 276.1M
5.3G 5%
>>>>>>>>>>/mnt/lustre[OST:11]
>>>>>>>>>>
>>>>>>>>>>filesystem summary: 70.9G 3.2G
64.0G 5%
>>>>>>>>>>/mnt/lustre
>>>>>>>>>>[/code]
>>>>>>>>>>
>>>>>>>>>>Now I installed Hadoop on two Lustre
Client(untouching MDS and
>>>>>>>>>>OSS).
>>>>>>>>>>
>>>>>>>>>>My core-site.xml shows:
>>>>>>>>>>
>>>>>>>>>>[code]
>>>>>>>>>><property>
>>>>>>>>>><name>fs.default.name</name>
>>>>>>>>>><value>file:///mnt/lustre</value>
>>>>>>>>>></property>
>>>>>>>>>><property>
>>>>>>>>>><name>mapred.system.dir</name>
>>>>>>>>>><value>${fs.default.name}/hadoop_tmp/mapred/system</value>
>>>>>>>>>><description>The shared directory where
MapReduce stores control
>>>>>>>>>>files.
>>>>>>>>>></description>
>>>>>>>>>></property>
>>>>>>>>>>[/code]
>>>>>>>>>>
>>>>>>>>>>My conf/masters shows
>>>>>>>>>>[code]
>>>>>>>>>>lustreclient1
>>>>>>>>>>[/code]
>>>>>>>>>>
>>>>>>>>>>My conf/slaves shows:
>>>>>>>>>>
>>>>>>>>>>[code]
>>>>>>>>>>lustreclient1
>>>>>>>>>>lustreclient2
>>>>>>>>>>
>>>>>>>>>>I have no idea if I need any further configuration
file changes.
>>>>>>>>>>
>>>>>>>>>>Do I need just the above configuration.
>>>>>>>>>>What about hdfs-site.xml and mapred-site.xml?
>>>>>>>>>>_______________________________________________
>>>>>>>>>>HPDD-discuss mailing list
>>>>>>>>>>HPDD-discuss(a)lists.01.org
>>>>>>>>>>https://lists.01.org/mailman/listinfo/hpdd-discuss
>>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>