2012年3月28日 星期三

mapper.py and reducer.py python code for hadoop streaming

程式碼如下圖:左方的程式碼為mapper.py,右方的程式碼為reducer.py
我的data檔案:你好我是中文中文
在自己的電腦上面執行cat data|python mapper.py|sort|python reducer.py可順利執行
但在hadoop中卻無法順利執行,
執行hadoop指令
hadoop jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2-cdh3u3.jar -file mapper.py -mapper mapper.py
  -file $reducer.py -reducer reducer.py -input /user/stayhigh/ -output $4

下面給出錯誤訊息:

12/03/18 10:56:50 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
packageJobJar: [mapper.py, reducer.py, /var/lib/hadoop-0.20/cache/stayhigh/hadoop-unjar8981423723230921443/] [] /tmp/streamjob2748145161211328089.jar tmpDir=null
12/03/18 10:56:50 WARN snappy.LoadSnappy: Snappy native library is available
12/03/18 10:56:50 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/03/18 10:56:50 INFO snappy.LoadSnappy: Snappy native library loaded
12/03/18 10:56:50 INFO mapred.FileInputFormat: Total input paths to process : 1
12/03/18 10:56:51 INFO streaming.StreamJob: getLocalDirs(): [/var/lib/hadoop-0.20/cache/stayhigh/mapred/local]
12/03/18 10:56:51 INFO streaming.StreamJob: Running job: job_201203121725_0761
12/03/18 10:56:51 INFO streaming.StreamJob: To kill this job, run:
12/03/18 10:56:51 INFO streaming.StreamJob: /usr/lib/hadoop-0.20/bin/hadoop job  -Dmapred.job.tracker=192.168.11.100:8021 -kill job_201203121725_0761
12/03/18 10:56:51 INFO streaming.StreamJob: Tracking URL: http://hadoop:50030/jobdetails.jsp?jobid=job_201203121725_0761
12/03/18 10:56:52 INFO streaming.StreamJob:  map 0%  reduce 0%
12/03/18 10:56:54 INFO streaming.StreamJob:  map 50%  reduce 0%
12/03/18 10:57:02 INFO streaming.StreamJob:  map 50%  reduce 17%
12/03/18 10:57:12 INFO streaming.StreamJob:  map 100%  reduce 100%
12/03/18 10:57:12 INFO streaming.StreamJob: To kill this job, run:
12/03/18 10:57:12 INFO streaming.StreamJob: /usr/lib/hadoop-0.20/bin/hadoop job  -Dmapred.job.tracker=192.168.11.100:8021 -kill job_201203121725_0761
12/03/18 10:57:12 INFO streaming.StreamJob: Tracking URL: http://hadoop:50030/jobdetails.jsp?jobid=job_201203121725_0761
12/03/18 10:57:12 ERROR streaming.StreamJob: Job not successful. Error: NA
12/03/18 10:57:12 INFO streaming.StreamJob: killJob...
Streaming Command Failed!

2012年3月21日 星期三

boundmethod,classmethod,staticmethod 分別

boundmethod classmethod staticmethod
classname Foo.method(ff) Foo.method() Foo.method()
instance ff.method() foo.method() ff.method()


    classmethod和staticmethod的差異:
classmethod 綁定第一個參數是類別物件 
            第一個參數是cls 可用來access 類別變數
staticmethod 不綁定第一個參數是類別物件

詳細差異可參考此篇:http://caterpillar.onlyfun.net/Gossip/Python/StaticClassMethod.html

ff = Foo()
ff是實體
Foo是類別名



#-*- coding:utf8 -*-
# classmethod_and_staticmethod.py


class Foo(object):
    def test(self): #bound method
        print ("object")


    @classmethod
    def test2(clss):#class method
        print ("class")
    @staticmethod   
    def test3():    #static method
        print ("static")






ff = Foo() #Foo是類別名,ff是實體名


#call bound method => 類別名.方法名(實體名)|實體名.方法名
Foo.test(ff)
ff.test()
#call classmethod  => 類別名或實體名.方法名
Foo.test2()
ff.test2()
#call staticmethod => 類別名或實體名.方法名
Foo.test3()
ff.test3()



#如果Foo有了子類別覆蓋了父類的classmethod,staticmethod,bound method ,最終會調用子類的方法並傳遞的是子類的類對象
class Foo2(Foo):
    def test(self):
        print ("boundmethod overwrite")
    @classmethod
    def test2(clz):
        print clz
        print "classmethod overwrite"
    @staticmethod
    def test3():
        print ("staticmethod overwrite")


foo2 = Foo2()
foo2.test() #show "boundmethod overwrite"
foo2.test2()#show "classmethod overwrite"
foo2.test3()#show "staticmethod overwrite"

python decorator 把函數當成物件做處理

python decorator 可稱為描述器
當我們有函數時 想要修改函數的功能 最直接的方式就是以修飾該函數的實作程式碼
但往往如此做非常費工夫 所以有了decorator的觀念
如果們將函數視為物件 將它傳入decorator修飾,並返回一個函數




decorator:修飾函數的函數




def decof1(inputf):
    return lambda:str(inputf()) + " :decorated" # decorator must return a function
    
@decof1
def f1():
    return 1
def f2():
    return 2




#It is equivalent to operate this statement => ' f1 = decof1(f1) '
print (f1()) # decof1(f1)