2011年12月29日星期四

python PIL模組

以 Python Imaging Library 進行影像資料處理

author:	Yung-Yu Chen (yungyuc) http://blog.seety.org/everydaywork/<yyc@seety.org>
copyright:	Copyright 2006, all rights reserved

1 影像與圖形資料的處理

上一回我們談過了圖形介面程式的撰寫，這一次我們要討論圖形 (影像) 本身的處理，而討論的內容將會集中在 Python Imaging Library (PIL) 這一套程式庫上。
PIL 是 Python 下最有名的影像處理套件，由許多不同的模組所組成，並且提供了許多的處理功能，允許我們在簡單的 Python 程式裡進行影像的處理。使用像 PIL 許樣的程式庫套件可以幫助我們把精力集中在影像處理的工作本身，避免迷失在底層的演算法裡面。
由於影像處理牽涉到了大量的數學運算，因此 PIL 中有許多的模組是用 C 語言所寫成的，以提昇處理的效率。不過，在使用的時候，我們當然不必在意這樣的問題，只管放心地用就是了。

1.1 PIL 能為你作的事

PIL 具備 (但不限於) 以下的能力：

數十種圖檔格式的讀寫能力。常見的 JPEG, PNG, BMP, GIF, TIFF 等格式，都在 PIL 的支援之列。另外，PIL 也支援黑白、灰階、自訂調色盤、RGB true color、帶有透明屬性的 RBG true color、CMYK 及其它數種的影像模式。相當齊全。
基本的影像資料操作：裁切、平移、旋轉、改變尺寸、調置 (transpose)、剪下與貼上等等。
強化圖形：亮度、色調、對比、銳利度。
色彩處理。
PIL 提供十數種濾鏡 (filter)。當然，這個數目遠遠不能與 Photoshop® 或 GIMP® 這樣的專業特效處理軟體相比；但 PIL 提供的這些濾鏡可以用在 Python 程式裡面，提供批次化處理的能力。
PIL 可以在影像中繪圖製點、線、面、幾何形狀、填滿、文字等等。

接下來，我們開始一步一步地對 Python/PIL 的影像處理程式設計進行討論。

2 轉換圖檔格式

市面上有許多影像處理程式，一般人最常用它們來處理的工作大概就是圖檔格式轉換了；這是影像處理軟體最基本的功能，PIL 當然也要支援。
假設我們有一個 JPG 檔案，名字叫作 sample01.jpg，那麼，以下的程式會把這個檔案載入 Python：

>>> import Image
>>> im = Image.open( "sample01.jpg" )

im 這個物件是由 Image.open() 方法所產生出來的 Image 物件。我們可以用 Image 物件內的屬性來查詢關於此檔案的資訊：

>>> print im.format, im.size, im.mode
JPEG (2288, 1712) RGB

格式字串放在 format 屬性裡，尺寸放在 size 屬性裡，而 (調色盤) 模式放在 mode 屬性裡。從以上的執行結果，可以看出來我們讀的確實是一個 JPEG 檔案，檔案的尺寸是 2288 像素寬、1712 像素高，調色盤是 RGB 全彩模式。
既然我們已經把圖檔讀入了 Python，要處理它就簡單了。利用 Image 類別的 save() 方法，可以把檔案儲存成 PIL 支援的格式：

>>> im.save( "fileout.png" )

如果圖檔很大，這會花上一點時間。Image.save() 方法會根據欲存檔的副檔名，自動判斷要存圖檔的格式 (剛剛我們用的 open() 函式也會這樣作)。
save() 可以指定存檔格式。在以下的例子裡，我們把存檔格式指定為 JPEG：

>>> im.save( "fileout.png", "JPEG" )

這時候副檔名是無所謂的。
只處理一兩個檔案的時候，使用 Python 直譯器就相當合適。然而若要處理一大群檔案，譬如把一整個目錄的 JPEG 檔轉換為 PNG 檔，那麼寫成一個程式檔會比較方便，例如：

#!/usr/bin/env python

from glob import glob
from os.path import splitext
import Image

jpglist = glob( "python_imaging_pix/*.[jJ][pP][gG]" )

for jpg in jpglist:
    im = Image.open(jpg)
    png = splitext(jpg)[0]+".png"
    im.save(png)
    print png

只要在一個放了 *.jpg 或 *.JPG 檔案的目錄裡面執行這個指令稿，它就會把所有的 JPEG 檔轉成 PNG 檔案：

$ ./convertdir.py
file0001.png
file0002.png
.
.
file9999.png

既然 PIL 會從檔名偵測常用的檔案格式，存檔時我們通常都不會指定存檔格式。
然而，依據檔案格式的不同，save() 方法提供了不同的選項參數。以 JPEG 而言，它可以接受 quality (從 1 到 100 的整數，預設為 75)、optimize (真假值) 及 progression (真假值)。在以下的例子裡，我們以 100 的 quality來儲存 JPEG 檔案：

>>> im.save( "quality100.jpg", quality=100 )

要訣

PIL 也支援 EPS (Encapsulate PostScript) 格式的寫入。TeX 的使用者可以利用 PIL 來簡單地把圖檔轉成 EPS 以供 TeX compiler 使用。

3 改變影像與製作縮圖

在了解了基本的圖檔轉換之後，我們來看看如何對影像進行尺寸方面的修改。PIL 對 Image 物件提供了 resize 方法，以執行影像的縮放工作。用我們的 sample01.jpg 檔案來當例子：

>>> im = Image.open( "sample01.jpg" )
>>> print im.size
(2288, 1712)
>>> width = 400
>>> ratio = float(width)/im.size[0]
>>> height = int(im.size[1]*ratio)
>>> nim = im.resize( (width, height), Image.BILINEAR )
>>> print nim.size
(400, 299)
>>> nim.save( "resized.jpg" )

然後我們就會得到比較小的 resized.jpg：

resize() 這個方法會傳回一個新的 Image 物件，所以舊的 Image 不會被更動。resize() 接受兩個參數，第一個用來指定變更後的大小，是一個雙元素 tuple，分別用以指定影像的寬與高；第二個參數可以省略，是用來指定變更時使用的內插法，預設是 Image.NEAREST (取最近點)，這裡我們指定為品質比較好的 Image.BILINEAR。
resize() 可以把影像放大縮小，在使用時一定要傳入寬與高。上面的程式會先限定新影像的寬，再根據舊影像的長寬比例來算出新影像的高應該是多少，最後把尺寸值傳入 resize() 去。由此可知，resize() 是允許我們不等比例縮放的：

>>> width = 400
>>> height = 100
>>> nim2 = im.resize( (width, height), Image.BILINEAR )
>>> nim2.save( "resize2wide.jpg" )

會得到形狀奇怪的縮圖：

我們可以任意改變新影像的尺寸值。
另一個常用的操作是旋轉；rotate() 方法可以用來旋轉影像。它取兩個參數，第一個參數是一個逆時針的度數，第二個參數則也是影像處理時的內插法，可省略：

>>> nim3 = nim.rotate( 45, Image.BILINEAR )
>>> nim3.save( "rotated.jpg" )

rotate() 並不會改變影像的尺寸 (dimension)，所以你會看到：

出現了黑邊。如果我們想要連影像尺寸一起變動，得要改用 transpose() 方法：

>>> nim4 = nim.transpose( Image.ROTATE_90 )
>>> nim4.save( "transposed90.jpg" )

結果是：

transpose() 方法接受 Image.FLIP_LEFT_RIGHT, Image.FLIP_TOP_DOWN, ROTATE_90, ROTATE_180, ROTATE_270 等五種參數；其中後三種的旋轉均為逆時針。rotate() 方法會對像素資料進行內插；而 transpose() 則只是轉置像素資料，所以沒有內插參數可以設定，也不會影響影像的品質。
縮放與旋轉是最常用的兩個操作，而在其中，縮圖的製作可能是特別常用的；PIL 對縮圖提供了一個方便的thumbnail() 方法。thumbnail() 會直接修改 Image 物件本身，所以速度能比 resize() 更快，也消耗更少的記憶體。它不接受指定內插法的參數，而且只能縮小影像，不能放大影像；用法是：

>>> im = Image.open( "sample01.jpg" )
>>> im.thumbnail( (400,100) )
>>> im.save( "thumbnail.jpg" )
>>> print im.size
(133, 100)

thumbnail() 在接受尺寸參數的時候，行為與 resize() 不同；resize() 允許我們不等比例進行縮放，但thumbnail() 只能進行等比例縮小，並且是以長、寬中比較小的那一個值為基準。因此，上面的程式所作出的thumbnail.jpg 變成了 133*100 的小圖片：

有了這些操作，我們可以很輕易地執行影像管理的任務。

4 修改圖形內容

除了可以針對圖形的尺寸作變更之外，PIL 更提供我們變更影像內容的能力。這樣，我們就不只能對影像進行管理，而能更進一步地利用程式來把影像的內容改成我們想要的樣子。
我們從「貼圖」開始：

>>> baseim = Image.open( "resized.jpg" )
>>> floatim = Image.open( "thumbnail.jpg" )
>>> baseim.paste( floatim, (150, 50) )
>>> baseim.save( "pasted.jpg" )

利用 paste() 方法，把之前作的 thumbnail.jpg 貼到 resized.jpg 裡面去：

此種用法的 paste() 方法要求兩個參數，第一是要貼上的 Image，第二是要貼上的位置。第二個參數有三種指定的方式：

None：不指定位置與尺寸，那麼 pasted() 會假設要貼上的 Image 與被貼上的 Image 的尺寸完全相同。
(left, upper)：雙元素 tuple。pasted() 會把要貼上的 Image 的左上角對齊在指定的位置。
(left, upper, right, lower)：四元素 tuple。paste()` 除了會把 Image 的左上角對齊外，也會對齊右下角。不過基本上這種寫法和上面那一種一樣，因為 paste() 要求要貼上的影像與這裡指定的尺寸一致，所以不可能出現不同的兩組 right, lower。

除了「貼圖」之外，我們還可以對影像的內容進行裁切：

>>> im = Image.open( "sample01.jpg" )
>>> nim = im.crop( (700, 300, 1500, 1300) )
>>> nim.thumbnail( (400,400) )
>>> nim.save( "croped.jpg" )

(因為裁切之後的圖形還是大了點，所以再縮圖一次) 得到的結果是：

crop() 接受的 box 參數指定要裁切的左、上、右、下四個邊界值，形成一個矩形。
除了剪貼之外，PIL 還可以使用內建的濾鏡 (filter) 作一些特效處理。這些濾鏡都放在 ImageFilter 模組裡面，使用前要先匯入這個模組：

>>> import ImageFilter

我們用個例子，對剛剛裁切的 "No Riding" 禁止牌作 20 次 blur (糊化)，來看看 PIL 濾鏡的效果：

>>> im = Image.open( "croped.jpg" )
>>> nim = im
>>> for i in range(20): nim = nim.filter( ImageFilter.BLUR )
...
>>> nim.save( "blured.jpg" )

你應該看不出來它是 "No Riding" 了吧：

使用濾鏡的基本語法是：

newim = im.filter( ImageFilter.FILTERNAME )

其中 FILTERNAME 是 PIL 中支援的濾鏡名稱，目前有：BLUR, CONTOUR, DETAIL, EDGE_ENHANCE, EDGE_ENHANCE_MORE, EMBOSS, FIND_EDGES, SMOOTH, SMOOTH_MORE, SHARPEN，此處就不一一介紹了，但建議你可以自己來把每一個濾鏡都試試看。
利用濾鏡，我們可以對同一類的影像進行相同的特效處理。當然，影像特效需要很精細的調整，在自動化作業中通常只能達到很粗略的效果；但 PIL 既然提供了，我們的自動程序就擁有更多的工具可以使用。

5 用 PIL 製作新影像

除了對已存在的影像進行編修之外，從零開始建立新影像也是很重要的工作。PIL 中的 ImageDraw 模組提供給我們繪製影像內容的能力。在使用 ImageDraw 之前，要先建立好空白的新影像：

>>> import ImageDraw
>>> im = Image.new( "RGB", (400,300) )
>>> draw = ImageDraw.Draw( im )

最後建出來的 draw 是一個 ImageDraw 物件會提供各種繪製影像的方法。針對幾何圖形，draw 物件提供 arc() (弧線)、chord() (弦)、line() (線段)、ellipse() (橢圓)、point() (點)、rectangle() (矩形) 與 polygon() (多邊形)。不過，我們不準備討論幾何圖形的繪製；相信這些方法的使用對一般人來說應該都很直覺才是。

要訣

你可以在指令行輸入 pydoc ImageDraw.ImageDraw.<<methodname>> 來查詢上述方法 (<<methodname>>) 的說明，譬如 pydoc ImageDraw.ImageDraw.line。

這裡要介紹的不是幾何圖形，而是文字的繪製。我們要再介紹一個模組：ImageFont，並且以實例來說明如何用 PIL 「寫字」：

>>> import Image, ImageDraw, ImageFont
>>> font = ImageFont.truetype( \
... "/usr/share/fonts/truetype/freefont/FreeMono.ttf", 24 )
>>> im = Image.new( "RGB", (400,300) )
>>> draw = ImageDraw.Draw( im )
>>> draw.text( (20,20), "TEXT", font=font )
>>> im.save( "text.jpg" )

這樣就在一個黑色底圖上用白筆寫了 "TEXT" 四個大字：

接著一一說明剛剛作的動作。首先我們用 ImageFont 的 truetype() 函式建立了一個 TrueType 字型，大小設定為 16 點。truetype() 函式的第一個參數必須是字型檔的搜尋路徑，第二個參數是字型的點數。然後依序建立影像物件與 draw 物件。寫字的動作用 draw 物件的 text() 方法來完成，它接受兩個參數，分別是文字的左上角點、字串，另外可以用 font 選項來指定所使用的字型 (若不指定，便使用預設字型)。
在 1.1.4 版之前，PIL 是只能使用點陣字型的。現在 PIL 加入了 TrueType 向量字型的支援，對於要「寫字」的人來說實在是一大福音。對點陣字來說，想改變字型的大小得要更換字型才作得到，但 TrueType 就沒有這個限制。如果我們想要寫出兩串不同大小的文字，這樣作就可以了：

>>> largefont = ImageFont.truetype( \
... "/usr/share/fonts/truetype/freefont/FreeMono.ttf", 48 )
>>> smallfont = ImageFont.truetype( \
... "/usr/share/fonts/truetype/freefont/FreeMono.ttf", 24 )
>>> im = Image.new( "RBG", (400,300) )
>>> draw = ImageDraw.Draw( im )
>>> draw.text( (20,20), "SmallTEXT", font=smallfont )
>>> draw.text( (20,120), "LargeTEXT", font=largefont )
>>> im.save( "multitext.jpg" )

結果如：

以上就是在 PIL 裡建立文字圖形的方法。
最後，我們要說明如何改變繪製圖形 (文字) 時的顏色；繪圖時畫筆的顏色是透過 draw 物件的 ink 屬性來改變的：

>>> draw.ink = 0 + 255*256 + 0*256*256

以上會把畫筆設成綠色。ink 值必須要是一個整數，其值由色彩的 RGB 值算出。舉幾個 ink 值的例子：

紅色的 ink 值應設為 255(R) + 0(G)*256 + 0(B)*256*256，
藍色的 ink 值應設為 0(R) + 0(G)*256 + 255(B)*256*256，
靛色的 ink 值應設為 0(R) + 255(G)*256 + 255(B)*256*256

所設定的 ink 會影響所有後續的繪圖動作。

6 結語

本文介紹了方便好用的 PIL 套件，可以讓我們用 Python 撰寫影像處理的程式。我們對圖檔的格式處理、尺寸處理以及內容的編修都作了討論，最後也說明如何從零開始創作一個影像。
對網頁程式來說，動態產生簡單的影像是特別有用的功能，可以用來補足 HTML 與 CSS 的不足之處。利用 PIL 來執行批次影像處理的工作，更能省去我們許多的操作時間。相信讀者能從其中發現它所提供的生產力。
在下一期的內容裡，我們要開始介紹 Python 的網頁程式設計。

詳細請見PIL官方網站介紹
http://www.pythonware.com/library/pil/handbook/imagedraw.htm

2011年12月25日星期日

Python Encryption Examples

Chilkat • HOME • Android™ • ASP • Visual Basic • VB.NET • C# • iOS (IPhone) • Objective-C • C++ • C • MFC • Delphi • FoxPro • Java • Perl •
PHP Extension • PHP ActiveX • Python • PowerShell • Ruby • SQL Server • VBScript

Python Examples

Quick Start
Unicode
Byte Array
Bz2
Certificates
CSV
Email
Encryption
FTP
HTML Conversion
HTTP
IMAP
MHT
MIME
POP3
RSA
S/MIME
Signatures
Socket / SSL
SFTP
SMTP
Spider
SSH Key
SSH
SSH Tunnel
Tar
HTTP Upload
XML
XMP
Zip

More Examples...
String
Amazon S3
Email Object
DKIM / DomainKey
NTLM
FileAccess
RSS
Atom
Self-Extractor
Service
PPMD
Deflate
DH Key Exchange
DSA
Bzip2
LZW

Python Encryption Examples

2011年12月9日星期五

python logging 模組 and logging 101

簡易整理：

logger

handler

filter

formatter

python logging模块学习 (2011-06-01 12:31)

http://blog.chinaunix.net/space.php?uid=429659&do=blog&id=349629

标签: python logging 分类： python

在python的logging模块中主要有四个组件：
logger: 日志类，应用程序往往通过调用它提供的api来记录日志。
handler: 对日志信息处理，可以将日志发送(保存)到不同的目标域中。
filter: 对日志信息进行过滤。
formatter:日志的格式化。

下面写了一个简单的脚本试一下各个组件的功能

import logging
#创建两个日志类
LOG1=logging.getLogger('a.b.c')
LOG2=logging.getLogger('d.e')
#创建handler对象
console = logging.FileHandler('/home/dwapp/joe.wangh/test/logging/test.log','a')
#设置日志输出信息的格式
formatter = logging.Formatter('%(name)s %(asctime)s %(levelname)s %(message)s')
console.setFormatter(formatter)
#设置过滤器可以设置多个过滤器只要日志信息不满足其中任何一个就不会被输出
filter=logging.Filter('a.b')
#console.addFilter(filter)
#给两个日志类绑定handler对象
LOG1.addHandler(console)
LOG2.addHandler(console)
#设置输出日志信息的等级这里设为logging.INFO 意味着只输出高于logging.INFO等级的日志信息
LOG1.setLevel(logging.INFO)
LOG2.setLevel(logging.DEBUG)
#输出一些日志信息
LOG1.debug('debug')
LOG1.info('info')
LOG1.warning('warning')
LOG1.error('error')
LOG1.critical('critical')
LOG2.debug('debug')
LOG2.info('info')
LOG2.warning('warning')
LOG2.error('error')
LOG2.critical('critical')

运行一下看看结果

dwapp@pttest1:/home/dwapp/joe.wangh/test/logging>python t1.py
dwapp@pttest1:/home/dwapp/joe.wangh/test/logging>cat test.log
a.b.c 2010-11-24 18:53:20,160 INFO info
a.b.c 2010-11-24 18:53:20,183 WARNING warning
a.b.c 2010-11-24 18:53:20,183 ERROR error
a.b.c 2010-11-24 18:53:20,183 CRITICAL critical
d.e 2010-11-24 18:53:20,183 DEBUG debug
d.e 2010-11-24 18:53:20,183 INFO info
d.e 2010-11-24 18:53:20,184 WARNING warning
d.e 2010-11-24 18:53:20,184 ERROR error
d.e 2010-11-24 18:53:20,184 CRITICAL critical

把#console.addFilter(filter)的注释取消掉再执行一遍

dwapp@pttest1:/home/dwapp/joe.wangh/test/logging>python t1.py
dwapp@pttest1:/home/dwapp/joe.wangh/test/logging>
dwapp@pttest1:/home/dwapp/joe.wangh/test/logging>cat test.log
a.b.c 2010-11-24 18:54:33,264 INFO info
a.b.c 2010-11-24 18:54:33,287 WARNING warning
a.b.c 2010-11-24 18:54:33,287 ERROR error
a.b.c 2010-11-24 18:54:33,287 CRITICAL critical

我们看到这次以d.e开头的日志信息都被过滤掉了

下面附上formatter的一些信息

%(name)s
Logger的名字

%(levelno)s
数字形式的日志级别

%(levelname)s
文本形式的日志级别

%(pathname)s
调用日志输出函数的模块的完整路径名，可能没有

%(filename)s
调用日志输出函数的模块的文件名

%(module)s
调用日志输出函数的模块名

%(funcName)s
调用日志输出函数的函数名

%(lineno)d
调用日志输出函数的语句所在的代码行

%(created)f
当前时间，用UNIX标准的表示时间的浮点数表示

%(relativeCreated)d
输出日志信息时的，自Logger创建以来的毫秒数

%(asctime)s
字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒

%(thread)d
线程ID。可能没有

%(threadName)s
线程名。可能没有

%(process)d
进程ID。可能没有

%(message)s
用户输出的消息

Python Logging 101

Introduction

The Merriam-Webster dictionary definition of the verb to log is:

To make a note or record of : enter details of or about in a log

At its most basic, a log is a list of events which may be of interest to someone in the future. In the context of software development, a log is a list of events of interest which occur during execution of an application program. (I use the word events as a synonym for happenings, rather than in the sense of mouse movements or keystrokes.) You can define an event quite simply, by asking the questions

What happened?
When did it happen?
Where did it happen (in which area of the application)?
How important is it?

The first three of these are objective, but the last is subjective. Lots of things happen during program execution, but only some of these are worth recording in a log. Generally, the developers of an application are best placed to decide which events are worth logging, and of their relative importance in the scheme of things.
Typically, logs are read by developers (when tracking down bugs), system administrators (as part of routine monitoring of systems), support desk staff (when dealing with particular support issues), and even end users (when they see an error message). Events that are of interest to one audience may not be interesting to another. The audience might even be dispersed across different geographical locations. A good logging system caters for all of these audiences by allowing:

Developers to easily record events - what happened, when, where and how important it is.
Flexible dissemination of event information to wherever an interested reader (whether human or another program) can read it.
Flexibility of configuration, so that recording of events can be made conditional on where they occurred and how important they are, without changing the application code.

Basic logging of errors to text files and system logs is an old technique, but not very flexible. In this post, I introduce a logging system for the Python programming language. This system, while it borrows ideas from other systems, is not a port of anything but an independent implementation for use by Python developers.The logging package has been part of Python since Python 2.3 (released in 2002).
I'll cover each of the above points - recording events, disseminating events and configuring the system.

Recording events

Recalling that an event was defined above in terms of four dimensions, the following table shows how these dimensions are provided to the logging API.

What happened	This is passed in using a formatting string with optional arguments
When it happened	This is not passed in explicitly. The module assumes that the time you called the logging API was the time of the event.
Where it happened	This is specified in terms of a logging channel or logger name (defined below).
How important it is	This is specified in terms of an integer level (defined below).

Loggers

A logging channel indicates an area of an application. How an area is defined is up to the developer. Since an application can have any number of areas, logging channels are identified by a unique string. Application areas can be nested (e.g. an area of input processing might include sub-areas read CSV files, read XLS files and read Gnumeric files). To cater for this nesting, channel names are organized into a namespace hierarchy where levels are separated by periods, much like Java or Python package namespaces. In the above instance, channel names might be "input" for the upper level, and "input.csv", "input.xls" and "input.gnu" for the sub-levels. Having determined the relevant application area, an application obtains a reference to a logger object, which is specific to a particular channel name:

logger = logging.getLogger("input.xls")

You can call the above code from any module or function, and it will always return a reference to the same logger. This avoids the need to pass logger references between functions. If this is the first call specifying a particular channel name, then a new logger is created and returned to you; the same logger is returned on subsequent calls with that channel name. The namespace hierarchy maps onto an equivalent hierarchy of loggers. The logger named "input.csv" would have a parent logger named "input". There is no requirement to instantiate all the loggers implied by a particular namespace - the system does that as and when needed. At the top of the hierarchy is a root logger which is created automatically, and used like any other logger. Once you have obtained a reference to a logger, you are almost ready to start logging.

Levels

By default, there are five levels of importance associated with logging events. Experience has shown that having more levels is unhelpful, since the choice of which level to assign to an event becomes subjective. The five levels are DEBUG, INFO, WARNING, ERROR and CRITICAL. Their significance is described in the following table.

DEBUG	Detailed information, of no interest when everything is working well but invaluable when diagnosing problems.
INFO	Affirmations that things are working as expected, e.g. "service has started" or "indexing run complete". Often ignored.
WARNING	There may be a problem in the near future, and this gives advance warning of it. But the application is able to proceed normally.
ERROR	The application has been unable to proceed as expected, due to the problem being logged.
CRITICAL	This is a serious error, and some kind of application meltdown might be imminent.

The above categories will cater for most scenarios, though the module allows an application developer to define their own custom levels if they really need to. Using the logging module couldn't be simpler. In any module which uses logging, you simply include the statement

import logging

and you can then log away:

inputLogger = logging.getLogger("input")
csvLogger = logging.getLogger("input.csv")
csvLogger.debug("Trying to read file '%s'" % filename)
csvLogger.warning("File '%s' contains no data", filename)
csvLogger.error("File '%s': unexpected end of file at line %d, offset %d", filename, lineno, offset)
csvLogger.critical("File '%s': too large, not enough memory, amount used = %d", filename, memused)

If you are handling an exception, you can provide for a stack trace to be inserted into the log, as indicated by the three equivalent statements in the following listing. When the logger sees the exc_info argument, it treats it as an indication that traceback information is desirable.

logger.exception("Error reading file '%s' at offset %d", filename, offset)
logger.error("Error reading file '%s' at offset %d", filename, offset, exc_info=1)
logger.log(logging.ERROR, "Error reading file '%s' at offset %d", filename, offset, exc_info=1)

Disseminating events

So far, so good. We've got hold of some loggers, and told them about various events at various levels of importance. What happens to those logged events? That's wherehandlers come in. Recall that a key component of a good logging system is getting the events to somewhere where interested parties can read them. In today's heterogeneous computing environments, where NT rubs shoulders with UNIX and multiple applications need to work together to deliver complex requirements, there are often requirements for event log information to be available on machines other than where the events were generated.
Handlers allow flexible dissemination of logging events. To see how flexible, the following shows a list of the handlers currently provided with Python logging.

Handler	How it's used
StreamHandler	Used to write to an output stream, typically sys.stdout or sys.stderr
FileHandler	Inherits from StreamHandler to allow writing to a disk file.
RotatingFileHandler	Used for logging to a set of files, switching from one file to the next when the current file reaches a certain size.
TimedRotatingFileHandler	Used for logging to a set of files, switching from one file to the next at specified times.
SocketHandler	Used to send the events, via a socket, to a remote server listening on a TCP port.
DatagramHandler	Similar to SocketHandler, except that UDP sockets are used. There's less overhead but less reliability.
SMTPHandler	Used to send the events to designated e-mail addresses.
SysLogHandler	Used to send the events to a UNIX/Linux syslog.
NTEventLogHandler	Used to send the events to an NT event log.
HTTPHandler	Used to post the events to a Web server.
MemoryHandler	Used to buffer events in memory until a trigger is received, at which point the events are sent to another handler to deal with.
NullHandler	Used in library code which uses logging to avoid misconfiguration messages when used in an application which doesn't configure logging.

These classes inherit from the base classes Handler and BufferingHandler, depending on whether they need to deal with events one at a time or whether they process events in batches.
Handlers are associated with loggers in a flexible way. A logger can have handlers associated with it directly by means of the addHandler method (with a corresponding removeHandler method for dissociation, if required). But not every logger needs to be associated with a handler. This is because whenever a logging event is passed to a logger, the logger and all of its parents are searched for handlers, and ALL these handlers are asked to handle the event. In order to get output from the logging system, all that is needed is that one or more handlers be associated with the root logger, and all loggers will automatically use those handlers.
If no logger has any handlers, the system will indicate this to sys.stderr (as it's assumed to be a misconfiguration) and then keep quiet.
If it is desired for a particular logger that handler search does not propagate to its parents, then the propagate attribute of the logger can be set to 0. If this is done, the search for handlers will stop at that logger and not continue upwards through the hierarchy.

A Handler usage scenario

Suppose you want some events to be brought to the attention of developers, others operations staff, and yet others to the user. Then one way you could go about this is:

Configure an SMTPHandler with a developer email address, configured to pass ERROR events or worse.
Configure an SMTPHandler with a support desk email address, configured to pass CRITICAL events or worse.
Configure a FileHandler to pass all DEBUG events.
Configure your loggers however you want. For example, in production, you can set the topmost logger's level to WARNING, and you'll never see DEBUG messages anywhere. If you find there's a problem with a specific area of the application, you can set the logger for that area to send DEBUG events. Then you will see all events with levels >= DEBUG from that area, but only events with levels >= WARNING from the other areas.

SMTPHandlers are generally used to notify people about problems which need urgent handling. If there's no urgency, then it's appropriate just to configure FileHandlers. It's not uncommon to have one log file for errors only and another log file which also includes DEBUG and INFO events.

Configuring the level of output

Handlers and loggers are all very well, but we need to control the verbosity of output generated by the logging system. When a developer writes an application, they should add logging calls for every conceivable event an audience may be interested in. After all, if they don't do this, those events would never be captured. During application testing, most of these events will probably be logged at some point. However, when the application is shipped, suppose that a bug is reported. No one wants to wade through reams of detail looking for the key pointers to the problem - so the verbosity has to be turned right down across the application. As developers home in on the problem, they would like to be able to turn the verbosity up and down selectively in particular areas, to see what is happening. They should be able to do this without changing the application code in any way.
The most basic form of verbosity control is provided by setting a threshold level on loggers, handlers or both. Both Logger and Handler have a setLevel method which takes a level and associates it with the logger or handler. When a logging call is made, if the logger's threshold is above the level of the call, no event is actually generated. For example, if a logger's threshold is set to ERROR, then debug(), info() and warning() calls on that logger do not generate any events, but error() and critical() do. Similarly, if a handler's threshold is set above the level of an event which is passed to it for handling, the handler ignores the event.
From a performance point of view, it is better to apply thresholds at the logger level than the handler level. However, there may be times when levels must be applied at the handler level to get the desired effect.
There is also an overall, high-level filter which can be applied to all loggers at one stroke. The module contains a function disable() which takes a level argument and acts as a threshold for all loggers. This setting is checked before the logger's own level setting.

Filters

If simple level-based filtering of the kind described above is not enough, then you can instantiate Filter objects and associate them with both loggers and handlers. A filter object has a filter() method which is passed an event and which returns a value indicating whether the event is to be processed. Multiple filters can be associated with loggers and handlers, which both inherit from a base class Filterer having addFilter(), removeFilter() and filter() methods. The Filterer.filter() method calls the filter() method on all attached filters until all have seen the event, or until one of the filters rejects the event. If the event comes through unscathed, it is processed.

LogRecords

In the discussion so far, we have talked in general terms about events. But how is an event represented? In this system, events are described by LogRecord instances. The LogRecord class has minimal functionality, acting as a repository for all the information of interest in an event; its single method getMessage() is intended to provide a hook for converting the message and arguments (passed to it at creation time) into a string representation of the event.
The main information passed in a logging call is the level, a message and arguments for use with that message. All of this information is held in the LogRecord. There is also additional information which the system generates automatically, and all of the information in a LogRecord can appear in the final output from the system. Here's a list of the information currently held in a LogRecord:

Logger name (logging channel - e.g. "input.csv")
Event level (e.g. DEBUG)
Event level text (e.g. "DEBUG")
Pathname of source file from where the logging call was issued
Filename of source file from where the logging call was issued
Line number in source file where the logging call was issued
Name of function from which logging call was issued
Time when the event (LogRecord) was created (as seconds after the epoch)
Millisecond portion of the creation time
Creation time relative to when the logging module was loaded (typically, application startup time)
Thread id (available if threading is available on your platform)
Process id

Message Objects

In the preceding discussion and examples, it has been assumed that the message passed when logging the event is a string. However, this is not the only possibility. You can pass an arbitrary object as a message, and its __str__() method will be called when needed to convert it to a string representation. In fact, if you want to, you can avoid computing a string representation altogether - for example, the SocketHandler emits an event by pickling it and sending it over the wire.

Event Processing

Let's assume that an event has been generated, and needs to be processed. What do I mean by processing? All that's left to do is to format the event appropriately for the target audience, and then output it to the relevant sink.
In almost all cases, this means formatting the event into some form of text string. The actual formatting is done at the last moment, to avoid unnecessary processing. Formatting is controlled through the use of Formatter objects, which are associated with handlers using the setFormatter() method.

Controlling Output Formats

Formatters control the formatting of an event into text. There are two base classes - Formatter (which works on single events) and BufferingFormatter (which works on a set of events, and includes the ability to add header and trailer text). Formatters use Python's powerful % operator (similar to C's sprintf) to format the output flexibly.
Formatters know how a LogRecord is laid out - they know the field names. If you specify a format string such as "%(asctime)s %(level)-5s %(message)s", this would output the time, level of the logging event and the user's message (itself obtained by evaluating msg % args where msg and args were specified by the user).
Formatters are initialized with a format string and an optional date/time format string. The latter is used as an argument to the standard library strftime function. Formatters try to avoid needless processing - for example, the creation time is formatted into text using the date format string only if the main format string contains "%(asctime)s". If exception information is required and available, it is formatted using the standard traceback module and appended to the formatted string. It's also cached for subsequent format operations.

Optimization

Formatting of message arguments is deferred until it cannot be avoided. However, computing the arguments passed to the logging method can also be expensive, and you may want to avoid doing it if the logger will just throw away your event. To decide what to do, you can call the isEnabledFor method which takes a level argument and returns true if the event would be created by the Logger for that level of call. You can write code like this:

if logger.isEnabledFor(logging.DEBUG):
    logger.debug("Message with %s, %s", expensive_func1(), expensive_func2())

so that if the logger's threshold is set above DEBUG, the calls to expensive_func1 and expensive_func2 are never made.
There are other optimizations which can be made for specific applications which need more precise control over what logging information is collected. Here's a list of things you can do to avoid processing during logging which you don't need:

You don't want information about where calls were made from.	Set logging._srcfile to None.
You don't want threading information in the log.	Set logging.logThreads to 0.
You don't want process information in the log.	Set logging.logProcesses to 0.

Also note that the core logging module only includes the basic handlers. If you don't import logging.handlers and logging.config, they stay out of your way.

Threading

The logging module doesn't use threads itself, but it ensures that it is thread-safe by using a reentrant lock (threading.RLock) to serialize access to internal data structures, as well as to I/O handlers. Each handler instance gets a reentrant lock, and each operation to emit an event is bracketed with a lock acquisition and release.

Convenience Functions

For casual users of the logging system, or Python novices, it may be too much trouble to create loggers in a namespace hierarchy. For such modes of use, the module defines module level functions debug(), info(), warning(), error() and critical() which delegate to the root logger. If no handler has been configured for this logger, then basicConfig() is called to attach a handler to the root logger. This means that the very simplest use of the logging module is as indicated:

import logging
logging.debug("Here's some %s information about %s", "debugging", "something")
logging.info("Here's some %s", "information")
logging.warn("This is your first %s", "warning")
logging.error("To %s is human", "err")
logging.critical("The situation is getting %s", "critical")
logging.exception("Please add a %s to this message", "stack traceback")

What's with the config stuff?

The configuration system in the logging.config is a very basic implementation of a one-shot configuration, not likely to be of much help in sophisticated usage but providing some value for novice and/or casual users. If you don't want to use it, don't - the entire logging system can be programmatically configured, and if you want you can have your own configuration file format and load that using your own code, making logging API calls to actually do the configing of loggers, handlers etc. The implementation uses ConfigParser (in order to avoid having yet another configuration meta-format) and so the format is more verbose than it could be.

訂閱：文章 (Atom)

stayhigh, Python

2011年12月29日星期四

python PIL模組

以 Python Imaging Library 進行影像資料處理

1 影像與圖形資料的處理

1.1 PIL 能為你作的事

2 轉換圖檔格式

3 改變影像與製作縮圖

4 修改圖形內容

5 用 PIL 製作新影像

6 結語

2011年12月25日星期日

Python Encryption Examples

2011年12月9日星期五

python logging 模組 and logging 101

Python Logging 101

Introduction

Recording events

Loggers

Levels

Disseminating events

A Handler usage scenario

Configuring the level of output

Filters

LogRecords

Message Objects

Event Processing

Controlling Output Formats

Optimization

Threading

Convenience Functions

What's with the config stuff?

關於我自己

stayhigh, Python

2011年12月29日 星期四

以 Python Imaging Library 進行影像資料處理

2011年12月25日 星期日

2011年12月9日 星期五

Python Logging 101

Introduction

Recording events

Loggers

Levels

Disseminating events

A Handler usage scenario

Configuring the level of output

Filters

LogRecords

Message Objects

Event Processing

Controlling Output Formats

Optimization

Threading

Convenience Functions

What's with the config stuff?

2011年12月29日星期四

2011年12月25日星期日

2011年12月9日星期五