python word 1_Python word | 学步园
這里測試的環境是:windows xp,office 2007,python 2.5.2,pywin32 build
213,原理是利用win32com接口直接調用office
API,好處是簡單、兼容性好,只要office能處理的,python都可以處理,處理出來的結果和office word里面“另存為”一致。
#!/usr/bin/env python
#coding=utf-8
from
win32com import
client as
wc
word = wc.Dispatch
(
'Word.Application'
)
doc = word.Documents
.Open
(
'd:/labs/math.doc'
)
doc.SaveAs
(
'd:/labs/math.html'
, 8
)
doc.Close
(
)
word.Quit
(
)
關鍵的就是doc.SaveAs(‘d:/labs/math.html’,
8)這一行,網上很多文章寫成:doc.SaveAs(‘d:/labs/math.html’,
win32com.client.constants.wdFormatHTML),直接報錯:
AttributeError: class Constants has no attribute ‘wdFormatHTML’
當然你也可以用上面的代碼將word文件轉換成任意格式文件(只要office 2007支持,比如將word文件轉換成PDF文件,把8改成17即可),下面是office 2007支持的全部文件格式對應表:
wdFormatDocument = 0
wdFormatDocument97 = 0
wdFormatDocumentDefault = 16
wdFormatDOSText = 4
wdFormatDOSTextLineBreaks = 5
wdFormatEncodedText = 7
wdFormatFilteredHTML = 10
wdFormatFlatXML = 19
wdFormatFlatXMLMacroEnabled = 20
wdFormatFlatXMLTemplate = 21
wdFormatFlatXMLTemplateMacroEnabled = 22
wdFormatHTML = 8
wdFormatPDF = 17
wdFormatRTF = 6
wdFormatTemplate = 1
wdFormatTemplate97 = 1
wdFormatText = 2
wdFormatTextLineBreaks = 3
wdFormatUnicodeText = 7
wdFormatWebArchive = 9
wdFormatXML = 11
wdFormatXMLDocument = 12
wdFormatXMLDocumentMacroEnabled = 13
wdFormatXMLTemplate = 14
wdFormatXMLTemplateMacroEnabled = 15
wdFormatXPS = 18
照著字面意思應該能對應到相應的文件格式,如果你是office
2003可能支持不了這么多格式。word文件轉html有兩種格式可選wdFormatHTML、wdFormatFilteredHTML(對應數字
8、10),區別是如果是wdFormatHTML格式的話,word文件里面的公式等ole對象將會存儲成wmf格式,而選用
wdFormatFilteredHTML的話公式圖片將存儲為gif格式,而且目測可以看出用wdFormatFilteredHTML生成的HTML
明顯比wdFormatHTML要干凈許多。
當然你也可以用任意一種語言通過com來調用office API,比如PHP.
=========================================
使用 python 寫 COM
2009年09月03日 星期四 下午 07:01
Python 支持Com調用(client com) 以及撰寫COM 組件(server com).
1. com 調用示例(使用Windows Media Player 播放音樂)
from
win32com.client
import
Dispatch
mp
=
Dispatch(
"
WMPlayer.OCX
"
)
tune
=
mp.newMedia(
"
C:/WINDOWS/system32/oobe/images/title.wma
"
)
mp.currentPlaylist.appendItem(tune)
mp.controls.play()
class
PythonUtilities:
_public_methods_
=
[
'
SplitString
'
]
_reg_progid_
=
"
PythonDemos.Utilities
"
#
NEVER copy the following ID
#
Use "print pythoncom.CreateGuid()" to make a new one.
_reg_clsid_
=
"
{41E24E95-D45A-11D2-852C-204C4F4F5020}
"
def
SplitString(self, val, item
=
None):
import
string
if
item
!=
None: item
=
str(item)
return
string.split(str(val), item)
#
Add code so that when this script is run by
#
Python.exe, it self-registers.
if
__name__
==
'
__main__
'
:
"
Registering COM server
"
import
win32com.server.register
win32com.server.register.UseCommandLine(PythonUtilities)
- 注冊/注銷Com
Command-Line Option
Description
The default is to register the COM objects.
--unregister
Unregisters the objects. This removes all references to the objects from the Windows registry.
--debug
Registers the COM servers in debug mode. We discuss debugging COM servers later in this chapter.
--quiet
Register (or unregister) the object quietly (i.e., don't report success).
- 使用COM
可以在python 命令行下運行
>>>
import
win32com.client
>>>
s
=
win32com.client.Dispatch(
"
PythonDemos.Utilities
"
)
>>>
s.SplitString(
"
a,b,c
"
,
"
,
"
)
((u
'
a
'
, u
'
a,b,c
'
),)
>>>
3. python server com 原理
其實在注冊表中查找到python com 的實現內幕
Windows Registry Editor Version
5.00
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}
]
@
=
"
PythonDemos.Utilities
"
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/Debugging
]
@
=
"
0
"
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/Implemented Categories
]
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/Implemented Categories/{B3EF80D0-68E2-11D0-A689-00C04FD658FF}
]
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/InprocServer32
]
@
=
"
pythoncom25.dll
"
"
ThreadingModel
"
=
"
both
"
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/LocalServer32
]
@
=
"
D://usr//Python//pythonw.exe /
"
D://usr//Python//lib//site-packages//win32com//server//localserver.py/
"
{41E24E95-D45A-11D2-852C-204C4F4F5020}
"
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/ProgID
]
@
=
"
PythonDemos.Utilities
"
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/PythonCOM
]
@
=
"
PythonDemos.PythonUtilities
"
[
HKEY_CLASSES_ROOT/CLSID/{41E24E95-D45A-11D2-852C-204C4F4F5020}/PythonCOMPath
]
@
=
"
D://
"
inproc server 是通過pythoncom25.dll 實現
local server 通過localserver.py 實現
com 對應的python 源文件信息在 PythonCOMPath & PythonCOM
4. 使用問題
用PHP 或者 c 調用com 的時候
php
$com
=
new
COM(
"
PythonDemos.Utilities
"
);
$rs
=
$com
->
SplitString(
"
a b c
"
);
foreach
(
$rs
as
$r
)
echo
$r
.
"
/n
"
;
?>
會碰到下面的一些錯誤.
pythoncom error: PythonCOM Server - The 'win32com.server.policy' module could not be loaded.
: No module named server.policy
pythoncom error: CPyFactory::CreateInstance failed to create instance.
(80004005)
可以通過2種方式解決:
a. 設置環境 PYTHONHOME = D:/usr/Python
另外在c ++ 使用python 的時候, 如果import module 出現錯誤'import site' failed; use -v for traceback
的話, 也可以通過設置這個變量解決.
b. 為com 生產exe, dll 可執行文件, setup.py 代碼如下 :
from
distutils.core
import
setup
import
py2exe
import
sys
import
shutil
#
Remove the build tree
ALWAYS do that!
shutil.rmtree(
"
build
"
, ignore_errors
=
True)
#
List of modules to exclude from the executable
excludes
=
[
"
pywin
"
,
"
pywin.debugger
"
,
"
pywin.debugger.dbgcon
"
,
"
pywin.dialogs
"
,
"
pywin.dialogs.list
"
]
#
List of modules to include in the executable
includes
=
[
"
win32com.server
"
]
#
ModuleFinder can't handle runtime changes to __path__, but win32com uses them
try
:
#
if this doesn't work, try import modulefinder
import
py2exe.mf as modulefinder
import
win32com
for
p
in
win32com.
__path__
[
1
:]:
modulefinder.AddPackagePath(
"
win32com
"
, p)
for
extra
in
[
"
win32com.shell
"
,
"
win32com.server
"
]:
#
,"win32com.mapi"
__import__
(extra)
m
=
sys.modules[extra]
for
p
in
m.
__path__
[
1
:]:
modulefinder.AddPackagePath(extra, p)
except
ImportError:
#
no build path setup, no worries.
pass
#
Set up py2exe with all the options
setup(
options
=
{
"
py2exe
"
: {
"
compressed
"
:
2
,
"
optimize
"
:
2
,
#
"bundle_files": 1,
"
dist_dir
"
:
"
COMDist
"
,
"
excludes
"
: excludes,
"
includes
"
: includes}},
#
The lib directory contains everything except the executables and the python dll.
#
Can include a subdirectory name.
zipfile
=
None,
com_server
=
[
'PythonDemos
'
], # 文件名!!
)
總結
以上是生活随笔為你收集整理的python word 1_Python word | 学步园的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python编程300例_经典编程100
- 下一篇: redis指定配置文件启动_深入学习 R