Practical

Python簡體中文轉繁體程式

因為自然語言處理期末project要用到,花了點時間寫了個簡體轉繁體的程式.

是用別人寫的函式,所以程式很簡單.

流程是

1. 下載提供簡轉繁函式的函式庫(連結在這)
2. 解壓縮之後切進這個目錄

cd jianfan-0.0.1/

3. python setup.py build

rosfuerte@rosfuerte-K53SM:~/project/nlp/jianfan-0.0.1$ python setup.py build
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/jianfan
copying jianfan/__init__.py -> build/lib.linux-x86_64-2.7/jianfan
copying jianfan/charsets.py -> build/lib.linux-x86_64-2.7/jianfan

4. sudo python setup.py install

rosfuerte@rosfuerte-K53SM:~/project/nlp/jianfan-0.0.1$ sudo python setup.py install
running install
Checking .pth file support in /usr/local/lib/python2.7/dist-packages/
/usr/bin/python -E -c pass
TEST PASSED: /usr/local/lib/python2.7/dist-packages/ appears to support .pth files
running bdist_egg
running egg_info
writing jianfan.egg-info/PKG-INFO
writing top-level names to jianfan.egg-info/top_level.txt
writing dependency_links to jianfan.egg-info/dependency_links.txt
writing jianfan.egg-info/PKG-INFO
writing top-level names to jianfan.egg-info/top_level.txt
writing dependency_links to jianfan.egg-info/dependency_links.txt
reading manifest file ‘jianfan.egg-info/SOURCES.txt’
writing manifest file ‘jianfan.egg-info/SOURCES.txt’
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/jianfan
copying build/lib.linux-x86_64-2.7/jianfan/__init__.py -> build/bdist.linux-x86_64/egg/jianfan
copying build/lib.linux-x86_64-2.7/jianfan/charsets.py -> build/bdist.linux-x86_64/egg/jianfan
byte-compiling build/bdist.linux-x86_64/egg/jianfan/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/jianfan/charsets.py to charsets.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying jianfan.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying jianfan.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying jianfan.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying jianfan.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents…
creating dist
creating ‘dist/jianfan-0.0.1-py2.7.egg’ and adding ‘build/bdist.linux-x86_64/egg’ to it
removing ‘build/bdist.linux-x86_64/egg’ (and everything under it)
Processing jianfan-0.0.1-py2.7.egg
Copying jianfan-0.0.1-py2.7.egg to /usr/local/lib/python2.7/dist-packages
Adding jianfan 0.0.1 to easy-install.pth file

Installed /usr/local/lib/python2.7/dist-packages/jianfan-0.0.1-py2.7.egg
Processing dependencies for jianfan==0.0.1
Finished processing dependencies for jianfan==0.0.1

5.然後就可以快樂地import了,可以參考我寫的小程式

import os
import sys
from jianfan import jtof

jian_file = sys.argv[1]

fj = open(jian_file, ‘r’)
original_content = fj.read()
fj.close()

translated_content = jtof(original_content)

unicoded = “”

for ch in translated_content:
unicoded += unicode(ch).encode(‘utf-8’)

fan_file = sys.argv[2]
ff = open(fan_file, ‘w’)
print >> ff, unicoded
ff.close()

使用方法是 python jian_to_fan.py [簡體檔案] [輸出繁體檔案]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s