phthon-nltk
- NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets and tutorials supporting research and development in Natural Language Processing.
- 使用python 來處理自然語言
openSUSE 可以使用 one-click install
http://software.opensuse.org/search?q=nltk&baseproject=openSUSE%3A11.4&lang=zh_TW&exclude_debug=true
http://www.nltk.org/
工程計算要先from math import *
Getting Started with NLTK
- >>>from __future__ import division
- 使用浮點數的用法
>>> nltk.download()
- 下載 Collection 標籤內 Book 的套件
- 會下載到 ~/nltk_data
http://www.nltk.org/book
>>> from nltk.book import *
This says "from NLTK's book module, load all items.
- 當結束完 python
啟動新的 python 都要
import nltk
from nltk.book import *
- .concordance()
- text1.concordance("monstrous")
- len()
- >>> stt=['I', 'want', 'to', 'go', 'home', '.']
>>> len(stt)
6
- >>> stt=['I', 'want', 'to', 'go', 'home', '.']
- set()
- >>> stt=['I', 'want', 'to', 'go', 'to','school']
>>> set(stt)
set(['I', 'to', 'school', 'go', 'want'])
- >>> stt=['I', 'want', 'to', 'go', 'to','school']
- >>> len(stt)/len(set(stt))
1.2
- sorted()
- >>> sorted(set(stt))
['I', 'go', 'school', 'to', 'want']
- >>> sorted(set(stt))
- .count()
- >>> st='I want to go home.'
>>> st
'I want to go home.'
>>> st.count('t')
2
>>> stt
['I', 'want', 'to', 'go', 'to', 'school']
>>> stt.count('to')
2
- >>> st='I want to go home.'
- .index()
- >>> stt
['I', 'want', 'to', 'go', 'to', 'school']
>>> stt.index('to')
2
>>> stt.index('I')
0
- >>> stt
- >>> stt
['I', 'want', 'to', 'go', 'to', 'school']
>>> stt[1]
'want'
>>> stt[1:3]
['want', 'to']
>>> stt[1:]
['want', 'to', 'go', 'to', 'school']
>>> stt[:3]
['I', 'want', 'to']
- 1:3 只取出 1,2
1: 從 1 取到最後
:3 從0 取到2
- 1:3 只取出 1,2
- .split()
- >>> st='I want to go home.'
>>> st
'I want to go home.'
>>> st.split()
['I', 'want', 'to', 'go', 'home.']
- >>> st='I want to go home.'
- .join()
- >>> stt
['I', 'want', 'to', 'go', 'to', 'school']
>>> ''.join(stt)
'Iwanttogotoschool'
>>> ' '.join(stt)
'I want to go to school'
>>> ','.join(stt)
'I,want,to,go,to,school'
- 可以指定轉接符號
- >>> stt
- 使用python 來處理自然語言
沒有留言:
張貼留言