我目前正在做一個小型數據結構項目,並且我試圖獲得全國各地大學的數據;然後用它們做一些數據處理。我發現這個數據在這裏:http://archive.ics.uci.edu/ml/machine-learning-databases/university/university.data需要幫助解析Java中的文件
,但這個數據的問題是(我從網站引用):「這是在數據文件末尾的一些相關功能的LISP可讀的文件。」我打算將這些數據保存爲.txt文件。
文件看起來有點像:
(def-instance Adelphi
(state newyork)
(control private)
(no-of-students thous:5-10)
(male:female ratio:30:70)
(student:faculty ratio:15:1)
(sat verbal 500)
(sat math 475)
(expenses thous$:7-10)
(percent-financial-aid 60)
(no-applicants thous:4-7)
(percent-admittance 70)
(percent-enrolled 40)
(academics scale:1-5 2)
(social scale:1-5 2)
(quality-of-life scale:1-5 2)
(academic-emphasis business-administration)
(academic-emphasis biology))
(def-instance Arizona-State
(state arizona)
(control state)
(no-of-students thous:20+)
(male:female ratio:50:50)
(student:faculty ratio:20:1)
(sat verbal 450)
(sat math 500)
(expenses thous$:4-7)
(percent-financial-aid 50)
(no-applicants thous:17+)
(percent-admittance 80)
(percent-enrolled 60)
(academics scale:1-5 3)
(social scale:1-5 4)
(quality-of-life scale:1-5 5)
(academic-emphasis business-education)
(academic-emphasis engineering)
(academic-emphasis accounting)
(academic-emphasis fine-arts))
......
The End Of the File:
(dfx def-instance (l)
(tlet (instance (car l) f-list (cdr l))
(cond ((or (null instance) (consp instance))
(msg t instance " is not a valid instance name (must be an atom)"))
(t (make:event instance)
(push instance !instances)
(:= (get instance 'features)
(tfor (f in f-list)
(when (cond ((or (atom f) (null (cdr f)))
(msg t f " is not a valid feature "
"(must be a 2 or 3 item list)") nil)
((consp (car f))
(msg t (car f) " is not a valid feature "
"name (must be an atom)") nil)
((and (cddr f) (consp (cadr f)))
(msg t (cadr f) " is not a valid feature "
"role (must be an atom)") nil)
(t t)))
(save (cond ((equal (length f) 3)
(make:feature (car f) (cadr f) (caddr f)))
(t (make:feature (car f) 'value (cadr f)))))))
instance))))
(set-if !instances nil)
(dex run-uniq-colleges (l n)
(tfor (sc in l)
(when (cond ((ge (length *events-added*) n))
((not (get sc 'duplicate))
(run-instance sc)
~ (remprop sc 'features)
nil)
(t (remprop sc 'features) nil)))
(stop)))
我最感興趣的數據是學生人數,學術重點和校名。
任何幫助,非常感謝。
我該怎麼做呢? – Brendan 2011-04-06 21:00:06