2014-02-07 65 views
1

我正計劃創建一個腳本來掃描網站列表並返回其WHOIS數據。在WHOIS查找返回多個屬性,如域名,創建日期,有效期限等在Python中存儲多個屬性

我的問題是:什麼是去有關存儲數據的最佳方式?我正在考慮創建一個名爲「Site」的對象,其中包含所有不同的屬性。這甚至會是Python OOP的正確用法嗎?如果是這樣,你能舉一個例子來看看會是什麼樣子嗎?

非常感謝您的幫助!

編輯:我到目前爲止

#Server scan test 
#Not sure if using Python yet, but it should be so simple it won't matter 
import whois 

class Scanner(object): 
    def __init__(self, arg): 
     super(ClassName, self).__init__() 
     self.arg = arg 
    def site(creationDate, domain_name, emails, expiration_date): 
     self.creation_Date = creationDate 
     self.domain_name = domain_name 
     self.emails = emails 
     self.expiration_date = expiration_date 
     self.name_servers = name_servers 
     self.referral_url = referral_url 
     self.registrar = registrar 
     self.status = status 
     self.updated_date = updated_date 
     self.whois_server = whois_server 

dummies = ['ttt.com', 'uuu.com', 'aaa.com'] 
infoArray = {} 
for i in dummies: 
    w = whois.whois(i) 
    infoArray[i] = w.text 
+0

@ Back2Basics它不是。這是爲了工作。我們正在轉向新的主機,所以我們需要所有的數據。我會發布我到目前爲止的代碼 – PintSizeSlash3r

回答

1

我會用字典來存儲數據

+0

我也會,但問題在於網站不僅僅是一個鍵和一個值。每個字母都有一個密鑰和大約10個值 – PintSizeSlash3r

+1

字典可以處理各種數據,包括列表。你也可以把字典放在字典中:-) –

0

如果你想Python對象持久化,你可以嘗試一下shelve module的代碼。

下面是對文件的例子:

import shelve 

d = shelve.open(filename) # open -- file may get suffix added by low-level 
          # library 

d[key] = data # store data at key (overwrites old data if 
       # using an existing key) 
data = d[key] # retrieve a COPY of data at key (raise KeyError if no 
       # such key) 
del d[key]  # delete data stored at key (raises KeyError 
       # if no such key) 
flag = d.has_key(key) # true if the key exists 
klist = d.keys() # a list of all existing keys (slow!) 
0

這聽起來像pywhois一遍。

基礎入門級是一個很好的例子,看起來像這樣:

class WhoisEntry(object): 
    """Base class for parsing a Whois entries. 
    """ 
    # regular expressions to extract domain data from whois profile 
    # child classes will override this 
    _regex = { 
     'domain_name':  'Domain Name:\s?(.+)', 
     'registrar':  'Registrar:\s?(.+)', 
     'whois_server':  'Whois Server:\s?(.+)', 
     'referral_url':  'Referral URL:\s?(.+)', # http url of whois_server 
     'updated_date':  'Updated Date:\s?(.+)', 
     'creation_date': 'Creation Date:\s?(.+)', 
     'expiration_date': 'Expiration Date:\s?(.+)', 
     'name_servers':  'Name Server:\s?(.+)', # list of name servers 
     'status':   'Status:\s?(.+)', # list of statuses 
     'emails':   '[\w.-][email protected][\w.-]+\.[\w]{2,4}', # list of email addresses 
    } 

    def __init__(self, domain, text, regex=None): 
     self.domain = domain 
     self.text = text 
     if regex is not None: 
      self._regex = regex 


    def __getattr__(self, attr): 
     """The first time an attribute is called it will be calculated here. 
     The attribute is then set to be accessed directly by subsequent calls. 
     """ 
     whois_regex = self._regex.get(attr) 
     if whois_regex: 
      setattr(self, attr, re.findall(whois_regex, self.text)) 
      return getattr(self, attr) 
     else: 
      raise KeyError('Unknown attribute: %s' % attr) 

    def __str__(self): 
     """Print all whois properties of domain 
     """ 
     return '\n'.join('%s: %s' % (attr, str(getattr(self, attr))) for attr in self.attrs()) 


    def attrs(self): 
     """Return list of attributes that can be extracted for this domain 
     """ 
     return sorted(self._regex.keys()) 


    @staticmethod 
    def load(domain, text): 
     """Given whois output in ``text``, return an instance of ``WhoisEntry`` that represents its parsed contents. 
     """ 
     if text.strip() == 'No whois server is known for this kind of object.': 
      raise PywhoisError(text) 

     if '.com' in domain: 
      return WhoisCom(domain, text) 
     elif '.net' in domain: 
      return WhoisNet(domain, text) 
     elif '.org' in domain: 
      return WhoisOrg(domain, text) 
     elif '.ru' in domain: 
      return WhoisRu(domain, text) 
     elif '.name' in domain: 
       return WhoisName(domain, text) 
     elif '.us' in domain: 
       return WhoisUs(domain, text) 
     elif '.me' in domain: 
       return WhoisMe(domain, text) 
     elif '.uk' in domain: 
       return WhoisUk(domain, text) 
     else: 
      return WhoisEntry(domain, text) 

編輯:因爲我不能上斯文的回答發表評論,你可以很容易地處理存儲在像這樣一本字典:

scanner = new Scanner() 
scanner.self.emails = '[email protected]' 
scanner.self.expiration_date = 'Tomorrow' 
scan_data_dict = scanner.__dict__