2012-11-20 26 views
1

使用Webview,我想找到此頁面中的鏈接。DOM何時準備好以及如何枚舉其元素?

-(void)webView:(WebView *)sender didFinishLoadForFrame:(WebFrame *)frame { 
    DOMDocument *myDOMDocument = [[self.webview mainFrame] DOMDocument]; 

看起來像一個很好的起點,但我發現WebScriptObject類參考有點神祕。顯然,我不想評估一些Javascript來獲得鏈接。我想直接閱讀DOM。

我怎麼才能找到DOM中的哪些節點是鏈接,並獲取他們指向的地址?

回答

2

查找圖像

看到walkNodeTree @http://cocoadev.com/wiki/DOMCore

的DOMNodes - 完整的樣品找到圖像節點,得到他們的src和使nsimages

@implementation DDAppDelegate 

- (void)applicationDidFinishLaunching:(NSNotification *)aNotification { 
    [self.webview.mainFrame loadRequest:[NSURLRequest requestWithURL:[NSURL URLWithString:@"http://dominik.pich.info/Home.html"]]]; 
} 

-(void)webView:(WebView *)sender didFinishLoadForFrame:(WebFrame *)frame { 
    DOMDocument *myDOMDocument = [[self.webview mainFrame] DOMDocument]; 

    NSMutableArray *imgs = [NSMutableArray array]; 
    [self walkNodeTree:myDOMDocument imgsCollected:imgs]; 

    //bad code, demo 
    NSMutableArray *nsImages = [NSMutableArray array]; 
    for (DOMNode *img in imgs) { 
     for(int i = 0; i < img.attributes.length; i++) { 
      DOMNode *attr = [img.attributes item:i]; 
      NSLog(@"%@", attr.nodeName); 
      if([attr.nodeName.lowercaseString isEqualToString:@"src"]) { 
       NSString *urlstring = [attr nodeValue]; 
       NSURL *url = [NSURL URLWithString:urlstring relativeToURL:[NSURL URLWithString:@"http://dominik.pich.info/"]]; 
       NSImage *nsimg = [[NSImage alloc] initWithContentsOfURL:url]; 
       if(nsimg) 
        [nsImages addObject:nsimg]; 
      } 
     } 
    } 

    NSLog(@"%@", nsImages); 
} 

- (void)walkNodeTree:(DOMNode*)parent imgsCollected:(NSMutableArray*)imgs { 
    DOMNodeList *nodeList = [parent childNodes]; 
    unsigned i, length = [nodeList length]; 
    for (i = 0; i < length; i++) { 
     DOMNode *node = [nodeList item:i]; 

     NSLog(@"%@", node.nodeName); 
     if([node.nodeName.lowercaseString isEqualToString:@"img"]) { 
      [imgs addObject:node]; 
     } 
     else { 
      //recurse 
      [self walkNodeTree:node imgsCollected:imgs]; 
     } 
    } 
} 
@end 
1

我一直在使用xpath規範XPath Introduction

通過將HTML從URL傳遞到NSXMLDocument和t母雞得到我想要的值NSXMLNode'snodesForXPath:error:

在這種情況下,我使用大型機的URL。但任何有效的網址都應該可以。

兩個NSXML類似乎沒有問題解析HTML就像他們做XML

有大量的XPath查詢字符串語法例子,你可以搜索,我發現它是非常容易深入到DOM樹一旦你知道HTML標籤和類語法是什麼。

我已經使用了一個非常簡單的a href在這裏查詢整個頁面。

但我已經加入了一個註釋掉的例子來顯示更多。上述

-(void)applicationDidFinishLaunching:(NSNotification *)aNotification 
{ 
    [theWebView setFrameLoadDelegate:self]; 

    NSURL* fileURL = [NSURL URLWithString:@"http://example.com"]; 

    NSURLRequest* request = [NSURLRequest requestWithURL:fileURL]; 
    [[theWebView mainFrame] loadRequest:request]; 
} 

-(void)webView:(WebView *)sender didFinishLoadForFrame:(WebFrame *)frame { 
    NSError *err_p = nil; 

    NSXMLDocument * xmlDoc = [[NSXMLDocument alloc] initWithContentsOfURL:[NSURL URLWithString:[theWebView mainFrameURL]] 
                    options:(NSXMLNodePreserveWhitespace| 
                      NSXMLNodePreserveCDATA) 
                    error:&err_p]; 

    if (xmlDoc == nil) { 

     xmlDoc = [[NSXMLDocument alloc] initWithContentsOfURL:[NSURL URLWithString:[theWebView mainFrameURL]] 
                 options:NSXMLDocumentTidyXML 
                 error:&err_p]; 

    } 

    NSError * error2; 


     NSString *xpathQueryTRTest [email protected]"//a";//--query string for all <a href tags 
//-- for example 2 --NSString *xpathQueryTRTest [email protected]"//div/p[1]";//--query string for all <a href tags 
NSArray *newItemsNodesTRTEST = [xmlDoc nodesForXPath:xpathQueryTRTest error:&error2];//--xpath node results returned in an array 

[xmlDoc release]; 

if (error2) 
{ 
    [[NSAlert alertWithError:error2] runModal]; 
    return ; 
} 

for (NSXMLElement *node in newItemsNodesTRTEST)//--parse the nodes in the array 
{ 

    NSLog(@"\nThe Node = %@\nThe node href value = %@", node, [[node attributeForName:@"href"]stringValue]); 
    //--for example 2 -- NSLog(@"\nThe Node value = %@\n", [node stringValue]); 
} 
} 
0

bothanswers是獨家MAC,而不是iOS的。如果您偶然發現了尋找iOS解決方案的頁面,請查看this教程,它基本上使用hpple庫進行DOM節點遍歷。其餘的很簡單。