开发者

Modify a .webarchive from within cocoa and write out again

开发者 https://www.devze.com 2023-02-15 09:29 出处:网络
I have access to a .webarchive file. I have so far managed to create a webarchive (using PyObj开发者_运维技巧C) from the file. I wish to modify some elements in the DOM tree and write the modified dat

I have access to a .webarchive file. I have so far managed to create a webarchive (using PyObj开发者_运维技巧C) from the file. I wish to modify some elements in the DOM tree and write the modified data out.

I guess I need access to some root DOM tree (the webarchive is one web page, with no links) given a WebArchive.

Does anyone have an idea how to do this in Cocoa? Thank you


Possible solution (haven't checked as yet)

from Foundation import *
import objc
import WebKit
from WebKit import *
d=NSData.dataWithContentsOfFile_("/tmp/x.webarchive")
ws=WebArchive.alloc().initWithData_(d)
wv=WebView.alloc().initWithFrame_frameName_groupName_(((100, 100),(100,100)), "foo",None)
mf=wv.mainFrame()
mf.loadArchive_(ws)


Your code to load the WebArchive into the WebView looks correct (I'm not very familiar with PyObjC). Modifying the DOM is pretty easy using methods from the WebKit API (documentation). The tricky bit is once you've modified the DOM and you want to write the modifications back to the WebArchive. Simply saving a new WebArchive won't work because this won't preserve your modifications, so you need to write the new source. Here's some code that will do that (here the WebView is webview and the original WevArchive is located at archivePath and the modified version will be written there as well):

//Get the string representation of the current DOM tree
NSString *sourceString = [(DOMHTMLElement *)[[[webview mainFrame] DOMDocument] documentElement] outerHTML];
NSData *sourceData = [sourceString dataUsingEncoding:NSUTF8StringEncoding];

//Load the archive from disk to a dictionary (it's a plist)
NSMutableDictionary *archive = [[NSMutableDictionary alloc] initWithContentsOfURL:[NSURL fileURLWithPath:archivePath]];
//Modify the main HTML
[(NSMutableDictionary *)[archive objectForKey:@"WebMainResource"] setObject:sourceData forKey:@"WebResourceData"];
//Write the plist back out
NSData *data = [NSPropertyListSerialization dataFromPropertyList:archive format:NSPropertyListBinaryFormat_v1_0 errorDescription:nil];
[data writeToURL:[NSURL fileURLWithPath:ArchivePath] atomically:YES];

This is a little bit of a hack because it relies on the internal structure of the archive format which is undocumented, but I think you can pretty safely assume it won't change drastically.

0

精彩评论

暂无评论...
验证码 换一张
取 消