开发者

View the innards of a .ppt file?

开发者 https://www.devze.com 2023-03-27 11:18 出处:网络
I need to figure out what is going on inside a client\'s .ppt files.What is a good way to get started?

I need to figure out what is going on inside a client's .ppt files. What is a good way to get started?

My eventual hope is to convert it to HTML. But if I开发者_运维百科 just export the .ppt to HTML, I get a lot of images (as opposed to text), which is not a Good Thing.

EDIT: software that automatically converts .ppt to HTML would be terrific, provided that it preserves as much information as possible in text format. If that doesn't exist, the next best thing would be to understand the innards of the .ppt and write my own code to do a partial conversion.

EDIT: I used OfficeConvert as recommended by Michiel Leenaars. It got me text all right. My 50-page, 8MB test file turned into 40MB of text. The fact that I got text is good. The fact that the amount went way up is moving in the wrong direction. And there is an awful lot of repetition in there. The word "style" appeared 410815 times; the word "draw" appeared 351229 times.


I think a safe way would be to use OfficeConvert to automatically convert to ODF programmatically with Microsoft Office. Run it with /? to get help. There are some dependencies (see below).

Then use a good ODF library like lpod to look inside it.

You can view some interesting code examples here.


Dependencies:

  • Microsoft .NET Framework Version 2.0 Redistributable Package (x86)
  • Primary Interop Assemblies for Office 2007 or Office 2010 (whichever you are using).


I like the Aspose products. (I'm not associated with them other than as a customer.) I've used the PPT one specifically to write code that pokes around in the insides of a PPT. Overkill if you just want to convert it to HTML, but invaluable for the sorts of things I use it for.


If you know Java, Apache has the POI project which lets you take a look at the inners of a PPT project. Could get all the info you want about the project (images, text) and then convert it to html however you like.

Its free too.

0

精彩评论

暂无评论...
验证码 换一张
取 消