开发者

reading Char by char From MS Word

开发者 https://www.devze.com 2023-04-04 03:01 出处:网络
in my program I should read Character by character from a pdf file and put evry word on a database. I doubted, can I do that or not? then I decided to convert the pdf file to a MS WORD file with a con

in my program I should read Character by character from a pdf file and put evry word on a database. I doubted, can I do that or not? then I decided to convert the pdf file to a MS WORD file with a converter and then read from that file.

Now still I Don't know how can I read Character by character from a MS Word File. I'm using C++/MFC in my progr开发者_Python百科am.

if you give me an sample code it would very help me and I'll be so thanks-full.


Check out IFilter. http://msdn.microsoft.com/en-us/library/ms691105%28v=vs.85%29.aspx

Its a COM interface to extract text from files (each extension has its DLL that the COM returned according to what you need).

An example in C#: http://www.codeproject.com/KB/cs/IFilter.aspx, or http://www.codeproject.com/KB/string/pdf2text.aspx (I've used it in native c++, but I don't have code example...).

Notice that for PDF you might need to down PDF IFilter: http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611

Good Luck!


If you can convert the source file and you only need the characters, then make it a plain text file and read it using std::ifstream.

To get more sofisticated information from an MS Word file, you should use Office Automation. There are good links in the answers to the following question:

Creating, opening and printing a word file from C++

0

精彩评论

暂无评论...
验证码 换一张
取 消