I am using solr to index RSS feeds and I am using DataImportHandler to parse the urls and then index them. Now I have implemented a web service that takes a url and creates an thumbnail image and stores it in a local directory.
So here is what I want to do: After the url is parsed, I want to send a Http request to the we开发者_StackOverflowb service with the URL. ScriptTransformer seemed the way to go and here is how my data-config.xml file looks.
<dataConfig>
<script> <![CDATA[ function sendURLRequest(row){
var url = new java.net.URL("http://***********/GenerateThumbnail?url=http://money.cnn.com/2011/07/20/news/economy/debt_ceiling_deal/index.htm?cnn=yes");
url.openConnection().connect();
return row; } ]]>
</script>
<dataSource type="JdbcDataSource" name="dbSource" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/solr_sources" user="root" password="******"/>
<document>
<entity name="rssFeedItems" rootEntity="false" dataSource="dbSource" query="select url from rss_feeds">
<entity name="rssFeeds" dataSource="urlSource" url="${rssFeedItems.url}" transformer="script:sendURLRequest" processor="XPathEntityProcessor" forEach="/rss/channel/item">
<field column="title" xpath="/rss/channel/item/title"/>
<field column="link" xpath="/rss/channel/item/link" />
<field column="description" xpath="/rss/channel/item/description" />
<field column="date_published" xpath="/rss/channel/item/pubDate"/>
</entity>
</entity>
.................
................
As you can see from the data-config file, I am currently testing to see if this would work by hard coding a dummy URL.
url.openConnection().connect(); Should make the HTTP Request. But the image is not generated.
I see no compile errors. I tried the example script of printing out a message
var v = new java.lang.Runnable() {
run: function() { print('********************PRINTING************************'); }
}
v.run();
And it worked.
I even played around with the function names to force it throw some compile errors and it did throw errors which shows that it is able to create the objects of class type URL and URL Connection.
Any suggestions?
I think you need to do more than just connect() to the URL to issue an HTTP GET. Maybe try:
var url = new java.net.URL("http://***********/GenerateThumbnail?url=http://money.cnn.com/2011/07/20/news/economy/debt_ceiling_deal/index.htm?cnn=yes");
var connection = url.openConnection();
connection.connect();
connection.getContent();
return row;
I just did a little experiment because I was curious and found that url.openConnection().connect() didn't even actually open a connection to my test server. It wasn't until I called getContent() that the client connected and issued an HTTP request. Perhaps for the HTTP protocol the java URL library doesn't see a need to open a stateful connection and therefore doesn't connect until the data is requested (as opposed to if URL was used to access something like an FTP address).
精彩评论