I've read the YQL guide, and I keep reviewing http://developer.yahoo.com/yql/guide/yql-o...entables-paging and I have been looking at a few examples, but I'm still left pretty confused how YQL paging works.
The problem that I am trying to tackle is creating a YQL open data table for the Mozilla labs Jetpack Gallery's jetpacks pages http://jetpackgallery.mozillalabs.com/jetpacks
You flip through the pages of jetpacks with the ?page query variable and there is an order_by query variable. You can only see 10 results per page.
Questions:
- List item
- Should I use or ?
- How do I specify the query parameter that indicates the page? in this case it is the 'page' query parameter.
- I am assuming I should use:
<urls><url>http://jetpackgallery.mozillalabs.com/jetpacks</url></urls>
is this correct? - In the execute element, I will need to extract the details for each jetpack on the page? if so how would I organize that for the response.object?
Can anyone provide some help? or perhaps point to a data table that I can look at as a referenc开发者_开发知识库e? or better documentation on how paging works?
Firstly, you should be looking at the paging model (Your link got compressed above, so I'm just putting it here.
When you use the paging with no <execute></execute>
block specified, it will be used in the query string with the URL specified in <url></url>
. Just play with the Flickr Photo Search Example, you have to run it in the Console with Diagnostics turned on to look at the changes in the URL. The id
attribute is used to insert the number in the query. Just to illustrate here, the paging portion looks like this:
<paging model="page">
<start id="page" default="0" />
<pagesize id="per_page" max="250" />
<total default="10" />
</paging>
For example, querying
select * from flickr.photos.search(10,20) where has_geo="true"`
The URL used was http://api.flickr.com/services/rest/?method=flickr.photos.search&has_geo=true&page=1&per_page=30
. As you can see, it actually took the page=1
but asked for per_page=30
and internally truncated the first 10 results so that you get an offset of 10 and a total of 20 results.
The reason why YQL did this is because the model selected is page
.
Another example, if you attempt to do this:
select * from flickr.photos.search(249,2) where has_geo="true"
YQL will retrieve both ...&page=1&per_page=250
and ...&page=2&per_page=250
(I've shortened the urls for illustration) as expected to get the results.
The paging variables are also defined in the global scope if you use JavaScript in <execute></execute>
section. You can see this being used in the flickr.photos.astro OpenData Table.
I guess that should answer the question for you, since I see that on GitHub, you have been working on how to extract the pages using XPath.
For your case you should have something like:
<paging model="page">
<start id="page" default="1" />
<pagesize id="per_page" max="10" />
<total default="10" />
</paging>
The per_page
would be in your internal query but it's used for the YQL to determine the queries needed. Then in your JavaScript could probably do something like:
y.query(
"select * from html where url=@url",
{url: "http://jetpackgallery.mozillalabs.com/jetpacks?page=" + page}
);
精彩评论