开发者

Programmatic access to Amazon Wishlist? [duplicate]

开发者 https://www.devze.com 2023-02-02 01:57 出处:网络
This question already has answers here: Scraping data to Google Sheets from a website that uses JavaScript
This question already has answers here: Scraping data to Google Sheets from a website that uses JavaScript (2 answers) Closed 27 days ago.

The community is reviewing whether to reopen this开发者_开发技巧 question as of 23 days ago.

Amazon recently changed their APIs which and it seems there's no way now to access my WishList on Amazon programmatically using these APIs. Anybody knows any way to do it besides screen-scraping? Maybe some third-party service (I don't mind working with only public data)?


For screen scraping, the compact layout style might be helpful: http://bililite.com/blog/2010/10/31/hacking-my-way-to-an-amazon-wishlist-widget/

Update

I did some hacking of my own in google spreadsheets and managed to get 2 basic implementations working.

Using Google Apps Scripts:

Type your wishlist ID into cell A1. Copy and paste the following into a google apps script (Tools > Scripts > Scripts Editor), and run the getWishlist function:

function getWishlist(){
  var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheets()[0];
  var wishlistId = sheet.getRange('a1').getValue(); 
  var response = UrlFetchApp.fetch("http://www.amazon.co.uk/registry/wishlist/" + wishlistId + "?layout=compact").getContentText();
  var asinRegex = /name="item.([\d]+)\.(?:[A-Z0-9]+).([A-Z0-9]+).*/g
  while (match = asinRegex.exec(response)) {
    var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheets()[0];
    var rowIndex = Number(match[1])+2;
    var asin = match[2];
    setRow(sheet, rowIndex, asin);
    var offers = UrlFetchApp.fetch("http://www.amazon.co.uk/gp/offer-listing/" + asin).getContentText();    
    setRow(sheet, rowIndex, asin, 
           getFirstMatch(/class="producttitle">(.+)</g, offers),
           getFirstMatch(/class="price">(.+)</g, offers));
  }  
  Browser.msgBox("Finished");
}

function getFirstMatch(regex, text) {
  var match = regex.exec(text);
  return (match == null) ? "Unknown" : match[1];
}

function setRow(sheet, index, a, b, c) {
  sheet.getRange('a' + index).setValue(a);
  sheet.getRange('b' + index).setValue(b);
  sheet.getRange('c' + index).setValue(c);
}

​ ​ NB, I'm having some probs with regex matching the title / price. Not sure why, but shows the basic idea.

Using Google Spreadsheet Functions

Type your wishlist ID into cell A1.

Type the following function into A2. It will populate the cell and all below it with the id strings for each item in your wishlist:

=importXML("http://www.amazon.co.uk/registry/wishlist/"&A1&"?layout=compact", "//*[starts-with(@name, 'item.')]/@name")

Type the following function into B2, which will extract the asin from the id string:

=right(A2, 10)

Type the following function into B3, which will fetch the offer listing for the asin in B2 and display the title:

=importXML("http://www.amazon.co.uk/gp/offer-listing/"&B2, "//h1")

Type the following function into B4, which will fetch the offer listing for the asin in B2 and display all the prices:

=concatenate(importXML("http://www.amazon.co.uk/gp/offer-listing/"&B2, "//span[@class='price']"))


A guy called Justin Scarpetti has created a really neat "api" which scrapes your wishlist and returns the data in json format.

This is a little API to retrieve Amazon Wish List data. There is no official API, as Amazon shut it down a couple years ago. The only way around that... screen scraping.

Amazon Wish Lister uses phpQuery (server-side CSS3 selector driven DOM API based on jQuery) to scrape Amazon's Wish List page and exports to JSON, XML, or PHP Array Object.

Perfect if you want to host display your wish list on your own website.

Source: Amazon Wish Lister

0

精彩评论

暂无评论...
验证码 换一张
取 消