Israel has released it's budget for all to see, and there is an API to extract the data. However, I don't know how to parse it into a txt/csv format.
Here is an example link to make a call for data.
Here is the output:
[
{
"parent": [
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 6075053,
"year": 2003,
"title": "השכלה גבוהה",
"gross_amount_used": 5942975,
"gross_amount_revised": 5942975,
"budget_id": "0021",
"net_amount_used": 5936491,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 5861591,
"gross_amount_allocated": 5861591
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 5364976,
"year": 2003,
"title": "השתתפות בתקציב המוסדות להשכלה גבוהה",
"gross_amount_used": 5337585,
"gross_amount_revised": 5337584,
开发者_运维问答 "budget_id": "002102",
"net_amount_used": 5331101,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 4985915,
"gross_amount_allocated": 4985915
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 565495,
"year": 2003,
"title": "השתתפות בפעולות",
"gross_amount_used": 462490,
"gross_amount_revised": 462490,
"budget_id": "002103",
"net_amount_used": 462490,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 559293,
"gross_amount_allocated": 559293
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 0,
"year": 2003,
"title": "רזרבה להתייקרויות",
"gross_amount_used": 0,
"gross_amount_revised": null,
"budget_id": "002105",
"net_amount_used": null,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 171801,
"gross_amount_allocated": 171801
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 108000,
"year": 2003,
"title": "פיתוח מוסדות להשכלה גבוהה",
"gross_amount_used": 108000,
"gross_amount_revised": 108000,
"budget_id": "002106",
"net_amount_used": 108000,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 108000,
"gross_amount_allocated": 108000
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 23634,
"year": 2003,
"title": "תחום פעולה כללי",
"gross_amount_used": 23634,
"gross_amount_revised": 23634,
"budget_id": "002101",
"net_amount_used": 23634,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 23634,
"gross_amount_allocated": 23634
},
{
"parent": [
{
"budget_id": "0021",
"title": "השכלה גבוהה"
},
{
"budget_id": "00",
"title": "המדינה"
}
],
"net_amount_revised": 12948,
"year": 2003,
"title": "פעולות עם משרדים ומוסדות אחרים",
"gross_amount_used": 11266,
"gross_amount_revised": 11266,
"budget_id": "002104",
"net_amount_used": 11266,
"inflation_factor": 1.15866084989269,
"net_amount_allocated": 12948,
"gross_amount_allocated": 12948
}
]
What would be the way to parse this into a table format?
Thanks!
Tal
If you install the rjson package You should be able to do:
do.call( 'rbind', fromJSON( file="http://budget.yeda.us/0021?year=2003&depth=1" ) )
[edit]
Actually.. that has problems with the variable length internal parent
list, but should get you half way there
Yep, it's JSON. fromJSON
will turn it into a list for you
resp <- getURL("http://budget.yeda.us/0021?year=2003&depth=1")
library(rjson)
resp <- fromJSON(resp)
That gets you to list form. For a data frame, try:
library(plyr)
resp <- llply(resp, function(x) llply(x, function(y) ifelse(is.null(y), "NULL", y)))
budget <- data.frame()
for(i in 1:length(resp)) {
budget <- rbind.fill(budget, data.frame(resp[[i]]))
}
The nested llply
take care of some unpleasantness when creating data frames that include null values.
Looks like JSON. Try the rjson package, but some looping or tricky listy fiddling may be required.
Lunchtime now, otherwise I'd have a solution pasted in. Give the non-lunching portion of the hive mind a few minutes....
This is similar to a previous answer but also creates columns for the second budget_id
and title
fields in parent
, not just the first and is structured slightly differently operating on the parent
and the rest
separately and then putting them back together.
library(rjson)
library(plyr)
js <- fromJSON(file = "http://budget.yeda.us/0021?year=2003&depth=1")
toDF <- function(x) do.call("rbind.fill", lapply(x, as.data.frame))
Null2NA <- function(x) if (is.null(x)) NA else x
parent1 <- lapply(js, "[[", "parent")
rest <- lapply(js, function(x) lapply(x[-1], Null2NA))
cbind(toDF(parent1), toDF(rest))
精彩评论