I have an array of 10 rows by 20 columns. Each columns corresponds to a data set that cannot be fitted with any sort of continuous mathematical function (it's a series of numbers derived experimentally). I would like to calculate the integral of each column between row 4 and row 8, then store the obtained result in a new array (20 rows x 1 column).
I have tried using different scipy.integrate modules (e.g. quad, trpz,...).
The problem is that, from what I understand, scipy.integrate must be applied to functions, and I am not sure how to convert each column of my initial array into a function. As an alternative, I thought of calculating the average of each column between row 4 and row 8, then multiply this number by 4 (i.e. 8-4=4, the x-interval) and then 开发者_JAVA百科store this into my final 20x1 array. The problem is...ehm...that I don't know how to calculate the average over a given range. The question I am asking are:
- Which method is more efficient/straightforward?
- Can integrals be calculated over a data set like the one that I have described?
- How do I calculate the average over a range of rows?
Since you know only the data points, the best choice is to use trapz
(the trapezoidal approximation to the integral, based on the data points you know).
You most likely don't want to convert your data sets to functions, and with trapz
you don't need to.
So if I understand correctly, you want to do something like this:
from numpy import *
# x-coordinates for data points
x = array([0, 0.4, 1.6, 1.9, 2, 4, 5, 9, 10])
# some random data: 3 whatever data sets (sharing the same x-coordinates)
y = zeros([len(x), 3])
y[:,0] = 123
y[:,1] = 1 + x
y[:,2] = cos(x/5.)
print y
# compute approximations for integral(dataset, x=0..10) for datasets i=0,1,2
yi = trapz(y, x[:,newaxis], axis=0)
# what happens here: x must be an array of the same shape as y
# newaxis tells numpy to add a new "virtual" axis to x, in effect saying that the
# x-coordinates are the same for each data set
# approximations of the integrals based the datasets
# (here we also know the exact values, so print them too)
print yi[0], 123*10
print yi[1], 10 + 10*10/2.
print yi[2], sin(10./5.)*5.
To get the sum of the entries 4 to 8 (including both ends) in each column, use
a = numpy.arange(200).reshape(10, 20)
a[4:9].sum(axis=0)
(The first line is just to create an example array of the desired shape.)
精彩评论