data-processing
Data manipulating environment
I am looking for something* to aid me in manipulating and interpreting data. Data of the names, addresses an开发者_StackOverflow社区d that sorts.[详细]
2023-04-07 17:22 分类:问答Large scale data processing Hbase vs Cassandra [closed]
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references,o开发者_JS百科r expertise, but this question will likely soli[详细]
2023-04-01 10:31 分类:问答correlation failure - Pearson
I want to write to datafile information about correlation as follows: *korelacja=cor(p2,d2,method=\"pearson\",use = \"complete.obs\")[详细]
2023-03-31 02:18 分类:问答data and software architecture for calculations from year 0 - year n
For example, our application tracks animal movements and prices for a farm. To get the current stock count the simplest solution is to have a starting number, then add up all the movement in and out u[详细]
2023-03-29 07:56 分类:问答Can anyone suggest a regex pattern that matches 4 consecutive lines of text?
I am trying to parse a large data file.In the file there are groups of either 3 or 4 lines of data separated by a blank line.Eg:[详细]
2023-03-26 02:55 分类:问答What is the optimal way to process a very large (over 30GB) text file and also show progress
[newbie question] Hi, I\'m working on a huge text file which is well over 30GB. I have to do some processing on each line and then write开发者_开发问答 it to a db in JSON format. When I read the f[详细]
2023-03-08 20:54 分类:问答Convert list of pairs to a table in shell without using awk
I have 开发者_如何转开发a tab-delimited list of pairs like this: appleyellow orange green applered[详细]
2023-02-10 08:37 分类:问答How do I extract data between square brackets that appear several times in a line using perl?
I have a line that contain开发者_Go百科es multiple instances of square bracketed data. [data 1] junk [data 2] junk,junk [data 3] junk [data 4][详细]
2023-02-04 01:32 分类:问答How do you handle timezones for data processing?
curious how people have solved this problem... I have a series of jobs that run overnight that r开发者_如何转开发oll up reports based on that day\'s data for customers. They\'re now asking for timezon[详细]
2023-01-17 12:33 分类:问答How should I filter this data?
I have a several series of data points that need to be graphed. For each graph, some points may need to be thrown out due to error. An example is the following:[详细]
2023-01-16 20:18 分类:问答