I have a list of timestamps in a text file. I want to figure out the times at which the change is more than a given threshold.
Input format:
10:13:55
10:14:00 10:14:01 10:14:02 10:14:41 10:14:46 10:17:58 10:18:00 10开发者_如何转开发:19:10 10:19:16
If the threshold is, say, 30 seconds, I want the output to list the cases where the change is >= 30 seconds
eg. 10:14:02 and 10:14:41, 10:14:46 and 10:17:58
Solutions in bash, python or ruby would be helpful. Thanks.
I tend to use awk
(with a sed
filter to break your lines up) for things like that:
echo '10:13:55 10:14:00 10:14:01 10:14:02
10:14:41 10:14:46 10:17:58 10:18:00
10:19:10 10:19:16'
| sed -e 's/ *//g' -e 's/^ //' -e 's/ $//' -e 's/ /\n/g'
| awk -F: '
NR==1 {s=$0;s1=$1*3600+$2*60+$3}
NR>1 {t1=$1*3600+$2*60+$3;if (t1-s1 > 30) print s" "$0;s1=t1;s=$0}
'
outputs:
10:14:02 10:14:41
10:14:46 10:17:58
10:18:00 10:19:10
Here's how it works:
- It sets the field separator to
:
for easy extraction. - When the record number is 1 (
NR==1
), it simply stores the time (s=$0
) and number of seconds since midnight (s1=$1*3600+$2*60+$3
). This is the first baseline. - Otherwise (
NR>1
), it gets the seconds since midnight (t1=$1*3600+$2*60+$3
) and, if that's more than 30 seconds since the last one, it outputs the last time and this time (if (t1-s1 > 30) print s" "$0
). - Then it resets the baseline for the next line (
s1=t1;s=$0
).
Keep in mind the sed
command is probably more complicated that it needs to be in this example - it collapses all space sequences to one space, removes them from the start and end of lines then converts newline characters into spaces. Depending on the input form of your data (mine is complicated since it's formatted for readability), this may not all be necessary.
Update: Since the question edit has stated that the input is one time per line, you don't need the sed
part at all.
Python:
from datetime import datetime
list = open("times.txt").read()
lasttime = None
for timestamp in [datetime.strptime(datestring, "%H:%M:%S") for datestring in list.split()]:
if lasttime and (timestamp - lasttime).seconds > 30:
print lasttime.time(),"and",timestamp.time()
lasttime = timestamp
In Python:
data = open('filename').read()
times = [datetime.time(x) for x in data.split()]
for i in range(1, len(times)):
if times[i] - times[i-1] > datetime.timedelta(seconds=30):
print times[i], times[i-1]
Ruby:
File.open(filename,'r').each do |line|
times = split
times.each { |time| time = Time.parse(time) }
times.each_with_index do |time,i|
puts time if ((time[i+1] - time [i]).sec > 30)
end
end
@OP, you algorithm is just to find a way to iterate each field, converting them to secs, and compare against the neighbours.
gawk 'BEGIN{threshold=30}
{
for(i=1;i<=NF;i++){
m=split($i,t,":")
n=split($(i+1),w,":")
sec = (t[1]*3600) + (t[2]*60) + t[3]
sec_next = (w[1]*3600) + (w[2]*60) + w[3]
if ( (sec_next - sec) > threshold ){
print $i, $(i+1)
}
}
}' file
output:
# ./shell.sh
10:14:02 10:14:41
10:14:46 10:17:58
10:18:00 10:19:10
精彩评论