I have an ordered (i.e. sorted) list that contains dates sorted (as datetime objects) in ascending order.
I want to write a function that iterates through this list and generates another list of the first available dates for each mon开发者_如何学运维th.
For example, suppose my sorted list contains the following data:
A = [
'2001/01/01',
'2001/01/03',
'2001/01/05',
'2001/02/04',
'2001/02/05',
'2001/03/01',
'2001/03/02',
'2001/04/10',
'2001/04/11',
'2001/04/15',
'2001/05/07',
'2001/05/12',
'2001/07/01',
'2001/07/10',
'2002/03/01',
'2002/04/01',
]
The returned list would be
B = [
'2001/01/01',
'2001/02/04',
'2001/03/01',
'2001/04/10',
'2001/05/07',
'2001/07/01',
'2002/03/01',
'2002/04/01',
]
The logic I propose would be something like this:
def extract_month_first_dates(input_list, start_date, end_date):
#note: start_date and end_date DEFINITELY exist in the passed in list
prev_dates, output = [],[] # <- is this even legal?
for (curr_date in input_list):
if ((curr_date < start_date) or (curr_date > end_date)):
continue
curr_month = curr_date.date.month
curr_year = curr_date.date.year
date_key = "{0}-{1}".format(curr_year, curr_month)
if (date_key in prev_dates):
continue
else:
output.append(curr_date)
prev_dates.append(date_key)
return output
Any comments, suggestions? - can this be improved to be more 'Pythonic' ?
>>> import itertools
>>> [min(j) for i, j in itertools.groupby(A, key=lambda x: x[:7])]
['2001/01/01', '2001/02/04', '2001/03/01', '2001/04/10', '2001/05/07', '2001/07/01', '2002/03/01', '2002/04/01']
Searching lists is a O(n) operation. I think you can simply check whether the key is new:
def extract_month_first_dates(input_list):
output = []
last_key = None
for curr_date in input_list:
date_key = curr_date.date.month, curr_date.date.year # no string key required
if date_key != last_key:
output.append(curr_date)
last_key = date_key
return output
Here is a simple solution in classic python
i.e. no itertools ;) and self explanatory
visited = {}
B = []
for a in A:
month = a[:7]
if month not in visited:
B.append(a)
visited[month] = 1
print B
Ouput:
['2001/01/01', '2001/02/04', '2001/03/01', '2001/04/10', '2001/05/07', '2001/07/01', '2002/03/01', '2002/04/01']
精彩评论