How or where do I specify the output_writer filename and content type for a GAE mapreduce job? This configuration below is working fine for me, but it creates a new blobstore entry with a new filename every time I run the job. I would like to be able to specify the filename and content type to be overwritten/replaced each time that I run the mapreduce job.
My handler is writing out lines of text for a csv file.
mapreduce:
- name: Export a model
mapper:
input_reader: mapreduce.input_readers.DatastoreInputReader
output_writer: mapreduce.output_writ开发者_如何学Cers.BlobstoreOutputWriter
handler: export_model
params:
- name: entity_kind
default: models.MyModel
The output_writer stuff is still experimental. There's no provision for specifying output filenames yet. You can follow the example in the demo app and use indirection: Attach the BlobKey of the output blog to an Entity of your choice that holds your desired name).
Look for
yield StoreOutput("WordCount", filekey, output)
in main.py
精彩评论