I'm using Eucalyptus and am considering putting hdfs and hbase on our node controllers. Would running hba开发者_如何学Cse on some of our instances improve performance, or is it redundant?
It depends. As always there are three fundamental bottlenecks:
1) CPU
2) Network I/O
3) Disk I/O
If your application is currently CPU bound, or if you data has a high cache hit rate with the extra nodes, then extra HBase nodes are useful. If your application is mostly disk bound, or network bound, then extra HBase nodes wouldn't help much (unless adding more nodes significantly improves your cache hit rate).
In general, you want your hbase nodes to run on hdfs nodes so that it can take advantage of local data access. I would find other situations somewhat unusual.
精彩评论