New in Cloudera Labs: Apache HTrace (incubating)
Via a combination of beta functionality in CDH 5.5 and new Cloudera Labs packages, you now have access to Apache HTrace for doing performance tracing of your HDFS-based applications. HTrace is a new Apache incubator project that provides a bird’s-eye view of the performance of a distributed system. While log files can provide a peek into important events on a specific node, and metrics can answer questions about aggregate performance, HTrace can follow specific requests all the way through the cluster. HTrace breaks down requests into sets of trace spans . Each trace span represents a length of time. A single request, such as an HDFS copyToLocal command, will generate many different trace spans. Each trace span has a list of parents that allow you to figure out why it was created and in which larger operation it is involved. Trace spans also have a “TracerId” that identifies which service and process they came from. Processes like the NameNode, Dat...