I need to measure the execution time of an Apache spark request (Bluemix). What I tried:
import time startTimeQuery = time.clock() df = sqlContext.sql(query) df.show() endTimeQuery = time.clock() runTimeQuery = endTimeQuery - startTimeQuery
Is this a good way? The time I get looks too small relative to when I see the table.
On Bluemix on your laptops, go to "Paelette" on the right. Select the "Evironment" panel and you will see a link to the Spark history log, where you can explore Spark's completed assignments, including computation time.
I use System.nanoTimewrapped around a helper function, for example:
System.nanoTime
def time[A](f: => A) = { val s = System.nanoTime val ret = f println("time: "+(System.nanoTime-s)/1e6+"ms") ret } time { df = sqlContext.sql(query) df.show() }
SPARK itself provides a lot of detailed information about each step of your spark mission.
You can view the current job at http: // IP-MasterNode: 4040 or you can turn on the history server for analyzing tasks later.
Learn more about the history server here .
Source: https://habr.com/ru/post/1623053/More articles:mono The view "Index" or its master was not found - c #JSON-LD output with AnugularJS for parsing structured data - jsonMono.Net support for asynchronous wait? - c #Mono with Owin authentication - authentication404 answer added to webapi answer - ubuntuОбновить версии, содержащиеся в README, на выпуск maven - javato define registration in Silex - phphttps://translate.googleusercontent.com/translate_c?depth=1&pto=aue&rurl=translate.google.com&sl=ru&sp=nmt4&tl=en&u=https://fooobar.com/questions/1623056/converting-map-type-in-case-class-to-structfield-type&usg=ALkJrhgltYebnvgpJ7h8q4hHAR7yGjQQTALaunch a Node.js server from a C # application - c #How to display only those problems that would not be covered by the fix message “Fix No. 123” after merging into the default branch? - githubAll Articles