Presto - Logging Query Stats
Logging Query Stats
While working on refactoring few DAO classes in my project, I was looking for a quick and easy way to log query stats. In Hibernate, it’s a standard feature. All you need to do is to set hibernate.generate_statistics to true.
It took a while for me to figure out how it’s done in Presto.
Presto offers a way for us to plug-in a Consumer<QueryStats> to PrestoStatement to collect QueryStats. The Presto Engine populates the Consumer object with QueryStats throughout the execution of the query.
Once the query is finished, we can iterate through the collection and log the stats. This is better than logging just the elapsed time because QueryStats gives us insight into why the particular query took ‘x’ seconds instead of simply reporting that a query took ‘x’ seconds.
private static class RecordingProgressMonitor implements Consumer < QueryStats > { | |
private final ImmutableList.Builder < QueryStats > builder = ImmutableList.builder(); | |
private boolean finished; | |
@Override | |
public synchronized void accept(QueryStats queryStats) { | |
checkState(!finished); | |
builder.add(queryStats); | |
} | |
public synchronized List < QueryStats > finish() { | |
finished = true; | |
return builder.build(); | |
} | |
} | |
// usage | |
try (Connection connection = createConnection()) { | |
try (Statement statement = connection.createStatement()) { | |
PrestoStatement prestoStatement = statement.unwrap(PrestoStatement.class); | |
RecordingProgressMonitor progressMonitor = new RecordingProgressMonitor(); | |
prestoStatement.setProgressMonitor(progressMonitor); | |
try (ResultSet rs = statement.executeQuery("<< sample query >>")) { | |
ResultSetMetaData metadata = rs.getMetaData(); | |
} | |
prestoStatement.clearProgressMonitor(); | |
// access progress monitor | |
List<QueryStats> queryStatsList = progressMonitor.finish(); | |
} | |
} | |
// results | |
// Query Stats= QueryStats[completedSplits=21,cpuTimeMillis=92,elapsedTimeMillis=44,nodes=1,peakMemoryBytes=0, | |
// processedBytes=0,processedRows=21802,queryId=20200507_054120_00481_5nch8,queued=false,queuedSplits=0, queuedTimeMillis=1, | |
// rootStage=Optional[io.prestosql.jdbc.StageStats@625732],runningSplits=0,scheduled=true,state=FINISHED, | |
// totalSplits=21,wallTimeMillis=100] | |
try (Connection newConnection = DriverManager.getConnection("jdbc:presto://localhost:5050/", "admin", ""); | |
Statement selectStatement = newConnection.createStatement();) { | |
// unwrap | |
PrestoConnection prestoConnection = newConnection.unwrap(PrestoConnection.class); | |
// override | |
prestoConnection.setSessionProperty("query_max_run_time", "1 s"); | |
String sql = "select * from tpch.sf100.orders"; | |
ResultSet resultSet = selectStatement.executeQuery(sql); | |
while (resultSet.next()) {} | |
} | |
// running this should throw an exception. | |
// java.sql.SQLException: Query failed (#20200507_055827_00492_5nch8): | |
// Query exceeded maximum time limit of 1.00s at io.prestosql.jdbc.PrestoResultSet.resultsException(PrestoResultSet.java:1894) | |
// at |
Overriding session properties
Presto requires a bunch of config files during startup. These config files control a variety of things - connector configuration, node configuration, JVM configuration, etc.
However, Presto also lets you override some of the properties at runtime (per session). These configurations can be used to tune or alter Presto’s behavior whenever required without a restart.
One of the common use cases is to limit the run time of a query. [Refer Lines 38-47]
Happy Coding!