Each task created by the Engine has a callback which will call result() on the future to get the TaskInfo for logging. This has the unfortunate side-effect of pulling the actual task result onto the client, which could incur some non-trivial data movement cost depending on the executor and app.
We might be able to avoid this by changing the TaskInfo to get passed on to child tasks, and only logged if TaskFuture.result() is called. This means that:
- A child task failing could mean we lose the
TaskInfo of parent tasks.
- A parent with multiple children could each log the parent's
TaskInfo so we'd need to check if task IDs were already logged.
- If the app never calls
result(), the TaskInfo for possibly many tasks in a chain do not get logged.
I think we can solve all of these.
Each task created by the
Enginehas a callback which will callresult()on the future to get theTaskInfofor logging. This has the unfortunate side-effect of pulling the actual task result onto the client, which could incur some non-trivial data movement cost depending on the executor and app.We might be able to avoid this by changing the
TaskInfoto get passed on to child tasks, and only logged ifTaskFuture.result()is called. This means that:TaskInfoof parent tasks.TaskInfoso we'd need to check if task IDs were already logged.result(), theTaskInfofor possibly many tasks in a chain do not get logged.I think we can solve all of these.