evaluation.webarena.run_infer
initialize_runtime
def initialize_runtime(runtime: Runtime) -> dict
Initialize the runtime for the agent.
This function is called before the runtime is used to run the agent.
complete_runtime
def complete_runtime(runtime: Runtime) -> dict[str, Any]
Complete the runtime for the agent.
This function is called before the runtime is used to run the agent. If you need to do something in the sandbox to get the correctness metric after the agent has run, modify this function.