Aller directement au contenu principal

evaluation.benchmarks.commit0_bench.run_infer

initialize_runtime

def initialize_runtime(runtime: Runtime, instance: pd.Series)

Initialize the runtime for the agent.

This function is called before the runtime is used to run the agent.

complete_runtime

def complete_runtime(runtime: Runtime, instance: pd.Series) -> dict[str, Any]

Complete the runtime for the agent.

This function is called before the runtime is used to run the agent. If you need to do something in the sandbox to get the correctness metric after the agent has run, modify this function.

commit0_setup

def commit0_setup(dataset: pd.DataFrame, repo_split: str) -> pd.DataFrame

Setup Commit0 dataset based on split type.

Arguments:

  • dataset - Full Commit0 dataset
  • repo_split - Split type ('all', 'lite' or specific repo name)

Returns:

Filtered dataset based on split type