openhands.utils.chunk_localizer

Chunk localizer to help localize the most relevant chunks in a file.

This is primarily used to localize the most relevant chunks in a file for a given query (e.g. edit draft produced by the agent).

Chunk Objects

class Chunk(BaseModel)

line_range

(start_line, end_line), 1-index, inclusive

normalized_lcs

def normalized_lcs(chunk: str, query: str) -> float

Calculate the normalized Longest Common Subsequence (LCS) to compare file chunk with the query (e.g. edit draft).

We normalize Longest Common Subsequence (LCS) by the length of the chunk to check how much of the chunk is covered by the query.

get_top_k_chunk_matches

def get_top_k_chunk_matches(text: str,
                            query: str,
                            k: int = 3,
                            max_chunk_size: int = 100) -> list[Chunk]

Get the top k chunks in the text that match the query.

The query could be a string of draft code edits.

Arguments:

text - The text to search for the query.
query - The query to search for in the text.
k - The number of top chunks to return.
max_chunk_size - The maximum number of lines in a chunk.

Chunk Objects​

line_range​

normalized_lcs​

get_top_k_chunk_matches​

Chunk Objects

line_range

normalized_lcs

get_top_k_chunk_matches