openhands.utils.chunk_localizer
Chunk localizer to help localize the most relevant chunks in a file.
This is primarily used to localize the most relevant chunks in a file for a given query (e.g. edit draft produced by the agent).
Chunk Objects
class Chunk(BaseModel)
line_range
(start_line, end_line), 1-index, inclusive
normalized_lcs
def normalized_lcs(chunk: str, query: str) -> float
Calculate the normalized Longest Common Subsequence (LCS) to compare file chunk with the query (e.g. edit draft).
We normalize Longest Common Subsequence (LCS) by the length of the chunk to check how much of the chunk is covered by the query.
get_top_k_chunk_matches
def get_top_k_chunk_matches(text: str,
query: str,
k: int = 3,
max_chunk_size: int = 100) -> list[Chunk]
Get the top k chunks in the text that match the query.
The query could be a string of draft code edits.
Arguments:
text
- The text to search for the query.query
- The query to search for in the text.k
- The number of top chunks to return.max_chunk_size
- The maximum number of lines in a chunk.