Skip to main content

openhands.runtime.plugins.agent_skills.file_reader.file_readers

File reader skills for the OpenHands agent.

This module provides various functions to parse and extract content from different file types, including PDF, DOCX, LaTeX, audio, image, video, and PowerPoint files. It utilizes different libraries and APIs to process these files and output their content or descriptions.

Functions: parse_pdf(file_path: str) -> None: Parse and print content of a PDF file. parse_docx(file_path: str) -> None: Parse and print content of a DOCX file. parse_latex(file_path: str) -> None: Parse and print content of a LaTeX file. parse_audio(file_path: str, model: str = 'whisper-1') -> None: Transcribe and print content of an audio file. parse_image(file_path: str, task: str = 'Describe this image as detail as possible.') -> None: Analyze and print description of an image file. parse_video(file_path: str, task: str = 'Describe this image as detail as possible.', frame_interval: int = 30) -> None: Analyze and print description of video frames. parse_pptx(file_path: str) -> None: Parse and print content of a PowerPoint file.

Notes:

Some functions (parse_audio, parse_video, parse_image) require OpenAI API credentials and are only available if the necessary environment variables are set.

parse_pdf

def parse_pdf(file_path: str) -> None

Parses the content of a PDF file and prints it.

Arguments:

  • file_path - str: The path to the file to open.

parse_docx

def parse_docx(file_path: str) -> None

Parses the content of a DOCX file and prints it.

Arguments:

  • file_path - str: The path to the file to open.

parse_latex

def parse_latex(file_path: str) -> None

Parses the content of a LaTex file and prints it.

Arguments:

  • file_path - str: The path to the file to open.

parse_audio

def parse_audio(file_path: str, model: str = 'whisper-1') -> None

Parses the content of an audio file and prints it.

Arguments:

  • file_path - str: The path to the audio file to transcribe.
  • model - str: The audio model to use for transcription. Defaults to 'whisper-1'.

parse_image

def parse_image(
file_path: str,
task: str = 'Describe this image as detail as possible.') -> None

Parses the content of an image file and prints the description.

Arguments:

  • file_path - str: The path to the file to open.
  • task - str: The task description for the API call. Defaults to 'Describe this image as detail as possible.'.

parse_video

def parse_video(file_path: str,
task: str = 'Describe this image as detail as possible.',
frame_interval: int = 30) -> None

Parses the content of an image file and prints the description.

Arguments:

  • file_path - str: The path to the video file to open.
  • task - str: The task description for the API call. Defaults to 'Describe this image as detail as possible.'.
  • frame_interval - int: The interval between frames to analyze. Defaults to 30.

parse_pptx

def parse_pptx(file_path: str) -> None

Parses the content of a pptx file and prints it.

Arguments:

  • file_path - str: The path to the file to open.