openhands.runtime.plugins.agent_skills.file_reader.file_readers
File reader skills for the OpenHands agent.
This module provides various functions to parse and extract content from different file types, including PDF, DOCX, LaTeX, audio, image, video, and PowerPoint files. It utilizes different libraries and APIs to process these files and output their content or descriptions.
Functions: parse_pdf(file_path: str) -> None: Parse and print content of a PDF file. parse_docx(file_path: str) -> None: Parse and print content of a DOCX file. parse_latex(file_path: str) -> None: Parse and print content of a LaTeX file. parse_audio(file_path: str, model: str = 'whisper-1') -> None: Transcribe and print content of an audio file. parse_image(file_path: str, task: str = 'Describe this image as detail as possible.') -> None: Analyze and print description of an image file. parse_video(file_path: str, task: str = 'Describe this image as detail as possible.', frame_interval: int = 30) -> None: Analyze and print description of video frames. parse_pptx(file_path: str) -> None: Parse and print content of a PowerPoint file.
Notes:
Some functions (parse_audio, parse_video, parse_image) require OpenAI API credentials and are only available if the necessary environment variables are set.
parse_pdf
def parse_pdf(file_path: str) -> None
Parses the content of a PDF file and prints it.
Arguments:
file_path
- str: The path to the file to open.
parse_docx
def parse_docx(file_path: str) -> None
Parses the content of a DOCX file and prints it.
Arguments:
file_path
- str: The path to the file to open.
parse_latex
def parse_latex(file_path: str) -> None
Parses the content of a LaTex file and prints it.
Arguments:
file_path
- str: The path to the file to open.
parse_audio
def parse_audio(file_path: str, model: str = 'whisper-1') -> None
Parses the content of an audio file and prints it.
Arguments:
file_path
- str: The path to the audio file to transcribe.model
- str: The audio model to use for transcription. Defaults to 'whisper-1'.
parse_image
def parse_image(
file_path: str,
task: str = 'Describe this image as detail as possible.') -> None
Parses the content of an image file and prints the description.
Arguments:
file_path
- str: The path to the file to open.task
- str: The task description for the API call. Defaults to 'Describe this image as detail as possible.'.
parse_video
def parse_video(file_path: str,
task: str = 'Describe this image as detail as possible.',
frame_interval: int = 30) -> None
Parses the content of an image file and prints the description.
Arguments:
file_path
- str: The path to the video file to open.task
- str: The task description for the API call. Defaults to 'Describe this image as detail as possible.'.frame_interval
- int: The interval between frames to analyze. Defaults to 30.
parse_pptx
def parse_pptx(file_path: str) -> None
Parses the content of a pptx file and prints it.
Arguments:
file_path
- str: The path to the file to open.