APIGen

APIGen

Automated pipeline for generating high-quality function call datasets

  • Multi stage data validation process ensures data quality
  • Standardizing JSON format improves data structure and verifiability
  • Support large-scale data collection from multiple API sources
  • Verify the correctness of function calls through actual API execution
  • Promote data diversity, including query styles, sampling, and API diversity
  • Using real-world APIs to ensure the practicality and high quality of the dataset

Product Details

APIGen is an automated data generation pipeline designed to generate verifiable high-quality datasets for function calling applications. This model ensures the reliability and correctness of data through a three-level validation process, including format checking, actual function execution, and semantic validation. APIGen is capable of generating diverse datasets in a scalable and structured manner, and verifying the correctness of generated function calls through actual API execution, which is crucial for improving the performance of function call proxy models.