
APIGen
Automated pipeline for generating high-quality function call datasets
- Multi stage data validation process ensures data quality
- Standardizing JSON format improves data structure and verifiability
- Support large-scale data collection from multiple API sources
- Verify the correctness of function calls through actual API execution
- Promote data diversity, including query styles, sampling, and API diversity
- Using real-world APIs to ensure the practicality and high quality of the dataset
Product Details
APIGen is an automated data generation pipeline designed to generate verifiable high-quality datasets for function calling applications. This model ensures the reliability and correctness of data through a three-level validation process, including format checking, actual function execution, and semantic validation. APIGen is capable of generating diverse datasets in a scalable and structured manner, and verifying the correctness of generated function calls through actual API execution, which is crucial for improving the performance of function call proxy models.