• Information about the datasets must be identified to use as AI training datasets, and this information is known as metadata. Metadata can be provided in JSON and XML formats and can include the following information for each type of dataset.
✔ Image metadata: Date, location, exposure, etc.
✔ Text metadata: Title, text length, creation date, etc.
✔ Audio metadata: Date, length, recorder, speaker, number of speakers, etc.
• As shown in the above examples, Metadata and training data must be separated and specification documents for each must be prepared so that developers can easily use them when training AI models.