IBIA: Indian Biological Images Archive

Image Data Submission Report

Generated on: 26 May 2026

Right Logo

Project Accession: IBIAP_1000000002
Title: Indian major basmati paddy seed varieties images dataset
Representative Image:
Description: The dataset contains images of 10 out of 32 notified Indian basmati seeds varieties (by the Government of India). Indian basmati paddy varieties included in the dataset are 1121, 1509, 1637, 1718, 1728, BAS-370, CSR 30, Type-3/Dehraduni Basmati, PB-1 and PB-6. Moreover, several images of other seeds and related entities available in the household have also been included in the dataset. Thus, the dataset contains 11 classes such that ten classes contain images from ten different basmati paddy varieties. In contrast, the 11th class- named “Unknown” contains images from a mixture of two morphologically similar paddy varieties (1121 and 1509), different pulses, other grains and related food entities. The Unknown class is useful in discriminating the paddy seeds from other types of seeds and related food entities. All the images were captured (in standard conditions) manually using an apparatus developed in-house and a tablet with a five-megapixel camera (5MP). The camera was used to capture 3210 RGB coloured images in JPG format. The data pre-processing was performed to generate the ready-to-use images for training and testing machine learning-based models. AI-based paddy seed variety classification models have been developed using the dataset. The dataset can be used to generate different types of AI-based models for adulteration detection, automated classification models (along with independent devices) at the time of rice threshing, and to increase the classification potential (Supplementing images representing additional basmati varieties).
Publications: https://www.sciencedirect.com/science/article/pii/S2352340920313421
Associated Codes (URL only): N/A
Funding agency: Department of Biotechnology (DBT), Government of India, India
Grant Number: BT/BI/04/001/2018 and BT/BI/25/066/2012
Ethics Statement: N/A
Any Other Information : The images dataset can be instrumental in the automatic or AI-assisted classification of paddy varieties of global economic importance. The images dataset can be used to develop more accurate and other types of AI-based classification models. For example, basmati paddy adulteration detection models and development of independent devices for automatic quality check of basmati paddy grains (at a larger scale) at rice threshing mills. At present, images have been generated for only 10 basmati paddy varieties while total 32 notified varieties of basmati are reported in literature. Thus, new images can be generated for the remaining varieties and existing images can be used for new AI-based models training and validation (to classify other basmati paddy varieties).
Additional File: N/A
Acknowledgments: We acknowledge Dr. Sunil Kumar Mukherjee's help in the procurement of seed samples and useful discussions. All the authors acknowledge ICGEB, for providing necessary infrastructure and facilities for the research. We also acknowledge the financial support by the Department of Biotechnology (DBT), Government of India, grants BT/BI/04/001/2018 and BT/BI/25/066/2012. AS acknowledges DBT Apex Biotechnology Information Centre at International Centre for Genetic Engineering and Biotechnology (ICGEB, India), for financial assistance. DS received fellowship from the Council of Scientific and industrial Research (CSIR, 09/0512(0207)/2016/EMR-1), New Delhi, India.

Sr.No First name Last name Email Organization Designation
1 Arun Sharma bioinfo.arun@gmail.com ICGEB, New Delhi, India Postdoctoral Researcher
2 Deepshikha Satish deepshikha8satish@gmail.com ICGEB, New Delhi, India Research Scholar
3 Sushmita Sharma sushmita@icgeb.res.in ICGEB, New Delhi, India Unspecified
4 Dinesh Gupta dinesh@icgeb.res.in ICGEB, New Delhi, India Principal Investigator

Study Accession: PPS_1000000002
Title: Indian major basmati paddy seed varieties images dataset
Imaging Type: Plant Photography (PP)
Imaging Sub-type: Not Applicable
Summary: Seeds from ten major Indian basmati paddy varieties were collected from the Indian Agricultural Research Institute (IARI), New Delhi, India. A total of 46 different types of pulses, grains and other food entities were also collected in-house to capture images other than paddy seeds. Moreover, a mixture of two morphologically similar paddy varieties (1121 and 1509) was also prepared in a separate vessel and its images constituted a new class (including pulses, grains and other food entities). Thus, the dataset (accessible on Mendeley) comprises 11 classes (10 basmati paddy seeds varieties and other grains and related food entities). An apparatus was designed to capture the images in standard conditions. A Micromax Canvas TAB P802 tablet, attached to the apparatus, was used to capture 3210 images.
Keywords: Images dataset; Basmati seeds; Deep learning; Classification; Indian basmati paddy varieties
Additional / Any Other Information: N/A
Release Date: Aug. 12, 2024
Access Licence Type: Open Access

Table 1. The sample types registered under this study are as follows:
Sample Type IDOrganismTaxon IDBiological EntityLateralitySource TissueSource Cell/Cell-lineCell Organelle
PPSMT_10000000002Oryza sativa 4530 SeedNot ApplicableN/AN/AN/A

The total number of samples registered under this study is: 3210

Table 3. The experiment types registered under this study are as follows:
Experiment Type IDInstrument NameInstrument TypeManufacturerModel
PPET_10000000002CameraTabletMicromaxMicromax Canvas Tab P802


Experimental Design Summary (PPET_10000000002)
Seeds from ten major Indian basmati paddy varieties were collected from the Indian Agricultural Research Institute (IARI), New Delhi, India. A total of 46 different types of pulses, grains and other food entities were also collected in-house to capture images other than paddy seeds. Moreover, a mixture of two morphologically similar paddy varieties (1121 and 1509) was also prepared in a separate vessel and its images constituted a new class (including pulses, grains and other food entities). Thus, the dataset (accessible on Mendeley) comprises 11 classes (10 basmati paddy seeds varieties and other grains and related food entities). An apparatus was designed to capture the images in standard conditions. A Micromax Canvas TAB P802 tablet, attached to the apparatus, was used to capture 3210 images.

Acquired Images Annotation Description (PPET_10000000002)
Ten Indian basmati paddy varieties seeds were collected from the Indian Agricultural Research Institute (IARI), New Delhi. Of these, five seed varieties- 1121, 1509, 1637, 1718 and 1728 were collected from the Seeds Production Unit, while the remaining five varieties seeds- BAS-370, CSR 30, Type-3/Dehraduni Basmati, PB-1 and PB-6) were obtained from the Genetics Department, IARI, New Delhi, India. Thus, IARI departments, ensured the purity of paddy seeds, and the images, taken using these seeds were assigned labels accordingly.

The total number of experiments registered under this study is: 3210

The total number of images registered under this study is: 3210