Image Data Submission Report
Generated on: 26 May 2026
| Project Accession: | IBIAP_1000000002 |
| Title: | Indian major basmati paddy seed varieties images dataset |
| Representative Image: | |
| Description: | The dataset contains images of 10 out of 32 notified Indian basmati seeds varieties (by the Government of India). Indian basmati paddy varieties included in the dataset are 1121, 1509, 1637, 1718, 1728, BAS-370, CSR 30, Type-3/Dehraduni Basmati, PB-1 and PB-6. Moreover, several images of other seeds and related entities available in the household have also been included in the dataset. Thus, the dataset contains 11 classes such that ten classes contain images from ten different basmati paddy varieties. In contrast, the 11th class- named “Unknown” contains images from a mixture of two morphologically similar paddy varieties (1121 and 1509), different pulses, other grains and related food entities. The Unknown class is useful in discriminating the paddy seeds from other types of seeds and related food entities. All the images were captured (in standard conditions) manually using an apparatus developed in-house and a tablet with a five-megapixel camera (5MP). The camera was used to capture 3210 RGB coloured images in JPG format. The data pre-processing was performed to generate the ready-to-use images for training and testing machine learning-based models. AI-based paddy seed variety classification models have been developed using the dataset. The dataset can be used to generate different types of AI-based models for adulteration detection, automated classification models (along with independent devices) at the time of rice threshing, and to increase the classification potential (Supplementing images representing additional basmati varieties). |
| Publications: | https://www.sciencedirect.com/science/article/pii/S2352340920313421 |
| Associated Codes (URL only): | N/A |
| Funding agency: | Department of Biotechnology (DBT), Government of India, India |
| Grant Number: | BT/BI/04/001/2018 and BT/BI/25/066/2012 |
| Ethics Statement: | N/A |
| Any Other Information : | The images dataset can be instrumental in the automatic or AI-assisted classification of paddy varieties of global economic importance. The images dataset can be used to develop more accurate and other types of AI-based classification models. For example, basmati paddy adulteration detection models and development of independent devices for automatic quality check of basmati paddy grains (at a larger scale) at rice threshing mills. At present, images have been generated for only 10 basmati paddy varieties while total 32 notified varieties of basmati are reported in literature. Thus, new images can be generated for the remaining varieties and existing images can be used for new AI-based models training and validation (to classify other basmati paddy varieties). |
| Additional File: | N/A |
| Acknowledgments: | We acknowledge Dr. Sunil Kumar Mukherjee's help in the procurement of seed samples and useful discussions. All the authors acknowledge ICGEB, for providing necessary infrastructure and facilities for the research. We also acknowledge the financial support by the Department of Biotechnology (DBT), Government of India, grants BT/BI/04/001/2018 and BT/BI/25/066/2012. AS acknowledges DBT Apex Biotechnology Information Centre at International Centre for Genetic Engineering and Biotechnology (ICGEB, India), for financial assistance. DS received fellowship from the Council of Scientific and industrial Research (CSIR, 09/0512(0207)/2016/EMR-1), New Delhi, India. |
| Sr.No | First name | Last name | Organization | Designation | |
|---|---|---|---|---|---|
| 1 | Arun | Sharma | bioinfo.arun@gmail.com | ICGEB, New Delhi, India | Postdoctoral Researcher |
| 2 | Deepshikha | Satish | deepshikha8satish@gmail.com | ICGEB, New Delhi, India | Research Scholar |
| 3 | Sushmita | Sharma | sushmita@icgeb.res.in | ICGEB, New Delhi, India | Unspecified |
| 4 | Dinesh | Gupta | dinesh@icgeb.res.in | ICGEB, New Delhi, India | Principal Investigator |
| Study Accession: | PPS_1000000002 |
| Title: | Indian major basmati paddy seed varieties images dataset |
| Imaging Type: | Plant Photography (PP) |
| Imaging Sub-type: | Not Applicable |
| Summary: | Seeds from ten major Indian basmati paddy varieties were collected from the Indian Agricultural Research Institute (IARI), New Delhi, India. A total of 46 different types of pulses, grains and other food entities were also collected in-house to capture images other than paddy seeds. Moreover, a mixture of two morphologically similar paddy varieties (1121 and 1509) was also prepared in a separate vessel and its images constituted a new class (including pulses, grains and other food entities). Thus, the dataset (accessible on Mendeley) comprises 11 classes (10 basmati paddy seeds varieties and other grains and related food entities). An apparatus was designed to capture the images in standard conditions. A Micromax Canvas TAB P802 tablet, attached to the apparatus, was used to capture 3210 images. |
| Keywords: | Images dataset; Basmati seeds; Deep learning; Classification; Indian basmati paddy varieties |
| Additional / Any Other Information: | N/A |
| Release Date: | Aug. 12, 2024 |
| Access Licence Type: | Open Access |
| Sample Type ID | Organism | Taxon ID | Biological Entity | Laterality | Source Tissue | Source Cell/Cell-line | Cell Organelle |
|---|---|---|---|---|---|---|---|
| PPSMT_10000000002 | Oryza sativa | 4530 | Seed | Not Applicable | N/A | N/A | N/A |
| Experiment Type ID | Instrument Name | Instrument Type | Manufacturer | Model |
|---|---|---|---|---|
| PPET_10000000002 | Camera | Tablet | Micromax | Micromax Canvas Tab P802 |
| Experimental Design Summary (PPET_10000000002) |
|---|
| Seeds from ten major Indian basmati paddy varieties were collected from the Indian Agricultural Research Institute (IARI), New Delhi, India. A total of 46 different types of pulses, grains and other food entities were also collected in-house to capture images other than paddy seeds. Moreover, a mixture of two morphologically similar paddy varieties (1121 and 1509) was also prepared in a separate vessel and its images constituted a new class (including pulses, grains and other food entities). Thus, the dataset (accessible on Mendeley) comprises 11 classes (10 basmati paddy seeds varieties and other grains and related food entities). An apparatus was designed to capture the images in standard conditions. A Micromax Canvas TAB P802 tablet, attached to the apparatus, was used to capture 3210 images. |
| Acquired Images Annotation Description (PPET_10000000002) |
|---|
| Ten Indian basmati paddy varieties seeds were collected from the Indian Agricultural Research Institute (IARI), New Delhi. Of these, five seed varieties- 1121, 1509, 1637, 1718 and 1728 were collected from the Seeds Production Unit, while the remaining five varieties seeds- BAS-370, CSR 30, Type-3/Dehraduni Basmati, PB-1 and PB-6) were obtained from the Genetics Department, IARI, New Delhi, India. Thus, IARI departments, ensured the purity of paddy seeds, and the images, taken using these seeds were assigned labels accordingly. |