Creating a BioSample
2. Go to https://submit.ncbi.nlm.nih.gov/subs/biosample/ and sign in to NCBI with your user account.
Click on the BioSample link and choose New submission.
During this step, the submitter should provide as much information as possible about the studied organism. It is difficult to edit this information after the process is complete, so users should carefully proofread all fields before submitting. The submission process progresses through seven fillable forms (presented as tabs at the top of the page). In order, these are:
a. Submitter: Provide information about the person submitting the data and the submitting organization (typically, this will be the submitter's organizational affiliation). An email address from the submitting organization's domain is required. If desired, a shared submission group can be created, allowing multiple authors to access and contribute to the submission.
b. General info: Here the user assigns a release date for the data, which can occur immediately upon submission or be delayed until publication (or until a specified future date). In addition, the user must choose between Single BioSample or Batch/Multiple BioSamples submission. If a batch submission is selected, only samples that are part of the same project should be included.
c. Sample type: Here the user chooses among ten options giving a general description of the sample type. Researchers working with non-model invertebrates should choose Invertebrate, those working on model or non-model plants should choose Plants, those working on canonical model animals (e.g. D. melanogaster and C. elegans) should choose Model organism or animal sample, and those working on any non-model non-invertebrate animal should also choose Model organism or animal sample. There are a variety of more specialized descriptors available for metagenomes or pathogens that should be chosen if appropriate.
d. Attributes: If Single BioSample was chosen, this page is a fillable form. If Batch/Multiple BioSamples was selected, the user is prompted to download a fillable template file. In either case, the following fields are mandatory:
i. sample_name: a short, unique descriptor of the sequenced sample; organism the scientific name of the organism to the most specific level available (standard 'Genus species' if possible);
ii. collection_date: the date when the sample was collected, from the field or lab as appropriate'basically the date the organism was sacrificed;
iii. geo_loc_name: the site where the specimen was collected, in the general format Country:State:City;
iv. tissue: the specific tissue from which RNA was extracted for sequencing.
If any mandatory information is missing, user should enter 'not collected', 'not applicable' or 'missing' as appropriate. Additional fields (e.g., sex, developmental stage, age, latitude/longitude of collection site) are available but not mandatory. The user is encouraged to provide as much data as possible. If Batch/Multiple BioSamples was selected, the filled template is uploaded.
See BioSample_attributes.xlsx for a completed example.
e. Overview: Here the user can look over all the submission parts and decide if it is ready to submit. Check carefully for any errors--once the submission is complete, changes can only be made by emailing the BioSample help desk! Once the user is satisfied with all entries, hit submit. Once the request is processed, you will receive a confirmation email containing the BioSample ID(s) in the format SAMNxxxxx.