Storing a new dataset

Searching healthy and COVID-19 data for infection enhancing antibodies

Clustering healthy and COVID-19 TCR data to investigate common CDR sequence motifs

All described functions are accessible without the need for a user account. However, additional features are available to those who register, like storing private datasets and reusing previous search queries. Registration is free and simple. Storing a new dataset To demonstrate storing a dataset in the InterClone database, we use one of the publicly available datasets, “Wen-2020”, that was published by Wen, et al. Since the raw data needs to be processed into AIRR format in order to be usable by InterClone, we provide the prepared dataset. It consists of a zip archive containing four TSV files (one per donor) with full length amino acid sequences as well as chain identifiers.

On the InterClone web server, select the Store tool. Enter a name for the dataset (e.g. “Wen-2020”) and choose the correct Receptor Type (in this case, “BCR”) as well as Chain Type (in this case, “heavy”). It is recommended to add tags to make the dataset easier to find later. Since we have data from healthy donors here, we can enter “healthy” as a tag. Then, browse for the prepared zip archive and select it for upload. The filled out form should look like this:

../_images/store-wen2020.png

After clicking on “Store dataset”, you will be redirected to the Profile page which will show a summary of your dataset. Once the dataset has been stored in the database, the status will show as “PREPARED” and the dataset can be used for Searching and Clustering. Please check the number of successfully processed sequences and compare it with the size of the original input. A large disparity between the two counts means that a lot of your input data could not be processed properly. This can happen for a number of reasons, like unknown chain types or unusual donor species. You can contact us if you think your data is fine and should have been processed. Note that anonymous users can only store public datasets and are not able to delete these afterwards. Please consider creating a user account for advanced management of your data.