Search Datasets

The search tool allows you to find sequences whose CDRs match within specified identity thresholds. This can be helpful for locating receptors that bind to the same epitope as the query, although there are always tradeoffs between sensitivity (the fraction of true sequences that are found) and specificity (the fraction of found hits that are true). The default identity thresholds for each CDR are set to achieve a reasonable balance, but you should adjust as needed. Note, however, that reducing the coverage threshold below 90 may potentially yield matches with low significance.

Input consists of a full-length variable region amino acid sequence. The rest of the input fields are identical to those of the store tool. This is because your query will be stored and can be accessed at any time for reuse.

If your query is a TCR and you do not know the full length sequence, you can try to assemble it from the V and J gene names and the CDR3 sequence using our assembly tool.

Next, select the datasets that you want to search. In order to reduce load on our server, we restrict the searched data to be no more than 200 million sequences. Click “Search” and your search should start immediately. Please expect to wait a few minutes for a small to medium sized search (~100,000 sequences).

To follow a real world use case with inputs and results, please see the tutorial.