UniProt Database Guide — How to Search and Use Protein Databases
What is UniProt?
UniProt (Universal Protein Resource) is the world's most comprehensive protein sequence and annotation database. It contains over 250 million protein sequences from all domains of life. Every protein that our structure predictor can look up uses a UniProt ID as the identifier — understanding UniProt is essential for working with protein data.
Swiss-Prot vs TrEMBL
UniProt has two main sections. Swiss-Prot contains roughly 570,000 entries that have been manually reviewed and annotated by expert curators. These entries have verified function, structure, and literature references. TrEMBL contains over 250 million entries that are automatically annotated but not manually reviewed. Swiss-Prot entries have much higher quality annotations, so our tool searches Swiss-Prot first when you type a protein name.
Understanding UniProt IDs
Each UniProt entry has a unique accession number (like P69905 for human hemoglobin alpha). The format is one letter, one digit, three letters/digits, one digit. These IDs are stable — they don't change over time. This is the ID you enter in our structure predictor to look up any protein.
Entries also have "entry names" (like HBA_HUMAN), but accession numbers are the preferred way to reference proteins because they're guaranteed to be unique and permanent.
What's in a UniProt entry?
Each entry contains the protein name, gene name, organism, amino acid sequence, function annotation, subcellular location, post-translational modifications, disease associations, known 3D structures (links to PDB), protein-protein interactions, and links to dozens of other databases. For our tools, we pull the protein name, organism, gene, sequence, and length directly from UniProt to display alongside the AlphaFold structure.
Searching effectively
On our tool, you can search by UniProt ID directly (fastest) or by protein name (we query UniProt's API to find the best match). For best results with name searches, include the organism: "human insulin" works better than just "insulin" since many organisms have insulin-like proteins.
Search by UniProt ID or protein name — we handle the database lookup for you.
Try the Structure Predictor