"The current pan proteome sequences are derived from the reference proteome
clusters (75% proteome similarity for Fungus and 55% proteome similarity for
Archaea and Bacteria). A reference proteome cluster is also known as a
representative proteome group (RPG) (Chen et al., 2011). A RPG contains similar
proteomes calculated based on their co-membership in UniRef50 clusters. For each
non-singleton reference proteome cluster, a pan proteome is a set of sequences
consisting of all the sequences in the reference proteome, plus the addition of
unique protein sequences that are found in other species or strains of the
cluster but not in the reference proteome. These additional sequences are
identified using UniRef50 membership."xsd:string