Staphylococcus aureus asymptomatically colonises 30% of humans and in 2017 caused 20,000 associated deaths in the USA alone.1 The public health importance of the species is also reflected in the global genome sequencing effort as S. aureus the 2nd most sequenced species in the NCBI database.2 Historically, multilocus sequence typing (MLST) has identified numerous sequence types which are further grouped into clonal complexes (CCs).3 Some CCs have spread globally to cause hospital and community-acquired infections (e.g. CC8).4 Despite the usefulness of clonal complex analysis, the growing number of genome sequences makes its use problematic as new investigations require the re-analysis of previously processed isolates.
In this study we developed Staphylococcus aureus Lineage Typing (SALTy) which rapidly divides the species into 60 phylogenetically well-defined lineages. Firstly, the lineages were resolved through single linkage clustering of 10,000 core genome MLST (cgMLST) profiles from a core genome consisting of 1,724 genes. Lineages differed by at least 1,017 cgMLST alleles. Secondly, alleles of three core genes were identified that can uniquely define the 60 lineages. SALTy was validated on 5,000 genomes and 99.3% (4,966/5,000) of the isolates were assigned to the correct lineage. We compared SALTy lineages to previously calculated clonal complexes from BIGSdb (n=21,173). The major complexes, CC1, CC5 and CC8 included isolates that were separated into distantly related phylogenetic clades. SALTy improves on CCs by grouping isolates congruently with phylogenetic structure.
We provide SALTy as a simple typing method that defines stable phylogenetic lineages. SALTy typing is highly accurate and capable of analysing large amounts of S. aureus genomic data quickly. We anticipate SALTy will aid the characterisation of S. aureus populations and the ongoing surveillance of subgroups that pose significant threat to human health.