I'm working on a project where I'm going to analyze a large amount of transcriptome data. After assembling our RNA-Seq reads into contigs using Trinity, it looks like I'm going to have about 10GB of sequences in fasta format. Since these sequences are from several hundred tissue libraries but from a single species (chicken), I'm expecting there to be a lot of redundancy, so I'd like to cluster these sequences and just use a representative sequence from each cluster as I go forward with my analysis. I see there are quite a few tools that exist to do things like this, and I'm wondering which you all would recommend. I'll be running this on a Linux machine with 64 CPU cores and ~500GB of RAM.
I started looking at USEARCH, but it seems I'm going to run into some memory issues with the free 32-bit version and as much as I clicked around on their site I couldn't figure out how much the 64-bit version costs or how to buy it.