As @IgnacioVazquez-Abrams said, a 16u4 is virtually identical with a 32u4, so you can use a Leonardo or Micro core with a tiny modification to boards.txt. A 16u2 is a rather different animal, with fewer timers, many fewer pins, and no A/D converter, so you’d have to do a fair amount of editing to obtain a usable Arduino core.
That said, if you’re reasonably fluent in C++, building your own Arduino core is not very difficult, and I found it an instructive exercise. If you care about getting your project up as quickly as possible and not get sidetracked, however, go with the 16u4.