Most entity systems don't use Data Oriented Design. Note that entity system doesn't necessarily imply DOD, it just means splitting gameplay functionality into different component classes giving entities certain features and behavior. You can (very easily) create a bloated and horribly slow Object-Oriented version of an entity system. So entity systems are just a design concept. DOD is a completely orthogonal code design concept that can however be applied very easily to entity systems.
So on to your questions:
Doesn't this harm performance, as there would be cache misses (since new can allocate data anywhere from the heap)?
Yes. Well, maybe. It all depends on access patterns. But yeah, if you want to update all Position components in one go, well designed and aligned data can be far more efficient. Still, know your data and know how you use your data te be sure.
How would you implement a system in such a way that the components of each type are allocated contiguously in memory, so you avoid cache misses?
There are multiple ways to do that using preallocated memory and such, you could use multiple smaller pools for finegrained memory control or if you know how much a level is going to use, you can just allocate that much memory etcetera etcetera. This is all described in the answers the others gave.
That could be a solution, but I think you're treating a symptom, not the real problem: You want to use classes. Don't do that, if possible. It's bad for a number of reasons, among which is cache locality. There is no reason for either the entity or the components to be classes. Now I've talked about this at more length in my answer here: When/where to update components I'll give you a very quick rundown though, or just read the long answer there and skip the next two paragraphs
All the entity is is an identifier that binds together a number of components. The entity could just be a unique integer. It can also be a class with a set of pointers pointing to all component data if you need that. Still, I don't see the necessity for that. If an entity is a unique integer, you could just as easily write a helper function to get a component given an entity id.
So now to get rid of the component classes themselves. There's no actual need for that either. What you need is two things: some data stored somewhere for each component (component data) and some way to update that data (component behavior). What a component class does is combine those two, but there are other ways. For instance, creating a component system where there are managers for each component type. All component data is stored in arrays in the manager, accessed using the entity id (which is just an integer as I said above, more about that in my answer here). The behavior is one simple update() function in the manager class which updates all components it manages in one go, fully utilizing the cache locality and optimization benefits a single update loop provides.
The same thing could be done using component classes which are aligned in memory, but there are still a lot of issues with that approach, most important of which are memory/cache related. What happens when you delete a component from memory? You can't just move one of the other components into the empty space, since that would mess up all pointers pointing to that component. How big is your component class data? It may very well be about the size of one cache line (64 bytes in modern x86). If that's the case, your system will hardly be any faster than one where data is all over memory.
The biggest issue is how you use your data. How much component data do you need each update? Say you have a component with a position (vec3), velocity (vec3) and two other 4 byte variables. That's 32 bytes of data per component (assuming you don't use any virtual function crap). Now when you update your component, you may need all variables and you need to load all data. That's fine. Cache line size is 64 and your components are all packed in memory, so when you load one component, you automatically load the next one into cache. Great. Now what happens when you only need to access the position for another unrelated update loop. You load 64 bytes of data, but it's only got 2 vec3's you need. You've just wasted 40 loaded bytes of data. This is exactly what DOD is about avoiding.
If on the other hand, you had a component manager class which stores component data in number of arrays, one for positions, one for velocities and two for the other data, your cache will be utilized fully in both situations. So if you only need velocities, when it loads one, it will load 64 bytes around it into cache. That's more than five velocity vectors. When you need all data, it will do the same for the data in the other arrays.
So to summarize, I'll quote a bit from another answer I gave here
So the goal of DOD isn't using linear arrays, it's maximizing cache utilization by placing data that is accessed at (about) the same time adjacent in memory. If you nearly always access 3 variables at the same time, you don't need to create 3 arrays for each variable, you could just as easily create a struct which contains only those 3 values and create an array of these structs. The best solution always depends on the way you use the data. If your access patterns are linear, but you don't always use all variables, go for separate linear arrays. If your access patterns are more irregular but you always use all variables at the same time, put them in a struct together and create an array of those structs.
Hope this helps! Let me know if anything is unclear or if I missed something!