I have a situation where I have several text documents and I want to create a list that contains the words in those documents. But I also need to record several properties of each word. These properties are
- The index of the word within all the words in the documents (Integer)
- The word itself (String)
- The document that the word is in (Integer)
- The topic value associated with this word (Integer)
I can think of two ways of doing this. The first is simply creating a list of tuples of the form (word,doc,topic) where the word index is given by the index of the tuple in the list. My second idea is to create a word class where the given properties are member variables in the class. Then just create a list of objects from this class.
So my question is which is the best solution - the list of tuples or the list of word objects? And a related question is what are the situations when either of these approaches is preferable?