3
\$\begingroup\$

I have written the following code which reads a csv file that contains a bunch of words and their sentiment value. Words like abandon may have a value of -1, while words like progress and freedom have a value of +1. So, the csv files acts as a database while we ask user for a txt file containing a speech or an essay to compare.

After reading all the sentiment values (ranging from -1 to +1) the code attempts to build a histogram. We have five categories, i.e, "Negative", "Weakly Negative", "Neutral", "Weakly Positive", "Positive". We map how frequently certain numbers appear and put them in their respective category.

My questions are,

  1. Can I make this code run any faster?
  2. Can I make this code more any smaller?
import numpy as np; 
import matplotlib.pyplot as plot

lexiconSentiment = np.genfromtxt("list_of_lexicon_value.csv", delimiter = ',', dtype = [('f0', 'S24'), ('f1', '<f8')])

userInput = input("Enter the file-name: ")

textFileIntoArray = np.genfromtxt(userInput, delimiter = ' ', dtype = 'str')

booleanValues = np.in1d(lexiconSentiment['f0'], textFileIntoArray)

listOfSentimentNumberValue = []

the size of booleanValue
for x in range(0,booleanValues.size):
    if booleanValues[x]:
        listOfSentimentNumberValue.append(lexiconSentiment['f1'][x])
        print lexiconSentiment['f0'][x]

xLabelDescription = ["Negative", "Weakly Negative", "Neutral", "Weakly Positive", "Positive"]

plot.xlabel("Sentiment"); plot.ylabel("Percent of Words")

plot.hist(listOfSentimentNumberValue, bins = (-1.0,-0.5,0.0,0.5,1.0,1.5),  color = 'blue', range = (0.0, 0.0), normed = True)

label_pos = [-0.75,-0.25,0.25,0.75,1.25]

plot.xticks(label_pos, xLabelDescription)
\$\endgroup\$

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.