All Questions
14 questions
4
votes
1
answer
87
views
Count words in a list of titles, with some cleanup
I have a list of article titles, where I wish to count the number of occurrences for each word (and remove some words and characters). The input is in a .csv file where the titles are in column '...
8
votes
2
answers
10k
views
String Similarity using fuzzywuzzy on big data
I have a file in which I was to check the string similarity within the names in a particular column. I use fuzzywuzzy token sort ratio algorithm as it is required for my use case. here is the code, is ...
4
votes
2
answers
2k
views
Compare two CSV files In Ruby
I wrote my first program in Ruby that compares two CSV files, but I'm sure there are more efficient ways to do it. I tried using the Ruby CSV library at first, but it was unproductive. Please let me ...
0
votes
1
answer
86
views
Replacing address details in colon delimited string
I'm sanitising personal data in a large database with a series of Regular Expressions. In many cases the sensitive data is in a colon-delimited field within a CSV file.
Here is a sample:
...
2
votes
1
answer
102
views
Splitting apart a comma separated list
I have a system that outputs a comma separated string into a single column. I need to grab the individual values out of that list. The list could have random spaces and commas that need to be ...
1
vote
2
answers
12k
views
Replacing data in a .csv file
The code works well and does what I intend for it to do. In essence, it opens a file referenced as 'resource'. This is a .csv file. I then searches for the keys in the dictionary and for each key that ...
4
votes
3
answers
3k
views
Parsing a single CSV line into a list of strings
I've written this method to replace an older method that was much simpler, but used the regex split method and couldn't tell if a comma was in quotes/brackets/etc. and didn't read double quotes as ...
4
votes
5
answers
950
views
Optimising single-delimiter string tokenisation
I am trying to optimise my tokenizing of tab delimited strings:
...
6
votes
3
answers
25k
views
Splitting and printing comma-separated values
I stumbled upon a question on SO asking how to split a comma-separated string into individual values.
Since it's been a while since I've had any good reason to write C I'd like to ask for some ...
5
votes
1
answer
216
views
String modification application
Below is working code of a semi complete program. Its purpose is to take an input string of any type and modify it based on rules defined for each type. So in this example I pass it a string in CSV ...
2
votes
2
answers
1k
views
Speeding up and fixing phone numbers from CSVs with Regex
I've hodgepodged together an attempt to extract all phone numbers from all CSVs in a directory, regardless of where they are and what format they're in. I want all phone numbers to be printed to a ...
4
votes
1
answer
258
views
Cutting strings into smaller ones based on specific criteria
I've got this largish (for me) script, and I want to see if anybody could tell me if there are any ways to improve it, both in terms of speed, amount of code and the quality of the code. I still ...
3
votes
2
answers
4k
views
Reading in a file and performing string manipulation
In a question I answered I posted the following code:
...
23
votes
5
answers
2k
views
Generating CSV strings for various 3rd party utilities
I'm generating CSV strings for various 3rd party utilities and this section of code gets repeated in many classes. Is there a better way to generate this string?
...