Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
4 votes
1 answer
87 views

Count words in a list of titles, with some cleanup

I have a list of article titles, where I wish to count the number of occurrences for each word (and remove some words and characters). The input is in a .csv file where the titles are in column '...
Jesper Mølgaard's user avatar
8 votes
2 answers
10k views

String Similarity using fuzzywuzzy on big data

I have a file in which I was to check the string similarity within the names in a particular column. I use fuzzywuzzy token sort ratio algorithm as it is required for my use case. here is the code, is ...
Rishab Oberoi's user avatar
4 votes
2 answers
2k views

Compare two CSV files In Ruby

I wrote my first program in Ruby that compares two CSV files, but I'm sure there are more efficient ways to do it. I tried using the Ruby CSV library at first, but it was unproductive. Please let me ...
nybyjojo's user avatar
0 votes
1 answer
86 views

Replacing address details in colon delimited string

I'm sanitising personal data in a large database with a series of Regular Expressions. In many cases the sensitive data is in a colon-delimited field within a CSV file. Here is a sample: ...
Mere Development's user avatar
2 votes
1 answer
102 views

Splitting apart a comma separated list

I have a system that outputs a comma separated string into a single column. I need to grab the individual values out of that list. The list could have random spaces and commas that need to be ...
gfrobenius's user avatar
1 vote
2 answers
12k views

Replacing data in a .csv file

The code works well and does what I intend for it to do. In essence, it opens a file referenced as 'resource'. This is a .csv file. I then searches for the keys in the dictionary and for each key that ...
thefragileomen's user avatar
4 votes
3 answers
3k views

Parsing a single CSV line into a list of strings

I've written this method to replace an older method that was much simpler, but used the regex split method and couldn't tell if a comma was in quotes/brackets/etc. and didn't read double quotes as ...
Hanii Puppy's user avatar
4 votes
5 answers
950 views

Optimising single-delimiter string tokenisation

I am trying to optimise my tokenizing of tab delimited strings: ...
PidgeyBAWK's user avatar
6 votes
3 answers
25k views

Splitting and printing comma-separated values

I stumbled upon a question on SO asking how to split a comma-separated string into individual values. Since it's been a while since I've had any good reason to write C I'd like to ask for some ...
Etheryte's user avatar
  • 654
5 votes
1 answer
216 views

String modification application

Below is working code of a semi complete program. Its purpose is to take an input string of any type and modify it based on rules defined for each type. So in this example I pass it a string in CSV ...
erotavlas's user avatar
  • 201
2 votes
2 answers
1k views

Speeding up and fixing phone numbers from CSVs with Regex

I've hodgepodged together an attempt to extract all phone numbers from all CSVs in a directory, regardless of where they are and what format they're in. I want all phone numbers to be printed to a ...
Xodarap777's user avatar
4 votes
1 answer
258 views

Cutting strings into smaller ones based on specific criteria

I've got this largish (for me) script, and I want to see if anybody could tell me if there are any ways to improve it, both in terms of speed, amount of code and the quality of the code. I still ...
erikfas's user avatar
  • 279
3 votes
2 answers
4k views

Reading in a file and performing string manipulation

In a question I answered I posted the following code: ...
user avatar
23 votes
5 answers
2k views

Generating CSV strings for various 3rd party utilities

I'm generating CSV strings for various 3rd party utilities and this section of code gets repeated in many classes. Is there a better way to generate this string? ...
Greg Buehler's user avatar