All Questions
220
questions
8
votes
3answers
878 views
7
votes
5answers
106 views
readable validation with python regex and simple logic
I have come with such a code for checking IP addresses. But would like to know how clean and good it is from 1 to 10. What I care about are readability and simplicity. Please give me any feedback.
...
5
votes
2answers
932 views
Date Detection Regex in Python
I worked on a problem from Automate the Boring Stuff Chapter 7:
Write a regular expression that can detect dates in the DD/MM/YYYY
format. Assume that the days range from 01 to 31, the months range
...
3
votes
1answer
55 views
Regex with comments
The below is to parse a lisp expression (doing as much as possible in 'one go'). How does it look, and what can be improved?
...
2
votes
1answer
47 views
Martix of Counts of Regex Hits Over a List of Strings
I have:
A dataframe with Identifiers (TermId) and Search Terms (SearchTerm).
A list of text strings (...
2
votes
1answer
60 views
Sanitize user-supplied HTML with Python and Regular Expressions
I have a product that needs to have users put content in a form that potentially contains HTML and display it back to other users. I'd like to mitigate the risk as much as possible, and I can limit ...
4
votes
3answers
302 views
use regex to separate a list of serial numbers into multiple lists with matched prefix
The question comes from How to separate a list of serial numbers into multiple lists with matched prefix? on Stack Overflow.
Input:
...
3
votes
2answers
170 views
Downloading and parsing research papers
I am trying to write a script which gets a research paper from a website by calling their API and then traverse it sentence-wise with some conditions.
The paper is accessible in XML format. I am ...
3
votes
4answers
226 views
A simplified regular expression matcher
I am working my way through some code challenges — partly to improve my problem solving, but also to improve my code quality.
I think my current (fully-functional) solution to a challenge is pretty ...
2
votes
1answer
49 views
Optimizing a script that switches function arguments in code files
I made a quick script, that inputs file paths(made for C files), function names and 2 indexes. The script goes over all of the files and switches the indexes of the 2 functions. e.g.:
Running the ...
8
votes
3answers
631 views
Function in Python to extract web Data
I developed this feature that I think can be improved quite a bit. The result is the desired one, but it took me many lines of code. Any idea to optimize it?
...
1
vote
1answer
224 views
Coin Flip Streaks script
I am attempting to complete the coin flip streaks problem from automate the boring stuff with python.
My code works fine but my only concern is the phrasing of the task.
Does the question want us to ...
5
votes
1answer
118 views
Calculator that finds the area under a curve
The program is meant to collect from the user:
The function under which to calculate the area
The left and right boundaries of the region
The amount and position of rectangles to use to approximate ...
4
votes
1answer
422 views
Finding Pattern Score in Python
I saw this problem in C++ on here and decided to try it in Python, which was much simpler. I've used the same problem blurb as in the link above, so they are consistent. I'm sure my code can be ...
6
votes
2answers
152 views
Separating data from string representation of objects, with added extras
Given a string representation of data, I want to extract the information into its corresponding object.
However,
If the string has "|" separators then these should be considered options and ...
3
votes
1answer
125 views
rename .html file from <title> tag with python
I have many html files saved to my computer.they have same tags like this: Rpi-Cam-Web-Interface- Page 2 - forum
and the page number changes
I want to rename file ...
1
vote
2answers
269 views
Python Regex to validate an email
I have written this regex to validate an email. It seems to work fine. Can someone advise on the regex used? Does it work properly or can be done in a better way?
...
2
votes
2answers
82 views
Simple text parser using regexes
I'm trying to write simple parser using regexes. This is what I currently have, it looks really messy. Any tips what can I change?
...
5
votes
2answers
255 views
Can it be shorter?: DNA Sequence Analyzer from CS50 written in Python
This is my first time requesting a code review. This is code I wrote in Python to create a DNA sequence analyzer. This is for the Harvard CS50 course.
I was really hoping to get some constructive ...
1
vote
1answer
758 views
Python - Password Generator & Strength Checker
I am a beginner in Python and I have attempted to create a small script/program which allows the user to do the following:
Generate a single random password
Generate a number of passwords specified ...
2
votes
0answers
40 views
Genetic sequence analyzer that reads FASTA and GenBank file formats and outputs all possible gene products
I have updated my my gene sequencing program from my previous post.
That post explains what each functions accomplish.
If you need clarifications feel free to ask.
Any tips to make the code more ...
3
votes
1answer
252 views
Python - A Regex Text File Searcher by Content Project
I would like some advice on efficiency and ways to further remove redundant code.
Regex Text File Search:
enter a valid directory path, and a string to search for
makes a list containing all the ...
1
vote
3answers
2k views
Python - Making A Valid Date Checker - using Regular Expressions
Date Detection:
Write a regular expression that can detect dates in the DD/MM/YYYY format.
Assume that the days range from 01 to 31, the months range from 01
to 12, and the years range from 1000 to ...
2
votes
2answers
162 views
Extracting min and max salary from string
What I want is to do is to extract min/max range of salary from a text which contains either hourly or annual salary.
...
2
votes
2answers
158 views
Creating nested list comprehension of files starting with specific string
I have a directory with 'sets' of files that start with a state name followed by 4 or 5 digits (typically indicating year). Each 'file set' contains 3 files a .txt, a .png, and a .jpg.
Example of ...
1
vote
2answers
51 views
Python function to find specific regex in the text of an XML document
I'm writing a code that, starting from an XML file:
stores the index of child elements of a tag and the child elements as
key, values in a dictionary (function ...
1
vote
1answer
444 views
Phone Number and Email Address Extractor - Is there any way to simplify this code?
I'm very new to programming and I don't feel confident about the readability of this program.
This program gets the text you copied then extracts phone numbers and email addresses in the text. Once ...
-1
votes
1answer
77 views
Regex to remove spoken email names
I'm trying to remove email names from transcripts because it removes unnecessary info that is not needed in the transcript.
The regex below removes email names from a spoken transcript.
Examples ...
3
votes
2answers
255 views
Regular Expressions - finding two or more vowels with a consonant before and after it, includes overlapping
Is there a way to make my code more compact or efficient?
Testcase example 1 : rabcdeefgyYhFjkIoomnpOeorteeeeet
output:
ee
Ioo
Oeo
eeeee
Testcase example 2 : baabooteen
output:
aa
oo
ee
...
1
vote
2answers
109 views
Tokenizing a large document
I'm currently trying to process a corpus of a million patent text files, which contain about 10k non-unique words on average. My current data pipeline works as follows:
Load the patent texts as ...
1
vote
2answers
87 views
Cleaning HTMLs to extract text
I have written the following routine to clean an html raw code and extract the text.
It works, but the code is not very well written. Any ideas how to solve this faster and with fewer lines of code?
...
3
votes
1answer
963 views
0
votes
0answers
50 views
Script for matching regex matches between two lists
gist of this project is that I have two datasets,
1. dfCore, which contains station_code and station_name
2. dfMetrix, which contains station_name
the ...
6
votes
2answers
2k views
FQDN Validation
I'm new to programming, I'd like you to check my work and criticize the hell out of me. What would you do differently?
FQDN: https://en.m.wikipedia.org/wiki/Fully_qualified_domain_name
253 ...
1
vote
1answer
62 views
Match a multi line text block
I want to count the occurrences of a multi line ASCII pattern within a file.
The pattern looks like this:
a1
b2
c3
The file I want to search looks like this:
(...
3
votes
2answers
70 views
Find and replace consecutive string values in a list
Suppose I have a list that holds a list of column names:
...
2
votes
1answer
509 views
Strong Password Checker (Python)
Problem
Write a program to return True, if a string is a strong password.
A password is considered strong if below conditions are all met:
It should have at least ...
5
votes
3answers
1k views
LeetCode 65: Valid Number (Python)
Problem
Validate if a given string can be interpreted as a decimal or scientific number.
Some examples:
...
5
votes
2answers
161 views
Splitting each word using longest substring that matches in a vocab/dictionary
This function takes in a string and a fixed list of vocab:
splits a string into words by spaces
for each word, check in the dictionary/vocab and find the longest matching substrings, if none matching ...
3
votes
1answer
474 views
Word frequency analysis: Python
I wrote a very rudimentary code that counts sentences and words in the arbitrary text.
Code:
...
5
votes
1answer
69 views
Extract, filter and match three letters from the given arguments and predict the name
Case 1: rank1_naming
This function takes two arguments
list_proteins_pattern_available
best_match_protein_name
Objective: Extract the three letter pattern from the both arguments.
Match the ...
7
votes
4answers
1k views
Sort files in a given folders and provide as a list
I have two folders query and subject. The following function sorts alphabetically and provides the query and subject as a separate list. The file names are mixed of numbers and I found the sort ...
3
votes
1answer
58 views
Run an external program and extract a pattern match along with the result file
The script takes two input files of protein sequences and runs an external program (installed in linux/MacOS). The result provides a text output file example output.. Identity percentage is extracted ...
1
vote
2answers
69 views
Extract three letter pattern from a given list
The code takes a list of names and extracts the three letter pattern from the names. Then return the list of names as a value in the dictionary with key as a pattern. The list name has a fixed naming ...
5
votes
1answer
975 views
Regex XML validator
I have written a Regex to validate XML/HTML, along with any attributes. It aims to:
Match any XML-like text
Not match any unclosed tags
Adapt for any spacing, newlines, etc.
Be as generous as ...
4
votes
1answer
183 views
2
votes
1answer
54 views
HTTP scraper efficiency with multiprocessing
I built this scraper for work that will take a csv list of firewalls from our network management system and scan a given list of HTTPS ports to see if the firewalls are accepting web requests on the ...
1
vote
1answer
63 views
Short function to remove unnecessary whitespace
I have a function consisting of one line of code:
def trimString(string):
""" Remove unnecessary whitespace. """
return re.sub('\s+', ' ', string).strip()
...
3
votes
2answers
98 views
Checking the strength of a password using regexes
I'm a beginner in Python and I made a password checker. I'm following these rules:
I have to use regex
password must be at least 8 characters
must have at least one lower case character
must have at ...
2
votes
1answer
697 views
Beautifulsoup and lxml (xpath) too slow with respect to regex when parsing HTML
I agree that using regex to parse HTML is not a good way, in particular I am worried about their fragility with respect to change in the HTML.
The problem is that any alternatives are really too slow....