-1
votes
0answers
11 views

Trying to use Python regular expressions on a web server(via CGI), works fine in IDLE.

def test_iterator(): with file as dictionary_file: for line in dictionary_file: if re.match(r"^%s$" % word_ex, line.decode('unicode_escape').encode('ascii','ignore')) is not ...
0
votes
1answer
25 views

Python regex issues

Im trying to grab proxies from a site using python by scanning through the page with urlib and finding proxies using regex. A proxy on the page looks something like this: <a ...
0
votes
1answer
29 views

re.compile does not work properly

I'm trying to find tag usin bs4, where text is in format: 'Firma: ...........'. The problem is that re.compile does not work for this at all. I can't find out what am I doing. Here is the code of ...
1
vote
2answers
41 views

Regex for name extraction on text file

I've got a plain text file containing a list of authors and abstracts and I'm trying to extract just the author names to use for network analysis. My text follows this pattern and contains 500+ ...
0
votes
1answer
27 views

Can't get a regex to handle brackets properly

Apologies for the vague title. I'm trying to get a regex that searches and OKs something like this: "Brand New Song [Demonstration]" by finding the "[Demonstration]" somewhere in the string, ...
-1
votes
2answers
43 views

Clean Messy Strings in Python

I have a messy inventory list (around 10K) to clean and I am some problems of using regular expression in Python to achieve this. Here is a small sample of my list: product_pool=["#101 BUMP STOPPER ...
1
vote
4answers
52 views

Parse values from a block of text based on specific keys

I'm parsing some text from a source outside my control, that is not in a very convenient format. I have lines like this: Problem Category: Human Endeavors Problem Subcategory: Space ...
0
votes
1answer
55 views

Search and replace between two files with search phrase in first file: Python

File 1: $def String_to_be_searched (String to be replaced with) File 2: ..... { word { ${String to be searched} } # each line in File 2 is in this format ..... { word { ${String} } # This line ...
2
votes
3answers
28 views

parsing a path based on regex - Python

I'm trying to scan an Amazon S3 bucket, in order to search if a new version of our installer has been posted: The bucket scan returns me something like: versions/ versions/4.4.1.2/ ...
0
votes
1answer
41 views

Regular expression to parse sequence IDs

I'm having a bit of trouble with using regular expressions to extract information from flat files (just text). The files are structured as such: # ID (e.g. >YAL001C) Annotations/metadata (short ...
3
votes
3answers
60 views

Parsing of parenthesis with sed using regex

I am looking for a command in sed which transforms this input stream: dummy (key1) (key2)dummy(key3) dummy(key4)dummy dummy(key5)dummy))))dummy dummy(key6)dummy))(key7)dummy)))) into this one: ...
0
votes
3answers
47 views

Python re.split with comma and parenthesis “(” and “)” [duplicate]

so I got this code: from dosql import * import cgi import simplejson as json import re def index(req, userID): userID = cgi.escape(userID) get = doSql() rec = get.execqry("select ...
-2
votes
0answers
36 views

Taking out contents from .docx file into txt file in python [on hold]

I am trying to read the contents from .docx file and write it into another. for reading .docx file i did this import zipfile z = zipfile.ZipFile('abc.docx') data = z.read('word/document.xml') now ...
0
votes
2answers
35 views

how to solve about regex file refining program?

I want to refine my csv file. i imported fileinput, optparse, and re modules. and loaded a csv file, and set if an word doesn't exist, delete it. but i received blank file. here is my code: ...
-1
votes
1answer
27 views

Match repeating characters from set

from re import search import random while True: r = ''.join(random.choice(string.ascii_lowercase + string.ascii_uppercase + string.digits) for _ in range(random.randint(1, 100))) if ...
4
votes
2answers
46 views

Is there a way to refer to the entire matched expression in re.sub without the use of a group?

Suppose I want to prepend all occurrences of a particular expression with a character such as \. In sed, it would look like this. echo '__^^^%%%__FooBar' | sed 's/[_^%]/\\&/g' Note that the ...
-2
votes
0answers
39 views

Python code quality standards: How pythonic can it go? [on hold]

So I started writing a simple dictionary parsing program that pulls out the words from a dictionary text file. It tests for the words that end in "ATH" and spits out the results to an output file. ...
-1
votes
2answers
41 views

What is the meaning of % in this python expression

Can someone explain what this regular expression means? I am looking at someone else's python code, and I just find myself curious as to what the expression is doing. I am also not certain what the ...
0
votes
1answer
50 views

multiple regex matches on one line not working

I have some HTML which I want to extract out text blocks that: begin with either # or | (pipe symbol) followed by some text and a 'ticker' in brackets followed by all text until the next match ...
-1
votes
1answer
38 views

Filter a list of tuples by pairwise pattern [on hold]

I have the following string, wich is full of a word and it´s parts of speech: [('lavadora', 'NCFS000'), ('sencilla', 'AQ0FS0'), ('facilidad', 'NCFS000'), ('casa', 'NCFS000'), ('marca', 'NCFS000'), ...
0
votes
0answers
19 views

Python NLTK RegexpParser

I am trying to break up the sentence "I am going to the market to buy vegetables and some fruits" into "I am going to the market" and "to buy vegetables and some fruits" This can be done using the ...
-2
votes
1answer
39 views

Python Regexp do not capture www. or .com inside xyz word in text data

View this Demo RegExp I do not want to capture "xyz" word inside between www. or .com View Screen
-1
votes
5answers
52 views

Check that two characters are not adjacent in Python

In my program, I need to check a user-input equation to make sure that it's a valid equation. I got rid of any operators at the beginning or end by using myEquation[0].isdigit and ...
3
votes
2answers
69 views

Regular expression for multiple occurances in python

I need to parse lines having multiple language codes as below 008800002 Bruxelles-Nord$Br�ussel Nord$<deu>$Brussel Noord$<nld> 008800002 being a id Bruxelles-Nord$Br�ussel Nord$ ...
1
vote
6answers
58 views

Python regex for number with or without decimals using a dot or comma as separator?

I'm just learning regex and now I'm trying to match a number which more or less represents this: [zero or more numbers][possibly a dot or comma][zero or more numbers] No dot or comma is also okay. ...
0
votes
3answers
36 views

Python regex capture text between “:” and “. ” (dot followed by whitespace)

I have some pieces of text like this: GAEDS030, GAEDS031, GAEDS032 : Problem reported in a https://twikiae.myweb.es /twiki/bin/view/Grid/ActFeb2011 previous entry has been observed in another disk ...
-5
votes
2answers
49 views

What is the shortest and/or most efficient regex that will match any input string?

I'm using Python's re module to filter a lot of data. I want to have a default filtering regex for when the user does not care, such that any input string will match. I think the shortest and most ...
0
votes
1answer
37 views

Preserving the order/ocurrence of an adjective,noun label-id with a regular expression? [duplicate]

Im new with python could anybody help me on how can to create a regular expresion given a list of strings like this: test_string = "pero pero CC tan tan RG antigua antiguo AQ0FS0 que ...
-1
votes
3answers
37 views

check if string contains special characters in python

I want to check if a password contains special characters. I have googled for a few examples but cant find that addresses my problem. How do I do it? Here is how I am trying it so far; elif not ...
0
votes
2answers
38 views

How to match a string in python with conditional looping

I am a beginner in python. I want to ask the user to input his first name. The name should only contain letters A-Z,if not, I want to display an error and request the user to enter the name again ...
0
votes
0answers
50 views

Python Pandas replace returns strange results

Don't know if this is a bug or feature. Please forgive my ignorance if it is later. I try to use dataframe.replace to convert a str_type column to different texts. my dataframe column contains df = ...
-1
votes
2answers
43 views

Clean Dict Output

I have a dict output which has special characters which i don't need. I am trying to clean it but the code below does not seem to work all at once. i.e I can either remove everything except numbers or ...
1
vote
1answer
22 views

Switching on regex matches

Can you suggest a nicer way to write the following: for r in replacements: m = pattern_1.match(r) if m: a.append((r,m.group(1),m.group(2),m.group(3))) continue m = ...
-6
votes
1answer
57 views

How to understand regular expression with python?

Im new with python could anybody helpme on how can to create a regular expresion given a list of strings.
0
votes
2answers
41 views

Python Regex: Trying to create pattern

I did my best to check the internet and stack for info but I am having trouble wrapping my head around regex for my utility. I have a string that follows this pattern: [any ...
0
votes
3answers
44 views

Extracting links with regex from source code; Python

I have a dataset of links to newspaper articles that I want to do some research on. However, the links in the dataset end with .ece extension (which is a problem for me because of some api ...
-3
votes
2answers
39 views

How to pull all strings from a lists of lists that contain a certain character? [closed]

I have the following lists of lists and I'd like all strings within the list of lists that contain the "|" character. l = [['a','b','c|','d'],['1|','|2','3|','4'],['1|','2','3|','4','']] Results: ...
-2
votes
3answers
46 views

Separating numbers from a string of which numbers are separated by a $ symbol?

I have taken a string in which numbers will end with commas , The program is as shown below: import re s = 'natraj 12 dozen $100.25, camlin 10 box $1250.50,' lis = re.split('\s*\$\s*|\s*\,\s*', ...
-3
votes
1answer
45 views

Python: use of r“ ” and + token in re.sub() function

I am not able to understand what the re.sub() function does in Python. I have read the documentation and other StackOverflow posts, but none of them clearly explains the re.sub() function. Can someone ...
-2
votes
2answers
38 views

Parse repeated occurance of +d in a line

This is the line to parse: 001000000 +3 12091992 +2 0200 +3 I have used like: Z = re.compile('(?P<stop_id>\d{9}) (?P<time_displacement>([-|+]\d{0,4})*)', flags=re.UNICODE) m = ...
1
vote
3answers
42 views

Python beautifulsoup extract value without identifier

I am facing a problem and don't know how to solve it properly. I want to extract the price (so in the first example 130€, in the second 130€). the problem is that the attributes are changing all the ...
-4
votes
0answers
71 views

How to split a string as per our need in python?

This is my input: "Vegitable 10 kg $100$ Taxi 10 kms $200$ mobile brothers $200$ clothes 2 shirts $1500.50$" And i want output as a list: ['Vegitable 10 kg', '100 ','Taxi 10 kms ' ,' 200 ...
2
votes
2answers
51 views

Regular expression for optional fields in Python

I need to parse a line with regular expression with it's last two parameter being optional. I am giving you an example and the expression I have written. exclaim and name are optional at the end. x ...
-2
votes
3answers
78 views

Python regular expression select “Nissan” word except between <a>…</a> or <span>…</span> tag

View on Live regex101 My regular expression pattern is [Nn]issan(?=[^<>]*<)(?!(?:(?!</?(?:a|span)[ >/])(?:.|\n))*</(?:a|span)>) I want to stop capture url inside nissan ...
-1
votes
3answers
53 views

How to make regular expression?

I have a text file which contains blocks like this. 15000 : 7072 : 25 dBm 15000 : 7073 : 23 dBm 15000 : 2551 : 18 dBm 15000 : 6102 : 24 dBm ...
0
votes
3answers
57 views

Nongreedy Regex with Repetition

I am using the following regex: ((FFD8FF).+?((FFD9)(?:(?!FFD8).)*)) I need to do the following with regex: Find FFD8FF Find the last FFD9that comes before the next FFD8FF Stop at the last FFD9 ...
-1
votes
1answer
31 views

Splitting a comma-separated string

I have following string, and what I would like to have is split it to get an array of key:value pairs color:'White', color:('White' or 'Black'),color:'YELLOW,BLACK', price: [11,12], price:{13, 14}, ...
0
votes
1answer
27 views

What regular expression to use to get a string that follows a certain word? [duplicate]

Hi I am still a beginner and have been trying to figure out how to use regular expression on this string: Name: Brenden Walski I want to get the value of the name or basically I want to get ...
-5
votes
2answers
31 views

String in Another String [closed]

How can you tell if a string is in another string using an reg exp? For example, hello is in hasdfasdeasdfasdflasdfasdflasdfasdfo, but not in haealala (In python) Thanks!
1
vote
2answers
57 views

How to find a textual description of emoticons, unicode characters and emoji in a string (python, perl)?

The detection and counting of emoticon icons has been addressed previously. As a follow-up on this question and the solution provided, I'd like extend it with ability to link the detected emoticons, ...