0
votes
5answers
36 views

Python- Regex for dir of certain depth

I have a regex expression but its not working for all cases. I need it to be able to match any case of the following within two levels of depth: If this word "test_word" is in the statement return ...
1
vote
4answers
40 views

Python - regex for dir

I have a regex expression but its not working for all cases. I need it to be able to match any case of the following: If this word "test_word" is in the statement return true What I been using ...
0
votes
2answers
28 views

python regular expresions match # ner

I am trying to extract the positions using regular expresions in a file like this: 36 17.89 N, 2 51.62 W 35 51.13 N, 2 51.62 W 35 51.13 N, 2 49.14 W 36 17.89 N, 2 49.14 W 36 17.89 N, 2 ...
-2
votes
1answer
30 views

how should I extract the double-quoted part of the text in python? [on hold]

I have a csv file, in which each record is a text. I would like my code to extract only the double-quoted part of the text, find the synonym of the extracted text in the thesaurus library, and update ...
-1
votes
1answer
22 views

How do you perform replacement of text in files and paste the replaced text at a specific position in one file

""" I am using this program to perform replamcement of text in four different files and then compare this to a tepmplate and then paste the replaced text at specific position in a ...
0
votes
2answers
38 views

re regex find first half but keep second half of line

Folks, doing my head in trying to search this question, as I find it strange to describe briefly... I am trying to strip out unnecessary text from a bank statement eg: source: TFR 09343-9724 to ...
0
votes
0answers
46 views

Replace placeholders from a file in another file

I am trying to create a utility which would: Extract placeholder from a .plh file placed on the file system. Create a hash map through the delimited key-value pairs. Replace the key from value ...
1
vote
6answers
56 views

Regex issue in python

I have a regex "value=4020a345-f646-4984-a848-3f7f5cb51f21" if re.search( "value=\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*\-\w*|\d*", x ): x = re.search( ...
1
vote
2answers
29 views

Comparing multiple file items using re

Currently I have a script that finds all the lines across multiple input files that have something in the format of Matches: 500 (54.3 %) and prints out the top 10 highest matches in percentage. I ...
0
votes
1answer
39 views

Pythonic way to strip a string

I am trying to achieve a python equivalent of the following bash command: VERSION=$( curl --silent "http://nexus:8080/nexus/service/local/lucene/search?g=com.xxx.yyy&a=zzz" | sed -n ...
0
votes
1answer
35 views

Python difflib with regex

I would like to compare a string A with a regex R. A = u'Hi my friend, my name is Julio' R = r'Hi\s+my\s+friend,\s+my\s+name\s+is([A-Za-z]+)' At this time I can easily know if the syntax is good ...
0
votes
1answer
29 views

Python regex not working in code

I've uploaded it here.. http://regex101.com/r/hC2eH3/1 Let email.body = > From: "FastTech" <[email protected]> > > == Ship to == > Example Name > > Shipping via ...
-1
votes
1answer
36 views

TypeError: expected string or buffer with re.match and matchObj.group()

I keep getting the error: Traceback (most recent call last): File "ba.py", line 13, in <module> matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line) ...
0
votes
3answers
37 views

How write a regex that preserve the order of appearence with python?

Given the following string i would like to extract a tuple such that the tuple preserve the appeareance of an assosiated id (POS tag). The order is: NCFS000, AQ0CS0. They need to be concecutive, no ...
1
vote
2answers
49 views

Why regex findall return a weird \x00

I use a regex to build a list of all key-value pair present on line(string). My key-pair syntax respect/match the following regex: re.compile("\((.*?),(.*?)\)") typically I have to parse a string ...
0
votes
2answers
31 views

Using lxml or ??? to extract information from webpages

currently I have the following code: # Import der Pythonmodule import urllib import lxml import mechanize import sys # Verbindung zum URL aufbauen try: URL = urllib.urlopen("http://...") ...
2
votes
1answer
43 views

How to replace and shift the string pattern in python using re?

I have a snippet like re.sub(r"""\s*(\p{LD}+)\s+NEAR/(\d)\s+(\p{LD}+)\s*""",r""""$1 $3"~$2""",'foo NEAR/4 bar') in python. expected output is "foo bar"~4 but now i am getting foo NEAR/4 bar ...
5
votes
3answers
146 views

why python regex is so slow?

After long debugging I found why my application using python regexps is slow. Here is something I find surprising: import datetime import re pattern = re.compile('(.*)sol(.*)') lst = ["ciao mandi ...
0
votes
2answers
64 views

Why Python chokes on this regex?

This looks like a simple regex, no backreferences, no "any" characters, I'd even dare to say it's parseable by a Thomson DFA and all. It even works, but chokes on very simple non-matches. {\s*? ...
0
votes
4answers
45 views

Python how to split a string into words that contain words with a single quote?

I have a string a, I would like to return a list b, which contain words in a that not starts from @ or #, and not contains any non-word characters. However, I'm in trouble of keep words like ...
0
votes
0answers
30 views

regular expression issue in django urls.py

I am serving static files in Django locally through their simple server. I have the following line included in my urls.py: static(settings.STATIC_URL, document_root=settings.STATIC_ROOT) (Bonus ...
0
votes
0answers
61 views

How do I sanitize a list comprehension given by a user?

I am working on an interface for a simulator that is meant to be friendly to people who prefer the command line to a GUI. To give the simulator the levels, the user types the information into a file, ...
0
votes
3answers
46 views

Replace two different characters with Regex

I want to use Regex to add a space between the parentheses and the arithmetic operators and digits. For example, I want to replace (+ 2 3) with ( + 2 3 ) I wrote this Regex, but doesn't seem to ...
0
votes
1answer
23 views

How to use regex to capture the correct repeated group?

A html containing 2 lines far apart from each other like below. Please note that there are 2 identical strings at the beginning of these two lines. <a href="http://example.com">file ...
1
vote
0answers
47 views

Finding fuzzy ratio average of certain words in a text file

I am trying to find the fuzzy ratio between certain words in a text file and get its average. I written a coding which will find the fuzzy ratio of all the lines in a text file with 'hello' but not ...
-1
votes
0answers
19 views

Trying to use Python regular expressions on a web server(via CGI), works fine in IDLE.

def test_iterator(): with file as dictionary_file: for line in dictionary_file: if re.match(r"^%s$" % word_ex, line.decode('unicode_escape').encode('ascii','ignore')) is not ...
0
votes
1answer
28 views

Python regex issues

Im trying to grab proxies from a site using python by scanning through the page with urlib and finding proxies using regex. A proxy on the page looks something like this: <a ...
0
votes
1answer
30 views

re.compile does not work properly

I'm trying to find tag usin bs4, where text is in format: 'Firma: ...........'. The problem is that re.compile does not work for this at all. I can't find out what am I doing. Here is the code of ...
1
vote
2answers
45 views

Regex for name extraction on text file

I've got a plain text file containing a list of authors and abstracts and I'm trying to extract just the author names to use for network analysis. My text follows this pattern and contains 500+ ...
0
votes
1answer
28 views

Can't get a regex to handle brackets properly

Apologies for the vague title. I'm trying to get a regex that searches and OKs something like this: "Brand New Song [Demonstration]" by finding the "[Demonstration]" somewhere in the string, ...
-1
votes
2answers
48 views

Clean Messy Strings in Python

I have a messy inventory list (around 10K) to clean and I am some problems of using regular expression in Python to achieve this. Here is a small sample of my list: product_pool=["#101 BUMP STOPPER ...
1
vote
4answers
65 views

Parse values from a block of text based on specific keys

I'm parsing some text from a source outside my control, that is not in a very convenient format. I have lines like this: Problem Category: Human Endeavors Problem Subcategory: Space ...
0
votes
1answer
56 views

Search and replace between two files with search phrase in first file: Python

File 1: $def String_to_be_searched (String to be replaced with) File 2: ..... { word { ${String to be searched} } # each line in File 2 is in this format ..... { word { ${String} } # This line ...
2
votes
3answers
30 views

parsing a path based on regex - Python

I'm trying to scan an Amazon S3 bucket, in order to search if a new version of our installer has been posted: The bucket scan returns me something like: versions/ versions/4.4.1.2/ ...
0
votes
1answer
46 views

Regular expression to parse sequence IDs

I'm having a bit of trouble with using regular expressions to extract information from flat files (just text). The files are structured as such: # ID (e.g. >YAL001C) Annotations/metadata (short ...
3
votes
3answers
62 views

Parsing of parenthesis with sed using regex

I am looking for a command in sed which transforms this input stream: dummy (key1) (key2)dummy(key3) dummy(key4)dummy dummy(key5)dummy))))dummy dummy(key6)dummy))(key7)dummy)))) into this one: ...
0
votes
3answers
47 views

Python re.split with comma and parenthesis “(” and “)” [duplicate]

so I got this code: from dosql import * import cgi import simplejson as json import re def index(req, userID): userID = cgi.escape(userID) get = doSql() rec = get.execqry("select ...
0
votes
2answers
36 views

how to solve about regex file refining program?

I want to refine my csv file. i imported fileinput, optparse, and re modules. and loaded a csv file, and set if an word doesn't exist, delete it. but i received blank file. here is my code: ...
-1
votes
1answer
27 views

Match repeating characters from set

from re import search import random while True: r = ''.join(random.choice(string.ascii_lowercase + string.ascii_uppercase + string.digits) for _ in range(random.randint(1, 100))) if ...
4
votes
2answers
48 views

Is there a way to refer to the entire matched expression in re.sub without the use of a group?

Suppose I want to prepend all occurrences of a particular expression with a character such as \. In sed, it would look like this. echo '__^^^%%%__FooBar' | sed 's/[_^%]/\\&/g' Note that the ...
-2
votes
0answers
41 views

Python code quality standards: How pythonic can it go? [closed]

So I started writing a simple dictionary parsing program that pulls out the words from a dictionary text file. It tests for the words that end in "ATH" and spits out the results to an output file. ...
-1
votes
2answers
41 views

What is the meaning of % in this python expression

Can someone explain what this regular expression means? I am looking at someone else's python code, and I just find myself curious as to what the expression is doing. I am also not certain what the ...
0
votes
1answer
51 views

multiple regex matches on one line not working

I have some HTML which I want to extract out text blocks that: begin with either # or | (pipe symbol) followed by some text and a 'ticker' in brackets followed by all text until the next match ...
-1
votes
1answer
40 views

Filter a list of tuples by pairwise pattern [closed]

I have the following string, wich is full of a word and it´s parts of speech: [('lavadora', 'NCFS000'), ('sencilla', 'AQ0FS0'), ('facilidad', 'NCFS000'), ('casa', 'NCFS000'), ('marca', 'NCFS000'), ...
-2
votes
1answer
40 views

Python Regexp do not capture www. or .com inside xyz word in text data

View this Demo RegExp I do not want to capture "xyz" word inside between www. or .com View Screen
-1
votes
5answers
52 views

Check that two characters are not adjacent in Python

In my program, I need to check a user-input equation to make sure that it's a valid equation. I got rid of any operators at the beginning or end by using myEquation[0].isdigit and ...
3
votes
2answers
71 views

Regular expression for multiple occurances in python

I need to parse lines having multiple language codes as below 008800002 Bruxelles-Nord$Br�ussel Nord$<deu>$Brussel Noord$<nld> 008800002 being a id Bruxelles-Nord$Br�ussel Nord$ ...
1
vote
6answers
60 views

Python regex for number with or without decimals using a dot or comma as separator?

I'm just learning regex and now I'm trying to match a number which more or less represents this: [zero or more numbers][possibly a dot or comma][zero or more numbers] No dot or comma is also okay. ...
0
votes
3answers
37 views

Python regex capture text between “:” and “. ” (dot followed by whitespace)

I have some pieces of text like this: GAEDS030, GAEDS031, GAEDS032 : Problem reported in a https://twikiae.myweb.es /twiki/bin/view/Grid/ActFeb2011 previous entry has been observed in another disk ...
-5
votes
2answers
49 views

What is the shortest and/or most efficient regex that will match any input string?

I'm using Python's re module to filter a lot of data. I want to have a default filtering regex for when the user does not care, such that any input string will match. I think the shortest and most ...