0
votes
1answer
22 views

Python regex for matching words containing characters and digits but not words with only digits [on hold]

i try to create in python3 a regular expression that matches words that contain "A-Za-z" or "A-Za-z0-9" but not only "0-9". For example i want to match "fooT", "foo23", "fo24ooo", "fo4o444" but NOT ...
-1
votes
2answers
30 views

Python: Take URL as an argument

I have made this script that currently works as I want it to. The URL (visible at the bottom of the script) is obviously hard coded into the script. I want the script to prompt the user for the URL or ...
0
votes
2answers
45 views

Removing all digits from a string on python

So i need some help on removing the digits from this string import re g="C0N4rtist" re.sub(r'\W','',g)' print(re.sub(r'\W','',g)) it should look like CNrtist but instead it gives me 04 I've ...
-1
votes
1answer
20 views

Split document into multiple files based on pattern

I'm trying to split a large text document of articles into multiple text files based on a boundary like this: 9 of 10 DOCUMENTS at the beginning of each chunk. Everything after that pattern but ...
0
votes
1answer
15 views

fnmatch does not work with variables but with static strings

The following code does not find any of the patterns defined in the file patterns. #!/usr/bin/env python import os import fnmatch patternFile = open('patterns', 'r') patterns = ...
-1
votes
2answers
43 views

i can't take the “?” character in the URL of django url

I try to code to django a redirect fonction. I provide the url, and i want to redirect to the provided URL. in my urls.py: urlpatterns = patterns('', url(r'^redirect/(?P<name>.*)$', ...
0
votes
2answers
45 views

Pandas clean column and apply optional multiplier

I'm using python and pandas. However this might be a regex question.... BE WARNED! I have a dataframe similar to the following: 21 190000 27 170000 29 120k 31 110K 33 100000s 38 ...
3
votes
2answers
46 views

How to find title a la reStructuredText

Is there a regex pattern can matches titles in the following reStructuredText-like text ? The difficulty is that the numbers of equal signs must be equal to the length of the title. Some basic text. ...
0
votes
1answer
31 views

Django url patterns dont match

i created a django app. i want to match example.com/hat/12 to example.com/hat/?hat_id=12 i am trying to send that from a "get" form like this : <form action="/hat/" method="get"> ...
3
votes
2answers
35 views

Regex does not work as expected in interpreter

I want to search through a string and find units in fahrenheit and convert them into celcius. To do this, my approach is to user regex to find units in fahrenheit in a given string, and if I find ...
0
votes
1answer
59 views

Python regex to recognize Chinese numerals

Using python 2.7 I am trying to write a regex that can recognize any utf-8 number 0-9 (not just arabic numerals, but simplified chinese as well) and any unicode word character. For example I have: ...
2
votes
1answer
42 views

Half-space in regex

I am supposed to write a little program that takes in a Persian text and in some places changes the space to half-space. The half-space or a zero-width non-joiner is used in some languages to avoid a ...
2
votes
3answers
26 views

EMAIL id matcher-python regular expression cant figure out

i am trying to match specific type of email addreses of the form username@siteaddress where username is non-empty string of minimum length 5 build from characters {a-z A-Z 0-9 . _}.The username cannot ...
-2
votes
1answer
33 views

If statement for regex in string [on hold]

I'm trying to check matches given string to the pattern. Input may look like this: 3256-10wyput So we've used this pattern [0-9]{4}\-[0-9a-z]{7} I want to prepare an if statement which checks does ...
0
votes
4answers
32 views

Regex, replace all underscore and special char with spaces?

so im trying to replace all special chars into spaces, using regex: my code works, but it wont replace underscore, what should i do? code: new_str = re.sub(r'[^\w]', ' ', new_str) its working ...
1
vote
2answers
52 views

How can I remove all the punctuations from a string?

For removing all punctuations from a string, x, I want to use re.findall(). But I've been struggling to know what to write in it. I know that I can get all the punctuations by writing: import ...
0
votes
1answer
27 views

python: square backet in character class

I'm trying to match square brackets (using character class) in python. But the following code is not successful. Does anybody know what is the correct way to do? #!/usr/bin/env python import re prog ...
4
votes
1answer
86 views

Which pattern has been found?

In the following wrong code, I would like to have the following infos for each match found. The alternative \w+ or \d+ that has been found. The position in the text of the match found. I would ...
1
vote
2answers
55 views

Python matching a real/float number with regex

I'm not the best at re. Can anyone tell me if this pattern will work to return a single occurrence of a whole or decimal number before the occurrence of the literal,"each"? The number and the string ...
1
vote
2answers
47 views

Extract IP out of string with Python

I just had a look at regex and I'm a bit confused. I wrote a program which analyses the "auth.log" file in realtime, line by line. Now I need different informations out of the entries. if "sshd" in ...
2
votes
1answer
46 views

Python regex not working as expected

Why does the following python regex not generate @Summary\n? import re re.sub('$~ ','@','~ Summary\n')
0
votes
4answers
50 views

Python Regex: How to match filenames with optional suffix? Why '(.*?)(\.suffix)?' doesn't work?

I have filenames like this: xxx xxx.suffix xxx xxx.suffix I want to find all the xxx's (which could be anything but does not contain '.suffix') and get rid of the suffixes. I tried ...
1
vote
1answer
40 views

css working on all but two pages

Specifically, pages using an optional regular expression. By optional, I mean PAGE_RE below. I am creating a Wiki. If a user searches a term, and that term doesn't already exist, then the user is ...
0
votes
2answers
26 views

no match returned for this regular expression

I am using raw string notation to express a fairly simple regular expression and I am not getting a match object. Shell transcript follows: [~/Documents/Programming/rlm]$ python python Python 2.7.5 ...
1
vote
3answers
53 views

Saving Images from URL [on hold]

I'm trying to create a script that will download and save all image files from a website into a directory. This is my code but I can't get it to download the files and save them, can anyone see why ...
4
votes
1answer
52 views

Regex: How do I capture a group after an optional capturing group using regular expressions?

Suppose I have the following strings: s1=u'--FE(-)---' s2=u'--FEM(-)---' s3=u'--FEE(--)-' and I want to match F,E,E,M and the content of the parentheses in different groups. I have tried the ...
0
votes
2answers
40 views

Weird behaviour of django dynamic url

I use dynamic URLs in django. It works fine for integer values, and works for strings if the dynamic part is the end if the URL. When there is some other component in the URL after the dynamic ...
1
vote
2answers
45 views

Python : Regular expression

I have the following code which does what I want, retrieve the package name from the result of that command : command : dpkg --get-selections | grep amule string to analyze : string = ...
2
votes
1answer
51 views

Performing incremental regex searches in huge strings (Python)

Using Python 2.6.6. I was hoping that the re module provided some method of searching that mimicked the way str.find() works, allowing you to specify a start index, but apparently not... search() ...
2
votes
2answers
28 views

Python 3.3.3 what happened when re.compile('e') and re.compile('\e')?

Update 1: >>> '\e' '\\e' Above shows that Python literal parser treats '\e' as two literals \ and e. Am I right? If so, re.compile('\e') should also follow this rule first. i.e., It ...
1
vote
1answer
57 views

Python re.sub considered slow?

I am fairly new to Python. I am building a script to grovel through a log file, like I have done a hundred times in Perl. I am using a hash to count occurrences of certain fields in the log file, like ...
2
votes
3answers
58 views

Regex non-greedy OR

Say I have 3 regular expressions A, B and C. I need to match either A and B together or separately but always at least one. C is optional. My combined regex so far is A?B?C but if A and B doesn't ...
1
vote
2answers
21 views

Python Special Chars Escape

how would I get this into a string in python? I know I need to escape special chars but I don't know how. I tried adding backslashes but it didn't seem to work. I' not sure what to do. Here is the ...
0
votes
2answers
47 views

Replacing multiple characters in a string

I have a csv file that looks like this: Mon-000101,100.27242,9.608597,11.082,10.034,0.39,I,0.39,I,31.1,31.1,,double with 1355,,,,,,,, ...
0
votes
4answers
45 views

What am i doing wrong with this regular expression

links = re.findall('href="(http(s?)://[^"]+)"',page) I have this regular expression to find all links in a website, I am getting this result: ('http://asecuritysite.com', '') ...
0
votes
1answer
53 views

Python re.search() and re.findall()

I am trying to solve this from problem from Hackerrank. It is a Machine Learning problem. Initially, I tried to read all the words from the Corpus file for building unigram frequencies. According to ...
0
votes
2answers
48 views

Q: How do I deal with a logical expression in Python?

let's say I got a logical expression in the format of ie. AvBv~C->D . It consists of boolean elements and operators like (v,~,->) (disjunction,negation,implication). I need to store those ...
0
votes
1answer
32 views

python parsing string using regex [closed]

I need to parse a string from this: CN=ERT234,OU=Computers,OU=ES1-HER,OU=ES1-Seura,OU=RES-ES1,DC=resu,DC=kt,DC=elt To this: ES1-HER / ES1-Seura Any easy way to do this with regex?
1
vote
2answers
68 views

Python re.search

I have a string variable containing string = "123hello456world789" string contain no spacess. I want to write a regex such that prints only words containing(a-z) I tried a simple regex pat = ...
1
vote
1answer
48 views

[Python]: Python re.search speed optimization for long string lines

I will just ask on how to speed-up re.search on python. I have a long string line, which is 176861 of length (i.e. alphanumeric characters with some symbols) and I tested this line for an re.search ...
0
votes
4answers
32 views

Regex to retrieve the last few characters of a string

Regex to retrieve the last portion of a string: https://play.google.com/store/apps/details?id=com.lima.doodlejump I'm looking to retrieve the string followed by id= The following regex didn't seem ...
1
vote
2answers
46 views

Regex with unicode and str

I have a list of regex and a replace function. regex function replacement_patterns = [(ur'\\u20ac', ur' euros'),(ur'\xe2\x82\xac', r' euros'),(ur'\b[eE]?[uU]?[rR]\b', r' euros'), ...
1
vote
1answer
29 views

Unicode issues when using NLTK

I have a text scraped from internet (I think it was a Spanish text encoded in "latin-1" and decoded to unicode when scraped). The text is something like this: 730\u20ac.\r\n\nropa nueva 2012 ... 5,10 ...
0
votes
1answer
24 views

Python extracting number from HTML tag using beautiful soup

I am working on a web scraper using beautiful soup. Here is my function: journalist_result = soup.find_all("h4",class_="slab") if len(journalist_result)>0: journalist_share = ...
1
vote
2answers
41 views

IPv4 address substitution in Python script

I'm having trouble getting this to work, and I am hoping for any ideas: My goal: to take a file, read it line by line, substitute any IP address for a specific substitute, and write the changes to ...
-1
votes
1answer
59 views

how does the regular expression work in python on interpret the pattern '\\\\mac\\\\'

i can not figure out how does regular expression to interpret the pattern \\\\mac\\\\. It comes out in python that \\mac\\. however, i wander why does not the re module in python to continually ...
-6
votes
0answers
27 views

How to writes a set of acceptable characters and not acceptable characters in Regular Expressions? [closed]

I want to write a regular expression that comprises of only 'a','b' or 'c' but not 'd'. So that it matches 'abcaabbc' but not 'abbccaabdca'. Please help.
0
votes
2answers
50 views

Regular expression where first letter in a word is uppercase, and word is surrounded by _

Basically I wont regular expression that will accept only this: Dog Cat_Dog Cat_Dog_Mouse Numbers are allowed. [0-9] And are treated as words Dog_003 -> OK Dog003 -> NOT OK ...
-2
votes
3answers
82 views

How to split by space but ignore it in multiple double-quotes?

I need to split different strings separated by space but I want to ignore spaces with in nested double-quotes, or any combination of double-quotes. Here is an example: c "a " bbh "." d1 Output ...
0
votes
2answers
41 views

regex select only the http://www part of the hyperlink

I have searched the forum and couldn't find anything that could solve my question. I am trying to retrieve only the link to a website from a hyperlink, for example. I have 68 different lines like ...

15 30 50 per page