2
votes
0answers
18 views

Parsing Solr log files - version 2

This post is in reference to: Parsing Solr log files I re-wrote most of the code and split it up into a couple of classes. Currently, the class functionality is pretty limited, but I can see that I ...
2
votes
1answer
46 views

Parsing Solr log files

I am kind of new to programming. Picked up some Perl about a year ago and now learning some Python. I am pretty confident in Perl, but Python seems un-natural to me. I wrote a little script that ...
2
votes
0answers
11 views

Parsing time ranges with PyParsing

The following code is intended to parse a string of the following format and return whether the current time falls in that window: ...
1
vote
1answer
74 views
10
votes
1answer
121 views

Configuration file with Python functionality

I'm working on quite complicated scientific project. I decided to use a configuration file for model description. However it was quite complicated to parse all strings after ...
5
votes
2answers
80 views

Social Media Hashtag Splitting

I decided to try out Python (3.x) two or so weeks ago, and this is my first real script using it. The program I've written below is slow, clunky, inefficient, inaccurate, and probably poorly coded! ...
3
votes
1answer
68 views

Extracting the text of a specific XML node

I have to extract friendlyName from the XML document. Here's my current solution: ...
2
votes
0answers
68 views

Parsing a website

Following is the code I wrote to download the information of different items in a page. I have one main website which has links to different items. I parse this main page to get the list. This is ...
4
votes
1answer
61 views

Output of 'ldd' to dictionary

I want to use the output of the terminal command ldd: ...
4
votes
2answers
62 views

Packaging a single-file Python copy-tool

I'm currently working on a very simple one-file project: Lumix provides the possibility for the camera TZ41 (and others) to load GPS data and tourist information from a DVD to a SD-card so that you ...
3
votes
1answer
79 views

Parse XML using Python XML eTree

I am a high school intern, and I am trying to parse my mentor's code so that he can read in an XML file and call simple methods to edit or get information from his XML file. I was hoping someone ...
9
votes
1answer
88 views

Processing C++ comments

Here's the first functional version of my Python 2 script for processing comments in C++ source files. It's a personal project, I expect to expand it later with more advanced options (mainly about ...
5
votes
2answers
79 views

First program with scraping, lists, string manipulation

I wanted to find out which states and cities the USA hockey team was from, but I didn't want to manually count from the roster site here. I'm really interested to see if someone has a more elegant ...
5
votes
3answers
83 views
3
votes
2answers
225 views

Speeding up a Cython program

I wrote the following piece of Python as a part of a larger system. Profiling reveals that a large amount of time is spent in DocumentFeature.from_string. So far ...
4
votes
3answers
867 views

Using pycparser to parse C header files

I have a small program which makes uses of pycparser to parse C header files. The code, unfortunately, kinda sprawls out everywhere to handle the different cases (example below). What's the best way ...
1
vote
2answers
49 views

How parse nicely a string into three (or more) pieces?

See, I have a file which consists of a bunch of lines like NAME:0001;some text ANOTHERNAME:000103;some more text NEWNAME:42; blah blah So what I need to do ...
4
votes
2answers
822 views

Python Script to Search PirateBay

I've written a very basic Python 3 script to search ThePirateBay. Since the tracker doesn't have an API, I had to parse the HTML using BeautifulSoup. I'd like to get some reviews, I'm pretty sure the ...
0
votes
1answer
71 views

Reading and writing an unknown character into an appropriately named file

I would like to refactor this code a bit further and make it better and more generic. As of right now, it is doing what I want it to do (reading a list of URL's, splitting the query and the ampersand ...
1
vote
1answer
234 views

Opening a list of URLs and splitting the queries into different files

I recently made a program in Python to open a list of URLs and split the queries into different files. I want to make this code more generic and simple. I am open to suggestions. ...
5
votes
1answer
275 views

Review Simple Logparser

This is my first try with Python. I wanted to parse some Log4J so I thought it a good opportunity to write my first Python program. The format of the logs I deal ...
4
votes
2answers
218 views

Efficient parsing of FASTQ

FASTQ is a notoriously bad format. This is because it uses the same @ character for the id line as it does for quality scores. Deciding what is a quality score and ...
1
vote
1answer
173 views

A (comprehensive) URI parser for Python

For a code challenge, I'm trying to write a comprehensive URI parser in Python that handles both URIs with authority paths (ex: URLs such as ...
3
votes
0answers
1k views

Parse emails and create HTML markup from attachments

I am developing a script that will act as a surrogate of sorts for a web portal and an email blast program. The web portal is used to create web-to-print files and is very good at creating print-ready ...
2
votes
1answer
100 views

New book parser

I know that this is a complete mess and does not even come close to fitting PEP8. That's why I'm posting it here. I need help making it better. ...
1
vote
0answers
95 views

Interpolating code delimited with character that can appear in code

I've got a string that consists of an arbitrary combination of text and {} delimited python code, for instance, ...
5
votes
2answers
524 views

Critiques on a trivially easy to use Python CSV class

I have been working on a project where I needed to analyze multiple, large datasets contained inside many CSV files at the same time. I am not a programmer but an engineer so I did a lot of searching ...
2
votes
3answers
146 views

Logic for an init function in email parser

I am writing an email parser and cannot decide how much logic should be included in the __init__ function. I know that a constructor should make the object ready ...
2
votes
3answers
1k views

Retrieving the first occurrence of every unique value from a CSV column

A large .csv file I was given has a large table of flight data. A function I wrote to help parse it iterates over the column of Flight IDs, and then returns a dictionary containing the index and value ...
3
votes
4answers
2k views

download an image from a webpage

I originally posted this on stackoverflow and was recommended that I post here, so I apologize if you see this twice:) I am trying to write a python script that download an image from a webpage.on ...
2
votes
1answer
1k views

Get metadata from an Icecast radio stream

I am new to Python and not very familiar with advanced Python data structures. I have written a function to receive data from a socket in Python and perform string manipulations on it. The basic ...
2
votes
2answers
190 views

Pair Programming matrix: room for improvement?

At work, we have a "pair programming ladder" where you can keep track of who is pairing with whom (based on check-ins). The idea is to promote "promiscuous pairing" where each developer eventually ...
3
votes
4answers
173 views

String parsing with multiple delimeters

My data is in this format: 龍舟 龙舟 [long2 zhou1] /dragon boat/imperial boat/\n And I want to return: ...
2
votes
1answer
515 views

Python Twitter parser

In the interests of improving my Python coding skills, I wanted to post a program I recently built and get it critiqued by you fine folks. Please let me know where you think I might improve this ...
3
votes
0answers
53 views

DOMDocument field class

Here is the class: ...
2
votes
1answer
367 views

PEG parser in Python

Any suggestions to make this code clearer, more Pythonic, or otherwise better? I'm open to changes to the design as well as the code (but probably won't drop features or error checking since ...
3
votes
2answers
167 views

Parsing HTTP server logs

I have a relatively simple project to parse some HTTP server logs using Python and SQLite. I wrote up the code but I'm always looking for tips on being a better Python scripter. Though this is a ...
5
votes
1answer
755 views

Parsing Wikipedia data in Python

I'm new to python and would like some advice or guidance moving forward. I'm trying to parse Wikipedia data into something uniform that I can put into a database. I've looked at wiki parsers but from ...
4
votes
2answers
229 views

Python wrapper for the Help Scout API

I started porting an API wrapper from Java to Python for practice. I am looking for ways to improve the readability/maintainability this code. I have done some reading about "pythonic" style and I am ...
1
vote
1answer
82 views

Reversi text board parsing to bitfield

This Python code parses string representations of Reversi boards and converts them into bit string suitable for use by a bitstring module: b = BitArray('0b110') or ...
1
vote
1answer
496 views

NLTK language detection code in Python

I need to write some code that checks thousands of websites, to determine if they are in English or not. Below is the source code. Any improvements would be appreciated. ...
3
votes
2answers
1k views

Joining url path components intelligently

I'm a little frustrated with the state of url parsing in python, although I sympathize with the challenges. Today I just needed a tool to join path parts and normalize slashes without accidentally ...
2
votes
1answer
852 views

Parse a text file in python

I would like to refactor a large python method I wrote to have better practices. I wrote a method that parses a design file from the FSL neuroimaging library. Design files are text files with settings ...
2
votes
2answers
866 views

Help fix up my Python XML Schema parsing code

I've been working on a lightweight xml schema parser, and have what I think is a moderately clean solution (some parts helped out by previous questions I posted here) so far for obtaining all schema ...
3
votes
2answers
301 views

Text parser implemented as a generator

I often need to parse tab-separated text (usually from a huge file) into records. I wrote a generator to do that for me; is there anything that could be improved in it, in terms of performance, ...
1
vote
1answer
144 views

Improve my python file parsing and duplicate removal code?

I am running through a file and inserting elements one by one. The counties all contain specific county codes which are duplicated many times throughout the file. I am looking for a way to assign ...
0
votes
1answer
195 views

Python xml schema parsing for simpleContent and simpleTypes

I am writing a few python functions to parse through an xml schema for reuse later to check and create xml docs in this same pattern. Below are two functions I wrote to parse out data from ...
4
votes
2answers
5k views

Better way to parse XML in Python

I'm sure there must be a better / simpler way of doing this.... The aim of this code is to return an object which contains all of the movies. Where attributes are not found, they need to return ...
3
votes
1answer
973 views

Reading a large XML file and parsing necessary elements into MySQLdb

I have fair concept in a programming (learner) but not an expert to refactor code at the highest level. I am trying to read a huge (100MB-2GB) XML file and parse necessary element (attributes) from a ...
2
votes
1answer
4k views

Python xml schema parsing and xml creation from flat files - code review

I am new to python and had to create a schema parser to pull information on attributes and complex types, etc. and then convert data from flat files into the proper xml format. We are processing a lot ...