Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

This code works. It is designed to retrieve website data directly, parse it, and open it as normal. I will develop it further later. Right now I want to know if there is any way to improve efficiency, such as not using a file etc. I would also like to know if my code could be made more pythonic (this would include PEP standards), or incidences where concatenation would be acceptable to readability. Here is the code:

import urllib.request
from tkinter import *
import webbrowser
import os

proxy_handler = urllib.request.ProxyHandler(proxies=None)
opener = urllib.request.build_opener(proxy_handler)


def navigate(query):
    response = opener.open(query)
    html = response.read()
    return html

def parse(junk):
    ugly = str(junk)[2:-1]
    lines = ugly.split('\\n')
    return lines

while True:
    url = input("Path: ")
    dump = navigate(url)

    content = parse(dump)
    with open('cache.html', 'w') as f:
        for line in content:
            f.write(line)

    webbrowser.open_new_tab(os.path.dirname(os.path.abspath(__file__))+'/cache.html')
share|improve this question
add comment

2 Answers

up vote 2 down vote accepted

Your code does seem to follow PEP 8 which is a good thing. Also, the logic is splitted in different small functions which is even better.

If you want to make your code portable, you probably shouldn't hardcode / in your path but use os.sep or even better os.path.join instead.

In order not to call write multiple times, you could use writelines.

Now, from a functional point of view, it might be a good idea to create temporary files with tempfile for instance. If you don't want to use it, it might be better to use a hash of the original url or of the content and use it in the name of the new file.

share|improve this answer
    
Thank you, I never would have thought of that stuff. Could you explain more to a n00b why using tempfile would be advantageous to my programs functionality? –  Alex Thornton Feb 18 at 17:39
add comment

1. Make sure your variable names are both short and to the point, using a parameter called junk is pointless. Also, this could cause conflicts if you ever write other code and import this.

def parse(junk):
    ugly = str(junk)[2:-1]
    lines = ugly.split('\\n')
    return lines

2. urllib is outdated. Unless there is a specific reason you need to use urllib, you should use urllib2. Urllib2 is faster and more simple.

import urllib.request

3. This part looks a bit messy: webbrowser.open_new_tab(os.path.dirname(os.path.abspath(__file__))+'/cache.html') Perhaps separate into multiple lines for readability.

share|improve this answer
    
Thanks alot, but I get an importerror for urllib2 running python 3.3.2 –  Alex Thornton Feb 18 at 17:37
    
Okay. Maybe ask on StackOverflow about that. –  xv435 Feb 18 at 17:38
1  
urllib is outdated in Python 2, but in Python 3, urllib was removed and urllib2 was renamed to urllib.request. As such, import urllib.request should not be a problem. –  icktoofay Feb 19 at 1:52
    
Okay. Thanks for the info. –  xv435 Feb 20 at 1:58
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.