foo.py
bar
import foo
, then foo.bar()
bar()
, so you modify foo.py
.import foo
, then nothing happensreload(foo)
!bar()
function?pdb
, the Python debuggerpdb
let's you examine the value of variables at the very moment the error occurred.Here's the code in python_ex_wrong.py
:
1 #Tyler Moore
2 #Almost working solution to example 1 and 2 from Python notes
3 #2 Feb 2012
4
5 import urllib2
6
7 def readCSV():
8 state2PeaceYrs={}
9 for line in urllib2.urlopen("http://cs.wellesley.edu/~qtw/data/peaceIndexNoHeader.csv"):
10 bits=line.split(',')
11 state2PeaceYrs[bits[0]]=[float(b) for b in bits[1:]]
12 state2Peace={}
13 for state in state2PeaceYrs:
14 state2Peace[state]=state2PeaceYrs[state][-1]
15 return (state2Peace,state2PeaceYrs)
16
17 if __name__ == '__main__':
18 (state2Peace,state2PeaceYrs)=readCSV()
19 #OK let's debug to make sure this makes sense
20 len(state2PeaceYrs) #should be 50
21 [len(state2PeaceYrs[s]) for s in state2PeaceYrs] # should be a 50-element list, all of length 19
Here's how we find our error:
1 >>> import pdb
2 >>> import python_ex_wrong
3 >>> (d1,d2)=python_ex_wrong.readCSV()
4 Traceback (most recent call last):
5 File "<stdin>", line 1, in <module>
6 File "python_ex_wrong.py", line 14, in readCSV
7 state2Peace[state]=state2PeaceYrs[state][-1]
8 IndexError: list index out of range
9 >>> pdb.pm()
10 > /home/qtw/public_html/code/python_ex_wrong.py(14)readCSV()
11 -> state2Peace[state]=state2PeaceYrs[state][-1]
12 (Pdb) state
13 '\n'
14 (Pdb) len(state2PeaceYrs.keys())
15 51
16 (Pdb) state2PeaceYrs[state]
17 []
18 (Pdb) state2PeaceYrs[state][-1]
19 *** IndexError: list index out of range
20 (Pdb)
21 >>> #type ctrl-D to exit the debugger, fix the code and then reload
scheme://netloc/path;parameters?query#fragment
%
symbolsurllib.quote_plus()
deals with special characters, while urllib.unquote_plus()
undoes the processhttp://example.com/foo?att1=val1&att2=val2&att3=val3
http://www.bing.com/search?q=tyler+moore&go=&qs=n&form=QBLH&pq=tyler%2520moore&sc=8-10&sp=-1&sk=
urllib.urlencode()
takes a dictionary and creates a query string:
1 >>> qs={"att1":"val1","att2":"val2 with spaces","att3":"val3%"}
2 >>> urllib.urlencode(qs)
3 'att3=val3%25&att2=val2+with+spaces&att1=val1'
http://api.nytimes.com/svc/search/v1/article?query=(field:)keywords (facet:[value])(¶ms)&api-key=your-API-key
query
. The URL should consist of the base URL + the query + the API key:http://api.nytimes.com/svc/search/v1/article?query=wellesley&api-key
So let's build this URL bit by bit:
1 apikey="put in your API key here"
2 baseurl="http://api.nytimes.com/svc/search/v1/article?"
3 q={"query":"wellesley","api-key":apikey}
4 url2check=baseurl+urllib.urlencode(q)
Continuing our earlier example about querying for Wellesley (download the stored json file from http://cs.wellesley.edu/~qtw/data/wellesleyNYT.json):
1 result=urllib2.urlopen(url2check).read()
2 f=open('wellesleyNYT.json','w')
3 f.write(result)
4 f.close()
1 import json
2 resd=json.loads(result)
3 >>> for r in resd:
4 ... print r, resd[r]
5 ...
6 tokens [u'wellesley']
7 total 4409
8 results [{u'body': u'PHILADELPHIA BETSEY STEVENSON and Justin Wolfers might sound like almost any upscale couple. They have impressive degrees and serious careers and the social markers that go with them. They have one child, but there are two strollers, a Bugaboo and a Bob baby jogger, parked in the front hall of their stylish home here. Their daughter, Matilda, who', u'date': u'20120212', u'byline': u'By MOTOKO RICH', u'url': u'http://www.nytimes.com/2012/02/12/business/economics-of-family-life-as-taught-by-a-power-couple.html', u'title': u'Economics of Family Life, as Taught by a Power Couple'}, {u'body': u"LIPPINCOTT--Rosemond, 97, on January 16, 2012, at Mayflower Place Nursing Center, West Yarmouth, MA. Born Summit, NJ, to Dr. Henry M. and Mary O'Reilly. Kent Place School, '32; Wellesley College, '36. Predeceased by husband Job H. Lippincott. Resided Chatham, NJ, 1937-77; Nantucket, MA, 1977-85; and thereafter on Cape Cod. She was a generous", u'date': u'20120129', u'url': u'http://query.nytimes.com/gst/fullpage.html?res=9800E2DA133AF93AA15752C0A9649D8B63', u'title': u'Paid Notice: Deaths LIPPINCOTT, ROSEMOND'}, {u'body': u'HOWARD--Barnaby J. The son of a British lord, who grew up to be a pilot with the British and United States Navies during World War II and later a farmer in Southern Rhodesia (now Zimbabwe) before returning to America to set up a successful investment company (CAIMS), died December 18 at home in Orange Park, FL at age 86 after a courageous battle', u'date': u'20120129', u'url': u'http://query.nytimes.com/gst/fullpage.html?res=9803E3DA133AF93AA15752C0A9649D8B63', u'title': u'Paid Notice: Deaths HOWARD, BARNABY J'}, {u'body': u"Two hundred fifty-two consecutive matches won over 13 years. Thirteen national titles. The longest winning streak in college sports. Trinity College has been a squash dynasty under Coach Paul Assaiante. But two weeks ago in New Haven, Yale overthrew that dynasty in a 5-4 victory. Yale's coach, David Talbott, called it ''a long time coming.'' The", u'date': u'20120129', u'byline': u'By MING TSAI', u'url': u'http://www.nytimes.com/2012/01/29/sports/chef-ming-tsai-devoted-player-and-cooker-of-squash.html', u'title': u'Squash, a Growing Sport, And Nutritious, Too'}, {u'body': u"To the Editor: Hendrik Hartog has it right in ''Bargaining for a Child's Love'' (Sunday Review, Jan. 15). That Republicans disparage entitlement programs astounds me. I don't know of any who have refused Social Security or Medicare for themselves or their parents or grandparents. My mother, born in 1918, often said that it was President Franklin D.", u'date': u'20120124', u'url': u'http://www.nytimes.com/2012/01/24/opinion/benefits-for-the-elderly.html', u'title': u'LETTER; Benefits for the Elderly'}, {u'body': u"CRAWFORD--John Charlton, composer, pianist, professor, beloved father and husband, died on January 5, 2012, at age 80 in his 23rd year of Parkinson's disease in Cambridge, MA. Born the son of academic parents in 1931 in Philadelphia, he was gifted in music and languages. He graduated from Germantown Friends School and the Yale School of Music, and", u'date': u'20120122', u'url': u'http://query.nytimes.com/gst/fullpage.html?res=9C00E2DE133AF931A15752C0A9649D8B63', u'title': u'Paid Notice: Deaths CRAWFORD, JOHN CHARLTON'}, {u'body': u"IT'S show time for Anne M. Finucane. Her co-star on this day, Bill Clinton, is waiting offstage. The audience shifts in its seats. The spotlight goes up and ... action! It's a Thursday in early December, at a conference center near Orlando, and Ms. Finucane is busy shaping an image. Or, rather, trying to reshape one. This choreographed interview", u'date': u'20120115', u'byline': u'By LOUISE STORY and GRETCHEN MORGENSON', u'url': u'http://www.nytimes.com/2012/01/15/business/at-bank-of-america-the-image-officer-has-a-lot-to-fix.html', u'title': u'The Image Officer With a Lot to Fix'}, {u'body': u'KNEUBUHL--James Pritchard of Southbury, CT, formerly of New Canaan, CT and San Marino, CA, died December 30, 2011, at the age of 95. Husband of the late Margaret Woodard Kneubuhl, Jim leaves his daughters, Janet Schloat of Pound Ridge, NY and Barbara Kneubuhl of Wellesley, MA; three grandsons, David, Benjamin, and Michael Schloat and their wives;', u'date': u'20120112', u'url': u'http://query.nytimes.com/gst/fullpage.html?res=9404E5D8123AF931A25752C0A9649D8B63', u'title': u'Paid Notice: Deaths KNEUBUHL, JAMES PRITCHARD OF SOUTHBURY'}, {u'body': u'EDELMAN--Eleanor L. died peacefully in her sleep at her home in Bronxville, New York on January 7, 2012. For 53 years, she was the wife of Albert I. Edelman, an attorney who predeceased her. She was born Eleanor Louise Weisman in 1924 in St. Louis, Missouri and was known to her friends as Elly. Along with her beloved sisters, Beryl and Nanette, she', u'date': u'20120112', u'url': u'http://query.nytimes.com/gst/fullpage.html?res=9E03E6D8123AF931A25752C0A9649D8B63', u'title': u'Paid Notice: Deaths EDELMAN, ELEANOR L'}, {u'body': u'Nina Bich-Phuong Xuan Ha and Stephen Michael Girasuolo were married Friday evening at the Harvard Club of New York. Marylin G. Diamond, a retired acting justice of State Supreme Court in New York, officiated. On Thursday, the Rev. Thich Nguyen Hanh, a Buddhist priest, performed a ceremony that incorporated Vietnamese traditions at the Unitarian', u'date': u'20120108', u'url': u'http://www.nytimes.com/2012/01/08/fashion/weddings/nina-ha-stephen-girasuolo-weddings.html', u'title': u'Nina Ha, Stephen Girasuolo'}]
9 offset 0
10 #so most of the data comes in resd["results"]
11 >>> len(resd["results"])
12 10
13 >>> for k in resd['results'][0]:
14 ... print k, resd['results'][0][k]
15 ...
16 body PHILADELPHIA BETSEY STEVENSON and Justin Wolfers might sound like almost any upscale couple. They have impressive degrees and serious careers and the social markers that go with them. They have one child, but there are two strollers, a Bugaboo and a Bob baby jogger, parked in the front hall of their stylish home here. Their daughter, Matilda, who
17 date 20120212
18 byline By MOTOKO RICH
19 url http://www.nytimes.com/2012/02/12/business/economics-of-family-life-as-taught-by-a-power-couple.html
20 title Economics of Family Life, as Taught by a Power Couple
q
that will include the appropriate parameters to answer the following query: get articles written by David Pogue in 2011 that mention "iphone" and "android". cPickle
moduleSuppose we want articles mentioning "Obama", "Romney", "Santorum", "Gingrich" or "Paul". We can issue the queries and store the results in a dictionary:
1 import datetime, time
2 rightnow=datetime.datetime.now()
3 queries=[{"query":"title:"+politician,"api-key":apikey} for politician in ["Obama", "Romney", "Santorum", "Gingrich","Paul"]]
4 apiResults={}
5 for q in queries:
6 #these 3 lines are just the same as before, just encoding and grabbing the URL
7 url2check=baseurl+urllib.urlencode(q)
8 result=urllib2.urlopen(url2check).read()
9 resd=json.loads(result)
10 #OK now store the json result in the apiResults dictionary
11 apiResults[(url2check,rightnow)]=resd
12 time.sleep(1)
Now apiResults
is a dictionary whose keys are 2-element tuples of the URL requested plus the time of the search:
1 >>> apiResults.keys()
2 [('http://api.nytimes.com/svc/search/v1/article?query=title%3AGingrich&api-key=[removed API key for security reasons]', datetime.datetime(2012, 2, 14, 15, 3, 39, 685928)), ('http://api.nytimes.com/svc/search/v1/article?query=title%3AObama&api-key=[removed API key for security reasons]', datetime.datetime(2012, 2, 14, 15, 3, 39, 685928)), ('http://api.nytimes.com/svc/search/v1/article?query=title%3APaul&api-key=[removed API key for security reasons]', datetime.datetime(2012, 2, 14, 15, 3, 39, 685928)), ('http://api.nytimes.com/svc/search/v1/article?query=title%3ASantorum&api-key=[removed API key for security reasons]', datetime.datetime(2012, 2, 14, 15, 3, 39, 685928)), ('http://api.nytimes.com/svc/search/v1/article?query=title%3ARomney&api-key=[removed API key for security reasons]', datetime.datetime(2012, 2, 14, 15, 3, 39, 685928))]
Python has built-in "object serialization" via the "pickle" module:
1 import cPickle as pickle #we use cPickle, a C implementation of the pickle module that runs faster
2 pf=open("~/qtw/inclass/data/apiex.pkl","wb") #wb= write to a binary file
3 pickle.dump(apiResults,pf,True)
4 pf.close()
Fire up the interpreter (or put the code in your own module) and type:
1 import cPickle as pickle
2 pf=open("~/qtw/inclass/data/apiex.pkl","rb")
3 apiRes=pickle.load(pf)
4 pf.close()
Table of Contents | t |
---|---|
Exposé | ESC |
Full screen slides | e |
Presenter View | p |
Source Files | s |
Slide Numbers | n |
Toggle screen blanking | b |
Show/hide slide context | c |
Notes | 2 |
Help | h |