Father’s Day Post 2009
By stephen | June 21, 2009
If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!
This blog post continues the tradition of doing my Father’s day tribute to my dad in blog post format. Happy Father’s Day Dad! You are the best!
I have learned so many things from my dad. He is a great example of someone who is enthusiastic about his job, but does not let it get in the way of his family responsibilities, which can be great with his house full of teenage daughters. He has several interesting projects going on at the moment, but still managed to attend all of the graduation events for both myself and my graduating younger sister. We even got to change the brakes on my car on Friday, and it was fun spending a few hours in the garage together. I am somewhat glad that my family’s garage has always been messy enough to necessitate some time on a Saturday cleaning it and therefore spending time together.
It is fun to stay involved with my dad not only on a personal level but on a professional one as well. Amazon’s web services have been crucial to some of the research I have done, and AWS will continue to provide important tools for my current and future research. I enjoy seeing my dad have success in his field and be at the forefront of important new technologies in the cloud computing arena. I think that many of my most valued character traits were developed and nurtured by the example set by my dad. My dad, brother, and I share a fantastic father-son bond.
Father and sons
Happy Father’s Day Dad! All the best! Love, Stephen.
Topics: General | No Comments »
quick R + Sweave + xtable + summary code chunk
By stephen | June 15, 2009
I use Sweave, LaTeX, and R to make documents. It is nice to make a quick table of summary statistics. However, the default table made by a summary of a data frame is really ugly. If you are just summarizing numerical (rather than categorical variables) within a dataframe, this helper function will be nice for you. Yes, I know it is ugly. It is a quick hack, which I will fix up later when I am not busy. But, in the future I can see building up a library of nice table generating functions for use with xtable.
######## MY GENERIC UTILITY FUNCTION#######
nicePanelSummary <- function(df, cols) {
# the default summary method on a panel is ugly
# this should look nicer
# Start by recreating the default summary block
s <- summary(df[,cols])
s.mat <- as.matrix(s)
s.num <- lapply(s.mat, function(x) gsub(".*:", "", x))
s.num <- lapply(s.num, as.numeric)
s.num <- matrix(s.num, nrow = 6, ncol=length(cols))
rownames(s.num) <- c("Min.", "1st Qu.", "Median", "Mean", "3rd Qu.", "Max")
colnames(s.num) <- cols
# Add the standard deviation
sds <- t(as.matrix(lapply(df[,cols],sd)))
rownames(sds) <- "St. Dev"
s.num <- rbind(s.num,sds)
return(s.num)
}
Hope this helps someone.
Topics: Uncategorized | No Comments »
post-finals update
By stephen | June 10, 2009
School:
Finals are done! Just 1 paper (my honors thesis) is left, and then I am completely done with the UW up to this point. This quarter has been fantastic. I took Econ 426 - Advanced Financial Derivatives, and it was very interesting. We studied about a handful of different types of derivatives, many of which have been getting a bad repuation recently. Futures, Forwards, Options, Credit Default Swaps, Mortgage Backed Securities, etc. It was quite a bit of fun. There was certainly some philsophical discussion in the class, namely who was to blame for the crisis?
Is there a moral hazard involved when a company originates loans only to rapidly sell them off, where the loans are put into mortgage backed securities? Or, is it completely the fault of the mortgage holder who entered into obligations that they did not understand and could not meet? Through tranching, it is possible to take a bunch of bad mortages, and as Professor Davis said, something AAA comes out the other end.
The class went well. I also had Econ 422 with Professor Eric Zivot, which was fun (and technically a pre-req to 426 but who’s checking?) and not too difficult. And, Spanish 103, which allows me to graduate.
Life
My PhD program at the University of Rochester begins July 6th. I’ll be doing quite a bit of driving between now and then. About 4K miles worth. I haven’t quite decided on a route yet, but here are a few options that I have.
1: Google Maps suggested route. Mostly through North Dakota, around through Chicago, then to Rochester
2: Through South Dakota, Michigan, stop in London, Ontario, then Rochester.
3: Head North, going just south of Lake Superior, down to Manitoulin island, ferry between Lake Huron and Georgian Bay, then a night in Toronto, then a short 3-4 hours to Rochester the next day.
As 3 seems the most exciting, I think I will give that a try. Any thoughts? All of them are about 2700 mi. in length. Then, once I’m established in Rochester, I will go down to DC for the 4th, and be back for classes on the 6th. This will be great. I am excited to start this program.
Blog
I fixed the blog so that comment spam should no longer be an issue, and I deleted all bad comments. I installed a captcha, so I know that that is annoying, but better than manually wading through thousands of comment spams. Please comment. I have also been on twitter lately, following the #rstats hashtag, about the R statistical language, and a few other interesting feeds. I don’t really follow too many of my friends on twitter, but I think that it is extremely useful to follow subjects, because you can get a feel for what people in general are thinking about concerning a specific topic.
Books
I recently read Phantoms in the Brain, by Ramachandran. It was really cool, and explained theories on how certain parts of the brain work, and which parts are responsible for which functions. In theory, there are parts of the brain which are responsible for very specific functions, such as adding, or differentiating between reflections and images. After strokes, these parts of the brain can be damaged, and people who are otherwise normal can be missing a very important function relating to thought or cognition. After reading the first few chapters, I wrote typed up some ideas. It is quite interesting. I have ordered On Intelligence, by Jeff Hawkins. I am looking forward to reading it.
Topics: General | 1 Comment »
TSCC is over
By stephen | May 21, 2009
Sad news. Terminator: The Sarah Connor Chronicles is over. Too bad. I really enjoyed that show. It embodied future John Connor’s great idea of “Hey, I’m sick of sending back terminators that look like a dude. Let me send back a Terminator dressed as an attractive female to protect my young adult self. Yeah, that will work.”
The DenOfGeek blog provides an analysis:
Sarah Connor was a non-populist, meditative, complex piece of television on a smash-bang, show-me-the-ratings kind of network. The two were never going to get on, but kudos to Fox for giving it two seasons to prove itself, especially when it was obvious after the first, unsuccessful season, that showrunner Josh Friedman had no intentions of changing the formula.
So, the show is over. Perhaps another network will pick it up and run with it. That would be great.
Topics: Uncategorized | No Comments »
network economies, crowdsourcing, etc.
By stephen | May 18, 2009
I just skimmed through a paper over at Digial Urban about Mapping for the Masses using Web 2.0 tech. Part of the web 2.0 (which may be an abused buzzword at this point) paradigm is crowd sourcing, or using large amounts of people to do simple tasks for you. The introduction to the paper has an interesting thought:
The notion that there might be value in harvesting the knowledge of individuals is based on the observation that, although a large number of individual estimates may be incorrect,
their average can be a match for expert judgment. Judiciously handled, randomly sampling the opinions or calculations of a large number of users might lead to data and information that is surprisingly accurate and that, in some cases, cannot be recorded in any other way (Surowiecki, 2004).
This begs the question: is there a Central Limit Theorem for knowledge of crowds. For some types of knowledge, probably. For others, probably not.
Topics: Uncategorized | No Comments »
Daniel Hannan MEP: The devalued Prime Minister
By stephen | May 16, 2009
- You have run out of our money. The country as a whole is in negative equity
- Servicing that debt is going to cost more than educating the child
- You used the good years to raise borrowing even further…we are now running a deficit that touches 10% of GDP. Higher than Hungary. Higher than Pakistan. Countries where the IMF has already been called in
- Last year 100,000 private sector jobs were lost, and yet you created 30,000 private sector jobs. Prime Minister, you cannot carry on forever squeezing the productive bit of the economy in order to fund an unprecedented engorgement of the unproductive bit.
Strong, true words. Watch the video.
Topics: General, News | No Comments »
MTURK with Smartsheet
By stephen | May 15, 2009
I have been getting ready for the big move to Rochester. One of my first steps was to start packing up my prized possessions…meaning my books. I filled 3 small-medium boxes with books that I don’t think that I will need during the next month. As I was packing, I decided to take the opportunity to type the ISBN number into a list as I was packing. From there, I thought, what would be interesting and useful information I could get from these ISBNs? My first thought was, what is the value of my collection, and (for moving) what does it weigh?
Luckily, Amazon has this information. It wouldn’t be too hard to write a program that used web services to query Amazon for the information, but there is a much easier way. Get Turkers to manually look up the information for you. What I wanted was a spreadsheet with the following columns:
ISBN, Box Number, New or Highest Used Price, List Price, Title, Shipping Weight, and Misc Notes.
I only had ISBN and of course box number (since I packed them). My dad let me know about Smartsheet.com, and we used them to very easily create an MTurk task to get the rest of the information. Smartsheet.com works on the following premise - A person has a spreadsheet with some values unknown for each row -> use Mturk to create tasks, 1 per row, for Turkers to fill out the missing information -> hand you back spreadsheet.
The result, within 5 minutes we had deployed tasks to MTurk, and within another 45 had 100 hits complete, with each Turker looking up and filling in all of the information I needed from Amazon. And, the result: the sum of the new or highest used price of all my books is $4226.62 for those 3 boxes. Hurray for college! I think we paid $0.05 per hit, so the entire job cost $5. The programmer in me says, “bah, write it yourself. You know how to use Perl + Web services + aggregate into a data structure + export results to CSV + generate PDF”. And, if I were adding up the value of more than 10,000 books, I would probably write it myself. But, the economist in me said, “For $5 you can have this problem solved and be doing something more valuable with your time”. My internal economist won the debate, and I now have a cool spreadsheet that I will use as a packing list. And, I know now that I definately need to insure these boxes.
Topics: General | No Comments »
newspapers
By stephen | May 11, 2009
Assume I run a donut shop. I am making profits, but the public is also benefiting since my incredible donuts are available to them, so arguably my donuts are a “public good”. Plus, I am charging $0.50 cents per donut, or $1.50 for a donut on Sunday, but it is slightly bigger and wrapped with advertisements. Then, I find that at my current prices, I am not profitable. This is partially due to the fact that food bloggers are teaching people how to make donuts at home, and other people give away donut recipes for free. Do I,
- A - Try to get the government to temporarily exempt the donut industry from anti-trust legislation such that I can form a donut cabal for an across the board padded pastry premium
- B - Declare that my donut stand is now a non-profit, and get a tax exemption
- C -Get the government to subsidize my business under the Donut Protection Act
- D - Charge enough for my donuts so that I make profits. If, at that price, nobody buys my donuts, then perhaps I should not be in the donut business.
Does this sound logical? Switch donut for newspaper in the above itemized list and see how it sounds.
On the way to visit my family for Mother’s Day/Sunday dinner, I heard a piece on NPR’s On the Media show about the state of newspapers. Particularly, it was a about a Senate committee Future of Journalism hearing. There are some characters making quite interesting propositions for newspapers. Take, for example, Jim Moroney, of the Dallas Morning News, who wants the government to suspend anti-trust regulation so that a newspaper cartel and simulaneously raise prices. He says:
To try to bring it back one website at a time, one daily newspaper website at a time, will not work. If The Dallas Morning News today put up a pay wall over its content, people would go to The Fort Worth Star-Telegram.
Congress should act quickly on legislation providing a limited antitrust exemption that will allow newspapers some breathing room to share ideas and jointly explore innovative business models.
Mr. Maroney, somehow I do not think monopoly and innovative business models are in the same league.
Other solutions are proposed. Newspaper subsidies. A newspaper protection act that would give newspapers the same treatment as non-profit organization. All while making fun of bloggers and the “new media”.
Then, there is Senator John Kerry. I am curious if he has ever used Google News, which was accused of stealing content. Google’s Marissa Mayer gave a simple, technically grounded rebuttal:
All newspapers and all publishers right now can opt out of aggregation. There are standard industry practices, files that you can put in place that say, please don’t collect my content.
It is true, though, that most newspapers, in fact, prefer the distribution. The distribution is better for them. It’s also better for users.
What she is talking about is the “Robots Exclusion Standard“, which, as she says, makes it trivially simple for websites to opt out of being crawled by google.
Senator Kerry responds:
JOHN KERRY: But isn’t there a greater - It’s a product. It’s created by somebody…. It, it is - it’s intellectual property, which we recognize as having a value. Correct?… Why do they not have a right - why is it antiquated to believe they have a right to be paid for their product?
Senator, go to news.google.com. There are 2-sentence summaries of articles, plus direct links to a variety of different news sources about each headline. If I have a friend who regularly visits a handful of donut shops, and I ask him where the best bear claws are, is he a donut thief for telling me the answer? No.
More complaints come from Mr. Moroney
JIM MORONEY: Dallasnews.com, if we’re doing 50 million page views a month and 6 million unique visitors, we’re not putting up PDF pages of our newspaper. We are using the same sophisticated technology that’s driving the whole Internet ecosystem. So this charge that we’re not investing millions and millions of dollars in our websites and that we’re somehow clinging to the past is wrong.
The problem is I invest 30 million dollars a year in newsgathering resources at The Dallas Morning News. I don’t do 30 million dollars of revenue through Dallasnews.com in a given year. So, if I quit publishing the newspaper tomorrow and went purely digital, what’s going to have to go is a great number of the professional journalists that I employ today.
So, don’t invest any more online. Nobody is forcing you. If it is a bad investment, don’t do it.
Topics: General, News, Theory | No Comments »
unemployment statistics
By stephen | April 29, 2009
I was reading this WSJ Economics post about recent unemployment statistics. The summary in 1 word. Up.
From the article:

I have currently been working with a time series of bankruptcy filing statistics from USCourts.gov. (Note to self, post my cleaned-up bankruptcy data). I’ll have to see how these are correlated.
Topics: General, News | No Comments »
the announcement
By stephen | April 25, 2009
It is official. I am moving to Rochester. This is going to be fantastic.
The next few weeks will be quite busy. I am moving to New York, graduating, planning a vacation, finishing my honors thesis, plus doing some other side projects. Wow, this is so fun!
Topics: News | No Comments »


