Optimize Prime

To content | To menu | To search

Monday, March 3 2008

Don't use Pound for load balancing

We were using Pound for load balancing at Justin.tv until today. It was consistently using about 20% CPU, and during spikes would use up to 80% CPU. Under extremely high load, it would occasionally freak out and break.

We just switched to Ngnix, and load immediately dropped to around 3% CPU. Our pages feel a little snappier, although that might be my imagination. Not only is the config format easier to understand and better documented, but it offers a full webserver's complement of functionality. We haven't hit any spikes yet, but given the current performance I suspect it will cream pound.

In short: Pound is out-dated. Nginx is a good replacement, although there are many, many, many other options I haven't tried.

Nginx vs. Pound

Edit: More data now available!

Week view on Pound vs. Ngnix

Saturday, March 1 2008

The next tinyurl

Tinyurl seems like a ridiculous idea for a website. The service is so simple that any competent hacker I know could write the central feature in under half an hour. It's not even original, in the sense that hashing long keys to short keys is one of the oldest tricks in computer science.

Yet writing it as a web service was original: Tinyurl is an Alexa top-1000 site. That's more traffic and usage than 90% of the startups I've met have (or will probably ever have). Not bad for what couldn't have been more than an afternoon's work. I don't believe it's the only simple utility that should exist, and yet currently doesn't. It opens up the tantalizing possibility that the right little hack could be used by millions of people.

The best idea in this vein I've had so far applies the tinyurl concept to the page itself. Most pages on the internet consist of a tiny patch of content, surrounded by acres of ancillary and mostly unnecessary wrapping. Given a url, the utility returns an embed for the actual content portion; the way this works on YouTube, Alexa, Flickr, or Justin.tv(1) should be obvious. I can imagine using this kind of a utility in several contexts in Justin.tv, and obviously it would quickly become a favorite with linkjackers everywhere (2).

Unfortunately, unlike tinyurl, this project would probably take a week to do right, and I don't really have the time right now. So I doubt I'll get around to writing this, but I hope someone else does. Let me know how it goes, so I can claim credit for all your hard work.

1. Sorry for the plug.

2. As tinyurl is the friend of the shock-site trickster.

Wednesday, August 22 2007

Insecure By Default

I never really paid much attention to the administration of SSH on my boxes. I figured debian probably set it up to be essentially secure by default.

While the algorithm is perfectly secure, there's a big problem. I was poking around my system logs after reading an article about someone else's box being compromised, and discovered multiple ip addresses trying to root passwords. Now, we have root login disabled on our boxes. But the fact that someone could just sit there guessing disturbed me greatly. What if someone actually wanted to break into our boxes? It's not like our usernames are highly obfuscated.

Turns out there's an easy solution:

$ sudo apt-get install denyhosts

Denyhosts blocks (temporarily) anyone who makes a sufficient number of failed login attempts. Why this behavior is not default, I am very unclear. Just whitelist your own ip to make sure you don't lock yourself out:

$ nano WORK_DIR/allowed-hosts

(look up the WORK_DIR in denyhosts.conf)

When I installed denyhosts it instantly locked out 4 people who were *currently* trying to crack our network. If you haven't considered installing it yourself, perhaps you should.

Monday, April 9 2007

Google Ownz Me

Apparently I've been using my gmail account too much:

Sector 6

I'm not using anything like GmailFS...I'm not even close to filling up my gmail account. I wonder what I did to trigger this?

If someone at Google reads this - FIX MY EMAIL.

Tuesday, March 6 2007

Oh the things that you'll see

I've wound up hacking a lot of Flash recently, which is a real PITA if you're used to real development environments. For one thing, there's no way to get debug output from a flash widget once it's been embedded in a real page. Actually it turns out that there is: the Flash debug player. After you install it, all trace output from any embedded flash player is written to a log file. Throw tail -f on the file, and you can watch the trace messages from your flash in real time. The cool thing is, you can also watch the debug messages from everyone else. There are a few interesting ones, but the best I've found so far is the YouTube embedded player. The tastiest bits:
START PLAYING :http://www.youtube.com/get_video?video_id=_-XoafyJ9K4&t=OEgsToPDskIMqdmC3eeaF1meusSwSKjs
playing.. the movie
status code is:NetStream.Play.Start
we got meta fuck yeah!
time is:127.494
and
result xml:0
node is:0 length:1
status code is:NetStream.Buffer.Flush
status code is:NetStream.Buffer.Empty
showing the goddamn play button
I'm always glad to see how much I have in common with the engineers at other companies.

Thursday, January 4 2007

Food Riffs

Justin and I were discussing baking lemon bars the other day, when the same thought hit us both simultaneously: why are we so limited, to consider only the Lemon variety?

Lemon: OLYMPUS DIGITAL CAMERA Lime: OLYMPUS DIGITAL CAMERA Orange: OLYMPUS DIGITAL CAMERA

They're still cooling. I'll update with the success, flavor-wise, tomorrow. Later on the agenda: Grapefruit!

Tuesday, December 19 2006

What's Wealth?

From The Consumerist:

A recent United Nations study on personal wealth found that having just $2,200 per adult puts a household in the top 50% of the world's >richest. However, thanks to large amounts of consumer debt, "many people in high-income countries have negative net worth and - somewhat >paradoxically - are among the poorest people in the world in terms of household wealth."

What does it truly mean to be wealthy? The American child of middle class parents with student loans and credit cards is much more secure in his access to all the accouterments of wealth for the duration of their entire life than a debt-free child of Ugandan refugees. That security is the true measure of wealth. Merely counting up the quantifiable assets (or debts) held is ridiculously simplistic. First, there are personal intangible assets (the American's college degree), familial assets (the American's parents probably have positive networth), and most importantly societal assets (the security that men won't come and raze your house). And I'd be wealthier renting an apartment, in hock to my neck to the credit card companies in America than living completely debt-free in Uganda.

Monday, December 4 2006

Splay Tree

There was no convenient Ruby implementation of Splay Trees. Now there is; this is pretty much a straight port of the Java version available here. I wrote up a quick set of tests as well.

Thursday, November 30 2006

This Blog Is Boring

I think that's objectively true. What I'm really worried about is that this reflects my current personality, working for a startup: boring. More generally, I've been concerned that everyone in startups winds up, to an outsider, fundamentally boring. The startups in Crystal Towers (we need a collective name - suggestions) were discussing this a few days ago. Someone noted that when they went home they had no idea what to talk about with people, if they couldn't discuss work. Luckily I didn't have that problem so much, but only because there were other people pulling the conversation forward. If I had to generate conversation topics on my own, I'm not sure I'd be able to do it for long without work.

Even this post is about startups and their effects. Which is a disease that I've noticed even great writers I know who are involved in startups. And I'm not even a particularly good writer, let alone great, so it's a fairly grim outlook for me. My only hope is to retire to a monastery in the mountains.

Tuesday, November 7 2006

Somehow it seems I always wind up rolling my own...

I spent a few days looking for a customizable real-time chat component to use on our new project. There were plenty of excellent components (mostly built in flash) that you couldn't customize at all. Since we need a bunch of features that those chat clients don't offer, they were only tempting dead ends. In the end, I decided to roll my own. In the search process I ran across Juggernaut, a Rails plugin for persistent connections. Building basic chat off of Juggernaut has required adding a couple feature enhancements (a system of handlers for connect and disconnect events), but overall it's been a solid base. My new chat project is called Zinzani; most of the functionality is now in place, although the default template is still very ugly.

I'll try to make http://zinzani.rubyforge.org/ a functional demo soon.

Thursday, October 26 2006

That's very liberal of you

It used to be that an act of generosity or mercy would be called "christian" by many. I've always thought that was poor word choice, because those actions aren't really in particularly Christian. "Frank led me to Jesus by his great love of Christ. That was very Christian of him": good word choice. "Frank let me borrow his lawnmower. That was very Christian of him": poor word choice. It's an important word choice as well, because it implies that non-Christians lack those traits and belief that outsiders lack generic positive traits is a symptom of unwarranted discrimination and prejudice.

Luckily, nowadays you don't hear that phrasing very often. It sounds slightly archaic. But I recently overheard something here in San Francisco that disturbed me. Someone said "He helped me set up the event; he's a real liberal." Liberal here is being used in the same sense as Christian; a place holder for "good". This is disturbing because it implies that the members of this culture actually believe that conservatives lack personal, positive traits. That's simply not true, and it's foolish to think otherwise. That kind of attitude is exactly what we should be fighting: people are people, good and bad, no matter what their religion or politics are. Neither Christians nor liberals have any monopoly on kindness.

Sunday, October 22 2006

Adventures in Devices

Steps required to get Philips SPC900NC webcam to work under Ubuntu (Dapper-Drake) Linux:

wget http://www.saillard.org/linux/pwc/snapshots/pwc-v4l2-20061020-042701.tar.bz2

  1. A big thank you to Saillard for putting in all the work for these drivers!

tar xjf pwc-v4l2-20061020-042701.tar.bz2 cd pwc-v4l2-20061020-042701

  1. if you try to run make now, it will fail because there's no /build or /source directory in /lib/module/`uname -r`
  2. so you need to get the kernel headers and softlink them in

sudo apt-get install linux-kernel-headers sudo ln -s /usr/src/linux-headers-2.6.15-27-386/ /lib/modules/2.6.15-27-386/build

  1. now make will work

make sudo make install modprobe pwc

  1. plug in webcam and test:

sudo apt-get install camorama camorama

I don't even want to talk about how many dead ends I had to go down to figure that out. At least I sort of understand how linux device support works now...

Wednesday, October 18 2006

Blacklist Script for Reddit

I've seen a lot of people complaining about the prevalence of political articles on reddit. I myself have thought, "I would rather not see any more stories about the LATEST OUTRAGE BUSH DEMOCRATS OMG". So I wrote a simple greasemonkey script that checks for the presence of various words in article titles. If it finds one of the words, it calls removeSiteDom to remove it.

If you like politics but never want to see "10 worthless CSS tricks!" again, feel free to edit the word list.

(download the script at http://blog.emmettshear.com/public/nopolitics.user.js)

Monday, October 16 2006

The World: Very Very Small

We opened our mailbox at our new place, found a magazine sent to the previous resident. Upstairs in our apartment where the Xobnis are crashing with us, and Matt was browsing through it. He finds none other than Reddit mentioned in an article. Tight circle...

Saturday, October 14 2006

Scratchtop

I threw together scratchtop.com in frustration with all the current ways to write and share simple documents on the web. Why are you making me register? Why do I have to click 10 times to start writing my first document? Why do I have to click edit and save? Why do I have to click at all? Feel free to use scratch top, but be warned that it's still very much development software.

As far as I know, scratch top is the simplest useful web application ever written. Anyone know anything simpler?

Arrival in San Francisco

Haven't posted in a few weeks. Was busy driving across country. Trip was fine, although Mormons stole Justin's sunglasses in Salt Lake City. Time to get to work.

Monday, September 18 2006

Lessons from eBaying Kiko

We at Kiko Software Incorporated have recently had an experience which is fairly unique in the crowded software world: we successfully sold a full software product on eBay. This hasn't been done too many times before (just one, as far as I know), but it's a pretty good way to exit a startup. There are a few potential problems areas though, and I'd like to share what we learned from the process.

Most importantly, specify the contract fully up front. Because we failed to do this, we wound up negotiating that contract after the sale. As anyone who's ever paid a lawyer to negotiate anything knows, that's more expensive than you'd like. Luckily Tucows was very reasonable and the negotiations went fine; if we'd been less lucky it could have been a real problem. Different sets of terms are worth very different amounts. In our deal, we very specifically did not offer a long term support contract with the software, even though that could have potentially increased the final price quite a bit.

If you think you have two potential sets of terms that would appeal to different buyers and you're willing to entertain either, consider selling an option bundled with the software. For example, if we'd been willing to offer a long term support contract for a reasonable price, we could have included that.

Pay careful attention to eBay's terms of service though. From what I can tell, certain kinds of options might be against the rules. Kiko's auction was pulled on the 7th day (out of 10) for having 2 links to the kiko.com (one to our main page, another to the API documentation). Apparently that's one over the limit, and an extremely vigilant community member killed our auction for it. We relisted it again as a 3 day auction and it doesn't seem any long term harm was done, but it was very nerve wracking.

Relisting it for 3 days was actually a mistake. We should have taken the removal as a blessing in disguise and relisted it for 10. Another week and a half would have given allowed more companies to bid. Corporations are slow. Ten days isn't enough time for most of them to both find out about the auction and go through their internal procedures. So a ten day auction is probably too short. If you can find another auction site besides eBay that allows longer auctions, you might consider using them.

We did do at least one thing right when we started our auction at $50,000 instead of $1. It's easy to see why it's a good idea if you consider that the real bids are all going to come in the last ten minutes of the auction. Earlier bids are essentially meaningless noise, and a low starting price can only induce more of those. The disadvantage of a low starting bid is you could be forced to sell for less than your minimum. Why take the chance?

Overall, the experience was extremely positive for us. Even with the mistakes we made the process was very fast, the asset sold for more than we expected, and our baby found a good home with Tucows. If you find your business with an large, valuable asset, consider eBaying it. It may sound crazy, but it seems to work.

Wednesday, September 6 2006

The Privacy Continuum

The Facebook just released a new feature that shows all the recent changes from your friends on your main page. Less than a day later, there are multiple groups protesting this as an invasion of privacy; the largest I've found is nearly 25,000 students strong. This is a very important reaction that we should pay attention to: a great many people feel their privacy has been invaded, but no new information has been revealed! The new feature only aggregates publicly available information. How could that be an invasion of privacy?

Because our previous conception of privacy, as public/private, is a flawed dichotomy that must be discarded in the face of changing technology. Data is not merely private or public, known or not known. Data has an associated cost of retrieval which is either high or low. The strength of this reaction to the new feature might have caught its creators by surprise, but it shouldn't have. After all, The Facebook's walled-garden approach to schools is largely responsible for its popularity, and that approach is entirely based around increasing the cost of accessing data. No one really believes that information they enter there is private, just that it will be relatively difficult for outsiders to find. A huge amount of information had been entrusted to The Facebook with the understanding that it would be available only to other users of The Facebook, and to them only so long as they paid the time cost of looking for it. Now that time cost has been reduced to nearly zero, privacy has been reduced correspondingly. The protesting students' instinctive reaction is precisely correct; the privacy bargain has been unilaterally modified.

This is same kind of invasion of privacy being proposed by the Bush Administration in the Total Information Awareness project. No private information would go into the database, and yet aggregating that data into one place is still an invasion of privacy. We all agree that our whereabouts are not secret when we enter public spaces, yet I doubt many of us would be comfortable with the government recording our every movement. That our purchasing histories are public is no big deal when accessing one takes effort and time; they are still mostly private because most people will not spend the time to find them.

Data aggregation is an invasion of privacy because it reduces the cost of access to that data, and cost of access is the continuum upon which "public" and "private" are poles.

  • Update: 4 hours later, it's up to 85,000 students.
  • Update: A day later, it's up to 280,000 students.
  • Update: A few days later, it maxed out around 750,000 students.

Thursday, August 31 2006

Things I Didn't Know About The History Of Schools

  • In 1930, there were 260,000 schools in the USA (including approximately 150,000 ones with only a single teacher). In 2000, there were less than 95,000 schools almost none with only a single teacher. The average number of students per school size grew from 89 to 502.
  • The number of support staff per student has increased three-fold since 1950, from 1/83 to 1/27.
  • The number of teachers per student has increased two-fold since 1950, from 1/26 to 1/12.
  • The average number of years of experience for a teacher has increased 7 years since 1966, from 8 to 15. The average age of a teacher has also increased by 7 years over the same time period, from 37 to 44.

The statistics are from School Figures, which is published by the Hoover Institute. I hadn't heard of them before, but from reading their choice of statistics it's clear they're in the privatization/school-choice crowd. The idea of school choice had an appeal for me at one point, when I still believed that what made a school good could be measured by tests. School choice advocates are entirely right that if you place enough emphasis on competing for students via test scores and the school ratings based on them, you'll see scores go up. The problem with that approach is that any kind of test scores are basically broken as a measure of the success of schools.

For the first part, it's difficult to say whether test scores are truly rising or falling at all. There are a huge number of confounding factors involving demographic trends and changes in the tests themselves. Secondly, even assuming we could correctly tease out the true change in test scores, there's very little evidence that rising test scores mean better educated students. It's difficult to deny that students who score 1600 on the SATs are much more likely to be well educated and thoughtful than those who score 900. But as those 1600 point scoring students would tell you, corelation is not causation! It might be that on average tall students score higher on their SATs, but that doesn't mean giving students stilts will improve test scores.

This is all a roundabout way of saying that I don't think much of the book, although it has a lot of pretty graphs. They've managed to pull together a very large and impressive number of statistics, a few of which are meaningful and most of which are not. Other offenses include, but are not limited to: extrapolating trends from as few as two data points, representing public opinion polls as meaningful measures of something more than people's opinions, and failing to consider alternate explanations for trends. I shouldn't pick on them too much, since all of these are common crimes, but I wish for a book on the history and facts of education that I could actually recommend at some point.

Terse, to say the least

I've been thinking and talking a lot about programming language design recently. I find when discussing the relative merits of languages, I often get hung up on the question of what is and isn't possible in a decent way in any given language. So I've decided to compile a set of the ways to solve a particular problem in several different languages. Hopefully this is the first of a series.

Problem: define a function "adder" of one variable (x) which returns another function of one variable (y) that sums x + y

My answers for every programming language I've thought about or used recently, in order of decreasing length:

Javascript:

function adder(x){ return function(y){ return x + y; } }

Scheme:

(define adder (lambda (x) (lambda (y) (+ x y))))

Ruby (a Proc is not exactly like a function because Ruby is dirty):

def adder(x) Proc.new {|y| x + y }; end

Erlang:

adder(X) -> fun(Y) -> X + Y end.

Forth:

: adder quote [+ ] append ;

Arc:

(def adder (x) [+ _ x ])

Haskell (currying is cheating):

adder = (+)

C and Java got skipped because the question isn't really meaningful for them, since they don't have anonymous functions.

- page 1 of 2