Monthly Archives: June 2007

SVC

I met some people (friends of friends) for drinks a couple nights ago at The Parlor in Lincoln Square.  One of the guys there was a fellow ‘softie, who works in the SVC (Silicon Valley Campus) on the Live Search (Research) team.  He mentioned how the Valley is quite hot right now for start-ups, how Google campus is amazing (not to mention the food), and how Facebook is hiring like crazy (they currently have ~200 people and are planning to double in the next X months [this stuff isn’t referenced from anywhere]).  He mentioned that it would be a good time to join them, before they IPO.  I recall reading somewhere that Facebook viewed themselves as the next Google.  Crazy?  I wonder what it would be like to live in the valley.  (Interestingly enough, I have a friend that works at Real and is contemplating moving down to Cali–she just got an interview with Yahoo!)  It’s quite the market out there.  If you’ve read MSFTextrememakover, you’ll see things aren’t all fine and dandy over here in Redmond.  For one, it’s been cold and raining a lot lately–and it’s June!

Most Stylish Man

A Microsoft developer was recently awarded the "Seattle’s Most Stylish Man" by Seattle Metropolitan magazine.  Pretty funny.  That being said, he is way more stylin’ than I am.  No, I don’t troll fashion websites; I found this out from an article on Micronews/MSW.  My favourite quote, though is by Danny Glasser:

Gopi’s manager, Danny Glasser, suggested the award might elevate Microsoft’s status in fashion circles. “After years of being laughed at, spit on, and having our lunch money taken, it fills us with pride to see the glitterati recognize the fabulousness that is the lifestyle of the Microsoft software developer.”

Service Reliability

Omar Shahine, a PM on Windows Live Hotmail, has an excellent post entitled, "Designing for Services Dependencies" from the Hotmail perspective.  Reading it brought back a bunch of memories and lessons learnt from Messenger Server.  (I don’t have to think about service reliability these days as we have more "qualified" people to do so.)  Let me focus on the "reliability" aspect rather than the "dependency" aspect.

Who remembers the 8+ day outage of MSN Messenger back in July 2001?  I confess that I was "not a fan" of MSN Messenger back in those days (I had just started to use AIM over ICQ) so it didn’t affect me.  But the legend of that outage can still be heard if you seek out the members of that team back in the day (whom are now scattered all over the place–everywhere, including Google and Yahoo!, but Messenger).  You can read about the whole ordeal here.  (As far as I know, though, the outage had nothing to do with .NET; the .NET Messenger Service was just a PR branding exercise.)  What about the outage in early 2003?  I don’t have first-hand accounts of these weeks as I wasn’t around during that time.

In the old days, what we would do is schedule several hours for server upgrades where we would kick off the entire Messenger user base (that’s millions of people around the world), take the entire cloud offline, deploy new binaries to the machines, restart the machines, smoke-test a bit, and then finally start taking traffic.  It was a heavily manual, intensive process, often taken many hours.  These were scheduled during the lowest peak traffic period in the week, which happened to be Friday nights, around 9 PM PST (sorry, Asia!).  We would be down for several hours while we upgraded bits.  Some people would be in the S.O.C. while the rest of us would be in a large lecture hall watching a movie (before!) and then watching the "action" on the big screen during deployment.  People got tired and sometimes made mistakes.  [Note that in the really old days, the servers were rebooted by developers to "try" out new fixes/features.  In the really really old days the servers were boxes under someone’s desk.  It would have been cool to be around then.]

So when was the last time you saw the pop-up dialog when using Messenger: "The Messenger Service will be performing maintenance in 5 minutes"?  It should have been some time late in 2004.  That maintenance, if I recall correctly, lasted several hours, during which time 100% of the Messenger user base was disconnected.  (I remember my sister on the east coast IM’ing me, "AHH!! No! No maintenance!  I need to talk to my partner to get this project done!!!!")  Prior to that event would in fact be the weekend of October 9-10, 2004.  That was a ‘fun’ weekend.  According to this article:

By early afternoon Monday, a representative of Microsoft said the company had fixed the issues that had prevented its users from logging on to Messenger.

"The system is now back up and running," the spokesperson said at 1 p.m. PDT. "We believe that the problem is now fixed." …

The spokesperson would not give further details about the problem, except to say that the Monday morning outage was due to "administrative maintenance."

Indeed, that was a big SNAFU.  (How much can I divulge here without getting fired?  Just use your imagination.)  It took a long while to recover from that outage, and we had learnt our lesson.  It was interesting to note, as well, that when you have an outage, it actually takes a while before the number of online users reaches the pre-outage numbers.  I suppose that it’s not surprising that you lose some percentage of users when your service is out-of-commission, but you don’t really wrap your head around the fact that even a small percentage of millions of people is a lot of people.  Nine hours in a year (= 99.9%) is not a lot of time, yes, but nine hours at once, passes by fairly quickly, and you’re guaranteed to hit many other blips along the way.

How do we measure reliability?  Surprisingly enough (it surprised me), we don’t actually take the statistics that clients upload to us since the checkbox to "Join our Customer Experience Improvement Program" is turned off by default, unfortunately, and not very many people turn it on.  (By the way, you really should check off that box: Tools > Options > General > Quality Improvement.)  Instead, we have these little programs that run against the cloud, simulating actual clients, every X minutes or so.  Using a little math, one can turn these into a rough measure of percentage uptime.  I always found this to be somewhat arbitrary.  For instance, it’s easy to figure out what happens if Passport is down.  As Omar says, you’re down on your ass.  But what does it mean, for example, if you can login and get the status of your buddies, but you can’t establish an IM session with them?  How does weight that for ‘service reliability’.  The best way (in my opinion) would be to get actual client data–how often people try to set up IM sessions compared to how often they fail; but we don’t have all that data (and definitely not all of it in real-time), so we figure out some complicated formula.  We try to be objective but it’s totally subjective.

Keeping a service up and "reliable" may sound extremely boring to many folks, but there are people here dedicated to doing exactly that.  An extra 9 is coveted by these people and will do anything to get that extra bit.  Funny enough, it’s a constant struggle between those that want to keep the service reliable ("don’t touch what ain’t broke") and others that want to roll out new features to users as quickly as possible.  Interesting dilemma there too.

Messenger has gotten much much better at dealing with downtime since 2004.  Part of that, of late, has to do with some platformizing and leveraging some cool work done in Search.  Which is why, although you may experience "unable to connect" every now and then, you won’t see "maintenance" any time soon.

Facebook Applications

I hesitate to keep making posts about Facebook, but it has become the de facto social network standard (at least, for me), and is the one creating all the buzz lately.  A friend of mine recently asked me what I thought about Facebook applications (in particular, one by EF Globalprint).

When people started adding applications a week or so ago, my first thought was that I was surprised with how long it had taken Facebook to finally open up their platform to allow for third-party extensibility.  I spent a little bit of time digging around their APIs before I realized that Facebook applications needed to be hosted on some external webserver.  I was expecting the platform to allow for mash-ups to be hosted on the Facebook servers, but I was mistaken.  If you take a look at the developer site, you’ll see the primary goal of their platform: to allow third parties the ability to build applications (and monetize) by leveraging the social network users have built on Facebook (Deep Integration/Mass Distribution/New Opportunity).

It’s a smart (and long overdue) move on their part.  Harnessing the ability of third party developers/companies to create apps that make the Facebook experience more fun and enjoyable is a win-win situation on both sides.  It’s too bad Windows Live has taken so long to come up with a (rather disjoint) dev story.

So what do I think about these apps?  It’s clunky.  And it’s not just the UI (although the UI is clunky as well).  Adding an application that either a friend has added or I found, brings me to a Facebook-hosted application webpage that doesn’t nearly have enough information on it about what the application is, or does.  This is probably the fault of the app developers, but it definitely doesn’t help users that might be interested in playing with the app.  Underneath the short blurb/logo there’s a "Discussion Board" followed by "Reviews" that contain little or no useful information (in fact, it distracts the user away from the main goal).  Let’s say you do take the leap of faith, and click "Add Application".  You are then prompted to confirm to allow the app to "know who I am and access my information", as well as a myriad of other choices.  Fine.  Now I add the app.  Woo-hoo!  I’ve added an app!

But wait.  It turns out that, if you click on the app (via your profile or navigation bar or wherever), that the app isn’t yet functional.  You’re prompted to either login/associate your Facebook account with the custom app’s account, or create a new custom app account.  Am I the only one that thinks this is hugely cumbersome?  So, in order to take part/use an application that my friend has added, I have to navigate a bunch of webpages, hesitate through whether or not I should trust this app, and then go through yet another sign-up process (username:password)?!?  Not to mention that it probably won’t be interesting to use until some critical mass of my friends has added the same application.  Thanks, but no thanks.  I suspect that apps won’t gain much usage until some form of OpenID is implemented by Facebook and adopted by the third parties.

Aside on EF Globalprint:  First off, I can’t find a website for this application on Google (which may or may not be officially affiliated with EF Education).  In fact, their new user registration points to an IP address of some box (in Stanford?).  I suppose this is testament to how easy it is to write a Facebook application.

What else?  It’s also strange how the app’s account identifier hangs off a public/private key pair issued to an individual’s account (although there are probably ways around this if you’re a sizeable company).

Anyhow, I suppose most of my beef is that (other than the identification thing,) the UX isn’t great.  And I just haven’t found/heard of any really compelling application to add yet.  (It would be interesting to be able to see the number and type of applications added by my friends.)  Regardless though, I do think there are many endless possibilities here.  It’s a good start for a social network platform.

Interesting to note that the top 3-4 most popular applications currently are:

  • Mobile, by Facebook.  1.76 million users.  (Does this even count?  It’s considered an ‘app’ even though it came out long before the developer story was out.)
  • iLike, by iLike, inc.  1.45 million users.
  • Horoscopes, by RockYou!  0.996 million users.
  • Movies, by Flixter.  0.839 million users.

For relative comparison, there are over 20 million users on Facebook (over half of which log in daily).