Vineet's profileVineet GuptaBlogListsGuestbookMore Tools Help

Blog


    June 11

    Notes on XMPP Part I - Fundamentals

    We recently augmented the team working on our desktop product. At the core of the product is XMPP – the protocol that drives several instant messaging servers and clients, sites like Chesspark and now Google Wave. Since XMPP is not known by many people, let alone be understood well enough, every time we on-board someone new, they have to go thru a steep learning curve. This post is an attempt to make it easier to understand the protocol .

     

    What is XMPP

    XMPP generally refers to a collection of specifications that define protocols for real time interactions over the public internet.  The core set of specifications has been standardized by IETF. While the first application (and the origin) of the protocols was to address instant messaging, the extensible nature of XMPP has led it to be used for a wide range of applications which need real time communication.

     

    Architecture

    XMPP has a decentralized client-server architecture (like the WWW) – there are several hundreds of XMPP deployments, each running anywhere from one to hundreds of servers to which millions of clients connect. Key aspects:

     

    Addressing:

    Each interacting entity on XMPP needs to have a unique address. This address is called a Jabber Id (JID). A JID consists of three parts:

    user-identifier: Unicode string representing the interacting entity. For example: user1

    domain-identifier: Unicode string representing the domain of the interacting entity. For example: directi.com

    resource-identifier: Unicode string representing a resource used by the entity. For example: pwdesktop

    A full JID looks as follows: <user-identifier>@<domain-identifier>/<resource-identifier>. For example: user1@directi.com/pwdesktop. An identifier without the resource is  called a bare JID: user1@directi.com. Another way of addressing is to use XMPP URIs: xmpp:user1@directi.com

     

    Communication:

    The original purpose of XMPP was to do instant messaging. This requires the server to be able to push messages out to the client on a real time basis. Over HTTP, this requires the client to poll the server, or a technique like Comet where the server caches HTTP requests and then sends responses on those requests. XMPP however uses a long lived TCP connection. This gives the server an always on channel to push info to the client. The client also need not wait for the server to respond to its messages, but can instead send an indefinite amount of messages to the server without blocking, which the server can then respond to, enabling an asynchronous kind of communication. This is further helped by the fact that XMPP does not require every packet being sent over the wire to be acknowledged. An entity assumes a packet to be delivered unless it receives an error.

     

    Protocol:

    XMPP uses streaming XML over a long lived TCP connection (though HTTP is also possible thru BOSH – see later) for communication. There is one stream from client to server and another stream from server to client. To start communicating, a client would send an opening XML tag:

    <?xml version="1.0"?>

    This is followed by a stream element – this marks the beginning or root of the document:

    <stream:stream>

    This in turn is followed by various messages sent across as XML elements. These elements are called stanzas. These stanzas can continue to get exchanged between the server and the client endlessly till a closing stream tag is sent which marks the end of the communication. There are three kind of stanzas:

    1) Message: The <message/> stanza denotes the basic push method for sending stuff from one entity to the other. These need not be acknowledged and provide a quick fire and forget mechanism to send info from one point to the other. A Message has a “to” and a “from” attribute denoting the receiver and the sender, and can include one or more payloads. In IM conversations, the payload is often HTML markup for richly formatted text (defined by XHTML-IM)

    2) Presence: The <presence/> stanza denotes a broadcast sent out by an entity to advertise its availability to other entities who have subscribed to receive these updates from the advertising entity. Presence messages can also include a payload. Common uses are to include an availability state like “away,” “busy,” etc. and personal status messages like “Working on blog post 1 of 6.”

    3) Information Query: The <iq/> stanza provides a Request-Response mechanism like HTTP verbs GET, PUT and POST. The payload defines the request of the sender which needs to be processed by the receiver. This is the only stanza type where the sender expects a reply – a result or an error. This makes IQ more reliable than a Message, allowing the two entities to carry out a structured interaction. Common examples of IQ usage in IM applications are to fetch the the Roster, and add / remove entries in the Roster.

    A closing stream tag indicates end of conversation.

     

    Decentralized client-server architecture

    XMPP servers talk to each other directly. So if user1@domain1.com needs to interact with user2@domain2.com, the servers at domain1.com would interact with domain2.com servers directly. This is different from email where communication between servers on different domains happens thru hops (which can lead to address spoofing and other issues) or HTTP where servers do not interact with each other at all.

     

     

    Creating a XMPP Session

    user1@domain1.com wants to talk to user2@domain2.com. To accomplish this:

     

    1) Client creates a TCP session with the server:

    The client used by user1 needs to find out the box that hosts the XMPP service for domain2. This is accomplished by doing a DNS service lookup by checking the DNS SRV record which maps the service to the machine name and the port of the service (5222 by default) becomes known, a AAAA lookup gives the IP of the machine. With the IP and the port we can now open a TCP connection.

     

    2) Client and Server start streams in opposite directions:

    a) The client sends across the opening XML text declaration tag (optional) followed by an initial stream header:

    <?xml version="1.0"?>
    <stream:stream to="domain2.com"
        version="1.0"
        xmlns="jabber:client"
        xmlns:stream="http://etherx.jabber.org/streams">

    b) The server sends back a response stream header with a unique stream id:

    <?xml version="1.0"?>
    <stream:stream from="domain2.com"
        id="0123456789" version="1.0"
        xmlns="jabber:client"
        xmlns:stream="http://etherx.jabber.org/streams">

     

    3) Client and Server negotiate Stream Features:

    Right after sending the response stream header, the server send across a <stream:features> message on the features it supports. These features are typically about:

    a) Whether server supports TLS or not (recommended)

    b) Authentication mechanism supported (see my earlier blog post on authentication mechanisms) – typically SASL plaintext and digest-md5 are supported. Ideally one should not use plain text without TLS since in that case the password is sent in clear on the wire.

    c) Stream compression (optional)

    At this point, the client uses <iq/> stanzas to negotiate which features it wants to use, and if TLS is used, enter into a TLS negotiation, or if SASL is used, authenticate via the appropriate SASL mechanism.

     

    4) Post Authentication Stream Negotiation

    After authentication, the server resets the session by sending a new stream header with a new stream id. This is done for security purposes. This new stream does not publish any authentication features (since that is already done), but now publishes new features. These typically include:

    a) Compression support

    b) Resource binding

    c) Formally starting an XMPP session

    At this point, the client again uses <iq/> stanzas to negotiate which features it wants to use, and once that part is over, the actual task of application specific stanza exchange can start between the client and the server.

     

    BOSH

    I earlier mentioned that XMPP can work over HTTP as well. This seems counter-intuitive: XMPP requires push, and HTPP is pull based (client sends a request and server responds). However, it turns out that one can do push over HTTP as well – the technique for using XMPP over HTTP is called Bidirectional-streams Over Synchronous HTTP (BOSH):

     

    1) There is a server in front of the XMPP server which handles HTTP clients. This is called a BOSH connection manager (CM).

     

    2) Client sends DNS query for TXT records, and discovers that there is an entry for BOSH connection which points to the BOSH server mentioned above.

     

    3) Session Creation Request: Client now sends a HTTP POST with an empty <body/> tag with some attributes set. The important ones are:

    a) hold – the number of HTTP requests the BOSH server can queue. This is typically set to 1

    b) wait – the timeout in seconds before which the server must respond to a pending request

    c) rid – a large random number that acts as the initial request id

     

    4) Session Creation Response: BOSH server opens a regular XMPP stream with the XMPP server over a TCP connection, receives the server’s XMPP response, wraps it up in a <body/> tag and returns this to the client over HTTP. The body tag contains the following attributes:

    a) hold – same as earlier

    b) requests – max number of HTTP requests that the client can open with the BOSH server at any time. This is typically set to hold + 1. Since hold is typically 1, requests is typically 2.

    c) sid – a large random number that acts as the session id. This is diff from the stream id sent by the XMPP server. The client must now include the sid in every subsequent request.

     

    5) Hereafter the client and the server negotiate stream features and authenticate pretty much in the same way as with a TCP connection, with the BOSH server sitting in between and wrapping / unwrapping the <stream> and <iq/> stanzas in <body/> tags. The XMPP application is now ready to exchange its specific stanzas.

     

    6) The question at this stage is, how does the server do a push. Recall that hold=1 is the max number of HTTP requests the BOSH server can queue, and requests=2 is the max number of HTTP requests that the client can open with the BOSH server at any time. Assume that the last request from the client was sent to the BOSH server just around 60 seconds back (let’s say a <presence/> packet) . The server had nothing to respond because the XMPP server had no stanzas. Now since the 60 seconds timeout is about to be over, this is what happens:

    a) BOSH server returns a HTTP 200 ok response to the last request with an empty <body/> tag. If the client too has nothing to say, it also sends across a HTTP 200 ok with an empty <body/> tag, to which the server can again respond at the 60 second timeout. This can go on ad-infiniteum as a keep-alive mechanism.

    b) Now assume that the client has something to say. One request is already in the play and max two are allowed. So the client can now send the new stanza in a new HTTP request. The BOSH server immediately responds to the earlier request (which was kept on hold) with a HTTP 200 ok with empty <body/> tag. It now again has one request outstanding which it can use to either send back the keep-alive, or send back a response from the XMPP server.

    c) Assume that the client and the server have been playing the keep alive game. Now the XMPP server sends a stanza (say an authorization request). The BOSH server needs to push this to the client. This is easy since it has a cached HTTP request. Hence push is accomplished over HTTP, without doing constant polling.

     

    The advantage of using BOSH is that it can work even in flaky networks where a TCP connection would break, forcing the client to once again establish an XMPP session. Also, this makes it possible to use XMPP in web clients where one cannot open a TCP connection, for example, Facebook’s chat feature uses XMPP over BOSH.

     

    Jingle

    XMPP uses a client-server model for all communication and is optimized for small snippets of info. So if the amount of data to be exchanged is very large, for example in applications like file transfer, audio-video calls and screen-sharing, an XEP called Jingle. Jingle is large and complex enough to deserve its own series of posts. I will summarize basic facts here:

    1) The basic idea behind Jingle (and other multimedia protocols like SIP) is to use two channels:

    a) Signaling Channel to set up, manage and tear down application defined sessions

    b) Media Channel to transfer the payload either peer to peer or relayed thru a mediator over a application defined transport

     

    2) In a Jingle negotiation (<jingle/> element inside a <iq/> element),  the initiator makes an offer to start a session by declaring one or more app type (say voice video, etc.) and a transport method (ICE, UDP, etc.). The responder and the initiator then negotiate a set of parameters (for example codecs to be used), and if the negotiation works data is exchanged. Some parameters can be modified even while the data is being exchanged.

     

    3) Jingle supports two transport types:

    a) Datagram transports like UDP – can tolerate packet loss – meant for apps like media streaming

    b) Streaming transports like TCP – no packet loss tolerated – for example file transfer

     

    4) The real power of Jingle comes from using Jingle over ICE. ICE provides a mechanism for two entities to communicate and negotiate all possible ways of connecting between each other – direct or mediated. ICE in turn can use a STUN server to find out the IP address and port of an endpoint from outside the firewall, and a TURN server to relay data in case a direct peer to peer connection is not available.

    May 24

    Lessons from Twitter @Replies tweaks

    When the Twitter reply feature tweak story started breaking, my first reaction was – this could be us in future. Watching Twitter struggle with the change and the backlash it generated was a big public lesson in product design, tech implementation and communications, and I thought I should document it here, lest I ever forget it.

    For the uninitiated, Twitter recently removed a setting where you could see responses by people who you follow to other people who you don’t follow. This was primarily done for tech reasons, but the blog announcement did not explain that very well, leading people to think of it as an arbitrary move, resulting in some strong feedback. Twitter however responded quickly with the real story: it was done for technical reasons. Here’s what I learnt:

     

    1) Product Design:

    According to Alex Payne – Twitter’s API lead, only 3% of the users had ever turned on that feature, however, these 3% users are apparently power users, which in Twitter’s world also means that they are also strongly vocal with a large influence base, and they were using the feature to discover new friends. Hence the strong backlash. The lessons are:

    a) If you change a feature, figure out the audience profile it impacts most, and the consequence of that impact. Numbers-wise 3% is low, but what if they are some of your most important and vocal users?

    b) When you do change a feature, figure out if the impact can be lessened by coming up with an alternative. It seems that the Twitter team is now working on this – they could have started the work before going ahead with the decision to remove the setting.

     

    2) Tech Implementation:

    The feature was removed for tech scalability reasons. Let us see what the situation is:

    • User U1 has n1 followers represented by the set S1. One of the followers in S1 is U2 who has in turn n2 followers represented by the set S2.
    • User U1 makes a twitter post – all n1 followers in S1 see it
    • User U2 makes a reply to this post. Now twitter needs to deliver it to people from both S1 and S2:
      • First deliver it to the intersection of S1 and S2 since that is the set which is following both U1 and U2.
      • Now find out (S1 + S2) - (S1 x S2) – this is the remaining set which is not following both U1 and U2
      • In this set find out the 3% users who have opted for receiving replies even when they are not following the user.

    How hard is it to do this? Well you are looking at at least two select queries, where the second one involves a complex join. This is where the stress on the system comes from. Can this not be addressed by caching? Here’s what would need to be cached: the follower list for each user, and the 3% of the total users who have got this feature turned on. It would be ineffective if the follower numbers n1, n2, n3, … for users U1, U2, U3, … were very large and not easily cacheable. For a typical social network like Facebook it would not be very difficult, but Twitter is different: it is asymmetric – you can follow someone without requiring that person to follow you. This results in massive values of n1, n2, n3, … for popular users. How big? According to http://wefollow.com/top, each of the top 10 twitter users has more than a million followers, with Oprah Winfrey at 1.02M and Ashton Kutcher at 1.85M. Clearly not an easy number to cache. What is scarier is that Twitter is growing at a very fast pace – faster than anyone else according to http://blog.nielsen.com/nielsenwire/online_mobile/twitters-tweet-smell-of-success/, and the number of celebrities and businesses using Twitter is growing at a fast pace. According to http://twitterholic.com/, 5 of the top 10 users have joined in the last one year.

    Lesson: when designing features, design with scalability in mind. If it is not scalable, avoid putting it in – you may need to kill it.

     

    3) Communications:

    This has been covered ad nauseam all over the web, so I won’t repeat it here. The moral of the story is that if you do have to kill a feature, the key thing is to be honest about the reasons and be clear in the message leaving no ambiguities. First Facebook learnt the lesson and now Twitter has. Hopefully, we would be able to avoid these mistakes.

    May 01

    Activity Streams 101

    One of the cornerstones of the .PW platform is the Wall - the real time aggregate of activities being done by an entity and its network. So while I am personally rather inactive on the various social networks (way too distracting), the recent announcement by FB on opening up their feed via activity streams led to some analysis and thought – here’s a summary.

     

    The Premise

    • A user carries out several activities across multiple systems – posting items, joining forums, connecting to people, etc.
    • Each of these systems have their own way of capturing, storing and publishing this information.
    • Each system wants to puts various degrees of control on the usage of this information. Twitter makes it freely available, FB puts a wall-garden around it, others fall in between.
    • The user wants to be able to control who can see what information, and exercise his copyright on that information in terms of how it is consumed, used, persisted and further shared.

     

    Challenges

    1. How does one capture the semantic richness of the various types of activities in a simple, precise way.
    2. How does one publish the stream in near real time without going into expensive polling. Why is this imp? Because on Jul 21, 2008 FriendFeed crawled Flickr ~2.7M times for a grand total of 6700 updates. HTTP was not made for Push, and Pull is a resource hog.
    3. How does a user continue to exercise his copyright on his content even as his feed becomes available in a machine readable way and published.

     

    Solutions

    • Challenge #1 is being addressed thru the emerging activity streams standard. More on this in a minute.
    • For #2, activity streams uses Atom. So this can theoretically be layered over XMPP. (Any implementations, anyone?)
    • For #3, there are no straightforward solutions. Twitter makes everything public – as a user you can opt out. FB comes from the other side of the fence – the user opts in for sharing. Copyright enforcement in both cases is contractually enforced.

     

    Activity Streams Standard

    Take a look at the formats of the following public feeds:

    1) http://api.flickr.com/services/feeds/photos_public.gne

    <entry>
    <
    title>death valley 07a</title>
    <
    link rel="alternate" type="text/html" href="http://www.flickr.com/photos/willburn25/3488468568/"/>
    <
    id>tag:flickr.com,2005:/photo/3488468568</id>
    <
    published>2009-04-30T08:40:47Z</published>
    <
    updated>2009-04-30T08:40:47Z</updated>
    <
    dc:date.Taken>2009-04-30T02:40:47-08:00</dc:date.Taken>
    <
    content type="html">
    &lt;p&gt;&lt;a href=&quot;http://www.flickr.com/people/willburn25/&quot;&gt;Shackleton, Jules&lt;/a&gt; posted a photo:&lt;/p&gt;
    &lt;
    p&gt;&lt;a href=&quot;http://www.flickr.com/photos/willburn25/3488468568/&quot; title=&quot;death valley 07a&quot;&gt;&lt;img src=&quot;http://farm4.static.flickr.com/3391/3488468568_b3eea508d4_m.jpg&quot; width=&quot;240&quot; height=&quot;92&quot; alt=&quot;death valley 07a&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
    </content>
    <
    author>
    <
    name>Shackleton, Jules</name>
    <
    uri>http://www.flickr.com/people/willburn25/</uri>
    </
    author>
    <
    link rel="enclosure" type="image/jpeg" href="http://farm4.static.flickr.com/3391/3488468568_b3eea508d4_m.jpg" />

    </
    entry>

     

    2) http://picasaweb.google.com/data/feed/api/all 

    <entry>
      <
    id>http://picasaweb.google.com/data/entry/api/user/isabellechedin/albumid/5284084403719516961/photoid/5330400059699013378</id>
      <
    published>2009-04-30T08:34:36.000Z</published>
      <
    updated>2009-04-30T08:34:36.000Z</updated>
      <
    categoryscheme='http://schemas.google.com/g/2005#kind' term='http://schemas.google.com/photos/2007#photo'/>
      <
    titletype='text'>DSC03419.JPG</title>
      <
    summarytype='text'/>
      <
    contenttype='image/jpeg' src='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/DSC03419.JPG'/>
      <
    linkrel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://picasaweb.google.com/data/feed/api/user/isabellechedin/albumid/5284084403719516961/photoid/5330400059699013378'/>
      <
    linkrel='alternate' type='text/html' href='http://picasaweb.google.com/isabellechedin/ExpatriationHongKong#5330400059699013378'/>
      <
    linkrel='http://schemas.google.com/photos/2007#canonical' type='text/html' href='http://picasaweb.google.com/lh/photo/FV74oIogEEchKANsA9dfLg'/>
      <
    linkrel='self' type='application/atom+xml' href='http://picasaweb.google.com/data/entry/api/user/isabellechedin/albumid/5284084403719516961/photoid/5330400059699013378'/>
      <
    linkrel='http://schemas.google.com/photos/2007#report' type='text/html' href='http://picasaweb.google.com/lh/reportAbuse?uname=isabellechedin&amp;aid=5284084403719516961&amp;iid=5330400059699013378'/>
      <
    author>
        <
    name>isa</name>
        <
    uri>http://picasaweb.google.com/isabellechedin</uri>
        <
    email>isabellechedin</email>
        <
    gphoto:user>isabellechedin</gphoto:user>
        <
    gphoto:nickname>isa</gphoto:nickname>
        <
    gphoto:thumbnail>http://lh5.ggpht.com/_2pkj6pXKPwY/AAAArGA7Css/AAAAAAAAAAA/lspX8vyux5o/s48-c/isabellechedin.jpg</gphoto:thumbnail>
      </
    author>
      <
    gphoto:id>5330400059699013378</gphoto:id>
      <
    gphoto:albumid>5284084403719516961</gphoto:albumid>
      <
    gphoto:access>public</gphoto:access>
      <
    gphoto:width>1600</gphoto:width>
      <
    gphoto:height>1200</gphoto:height>
      <
    gphoto:timestamp>1240920712000</gphoto:timestamp>
      <
    exif:tags>
        <
    exif:fstop>3.5</exif:fstop>
        <
    exif:make>SONY</exif:make>
        <
    exif:model>DSC-T100</exif:model>
        <
    exif:exposure>0.04</exif:exposure>
        <
    exif:flash>false</exif:flash>
        <
    exif:focallength>5.8</exif:focallength>
        <
    exif:iso>400</exif:iso>
        <
    exif:time>1240920712000</exif:time>
      </
    exif:tags>
      <
    media:group>
        <
    media:contenturl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/DSC03419.JPG' height='1200' width='1600' type='image/jpeg' medium='image'/>
        <
    media:credit>isa</media:credit>
        <
    media:descriptiontype='plain'/>
        <
    media:thumbnailurl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/s72/DSC03419.JPG' height='54' width='72'/>
        <
    media:thumbnailurl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/s144/DSC03419.JPG' height='108' width='144'/>
        <
    media:thumbnailurl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/s288/DSC03419.JPG' height='216' width='288'/>
        <
    media:titletype='plain'>DSC03419.JPG</media:title>
      </
    media:group>
      <
    gphoto:albumtitle>expatriation à Hong-Kong</gphoto:albumtitle>
      <
    gphoto:albumctitle>ExpatriationHongKong</gphoto:albumctitle>
      <
    gphoto:albumdesc/>
      <
    gphoto:location/>
      <
    gphoto:snippet/>
      <
    gphoto:snippettype>PHOTO_DESCRIPTION</gphoto:snippettype>
      <
    gphoto:truncated>0</gphoto:truncated>
    </
    entry>

    While both of them are about public photos, the formats are vastly different since they come from two different providers. So a potential system that wants to aggregate photo updates across multiple providers needs to understand each vendor’s format separately. Efforts like Yahoo’s Media RSS extensions seek to address this.  (The picasa feed above uses the Yahoo extensions but the Flickr feed does not, which is strange considering it is a Yahoo property). Another example is Apple’s iTunes RSS extensions. However, these are RSS extensions, and what we really need in this space is an Atom extension. (why? see http://blog.unto.net/work/on-rss-and-atom/). An example is http://martin.atkins.me.uk/specs/atommedia.

    However, this is only about photos which is just one of the activities. When we start considering the possible list of activities, the problem scope becomes extremely large. This is where the Activity Streams effort comes in. Consider the following activity: “Vineet posted new photographs to Agastya on Facebook 6 hours ago.” This can be broken down as:

    • Vineet = actor
    • posted = verb
    • new photographs = item / article
    • Agastya = social object (album here)
    • Facebook = site
    • 6 hours ago = timestamp

    Other possible fields here can be:

    • User Agent / tool (example Thwirl)
    • Location
    • Mood
    • Title
    • Summary
    • Detail
    • Link
    • Verb collection
    • Actor collection
    • Object collection
    • Comments (points to another object)
    • The producer of the content may define the default viewing mechanism which can be used by a consumer

    To capture the above, the following draft specs are currently in place.

    1. Atom Activity Extensions defines Actor, Verbs, Objects, Title, Summary, Detail, Link, etc.
    2. Atom Activity Base Schema defines the semantics of various Verbs and Objects, Location and Mood
    3. Atom Media Extensions defines how typical media – videos, photographs, etc. should be described. So in the above example, if we wanted to enclose the actual photographs of Agastya, we would need to describe the actual link, the preview thumbnail, image type, height, width, etc.

    Note that this is work in progress and if you read the drafts, the gaps are fairly obvious. The big ones (I think) are:

    • As activities get republished, a downstream consumer may end up dealing with duplicates. This can be addressed by a combination of a URI + per item identifier
    • Some activities may be in response to other activities (classic case being video responses on YouTube) – the relationship would need to be captured
    • Instead of a single actor-verb-object, we may have collections of actors, verbs and objects. Examples being multiple actors on a single object (wiki editing) or single actor, multiple objects (uploading multiple pictures) or combinations.

     

    Activity Streams Implementation

    1. MySpace: http://wiki.developer.myspace.com/index.php?title=Standards_for_Activity_Streams is compliant to the current draft specs. Also, their documentation is simple to read, understand and use.

    2. Facebook: http://wiki.developers.facebook.com/index.php/Using_Activity_Streams. For one, the stream only consists of user generated content and not app generated content. Second, they have a model where you need to prompt the user for permission to access the stream. This has two repercussions:

    • This requires the user to be logged on to Facebook
    • You can only show the user his own data. So if I import the FB feed on Friendfeed, the subscribers to my Friendfeed feed would not get to see the FB feed. This could have been possible had apps like FF could have persisted FB data, but FB TOS prohibit caching data for more than 24 hours.

    Note that the Activity Stream standard is just one way of accessing the FB stream. They also provide a XML/HTTP and a FQL API. See Using the Open Stream API for details. The restrictions however stay in place.


    So in summary, while the Activity Stream standard effort is quite exciting, the current FB implementation is not. All one can use it for is building clients which show the user her own data. The real value would be in letting apps re-publish the data without violating the privacy needs of the user. As of now, the wall garden is very much in place.

     

    Further Reading


    April 21

    Authentication Mechanisms

    After publishing my previous post, I had thought that I would not be coming back to Crypto for a while. However, today evening Sebastiaan posted on SCRAM on one of the Directi mailing lists, and I got compelled to write down this one.

    Authentication in Cryptography has two aspects: data authentication and entity authentication. Data authentication is addressed using HMACs and Digital Signatures (discussed in my previous post). What we are talking about here is entity authentication: how does one entity get to know that the other entity is actually what is claims to be. This in turn has two aspects:

    1. Alice and Bob are communicating in “real time” – the typical way of authentication here is by Alice issuing a challenge to which only Bob can respond.
    2. Alice and Bob are sending messages to each other which are delivered after an appreciable delay since a message may be stored and forwarded thru multiple devices. Here since Alice can’t wait for Bob to respond to the challenge, the interest is only in validity of origin of data.

    Here we are talking about the first kind of authentication. The goals of an entity authentication / identification protocol are as follows:

    1. Bob should be able to successfully authenticate himself to Alice.
    2. A third party should not be able to impersonate Bob to authenticate itself as Bob to Alice
    3. Alice should not be able to use the data used in authenticating Bob to impersonate Bob to a third party.

     

    Passwords

    A password is a shared secret between two entities. By sending across the shared secret, one entity can prove to the other that it is indeed the possessor of the secret and hence the valid entity. However, sending the shared secret across as a mechanism for authentication is fairly weak. Here are some inherent weaknesses in the scheme:

    1. Passwords are easy to guess (or brute force) since they are unlikely to be very long or use a random sequence or be chosen from a large alphabet set.
    2. Passwords do not change frequently so brute forcing them becomes easier since there is a huge time window available.
    3. A password is a shared secret between the two entities. Any entity that is able to intercept the password being sent by Bob over the wire can now use the password to authenticate itself to Alice.
    4. In systems that store password in clear text, this is even worse, since privileged insiders or administrators now have access to the shared secret.

     

    Some of these issues can be addressed to some extent thru the following:

    1. Enforce a password policy that requires a long length, and inclusion of numbers or special symbols. This makes a brute force / dictionary / exhaustive search attack harder, but not impossible.
    2. Make the user change the password frequently – while good in principle, in practice this does not help in a huge way since most password changing policies provide a lot of time (30-45 days) for a dedicated hacker.
    3. Encrypt the password, preferably using a one-way function so that even if the algorithm, key and the Ciphertext are all available to the privileged user, he cannot get to know the shared secret.
    4. Salting the one way function by using a randomly generated number as a key also makes a dictionary attack harder.

    However, the biggest drawback of the fixed password approach is that it is open to a replay attack. So it cannot be used over an open network, but must require a trusted connection.

     

    Challenge Response Identification

    The idea of a Challenge  Response scheme is simple: Both Alice and Bob have a shared secret. To authenticate Bob, Alice makes him a challenge. To respond to the challenge, the shared secret is required which only Bob has. Thus by sending the response, Bob proves that he is the possessor of the shared secret without ever sending across the shared secret. There are multiple ways of doing challenge response:

     

    Symmetric key Challenge Response using Encryption

    Here the assumption is that Alice and Bob exchange the shared secret a priori. Let’s  suppose the shared key is K. Then:

    1. Alice generates a random number n1, and sends it to Bob.
    2. Bob applies the encryption algorithm to generate n2 = EA(n1, K) and sends it to Alice
    3. Alice apples the decryption algorithm to compute n3 = DA(n2, K)
    4. If n1 = n3, Bob indeed owns the shared secret and is who he claims to be.

    The EA is typically a function like modular exponentiation which is easy to compute, but hard to reverse. Also, data integrity must be ensured in all transmission by using a HMAC.

     

    Symmetric key Challenge Response using One-Way functions

    In case the usage of an Encryption / Decryption routine is infeasible (because of computing cost maybe), it is possible to achieve the same using one-way functions as well:

    1. Alice generates a random number n1, and sends it to Bob.
    2. Bob applies the one-way function H to generate n2 = H(n1, K) and sends it to Alice
    3. Alice also applies the one-way function H to generate n3 = H(n1, K)
    4. If n1 = n3, Bob indeed owns the shared secret and is who he claims to be.

     

    Asymmetric Challenge Response

    1. Alice generates a random number n1 and sends it to Bob
    2. Bob encrypts the random number n1 using his private key to generate n2 and sends it over to Alice
    3. Alice decrypts n2 using Bob’s public key and obtains n3
    4. If n1 = n3, Bob indeed owns the public key under his name

     

    Zero Knowledge Protocols

    Since in all the cases above, Bob responds with some response based on the secret key, there is a theoretical risk of information leakage. Zero knowledge protocols seek to address this by enabling a claimant to demonstrate knowledge of the secret without revealing any information whatsoever.  The basic idea is that the claimant must answer two questions: one involving demonstrating the knowledge of the secret, and the second is a simple question to prevent cheating. The series of steps is repeated several times (typically 40) to reduce the probability of cheating, and only if all answers are correct is the entity authenticated. The math involved in doing this is detailed but straight forward, and it is also easy to understand how the protocol works without revealing any information. You can read about it any standard cryptography textbook.

     

    SASL Mechanisms

    With that backgrounder, let’s come back to the issue of SCRAM-MD5 where we started. Basically SASL is a framework for entity and data authentication supporting multiple mechanisms including anonymous, plain-text, one-time-password, CRAM, Digest and now SCRAM. Let’s quickly examine these (please do not take the descriptions below as accurate – they are greatly simplified):

    1. Anonymous: As the name suggests, the purpose of the mechanism here is for an entity to be able to use the service without proving its identity. The claimant basically sends a single string indicating that it wants to access the service anonymously. The server if it has that capability (which typically is checked by the client before sending the anonymous auth request) allows the entity to use the service anonymously.
    2. Plain-text: The plain-text mechanism is expected to be used over a secure pipe provided by a lower layer which takes care of confidentiality and integrity. Basically, the client sends over the username and password in plain text over TLS. The problem here is that in the absence of a secure lower layer, the entity can be easily compromised. Also, there is no mutual authentication.
    3. OTP: One time passwords are a simple concept:
      • The user provides the password.
      • The client receives a non-secret seed from the server and appends the password with the seed
      • The result is passed thru a hash function (typically MD5, SHA1 and MD4 also allowed) and then reduced to 64-bits. This result is called S
      • This secret S is now run thru the hash function N-times, then N-1 times, N-2 times, and so on to produce multiple one time passwords.
      • Every time authentication needs to be done, the OTP along with the sequence number and the hashing function used are send over to the server
      • The adversary cannot invert the hash function and hence cannot recover the original password
      • The server applies the same hash function the same number of times and if it gets the same data, authenticates the user.
      • Good: makes replay attack difficult
      • Bad: Server stores password, no mutual auth. Open to dictionary attack. Hard to maintain sequence in the event of communication failure. Open to a pre-play attack.
    4. CRAM:
      • Server sends a challenge string
      • Client responds with username followed by a message digest created by hashing the challenge string using the password as a key
      • Server performs the same computation on its side, and if it obtains the same message digest, authenticates the user
      • Good: Not open to replay. No disclosure of password
      • Bad: No mutual auth. Open to a dictionary attack if a CRAM exchange is captured. Server needs to store passwords (hopefully encrypted).
    5. Digest:
      • Server stores one way hash of username and password H1 = Hash(username, password)
      • On request of a resource, server calculates hash of resource and the method required to access it H2 = Hash(resourceURI, method)
      • Server sends a nonce to client
      • Client calculates H1 = Hash(username, password), H2 = (resource URI, method)
      • Client generates Response = Hash(H1, nonce, H2) and sends it back to the server
      • Server does the same calculation at its end and if both are identical, the user is authenticated.
      • Good: Server does not store password, but a one-way hash
      • Bad: Usage of MD5 as a hashing function. Many security settings are optional. Open to a man in the middle attack.
    6. SCRAM: It is tedious to describe the protocol in brief since it involves multiple repeated computations. Best to read the draft. The key improvements over Digest and CRAM, etc. are:
      • Server stores salted one way hash of password
      • Mutual authentication
      • Channel bindings
      • Can be used as a GSS API mechanism as well. GSS API is like SASL – a generic security framework.
      • Server cannot impersonate the client to other servers
      • Permits the use of a server-authorized proxy without requiring the proxy to have super-user rights with the backend server

    I am mentioning the last two points in good faith. Have not studied the protocol deeply enough to arrive at how these two conclusions are valid.

    April 20

    Crypto 101

    Most developers whom I have come across, lack a solid grasp of the fundamentals of cryptography.  When a developer who does not understand crypto needs to use crypto, several things can go wrong:

    •    Not understanding the implications of using some crypto technology in the code
    •    Not realizing where to use crypto
    •    Not implementing crypto correctly and hoping that the implementation is correct
    •    Not implementing crypto correctly, but feeling secure because “we have used cryptography”
    •    Not using crypto at all

    The unfortunate part to this whole situation is that Cryptography is not hard to understand and most of the perceived complexity is in areas where an application developer would typically never venture: designing an algorithm or a protocol, cryptanalysis, etc.  In most cases, one can use one of the modern crypto libraries and go by with a minimal level of understanding of the principles involved. The aim of this post is to provide that minimum background required to start using crypto effectively in day to day work.

     

    1. Introduction
    While Cryptography can serve in multiple ways, its most common usage – which also illustrates most of its principles – is secure communication. To understand the notion of secure communication, let’s bring in our cast:

    This story is about two nice, genuine people - Alice and Bob - who are in separate locations and want to have a conversation. The third character in our story is Eve who does not like Alice and Bob and is their adversary. She wants to intercept this conversation, find out what Alice and Bob are talking about, and if possible, disrupt the conversation by modifying the messages between them or by impersonating Alice or Bob.

    Now Alice and Bob are aware of Eve and to prevent her from accomplishing her purposes, they set up a pipe which goes from Alice to Bob. Whatever message Alice drops in the pipe at her end would be received by Bob at the other end and vice versa. The pipe is opaque and indestructible so Eve cannot look at the messages nor modify them.

    Such a pipe or channel then has the following properties:

    •    Confidentiality: Contents of messages passing between sender and receiver cannot be seen by the adversary since the pipe is opaque.
    •    Integrity: Contents of messages cannot be modified since the pipe cannot be penetrated
    •    Authentication: The pipe, being indestructible, does not allow anyone else except Alice and Bob to send messages. So Eve cannot masquerade as Bob.
    •    Non-Repudiation: Since only Alice and Bob can send messages on the pipe, when Bob receives a message on the pipe, Alice cannot deny that she had not sent that message.

    The purpose of cryptography is to establish this kind of a secure channel over a public network like the Internet.

     

    2. Confidentiality using a Shared Secret
    Traditionally the problem of making communication confidential over an insecure transport has been solved by Private-Key Encryption [also called Symmetric Encryption]. The idea is simple: Before exchanging messages, Alice and Bob meet in person and decide on the following:

    • An Encryption Algorithm (EA)
    • A Decryption Algorithm (DA)
    • A Shared Secret – a key (K)

    [Note: Alice and Bob do not care if Eve gets to know EA or DA, as long as K remains a secret. This is a very important point in modern-day cryptography: we assume that the Adversary has the knowledge of how the algorithm works. This is called Kerckhoffs’ Principle. Mechanisms that are based on keeping the working of an algorithm secret are obscure, not secure. Such mechanisms are of historical interest only and are inadequate for today’s world.

    Some people compare this with government agencies (like NSA in the US) not publishing their mechanisms. While this is true, such non-disclosure does not imply that the algorithms used by agencies like NSA are designed on the basis of algorithm obscurity. Security agencies in governments (or large IT companies creating security technology) hire top-notch cryptographers who are well aware of the Kirchhoff's Principle and design with the assumption in mind. Obscurity then is either an extra defense (however weak that maybe), or a way of protecting intellectual property rights on the mechanism. Bottom-line: modern-cryptography assumes that the attacker has knowledge of the working of the algorithm and only the key is secret.]

    Now to send messages securely, Alice takes the message (called Plaintext in this context) and generates an encrypted message out of it (called Ciphertext) by using the Encryption Algorithm EA and the Secret Key K. Thus:

    Ciphertext = EA(Plaintext, K)

    This Ciphertext is now sent to Bob over the public network. On receiving the Ciphertext, Bob uses the Decryption Algorithm (DA) and the Secret Key (K) to generate the Plaintext back. Thus:

    Plaintext = DA(Ciphertext, K)

    Eve does not have the Secret Key (K) and is therefore not able to obtain the Plaintext. We have achieved confidentiality by sharing a secret.

     

    2.1 Substitution Ciphers
    The simplest way of understanding how private-key encryption works in practice is to consider an elementary algorithm - The Substitution Cipher. A Substitution Cipher works as follows: There is a message M that needs to be sent. To encrypt the message, characters in blocks of n characters are taken, and substituted by a different set of characters picked up from the Alphabet, differing from the original set by a function of K.

    The idea is not new. It’s most famous usage was by none other than Julius Caesar who replaced each character in the message by a character three characters ahead (shifting by 3) in the Latin Alphabet. This is called the Caesar-Cipher. Given that most of his enemies were not even literate, let alone thinking of breaking ciphers, it would have been fairly secure. Even as late as 1915, the Russian army was using the Caesar cipher since their troops could not learn to use more complicated ones. A trivial modern day implementation is called ROT13 (Rotation 13 – rotation by 13 positions). Since the English alphabet is 26 characters, ROT13 is its own inverse, i.e., the decryption and encryption algorithms are the same rule because of the choice of the key (13). ROT13 is used by people asking questions / puzzles, etc, on the Usenet. The standard Usenet client has ROT13 built-in.

    What can Eve – the adversary – do about breaking a substitution cipher being used by Alice and Bob? Cryptanalysis - analysis of a Cryptographic Scheme with the goal of breaking it – of a substitution Cipher is rather straightforward, and by analyzing a few instances of Ciphertext produced by a substitution cipher, a cryptanalyst can easily break the cipher.

    One of the simplest techniques that Eve can use for this purpose is Frequency Analysis. This is based on the fact that in any language some letters are used more than others. (In English the approximate sequence is ETAOIN SHRDLU for the top 12 characters). So by analyzing the frequency of the alphabets appearing in the Ciphertext, the Plaintext can be figured out. The earliest use of this technique was by the Arabs in the 9th century who realized that certain characters appear more frequently in the Koran than others. Several of the ciphers used in World War II were also broken using frequency analysis.

     

    2.2 Transposition Ciphers
    We know that substitution ciphers are not strong enough. So what are our other choices? There is another technique called the Transposition Cipher, where the characters are permutated, i.e., positions of characters are interchanged. Transposition Ciphers have also been known for a very long time and one can find a transposition cipher every day in a newspaper. We know it by the name “Anagram.” An anagram is a very simple form of a transposition cipher, and has more recreational value than cryptographic. Nonetheless, anagrams have been used by linguists and intellectuals throughout history for amusement or for hiding messages, and provided a powerful plot device in Dan Brown’s The Da Vinci Code.

    Anagrams aside, the earliest example of a Transposition Cipher being used for a security purpose is a Scytale – Greek for a baton – a device used by the ancient Greeks to communicate messages during a military campaign. The working was really simple: a strip of paper was wound around a cylinder of a known diameter and a message written on the paper. The paper would then be peeled off and sent thru a messenger. The recipient would have a similar cylinder on which it would wind the paper again to read the message. Unless the recipient’s cylinder was of the same diameter, the letters would not come together properly and the message would not make sense. However, manual analysis of the letters is straightforward – letters after a specific distance need to be put together to make sense out of the message, and the strip of paper is a big giveaway.

     

    2.3 Product Ciphers
    Transposition Ciphers or Substitution Ciphers are both open to fairly simple Cryptanalysis, but a combination of a Transposition operation with a Substitution operation leads to a much stronger cipher. For example, a Product Cipher, alternates transposition and substitution, and carries out this core function over several iterations. The ciphers used in World War II (ENIGMA by the Germans, PURPLE by the Japanese) were Product Ciphers. Modern-day Product Ciphers include DES as well as AES (a.k.a. Rijndael). DES uses a 64-bit key and alternates 16 substitutions with 15 transpositions. AES uses a 128-bit key and alternates 10 substitutions with 10 transpositions. AES is stronger.

     

    2.4 Modes of Operation
    Depending on how a Symmetric encryption algorithm deals with data, we can define two distinct categories for symmetric algorithms: Block Ciphers and Stream Ciphers. In a Block Cipher, data is taken in blocks (typically 64-bit or 128-bit block) and the algorithm encrypts / decrypts the block as a whole. A Stream Cipher operates on a stream of data and encrypts / decrypts the data one-bit at a time.

     

    2.4.1 Stream Ciphers
    A Stream Cipher uses the bits in the secret key to perform some operation with each bit in a stream to generate Ciphertext. For example, a simple operation would be to XOR each bit in a key with each bit in the Plaintext stream. However, it is highly unlikely that the key would be as long as the Plaintext (see “One Time Pad” in section 3.1 for a discussion), so what typically happens is that the Secret Key is used to generate a pseudo-random bit-stream called keystream, and the keystream is then used to perform some operation with the Plaintext stream.

    The keystream is generated using some internal state in the algorithm. For higher security, this internal state needs to be updated continuously. Based on how this state change takes place, Stream Ciphers can be classified as:

    1.    Synchronous Stream Ciphers: The internal state of the cipher changes independent of Plaintext or Ciphertext.

    2.    Self-Synchronizing Stream Ciphers: The internal state changes based on the previous Ciphertext digits.

    Since the keystream itself is not really random, the bit sequence in the stream would repeat itself after a period. The longer the period, the more secure the algorithm.

     

    2.4.2 Block Ciphers
    A block cipher operates on a pre-defined block-size. Typically the block-size is 64-bits or 128-bits. So for a block cipher, the biggest question is how to deal with messages longer than the block-size. There are various modes of operation which a Block Cipher can employ. The main ones are:

    1.    Electronic Codebook (ECB): This is the simplest mode. The message is split into parts and each block is encrypted / decrypted separately. This is insecure and not recommended since it is subject to two simple attacks:
    •    Stereotyped Beginnings and Endings: Messages have typical headers and footers and that knowledge can compromise the key.
    •    Replay Attack: The adversary does not know the Plaintext or the key, but she simply intercepts the Ciphertext and forwards it to the receiver. An example where this can hurt is authentication.

    2.    Cipher Block Chaining (CBC):  In this mode, each block is XORed with the previous block before encryption. It is very commonly used. A variation on CBC is called Propagating Cipher Block Chaining (PCBC). This mode is used in Kerberos.

    3.    Cipher Feedback (CFB), Output Feedback (OFB), Counter (CTR): All of these modes convert a block-cipher into a stream-cipher by taking the output from the previous block. CFB converts a CBC into a self-synchronizing stream cipher. OFB and CTR makes a block cipher a synchronous stream cipher.

    All modes except ECB use output from the previous block to operate on the next block. This leads us to the question: How is the first block processed? To solve this problem, a dummy block, called an Initialization Vector (IV) is used. The IV need not be secret, but the same IV should never be used with the same key.

     

    3. The Chimera of Perfect Secrecy
    I mentioned that Product Ciphers are stronger as compared to Transposition and Substitution Ciphers. But on what basis does one make a statement like that? Is there a way of figuring out how secure an encryption is, or how difficult it is to break it? Till recently, most cryptography was ad-hoc without any solid theory behind it. This changed in 1948 when Claude Shannon - the Father of Information Theory and Modern-Day Cryptography - published A Mathematical Theory of Communication (I tried reading it and lost my way by the 3rd page, so this really is the interpretation of other people – unlikely to be wrong since a lot of very learned people say the same thing). What he showed was that a secure encryption system could exist only if the size of the Secret Key was as large as the total size of the messages that would ever be exchanged using that secret key.

    3.1 One Time Pad
    An example that illustrates Shannon’s principle is the One Time Pad (OTP). The idea is this: suppose Alice has Plaintext of length N characters. She now generates a key randomly, also of length N and uses it to generate the Ciphertext. The key is used only once, and for the next message, a new key is generated, again equal to the length of the message. This is mathematically proven to be unbreakable. (Since the key can be used only once, it is called “One Time”, and “Pad” because the key used to be on a slip of paper torn from a Pad on which random text was printed).

    The problem with OTPs is key distribution – it is logistically tough and prone to security breaches. So while OTPs in themselves are perfectly secure, their implementations are vulnerable because of the insecurity involved in key distribution. OTPs are typically never used, except in highly critical applications where the parties involved are willing to undertake the key exchange challenge.

    The reason I brought up OTPs is to point out that perfect security – that can defeat unlimited computing power - is not always practical. Modern-day Cryptography does not chase this chimera. Its goal is not to provide security in the face of infinite computing capacity. Instead, it assumes that the adversary has (reasonably) limited resources. What makes modern-day Crypto work is the gap between the computational ability it takes the legitimate user to encrypt vs. the computational infeasibility of decryption for the adversary who does not have the secret.

    3.2 Computational Complexity Theory
    Whether or not a computation is feasible can be found out if we know the cost involved in making the computation. The discipline of computer science which deals with cost of computation is Computational Complexity Theory. The theory measures the cost of making a computation in terms of the space (quantity of information storage) and time it takes to make the computation. Here it is important to note that when we think of computation, we are not thinking about a specific computation model like the Von Neumann Architecture on which modern PCs are based, or a Turing Machine, but any arbitrary model.

    3.2.1 Algorithm Efficiency
    Before we get deeper into the theory, a quick word about algorithm efficiency: An algorithm’s efficiency is defined as the amount of the time it takes the algorithm to compute the problem, as a function of the problem size. (The problem size measures the size of the input to the program. For different problems, problem sizes are differently defined, and getting it right takes some familiarity with complexity theory.) So for example, the time taken could be:

    T(n) = n3 + n2 + 1

    Where n = problem size.

    Now as n gets arbitrarily large, the value of (n2 + 1) becomes insignificant as compared to n3. So T(n) = n3 for very large n. Such an algorithm then has complexity of the “order of n cube”, and the algorithm is called a O(n3) algorithm.

    An algorithm is considered to run in polynomial time if T(n) = O(nk) where k is a constant. Example include

    •    Constant time: O(1) [k = 0].
    •    Linear time: O(n) [k = 1]
    •    Quadratic time: O(n2) [k = 2]
    •    Cubic time: O(n3) [k = 3]

    An algorithm running in polynomial time would be more efficient than an algorithm running in exponential time: T(n) = O(kn)

     

    3.2.2 Complexity Classes
    In general, complexity theory looks at the universal set of problems in terms of what are called Complexity Classes. A complexity class is a set of problems that are of related complexity, as in, the algorithm efficiency for solving that set of problems can be expressed by T(n) = O(f(n)).  As can be expected, there are a lot of complexity classes, but the ones of biggest interest to us are called P and NP classes.

     

    3.2.3 Class P Complexity
    A problem that can be solved in polynomial time is said to have Class P Complexity. A number of problems fall under Class P complexity:

    • Finding the greatest common divisor
    • Determining if a number is Prime
    • Linear Programming
    • Shortest Path
    • Sorting
    • Modular Exponentiation MOD(x, y, z) = xy  MOD z

    From the perspective of Complexity theory any algorithm that runs in polynomial time is considered as being efficient (or feasible), and any algorithm that does not run in polynomial time as being intractable or infeasible. So any computation in the class P-Complexity is considered efficient. However, this is only in theory, it may not be so in real life. For example if the value of k is 10000, T(n) = O(n10000) is certainly a very inefficient algorithm.

     

    3.2.4 Class NP Complexity
    Put informally, problems of class NP complexity can be defined as ones having a verifier function that can be solved in polynomial time. To understand this better, consider the problem of finding out whether a number is composite or not. Doing this for large numbers is really hard, with T(n) = O(kLog n) for the most efficient algorithms. However, once we have a candidate factor of n (say x), finding if x is actually a factor or not is trivial:

    bool IsFactor(int n, int x)
    {
        if (n % x != 0 || x == 1 || x == n) return false; else return true;
    }

    Since the verifier function (IsFactor) is a class P algorithm, the problem of whether n is composite or not is a NP problem. All NP problems have a verifier function which accepts an input and a verification value, which can be computed in polynomial time. A number of problems are Class NP Complex, the most common being the Factoring problem and the traveling salesman problem.

    The hardest problems in Class NP complexity are said to be NP Complete. An example is the problem of multi-processor scheduling: Find the minimum time required to schedule n jobs (of varying lengths) on m processors such that no two jobs overlap.

     

    3.2.5 Is P = NP?
    Consider the case where a P class problem also has a P class verifier. Would this problem be a class NP problem or a class P problem? The simple answer is P is a subset of NP. So every P problem is a NP problem, but the reverse may not be true. However, some researchers believe that any problem whose verifier can be solved in polynomial time, can itself also be solved in polynomial time. In short, P = NP.

    Whether P is equal to NP is one of the most important unsolved problems in modern mathematics and if it can be proven that all class NP problems are actually class P problems, the implications would be stunning, with a huge impact on our society (for starters, most of modern-day Cryptography would be rendered useless)

     

    3.3 Information Leakage
    While the computation of the Plaintext from the Ciphertext may be infeasible (NP Complete), what if it is feasible (P Complex) to compute part of the Plaintext? For example, what if the length of the Plaintext can be known? Does that compromise the security of the message? Assume a situation where Eve expects Bob to ask Alice whether she would marry him. The answer which Alice would return is a simple yes / no, represented by a single bit. Now if the Eve is looking at the Ciphertext going along the wire, and discovers a single unit moving across, she may deduce that Alice has replied to Bob’s question.  What now is the probability of Eve finding out the correct answer? 50%. Much higher than what a secure communication may require.

    So from a Cryptography standpoint, when we look at computational feasibility, we need to look at it not just from the perspective of the entire data becoming available, but also how feasible it would be to recover part of the information.

     

    4. Cryptographic Primitives
    Modern-day cryptography - based on computational complexity – is built using certain Cryptographic Primitives. What is a primitive? A primitive is an atomic operation (from the perspective of the layer above it), a building block, which does a certain defined task. Cryptography requires primitives that have certain computational hardness. All computational complexity in Cryptography comes from these underlying primitives. Understanding the primitives is the key to understanding Cryptography. While you would not have to typically deal with how Protocols are constructed, a basic understanding is important to get the fundamentals right.

     

    4.1 One Way Functions
    The simplest of the primitives is a One-Way Function. A one-way function is any function that is easy to compute but hard to invert. A mathematical function that is hard to invert is multiplication: Given integers P and Q, N = P x Q can be easily calculated, but given a random N, how would one factor it into P and Q? Sure, we can produce all the factors. But getting P and Q specifically is probabilistic. The higher the value of N, the lower is the probability of getting P and Q back quickly and consistently. This is called the factoring problem, and it is a good example of a function that is efficient to compute, but infeasible to invert.

    Today we have no rigorous mathematical proof that what we consider one-way functions, are indeed one-way. What that means, is that today we are not sure whether there exist inversion methods that have lower computational complexity than the ones we are aware of. Just because a lower-complexity inversion has not yet been discovered, does not mean that they cannot exist. So cryptography is built on assumptions. So far the assumptions have been valid.

     

    4.2 Hash Functions
    One of the most important one-way functions is a Hash Function. A Hash function takes an arbitrary number of bits as input and produces a smaller number of bits based on the input. The algorithm used is deterministic (given the same input, it would always pass thru the same steps and produce the same output), so for the same input bits, it would always generate the same output bits. In context of hash functions, the input is often called a message and the output is called a message digest or a digital fingerprint. So:

    Digital Fingerprint / Message Digest = HashFunction(Message)

    A hash function is obviously one-way. That’s because a hash function is lossy – the number of output bits is less than the number of input bits so there is not enough information in the output to calculate the input. However, this alone would not make hash functions useful. What makes hash functions useful is that they are collision resistant and create an Avalanche effect.

    A collision is a situation where the hash function generates the same output for two different inputs. A good hash function is collision resistant. Once again, if you consider the fact that the number of output bits is less than the number of input bits, it is obvious that there is no way of avoiding a collision. A simple way of approaching this is to think about the input as a 64-bit number and the output as a 16-bit number. Now there are 18,000 trillion 64-bit numbers (264 = 18,446,744,073,709,551,616) 64-bit numbers and 65,536 16-bit numbers. As we keep reducing each number in the 18,000 trillion-range to 65,536 numbers, there are bound to be collisions.

    So a hash function cannot be collision resistant. However, we can reduce the probability of collision if we choose a large enough output range and a good algorithm. For cryptographic purposes, we define collision resistance as: It should be computationally hard to find two inputs m1 and m2 so that their output h = HashingFunction(m1) = HashingFunction (m2). The other important property for a cryptographically secure hashing algorithm is that it should be Pre-Image Resistant: Given an output h, it should be computationally hard to find an input m so that h = HashingFunction(m).

    The other desirable property of a good hash function is what is called the Avalanche Effect. The idea is that if the input changes even by a small amount, the output changes dramatically. The Strict Avalanche Criterion states that if one bit is flipped, each of the output bits should change with a probability of 0.5, in other words, half of the output bits should flip.

     

    4.3 Random Numbers
    The other important family of Cryptographic primitives is Random numbers and their generators. Recall Kerckhoffs’ Principle – the only secret is the key, not the algorithm. Now, we may keep the key a secret, but what if someone guesses the key? So we want to make the key harder to guess. A “guess” can even be an automated attack where the program tries all possible permutations. (This is called a brute-force attack) To reduce the possibility of such an attack from being successful, there are two things that are needed:

    1. The key should be long. The longer the key, the stronger the encryption
    2. The key should be unpredictable. The more random a key, lesser are the chances of it being predictable.

    Point one is easy to understand, but point two is a little ambiguous. What is “more random?” Without getting into a philosophical discussion on randomness, I would just like to say that a random number is one which is generated in such a way that its probability of being generated is as high / low as any other number, and there is no way of determining when the number would be generated.

    Generation of random numbers depends on the initial input on which a mathematical calculation would be performed. But unless that initial condition has randomness in itself, the output would also not be in random. This initial randomness is called an Entropy Source. Based on the entropy source, Random Number Generators (RNGs) can be classified as:

    1. RNGs based on Deterministic Random-bit Generators (DRBGs): DRBG means that the source of entropy is deterministic. Modern computers are all deterministic (as compared to quantum computers which are non-deterministic). If the entropy source is deterministic, the numbers generated cannot really be random. This means that a software-based generator does not generate truly random numbers. Hence we call the numbers generated by such a generator Pseudo-Random Numbers, and the generator itself is called a Pseudo-Random Number Generator (PRNG). All software based RNGs are PRNGs.

    2. RNGs based on Non-Deterministic Random-bit Generators (NDRBGs): NDRBG means that the source of entropy is non-deterministic. Such RNGs are typically hardware based and use a physical phenomenon as a source of entropy. Examples of physical phenomena used by NDRBGs include photoelectric effect, thermal noise, or some other quantum phenomena. Quantum phenomena in theory are unpredictable and therefore a reliable source of entropy. We call the numbers generated by such a generator Truly-Random Numbers, and the generator itself is called a True-Random Number Generator (TRNG). TRNGs are rather expensive and are needed only for very high security scenarios.

    How does one find out if a PRNG is good enough to be used in Cryptography? A random number generator would typically generate numbers in a given range. This leads to two important properties:

    • The distribution of values of the numbers generated in the range. The more even the distribution, the better.
    • Ease of prediction of numbers based on a given set of numbers from the generator. The more difficult the prediction, the better is the PRNG.

    In order to have these properties, a PRNG needs two things:

    1. A random bit stream from a source of entropy. There is some entropy even in a deterministic computer. Examples include Clock Tick count, and values of System Handles (like Thread handle, Process handle, Network Card Id, etc.)
    2. A mathematical calculation that removes the bias from a bit-stream and makes it more randomized. Examples include hashing algorithms like MD4, SHA, etc.

    For the second point, there is a standard by a body called FIPS (Federal Information Processing Standard) which can be found in FIPS 186-2.

    A function catering to the above two requirements is called a Cryptographically Secure Pseudo Random Generator (CRNG). Typical random number generators like the C++ rand() are not CRNG and they should not be used in Cryptography.

     

    5. Confidentiality without using a Shared Secret
    Coming back to the problem of message exchange between Alice and Bob, the situation so far is as follows:

    • Prior to exchanging messages, Alice and Bob agree upon a large random key. The longer and more random the key, the more secure the encryption. The key is private to Alice and Bob – it is a shared secret between them and not disclosed to a third party.
    • The sender then uses a sophisticated encryption scheme (like a Product Cipher) to encrypt the message. The strength of the algorithm being determined by how efficient the algorithm is for the sender / receiver to execute, but computationally hard for Eve (the Adversary) to invert. The bigger this difference, the better the algorithm.
    • To decrypt the message, the recipient uses the secret key to reverse the computation carried out to encrypt the message. The shared Secret makes the reversal of the computation efficient for the recipient and gives him the computational advantage over the Adversary.

    The only problem in this situation is the initial condition – the sharing of the secret key. Since the security of the scheme derives from the key (recall Kerckhoffs’ Principle), its sharing in a secure way is a pre-requisite. The question is how does one share the secret in a secure way without meeting in person? There has to be some way of transferring the key in a secure way too. But wouldn’t secure transmission of the key also involve some sort of encryption? And in that case, wouldn’t we need a key once more? The problem seems to be recursive. But we do know that the problem has been solved, and the key to solving the problem (pun unintended), lies in the symmetry of the situation.

     

    5.1 Diffie-Hellman Key Exchange
    The first time a solution was published for solving the problem of key sharing was in 1976 when Whitfield Diffie and Martin Hellman proposed The Diffie-Hellman Key Exchange:

    • Alice and Bob agree to use a prime number p (say 11) and a ‘generator’ g (say 5). Now these numbers are well established and everyone (including Eve, the Adversary) is aware of it.
    • Hereafter, the following actions take place on Alice’s and Bob’s sides (MOD is the modulus / remainder operation):
    Alice Bob
    Alice chooses an integer 'a' at random. Say 2. Bob chooses an integer ‘b’ at random. Say 3.
    Alice computes (ga MOD p) and sends it over to Bob. (52 MOD 11 = 3) is sent over. Bob computes (gb MOD p) and sends it over to Alice. (53 MOD 11 = 4) is sent over.
    Alice now computes (gb MOD p) a MOD p. So Alice computes 42 MOD 11 = 5. Bob now computes (ga MOD p) b MOD p. So Bob computes 33 MOD 11 = 5.
    Alice obtains 5 Bob obtains 5
    Shared Secret = 5 Shared Secret = 5

    Hence both Alice and Bob are able to arrive at a shared secret. (Of course in a real implementation, a, b and p will be very large numbers)

    What makes D-H work is the fact that gba = gab. This leads to:

    (ga MOD p) b MOD p = (gb MOD p) a MOD p

    But the big question is what prevents the Adversary Eve from also carrying out the same computations given that she also knows p and g? We need to give Alice and Bob some advantage over Eve for this shared secret to be limited to just Alice and Bob. The advantage Alice and Bob have is that they have direct knowledge of a (or b) while Eve is looking at ga (or gb)

    Eve needs to calculate gab given g, ga, and gb where g belongs to a group defined as (Integers MODULO prime). This is called the Diffie-Hellman Problem (DHP). A fast means of solving this problem would break Diffie-Hellman Key Exchange (and a number of related schemes). The fastest way of solving it today is to find ‘a’ given ga. This is called the Discrete Logarithm Problem (DLP). There is no known efficient algorithm to solve DLP as of today. Why is that so? We have no rigorous mathematical proof; just that no efficient algorithms have so far been discovered. But to understand the nature of the problem, let’s try and break this down further.

    Let’s start by thinking about what is known to Eve. When Alice sends across her first computation to Bob, Eve knows:

    1.    p
    2.    z
    3.    N = (ga MOD p).

    Now given this information, Eve needs to calculate ‘a’. Note that MOD is not a symmetric operation. Meaning that given (x = y MOD z), and given x and z, you cannot carry out a mathematical operation that yields y.

    The values of y can be = (z * n) + x, where n is a natural number. Hence there is an infinite set of values that can be taken by y.

    Applied to our case, when Eve looks at (25 MOD 11), she is looking at 3. And she knows ‘p’ is 11. So ga can be = (11 * n) + 3 where n is a Natural number. Given that g is known to be 5, our equation becomes:

    5a can be = (11 * n) + 3.

    Now Eve has the unenviable job of calculating a. How easy is this computation? A trivial approach would require Eve to run thru all powers of 5 and compare each with (11 * n) + 3 for several iterations of n. Are there multiple solutions? That depends on values for g, p. Let’s assume that there is a single solution only. But even if there is a single solution, if we choose p, a, and b to be very large integers (g can be small. It is usually 2 or 5), the calculation is a very tough one for Eve, and takes a lot more computational power than it would take Alice or Bob to calculate the Shared Secret. This asymmetry is what makes Diffie-Hellman work.

    Note: These days, it is recommended that p be at least 512-bits long. For more security, use 1024-bits.

     

    5.2 RSA
    Another approach to solving the problem of symmetric encryption was published in 1977 by Ron Rivest, Adi Shamir and Len Adleman at MIT, and they called it RSA after their initials. The approach taken in RSA is that the secret is not required to be shared at all. In RSA, the encryption and decryption operations are done using different keys. The mechanism is:

    1. Alice generates a pair of keys. Let’s call them K1 and K2.
    2. The algorithm is such that the message is encrypted using K1 and decrypted using K2. Message encrypted using K1 cannot be decrypted using K1; you need K2 to decrypt the message. This is called asymmetric encryption.
    3. Alice makes K1 public: Bob, Eve and the entire world now has K1. We call K1 Alice’s Public Key. Bob (or anyone else) can now encrypt a message using Alice’s public key and send it to her.
    4. K2 stays with Alice as a secret. This is Alice’s Private Key, used to decrypt the message. Only Alice, and no one else, can read the messages that are encrypted using her public key.

    Bob also does a similar thing. He also creates a private-public key pair and publishes the public key. Alice can now send messages to Bob by encrypting the messages with his public key. Only Bob would be able to decrypt the messages (using his private key). We have solved the problem of the secret key by removing the symmetry of the model.

    The interesting part is the algorithm itself which requires one key for encryption and another key for decryption. Again, the math involved is straightforward, just that the calculations involve some huge numbers which means that the computing power required is rather high.

    As usual, a cryptographic scheme works if the legitimate users can compute efficiently while the adversary’s computations are complex. So the questions we should ask for RSA are:
    1. How complex is to compute the private key using the public key.
    2. How efficient is to encrypt / decrypt messages using asymmetric algorithms.

    We will not get into the details here, but take it for granted for now that:

    1. Computation of the private key from the public key is computationally hard. No simple algorithms exist as of today and the inversion is infeasible. Of course this is an assumption based on knowledge that we have today. So far the assumption has been valid, and that is why RSA works!

    2. It is not very efficient to encrypt / decrypt messages using asymmetric algorithms. For one, RSA calculations involve really long numbers. So while a typical symmetric algorithm would be secure with a 128-bit key, the minimum key-size required to achieve security in RSA would be 1024-bits. Obviously the computations on such large numbers would be really slow. Hence it is rather inefficient to encrypt / decrypt large amounts of data using RSA. So the way RSA based cryptography really works is that the only data that is encrypted using asymmetric encryption is the key for a symmetric encryption. Thus:

    • Bob generates a key for symmetric encryption and now needs to share this key with Alice
    • Bob has Alice’s public key which she has published for the whole world to send secrets to her
    • Bob encrypts the shared key using Alice’s public key and sends it over to her
    • Alice decrypts the symmetric key sent by Bob using her private key
    • Now Alice and Bob both have the secret key and they can use symmetric encryption
    • The actual data is encrypted / decrypted using symmetric encryption

     

    6. Integrity
    So far we have assumed that Eve is a passive spectator, but what if she is more malicious, more powerful? Consider the scenario where when Bob asks Alice if she would marry him or not, Eve is able to send a response (No). If Bob does not know that the message had been tampered with and that the message he received was not by Alice, but by Eve, we have a classic Hindi film situation. To avoid this, Bob should be able to establish that the message was not modified on the way, that it originated from Alice, that the message is authentic.

    This is called Message Integrity. In fact, integrity is often a more critical requirement than confidentiality. We are ok if an adversary discovers our contents of our message, what we certainly do not want is the adversary modifying the contents.

    6.1 Encryption does not give Integrity
    Prima Facie, it seems that encryption automatically gives integrity. The argument goes as follows:

    • Alice prepares Ciphertext out of Plaintext and sends it over to Bob
    • In the regular scheme of things, Bob would receive the Ciphertext and get Plaintext out of it.
    • In the attack scenario, Eve intercepts the message and modifies it. Now Bob gets the modified Ciphertext.
    • When Bob decrypts the modified Ciphertext, he gets garbage. This means that the message is no longer authentic.

    Now though it seems correct, this argument is fallacious. For example, if Eve knows that Alice is responding to Bob’s message of whether she would marry him, Eve may simply flip the bits on the response being sent by Alice. Since the response was supposed to be a yes / no, flipping the bits of the Ciphertext may also flip the response. This does depend on the strength of the encryption algorithm, but the possibility cannot be discounted. So in principle we cannot assume that encryption provides integrity.

    Proving the fallacy of the “encryption provides integrity” argument in general requires that we look at some math. We will avoid that right now. Suffice to say that if Eve has some information about the nature of the data being sent, and the algorithm being used, she can modify parts of the Ciphertext in such a way so that the decryption of the Ciphertext actually yields wrong Plaintext which is meaningful to Bob.

    Since encryption does not provide Integrity, we need some additional mechanisms. There are two popular solutions: Machine Authentication Codes and Digital Signatures.

    6.2 Machine Authentication Code
    The simplest way of solving the data integrity problem is called a Machine Authentication Code (MAC). The idea behind MACs is that we send a tag along with the message. The tag is a function of the key and the message. If the message gets modified, a verification check will reveal that the tag can no longer be computed using the message and the key. This means that the message has been tampered with. MACs work as follows:

    • Alice needs to send a message (M) to Bob. Confidentiality may or may not be a goal, but integrity certainly is.
    • For achieving integrity, Alice and Bob have a shared secret key K
    • Alice uses a “Tagging Algorithm” to generate a Tag T using the Message M and the Key K: T = TaggingAlgorithm(M, K)
    • The message M is then sent (maybe encrypted, maybe as Plaintext) along with the tag T to Bob
    • Eve intercepts the message, modifies it and forwards it Bob
    • Bob uses a “Verification Algorithm” to check if the Tag can be calculated using the Message M. If yes, the message is authentic, otherwise it is not.

    The Verification Algorithm may look like: bool VA(M, K, T)
    {
      T2 = TaggingAlgorithm (M, K) ;
      if (T2 == T) return true; else return false;
    }

    While the mechanism is similar to that in symmetric encryption, the goals here are totally different. In symmetric encryption, the message being sent – the Ciphertext – contained the entire information of the Plaintext in an encrypted form. The way the Tag is different from the Ciphertext is that the Tag does not represent the complete information of the Plaintext and is a much smaller value, a number, which is computed using the Plaintext and the secret key K, and is sent along with the message (in Plaintext or Ciphertext).

    Note that symmetric encryption and MACing are not mutually exclusive. For the same Plaintext, an Encryption Algorithm, along with an Encryption Key K1 can generate Ciphertext, while a Tagging Algorithm along with a Tagging Key K2 can generate a Tag. So we can achieve both confidentiality and integrity. It is just that in this scenario, Alice and Bob have to share two secret keys.

    Now this does seem secure, but there are a couple of interesting things that can still happen. The simplest one is what is called a Replay Attack. In a replay attack, Eve watches a message M along with a tag T being sent by Alice to Bob. She picks a copy of the message from the wire and keeps it with her. Sometime later, she sends the same copy over to Bob. How can this become an attack? What if the message Alice sent was: “I borrowed Rs. 1000 from Eve. Please credit Rs. 1000 in Eve’s account.” Every time Bob receives this message, he transfers Rs. 1000 to Eve’s account. Alice wanted to send this message only once, but Eve carries out a replay attack and gets richer. Another example is that of a user sending over a username + password combination to a web-server and the server sending over a cookie to denote a successful login. If a hacker intercepts the packet being sent over to the server and later does a replay, the hacker will also get the cookie, without knowing the username and the password.

     

    6.3 Digital Signature
    In MAC, we solved the Integrity problem by giving both Alice and Bob a secret key using which they could generate a Tag. But let us not forget that it is not easy to share the secret key in the first place. How then do we solve the integrity problem without sharing the key? The answer lies in what is called a Digital Signature.

    Recall from our discussion on RSA that we addressed the confidentiality requirement by having the receiver publish its public key. The sender would use this public key to encrypt the data and send it to the receiver who would decrypt it using its private key.

    Digital Signatures address the integrity requirement by having the sender publish its public key. To send a message, the sender generates a Signature using his private key and the message:

    Signature = SigningAlgorithm(Message, SendersPrivateKey)

    He then sends the Signature along with the message. The receiver uses the sender’s public key to verify whether the Signature is valid or not:

    Yes / No = VerificationAlgorithm(Message, Signature, SendersPublicKey)

    As you notice, Digital Signatures are similar in principle to MACs, except that the setting here is asymmetric. Once again, as with MACs, the confidentiality and integrity requirements are not mutually exclusive. We can achieve both.

    An important point of difference between MACs and Digital Signatures concerns non-Repudiation. When Alice sends a message to Bob, we don’t want Alice to be able to deny the fact that she actually sent the message and not someone else. MACs are not able to create Non-Repudiation. If Bob can verify a tagged message, he can also produce a tagged message (because the key is the same). So Bob cannot drag Alice to a court of law, produce a tagged message and claim it was sent by Alice. Digital Signatures on the other hand do provide Non-Repudiation. Since a Digital Signature can only be produced by the Private Key of the sender, the sender cannot claim that the receiver concocted the evidence.

     

    Summary

    I have tried to cover most of what I think are essential concepts towards understanding cryptography. With this, one can now get started putting publically available crypto libraries to use. There are quite a few things I wanted to cover but have not been able to since this text grew too long and I basically got tired of working on this. These are about how typical algorithms work - Feistel networks, key schedules, S-box, P-box, etc., survey of the various standard algorithms – DES, 3DES, AES, MD5, SHA0, 1, 2, and choosing block sizes, IVs, key sizes, modes of operations, etc., and finally material on Authentication, Digital certificates, PKI, and Elliptic Curve Cryptography. Of course, even with all this info included, I would still have only scratched the surface since there is a ton more to be learned. The reading list below is therefore highly recommended:

    1. Applied Cryptography: Protocols, Algorithms and Source Code in C by Bruce Schneier: Comprehensive, but dated book. Written in a simple way which is very easy to follow.
    2. Handbook of Applied Cryptography by Menezes et al: Provides the mathematics and the fundamentals skipped in the Schneier book. Don’t let the math scare you – it is easy enough to follow if you pay attention and are willing to surf the net for bits you do not have background on. More suited if you are thinking of designing your own algorithms, etc.
    3. Cryptography and Network Security by William Stallings: Great info on hashing, choice of algorithms and design of secure networks. More suited for the application programmer.

    December 11

    Usability - Part 1 - Research Findings

    I have never been a UI guy, my focus being mostly on databases and middleware. Sure I know HTML / CSS / XAML, but my knowledge here is far less than what I know about the other tiers. In any case this is from a development point of view - whatever little I do know of these languages does not help me distinguish a good UI from a bad UI, leave alone design a good UI. However, users ultimately buy experience, and as someone who owns a few products at Directi it is my job to own that experience. Now I doubt I can teach myself aesthetic appreciation, but usability is a little bit more objective. So of late I have been investing some time in understanding the core principles behind usability. This series of posts summarizes the key things that I learnt and is meant as notes to myself. Hopefully, this material would be useful to others as well.

    Basic Usability Design

    The first thing any programmer who is looking to improve his UI skills must read is User Interface Design for Programmers - classic Joel Spolsky from his early days. The key points are simple enough:

    1. Invent some users
    2. Figure out the important activities
    3. Figure out the user model -- how the user will expect to accomplish those activities
    4. Sketch out the first draft of the design
    5. Iterate over your design again and again, making it easier and easier until it's well within the capabilities of your imaginary users
    6. Watch real humans trying to use your software. Note the areas where people have trouble, which probably demonstrate areas where the program model isn't matching the user model.

    However, what is hard is implementing these steps correctly, and Joel gives several examples on correct and incorrect decision and typical mistakes. As I said, a must read.

     

    Research Findings 

    This content comes from a document that my friend Ninad Rawal - the Usability Lead at Directi - shared with me. He is also responsible for introducing me to the Web Form Design book which is the topic of the second post.

     

    1) Memory

    a) Initial short-term memory is fragile - you cannot remember a lot. (So avoid having your user remember stuff across screens)

    b) More you process info, more you remember it. So for example if your screen refreshes to display an update, keep that update in the same spot because your mind remember the update being there at that spot. Example is the clock in Windows Taskbar in the bottom right corner. Consistency is key since it increases recall.

    c) Interfering with info while it is being processed affects how much you remember. I think it is for this reason that I prefer the static notification icon in Facebook to the popup toasts common in IM apps.

    d) How much you are able to remember can be assisted by chunking information. For example, phone numbers. Or updates on a micro-blog being grouped by date. Ideal chunking is 3-4 chunks with 3-4 items / chunk.

    e) Mostly you do not remember events, you remember your interpretation of events. You typically remember parts of the event and fill in the gaps by parts of other memories or what you believe is probable. Memory is not perfect and not reliable. Avoid having people to remember things. (I am thinking passwords here!)

     

    2) Visual Recall

    a) There is a common perspective associated with each object we see in daily life. In engineering terms, it happens to be the isometric perspective, followed by the front-perspective, followed by the top-perspective. This is why icons should be isometric (3d like) and not flat. For 2d objects like files and messages, a 2d icon is alright.

    b)  Humans are great at shape recognition (reading is a form of shape recognition). The sides of a shape are more important than the middle of a shape for recall. The mind can extrapolate and create the middle parts if the edges are present but not the other way round.

    c) People tend to scan horizontally rather than diagonally. While looking for something specific, we start at top-left and then go to center, avoiding edges. So put most imp info in the center and assume a horizontal scan. Don't put imp info at the edges.

    d) Clutter creates confusion. This is because if two things are present together, the mind processes them together, even if they are unrelated. So put less space between related items and increase space between unrelated items.

    e) In printed text, it is easier to read mixed upper case and lower case. For isolated words, all caps is better.

     

    3) Decision making

    a) First impressions are important, especially if they are consistently reinforced. It is hard to change once someone has made up her mind even sub-consciously.

    b) Humans like to have more info. So if your user needs to use x amount of data, you may need to provide 4x.

    c) Abstract info is hard to grasp - real-life examples, especially actual events (case-studies, anecdotes) create a bigger impact

     

    4) Text Reading

    a) Length of Line: (source:  http://www.ingentaconnect.com/content/tandf/tbit/2004/00000023/00000006/art00002)

    People prefer reading text with a smaller line length because of less eye movement, even though they are able to read faster with a longer line length of around 95 chars.

     

    b) Number of Columns and Justification: (source: http://psychology.wichita.edu/surl/usabilitynews/72/columns.asp)

    - Fast readers get best reading speed and efficiency in two column fully-justified text

    - Slow readers get best reading speed and efficiency in single column left justified text

    - If you are not sure, choose single column left justified text

     

    c) Serif vs. Sans-serif Typeface: (source: http://www.alexpoole.info/academic/literaturereview.html)

    - They are equally legible - could be a matter of preference, not of performance

    - Choose a standard typeface - TNR, Arial, Verdana ...

     

    d) Visual Factors for GUI and Web: (source: http://www.ingentaconnect.com/content/hfes/hf/2005/00000047/00000001/art00012)

    - Reading Performance reduces with in pages with many links and variable densities

    - Text Alignment is not a performance enhancing factor

     

    e) Font and Background Color: (source: http://sigs.aisnet.org/sighci/bit04/BIT_Hall.pdf)

    - People believe black text on white background is most readable as compared to color fonts / backgrounds (though it does not affect their ability to retain info)

    - People find color to be aesthetically appealing

    - People equate higher readability with higher professionalism

     

    5) Interaction

    a) Icons (source: http://www.informaworld.com/smpp/content~content=a785033788~db=all)

    - Bigger icons are faster to work with

    - Icons left on the screen (like on the Windows task bar) result in faster response. Increasing number of icons does not increase response time.

    - Icons appearing on the screen on an action (say on pressing Start button in Windows) result in a slower response. Increasing number of icons increases response time.

    - Minimizing movement reduces response time (Fitts' Law) - hence square configuration is best, followed by horizontal

     

    b) Keyboard Shortcuts (source: Hidden Cost of GUI : Failure to Make Transition form Menu and Icon Toolbars. International Journal of Human Computer Interaction, 18(2), pp. 133-144)

    - Using a keyboard shortcut is fastest, followed by icon, followed by menu

    - People don't learn shortcuts despite being easy to learn.

    - Habitual patterns dominate performance

    October 29

    Azure - Microsoft's Cloud Platform

    Microsoft's application platform has now got a shade of the sky - Azure was formally announced at the PDC yesterday by Ray Ozzie in his opening keynote address.

    I have said this several times in the past - computing going forward is going to be all about parallelization. On the client, parallelization of code is necessitated by the advent of many-core CPUs. On the server, parallelization is about scaling out - partitioning your code, app, data to be able to run on several servers in parallel to provide high-scale and redundancy. Of course it is rather difficult to design, build and operate the app infrastructure required to run an application that can potentially run across hundreds of servers. And this is where the Azure platform comes in - it is a platform to run .Net applications on Microsoft's cloud fabric, hosted in its datacenters across the globe, offering virtually limitless scalability and availability. This is done by providing a set of abstractions that let you focus on the core business logic of the application instead of worrying about the underlying physical infrastructure. You can get an overview of these abstractions by reading David Chappell's excellent whitepaper that introduces the three key set of abstractions announced at the PDC. In this post, I am going to focus on the lowest layer - Windows Azure.

     

    Cloud Operating System

    Windows Azure is the OS for the Cloud. Marketing speak? Not really. What does an OS do? It abstracts away the hardware. Before the OS came along, people used to code specifically for each machine architecture. You had to know the instruction set of a machine in order to be able to program for it. This changed with the notion of an OS. You did not have to worry about the specific machine architecture - you wrote to the OS and it took care of loading your code in memory, loading stack in the register, moving the instruction pointer, loading the data from disk to RAM, RAM to cache, cache to register, etc.

    Today, when we think about creating really large scale applications, we need to know our deployment environment fairly well. How many web-servers are there? How should I partition my data - where is the node to which I need to send the writes, from which nodes can I do the reads. And then servicing this entire thing is difficult as well - how do I upgrade my code across the farm? How do I apply patches? And then there is the problem of recovery - a node goes down, now how do I route traffic? What happened to in-flight data on that node? What about the database node that just went down? Recovery gets very hard in a truly large scale app that is deployed across several physical boxes. It is not that it cannot be done - every single successful dotcom / web2.0 site has gone thru this route. But it is a hard route to follow, it takes a ton of engineering talent, a lot of upfront money and you still need to be lucky to get it right.

    Enter the Cloud OS - an abstraction for a scaled out infrastructure that you can start using without bothering about underlying details. The idea is this - you design your app to scale (yes, you still have to do the design - no abstraction can help scale an app that has sticky session state or unpartitionable data design), and stop bothering about everything else underneath it. You basically state your intent by defining your deployment architecture declaratively by telling the OS the number of front-end servers you need, the number of back-end processing machines you need, what is the response time you expect on a certain operation, etc. and the system figures out the rest of it - it is an OS for the Cloud!

     

    The Programming Model

    At a high level, the programming model is really straight-forward. You write your code just the way you typically do (hopefully designed for scale-out) and alongside, you provide meta-data about what your app is and how it should be deployed. Then you submit your code and the metadata to the Cloud OS.

    In the OS, a component called the Fabric Controller examines your metadata and finds the available resources that match the physical needs you describe and pushes your code to these resources - typically virtual machines. On each of these resources another component called an Agent is running which is in constant communication with the Controller. When the Controller pushes your code to these boxes, the agent configures it on this box as you defined - and keeps a check on its activities to make sure that the code is able to deliver the kind of characteristics you desire in terms of response time, footprint, etc. Similarly, if your metadata describes other resources to be used - routers, switches, load balancers, etc. - the Controller also gets these resources provisioned and configured according to your definition.

    Once your app is up and running, the agent on a box maintains a heartbeat with your code - if it sees anything moving out the range it may take action like moving your app to a different box, or provisioning another box, etc. Of course what can also happen is that this box itself may die, taking the agent with it. This is when the Controller comes into action. The controller also maintains a heartbeat with the various Agents, and in case an Agent goes down, it can try and re-start the machine, etc.

     

    Service Architecture

    The architecture of a Service targeting the Azure CTP release is given at http://msdn.microsoft.com/library/dd179341.aspx. Couple of key points:

    1) The service can consist of up to one app in a web role and one app in a worker role. However, there can be any number of instances of these apps.

    2) The metadata is split into two parts:

    a) Static Part - This part, called the System Definition, cannot be changed for a running service. This describes the "shape" of your service - no of web roles, worker roles. To change this, the service has to be republished. The schema is given at http://msdn.microsoft.com/library/dd179395.aspx

    b) Dynamic Part - This part, called the System Configuration, can be changed for a running service. This describes the instancing of your service - how many instances of the web role and the worker role do you need. The schema for this is given at http://msdn.microsoft.com/library/dd179389.aspx

    3) Storage: There are two options for storing data: First one is to use SQL Server Services - this is the database in the sky option. More on this at http://msdn.microsoft.com/library/dd200927.aspx. Second option is to use Windows Azure native storage: This is a simpler storage option. There are three services here:

    a) Blob Storage: Meant for opaque objects like media files. This is similar to Amazon's S3.More on http://msdn.microsoft.com/library/dd135733.aspx.

    b) Table Storage: Meant for structured data. This is conceptually the same as ADO.net Data Services. More on http://msdn.microsoft.com/library/dd179423.aspx

    c) Queue Storage: A basic message queue system. More on http://msdn.microsoft.com/library/dd179363.aspx 

     

    Creating a Azure hosted Service

    The Azure SDK provides an emulator called the Development Fabric that simplifies development tasks a lot. Here are the key steps:

    1) Write your code

    2) For storing data, a Development Storage is provided. You need to initialize this storage thru a tool called DevelopmentStorage

    3) Create the Service Definition and Service Configuration files

    4) Change the storage URIs as per http://msdn.microsoft.com/library/dd179425.aspx in the configuration file

    5) Package the code along with the Service-Def file using a tool called CSPack.

    6) Publish the package using a tool called CSRun.

     

    Of course, before you can carry out step 6) you need to get your Azure account. To do that, visit https://connect.microsoft.com/site/sitehome.aspx?SiteID=681.

     

    But Remember ...

    ... that though the OS makes it simpler to run a scalable application by abstracting away all the infra underneath and by providing multiple services, the key is to design the scalable app. I will write more on writing scalable applications and also on using Azure in subsequent posts.

    September 10

    Humans, don't destroy the Universe

    No this is not a line from a new novel by Douglas Adams, but our regular doomsayers. Yessir, not only have we become powerful enough to destroy the Universe, we are foolish enough to do it too! So after several failed warnings, finally the end of the world is in sight and you do not have to pay your EMI any more. Only this time it has not do with the turn of the millennium or a celestial body bringing the wrath of the Gods or the second coming of Christ, but something we humans are going to unleash upon ourselves. And it goes by the inane name of LHC. No it is not a mass hypnosis drug - far from it - LHC stands for the Large Hydron Collider, the largest particle accelerator in the world.

    What are particle accelerators, you ask and how can one end the world? Well particle accelerators are instruments built by physicists (the Einstein variety of scientists - classification by subject matter, not gray) to study the behavior of very tiny particles at high energies. To get these tiny particles at the high energy levels, they make them go round and round in a long circular tube, accelerated by magnetic fields, till they start approaching the fastest speed possible - that of light in vacuum (a staggering 300,000 km / second). Then these particles are smashed into one another producing fireworks which the physicists watch with anticipation and glee. The glee is mostly because of being able to spend billions of dollars to smash things into one another at high speeds, but occasionally is also caused by observation of phenomena that reveal truths about the fundamental nature of our world - how the Universe got created and why things work the way they do.

    At a length of 27 km and a budget of 6 Billion dollars, the LHC is just the largest and the most expensive of these toys yet. And apparently this toy can create particles that can destroy our world, or so say some scientists. Nay, say the toy makers- CERN - the same lot who built the world wide web. Nay, also say the 7000 odd kids who are going to play with the toys - some of the most eminent physicists of the world. The situation has the beautifully tragic potential of a foolish young civilization ending itself by probing too deep into the mind of God. It also has an element of ironical comedy which would have fitted straight into a sub-plot of the Hitchhiker's Guide to the Galaxy. But we are dealing with an important question here - whether to pay the EMI or not - and must find out which camp to believe.

    Now popular media cannot be believed, since it is caught up between two forces - the commercial force of getting more eyeballs which leads to sensationalism and the commercial force of credibility so that they can continue to get eyeballs which leads to more responsible reporting. Of course when it comes to such a juicy story, the immediate grabbing of eyeballs is more important, credibility can be built by other stories. Which is why you see "News" channels these days warning people against the wrath of the solar-eclipse and the evil eye of Saturn in the vocal styles of C-grade horror movie protagonists - it is all about immediate cash flow. However, this blog is not constrained by commercial interests - partly because my employers take good care of me (not for long, says my manager, if I continue writing such pieces), but mostly because Microsoft is late to bring out an ad-module for Live Spaces. So lets examine the arguments on both sides.

    Of the several particles that would be produced at the LHC, the following four are the ones under the spotlight:

     

    1) Micro Black Holes:

    What is a Black Hole?

    All masses attract each other thru the force of gravity. However the force is very weak - if you place two balls apart, they do not automatically collide since the force of attraction is unable to overcome the force of friction. This force of gravity increases as the mass of an object increases and the distance between the objects decreases. For objects as large as the planet Earth, the force is quite strong - actually it is this force that holds you and me to Earth.  Since matter is made up of smaller particles, all particles also attract thru the force of gravity, a single piece of matter can get compressed. However, the internal repulsive forces prevent this from happening. Which is why we do not collapse to a single point. However, when the mass is as high as a Star, the internal pressures can be overcome and the mass can collapse on itself undergoing a compression. The reason Stars do not collapse is because the tight compression fuses nuclei and generates energy which pushes particles apart, preventing further compression. However, very large stars, after exhausting their nuclear fusion can go on compressing infinitely producing what is called a Black Hole. Nothing is able to escape the gravitational pull of a Black Hole, not even light with its fast speed. That is why the name Black Hole - it cannot be seen because light is not able to escape it. So a black hole swallows matter around it and grows stronger. However, Black Holes are not a one way tunnel. Stephen Hawking proved that Black Holes evaporate energy, diminishing their mass gradually unless they completely disappear. This is called Hawking Radiation. Hawking Radiation is stronger for smaller black holes than larger black holes. This means that a small black hole radiates energy faster. Since the pull of a small black hole is smaller, it also grows slowly as compared to a larger black hole. Hence, a small black hole dissipates faster than a big one.

    What is a Micro-Black Hole?

    Theorists have long held that lighter black holes are also possible. As light as a millionth of a gram! However, the energy that it would take to produce such a small black hole is quite high. Now remember that in the LHC, particles would collide at very high energies, so maybe, just maybe a micro black hole can get formed. According to one theoretical model however, the energy required would actually be just sufficient and micro black holes can get formed at a rate of as much as one black hole per second in the LHC.

    What is the Danger?

    Since black holes grow and swallow other mass, there is a chance that a micro black hole will be formed that can swallow the entire Earth. There are several factors required to lead to this situation:

    a) Micro black holes should be formed in the first place. The currently accepted models of micro black holes put the energy levels required at a thousand times the capability of the LHC. The doomsday scenario people claim that an alternative model makes it possible

    b) The Micro black hole must not evaporate. This would happen only if Hawking Radiation is not a true phenomenon - that is, only a theoretical curiosity, not an actual fact.

    c) The Micro black hole would be produced at low velocity so that it does not cross the Earth very quickly. At a high speed (remember the particles are colliding in the LHC at speed approaching that of light), the particle can leave the Earth very quickly without causing any harm

     

    2) Strangelets:

    All matter consists ultimately of a set of fundamental particles called Quarks and Leptons. I will not explain Quarks and Leptons here, you can read more on them on my other blog. There are 6 kinds of Quarks and one of them is called the Strange Quark. Don't go on the name - the scientists named the quarks in creative ways and all quarks are equally strange. Typically matter is made of neutrons and protons which are made of Quarks called Up and Down quarks. However, there can also be matter made of Up, Down and Strange Quarks and such matter is called Strange Matter. Strange matter can occur in the core of very dense stars called neutron stars (these stars fall somewhat short of being a black hole) or in isolated droplets called Strangelets or a neutron star may get converted to a Strange Star if the mass is very high, but just short of being a black hole.

    What is the Danger?

    Typically we do not observe Strange matter because it is unstable. However, it is possible that if there is more of Strange Matter, it may be stable. The LHC may produce strangelets in enough quantity that they become stable. It is also possible that when a strangelet comes in contact with another nucleus it may convert that to a strangelet as well, thus setting off a chain reaction which may convert all of matter on Earth into Strange Matter.

     

    3) Vacuum Bubbles:

    There are theoretical models that predict that the vacuum in space is not the lowest-energy vacuum, but that there is a lower state vacuum called True Vacuum. This means that the Universe has False Vacuum. Thus at some point, our Universe may transition into the True Vacuum. However, this has not happened in 13 billion years so far. At LHC, since very high energy conditions would be there, it is possible that a region of True Vacuum - called a Vacuum Bubble may appear.

    What is the Danger?

    If a Vacuum Bubble is created, space surrounding it would also collapse into a lower energy state and it would expand. Such a bubble can grow at the speed of light and convert the entire Universe in a state of True Vacuum. This can destroy the Universe as we know it.

     

    4) Magnetic monopoles:

    If you have seen a magnet, you know it has two poles - North and South. Ever seen a magnet without two poles? What do you think would happen if you cut a magnet in half? People have tried this, and it leads to two magnets - each with two poles. No one has a seen a magnet with a single pole - a magnetic monopole. However, current theories predict magnetic monopoles, in fact they require that magnetic monopoles exist in order to be self-consistent. Just that they have not been observed anywhere, simply because they are obviously very rare and hence not likely to be encountered by a particle detector. Also, because they are very heavy, they are not created by particle accelerators. However, the LHC is a bigger and more powerful accelerator than any. And hence the possibility of a magnetic monopole getting formed.

    What is the Danger?

    In some models, it is predicted that a magnetic monopole can catalyze nucleon decay - this means that if a magnetic monopole strikes a large number of nuclei, causing them to decay and to release massive amounts of energy creating a gigantic nuclear explosion.

     

    What Does CERN say?

    Having understood the various threats in a little bit of detail, let's now try and see why CERN and a majority of Earth's physicists are not worried about this. The basic argument has to do with the LHC experiment being common in nature. To understand that we need to understand a phenomenon called Cosmic Rays.

    Earth is constantly bombarded by high energy protons coming from deep space. These particles zipping all around us - called Cosmic Rays - originate in the Sun, in Supernova explosions in the outer Galaxy, in the Super-Heavy nuclei of various Galaxies, in Gamma Ray bursts, and in phenomena we do not know / understand yet. These particles can travel at very high energies. The highest energy comic ray possible is defined by the GZK limit and is held at 6 x 1019 eV - around that of a tennis ball thrown at 90 km/hr. However, cosmic rays with higher energies have also been observed and these are called Ultra High Energy Cosmic Rays. The magnetic field of the Earth and the Sun is able to divert most Cosmic Rays towards the poles - this is what leads to the beautiful Northern Lights phenomenon, and that is one reason why life was able to originate on Earth - the magnetic field allowed creation of more complex molecules which would not have been possible under an open exposure to the Cosmic Ray shower. But the point is this - Cosmic Rays are common.

    What does this have to do with the LHC? Just this - the particles produced in the LHC would have energies around 1017 eV - three orders of magnitudes less than the highest energy cosmic rays! What? I hear you ask! Then why spend 6 billion dollars of tax payers money to produce stuff that nature already produces??? Well, one reason is that big boys like playing with big toys. The other reason is that the particle accelerator provide controlled environments to reproduce and study such phenomena which is not possible in the open Cosmic Ray environment. Nevertheless, the collision energy produced in the LHC is far less than what occurs in nature and hence it would not produce anything which is not otherwise found in nature. It would just provide an environment to study the phenomenon. According to CERN:

    "Over the past billions of years, Nature has already generated on Earth as many collisions as about a million LHC experiments – and the planet still exists. Astronomers observe an enormous number of larger astronomical bodies throughout the Universe, all of which are also struck by cosmic rays. The Universe as a whole conducts more than 10 million million LHC-like experiments per second. The possibility of any dangerous consequences contradicts what astronomers see - stars and galaxies still exist."

    While CERN also goes on to publish rebuttals to each type of threat individually, to me the argument above sounds like a good one. If you are interested in a more detailed discussion, read http://cern.ch/lsag/LSAG-Report.pdf.

     

    So my recommendation is to plan your life on the assumption that the world is not ending. Not tomorrow when they inject the first set of particles, not after a couple of months when the LHC is fully functional, and not after a few years when enough meta-stable micro-black holes, magnetic monopoles, strangelets and vacuum bubbles have accumulated to destroy everything we know and love. And yes, do pay your EMI. I am continuing to pay, and so are the doomsayers!

    August 15

    Moving On ...

    Yesterday was my last working day at Microsoft India.

    When I joined Microsoft in Oct 2001, I never thought this day would come.  After all, joining Microsoft was a dream, and when I finally did join the company, I thought I am going to be here for life. However, when you have lived a dream for seven years, taken nourishment from it, enjoyed it the fullest, soaked in it ... it gets absorbed in you. And then you need a new dream to retain that same level of vitality.

    Thankfully, there is a lot to dream about these days - there are changes happening in mainstream computing -  some of which I have alluded to in earlier posts as well - which would precipitate a major re-architecting of a lot of existing software and lead to some completely new kind of apps. This opens up huge opportunities both on a business front and on a technology front. Then there is the India factor - India is the like a laboratory for the world right now because of it being the only English speaking country out of the BRIC nations. This creates even more opportunities.

    With its assets, vision and execution path, I firmly believe Microsoft to be amazingly well positioned to leverage these opportunities and define the next era in computing. Of course there are challenges - the misconception on the role of the Client, the negative perception on Vista (some of it deserved, some of it just plain perception as demonstrated by The Mojave Experiment), the lack of online ad-inventory are perhaps the most important ones. But there is a ton of amazing work happening at Microsoft - the I/O changes in Vista, the emerging Parallel Computing platform, the work on declarative frameworks, innovations in languages, Software + Services platforms like Live Mesh, stuff like Sync Framework and Velocity are all about leveraging the opportunities ahead. And wait till you see some of the stuff coming out of Redmond in the next one year or so - it is wicked cool and a lot of public perception would change dramatically!

     

    So if I believe so strongly in Microsoft, and I love the company so much, why am I moving out? Well couple of reasons:

    1) I wanted to participate more directly in the changes happening in the industry - that means shipping a product. Something I could own end to end, and which I could see in the hands of people, making a difference to their lives.

    2) All my life I have been a Microsoftie and while I have studied the other stacks academically, I have never really worked on them in a big way. So I wanted to explore some of the other stuff out there - technologies, techniques, software-development models, etc.

    3) One day, I hope to run my own business, and that requires developing some competencies which I do not have today in full measure.

    Now all of this can happen at Microsoft as well, but it would have taken time. And hence the role change - I want to move fast on this.

     

    Of course making a change like this is never easy. The last one month has been an emotional one for me. The feeling is similar to what I felt while leaving school - a sense of leaving an institution where you grew up, where you laughed and cried, got angry and frustrated, saw success and failure, received support, love and respect, where you made some terrific friends and learnt from some great people. An institution that you have had the honor of representing, an institution that is a part of you. It is very hard.

     

    So where does one go when one leaves a place like Microsoft? You need the same level of passion, energy, smarts and most importantly that attitude of changing the world and enjoying yourself to the fullest while doing it! Well, I guess I have been extremely lucky to find just such a place. It is a company called Directi - in my opinion, one of the most innovative companies in India. My role there would be to help build Directi's next generation of products and services which take advantage of some of these opportunities I talked about above. From what little I have seen of the culture and the people so far, it looks like a fun ride ahead. Folks here are smart, passionate and thoroughly enjoy the work they do - some really cool concepts and technology. I can't wait to get started!

    June 12

    Started a New Blog

    As a kid, I found studies quite boring. (who doesn't?) The first subject I fell in love was Physics (actually second - first one was Computer Science) I used to find it utterly boring too, but this changed when I was in 11th standard and was forced to stay at home for about two months because of a strong bout of Juvenile idiopathic arthritis in my knees. I missed out on the first-term exams because of this and the fear of not doing well in the mid-term made me start studying my weakest subject - Physics. That is when I realized the beauty of Physics - it made me see the world around me differently. That love affair continues till today.

    After I started working, I also started developing a love for Evolution, Anthropology, History and Economics as well - they made me look at the world around me in a different way. And this world is full of wonder and beauty. I wish I could go back to school now and find the time to study all of them at leisure. I think the problem with our education system is that instead of letting kids understand these wonderful views of our world on their own, at their own pace, they emphasize on spoon feeding the details of each view. You miss out on all the fun in the process.

    So for the last five years or so (I recall telling my wife about this when I first met her more than four years ago) I have been thinking about writing down the story of our world in a simple, easy to understand way which emphasizes the big picture and connects the dots. I finally started writing some early thoughts a couple of weeks back and the result is a new blog: http://bigtale.blogspot.com/. I'm having a ton of fun writing this and hope that it serves as a useful resource to anyone who wants to know more about our world.

    June 11

    Edge vs. Cloud Computing

    For about 2-3 years now, I have been talking internally in my group and to other people in Microsoft on Edge Computing vs. Cloud Computing. This thing came up all over again in a discussion and I thought it would be handy to summarize my thoughts in writing somewhere.

    Terminology

    The way I use the terms Cloud and Edge are not common to the industry (are there common industry definitions on this at all?), so let's just put this down first:

    1) Cloud Computing is computing that happens in distributed, geo-plexed infrastructures on the Internet. Amazon Web Services and Google App Engine are examples of general purpose Cloud infrastructures. You can call it off-premise computing as well. The idea is that instead of creating your own facility, or putting your server with a hoster, you put your data / apps in a Cloud. This cloud is assumed to have 100% availability, and infinite scalability. This is typically achieved in two ways:

    a) Virtual machine hosting service - The advantage is that you can bring in any existing workload, virtualize it and host it. No changes required to your app architecture. For example Amazon EC2 provides Xen virtualization. However, this is not very elastic in the sense that the computing power is handed out in discrete units of virtual private servers and there is no continuous scaling. The underlying fabric may have infinite continuous scaling but the way it is handed out is discretized.

    b) Abstracted Infra for Prescriptive Services - This approach is where the Cloud vendor offers specific services for data storage, communication, app execution, etc. For example Google AppEngine provides data storage, user management and an app execution environment based on Python. Similarly, Amazon provides a storage service (S3), a message queue (SQS) and a structured storage service (SimpleDB). Microsoft has also announced a couple of services: BizTalk Services for hosted message oriented communications and SQL Server Data Services which aims to provide SQL Server functionality in the Cloud. IBM announced Blue Cloud in Nov, but haven't released much info on it since then. The key thing here is that these services themselves are infinitely scalable, and this means that apps built to these services are also elastically scalable. Of course billing may happen in price bands based on consumption, but the app itself is not boxed.

    2) Edge Computing is computing that happens outside the Cloud. This could be on a hand-held device, on a PC, on a server in a data-center. For the server scenario, this could be hosted on premise, or this could be in a hoster's datacenter. The key thing is that this is not distributed, geo-plexed, on-demand scaling in the Cloud. The computing is contained in isolated boxes that may / may not talk to each other.

    Now the hosting scenario is a bit of an overlap - when does an app hosted with a hoster stop being Edge and become Cloud? To me the key difference is distribution because it is distribution that provides 100% availability and scalability. If you place your server with a hoster, if that hoster's datacenter goes down, your server is down. This is Edge. If you place your app / data with a hoster that has got a distributed infrastructure, is able to move load from datacenter to datacenter based on traffic conditions and lets you take unlimited load - it is Cloud.

    Now with that out of the way, let's get down to examining the trends and see which model is likely to succeed.

     

    A Lesson in Computing History 

    If you go back a few decades in time, computing was primarily driven by governments, and specifically defense establishments or research agencies. Then something happened in 1946 - scientists working for a US government agency came up with the first general purpose computer – the ENIAC. The impact of this development was that computing moved from being solely owned by a few governments in the world, to becoming affordable to businesses. And even then, only some of the richest businesses – the largest banks or manufacturing companies – could afford to buy computing power. Over a period of time, computing got cheaper and more and more businesses were able to use the power of the computer. For the next two decades, well into the 1970s, all computing was business computing or special-purpose government / scientific computing

    The second revolution in computing came when Kilby and Noyce independently came up with the Integrated Circuit, which was used by Ted Hoff and Federico Faggin at Intel to develop the Microprocessor. A couple of smart people – Steve Wozniac, Bill Gates, Paul Allen - figured that this meant that computing power cost was to drop in a non-linear way, and would become affordable to consumers. This eventually led to the PC.

    The first personal computer – the Altair 8800 -  when it came out in 1975, was targeted at the uber-geek of the day – the computer enthusiast! This would probably not have become a mainstay had it not been for a program called the VisiCalc spreadsheet. VisiCalc was the predecessor to the Lotus Spreadsheet which in turn preceded Microsoft Excel.  VisiCalc was what propelled the PC into business computing.  Businesses started buying PCs so that their finance people could run VisiCalc. This mass purchasing in turn brought down prices which in turn led to more consumption, both on the business computing and the personal computing front, which spurred more innovation and more price falls. A virtuous cycle got started, and for the first time we saw the growth of a segment called personal computing.

    However, personal computing would perhaps have remained small had it not been for a phenomenon which caught almost everybody unawares – the World Wide Web. When Berners Lee invented the hyper-text at CERN in 1990 to solve the problem of navigating thru complex documentation, no one realized that it would enable hyper linking on a global scale, making everyone with a connection to the Internet a publisher with a planet-wide audience.

    To summarize, the 1950s saw computing move from the domain of a few governments to becoming available to businesses – business computing was born. The 70s saw the birth of personal computing. The 90s saw the WWW which led to huge growth in personal computing as well as business computing. Every 20 years, the industry witnesses a revolution of sorts. It is almost 20 years since the last one. What’s next?

     

    Industry Trends

    One of the truest trends in the industry has been Moore’s Law - it has held true for the last 40 years – computing power is still doubling every 18 months or so. Differently put, computing power prices are still dropping non-linearly after four decades. Incidentally price of storage has also been falling non-linearly. The only major component which does not obey this law is bandwidth where price falls seem to be linear. I have outlined the hardware trends in a couple of earlier posts as well, you may want to go thru them for more context.

    Second, the mobile phone has put enormous computing power in the hands of everyone. The cellphone that I carry runs a Windows operating system on a 32-bit chip and has more storage and computing power than was available on a full-fledged desktop PC a decade ago. And these phones are getting more powerful faster than the PCs – the only constrain is battery power. With the converged device getting cheaper and more strongly subsidized, and the telcos tearing down their wall gardens, you can bet that the mobile phone would be key to pushing the computing power in the hands of the masses. Someone called the Mobile Phone the PC of India. I couldn’t agree more.  

    What does all this mean? If you think about the buying capacity pyramid, the governments form the tip, large businesses form the top, small & medium enterprises form the bottom and consumers form the base. The history of computing has been one of transfer of growth from top towards bottom. Moore’s law ensures that. What it means is that over the next decade, the big growth in IT would happen towards the lower half of the pyramid. Personal computing would drive the next generation of growth in computing. This is also evident from the data on shipment and growth rate for PCs, Servers and Mobile devices:

    - Servers: 8.8 million server shipments in 2007, growing at 7% (https://www.gartner.com/it/page.jsp?id=608710)

    - PCs: 255.7 million PC shipments in 2007, growing at 10.5%. (http://gartner.com/it/page.jsp?id=502458)

    - Mobile Devices: 32.2 million Smartphones in Q1 2008, growing at 29% (https://www.gartner.com/it/page.jsp?id=688116)

    Now a significant chunk of the 8.8 M servers is being used to power up the cloud. (Microsoft alone adds around 10k servers / month), but as you can see, the much bigger growth in computing power is coming from the Edge - the PCs, the mobile devices and a significant number of those Servers are getting deployed on the Edge. Also the growth rate is faster on the Edge. So try and visualize this:

    - For several years before the advent of the Internet, businesses had been investing in computing power on the Edge - so there is a much bigger base of data and computing power on the Edge

    - Far more computing power gets added to the Edge year on year as compared to what goes on the Cloud. Also, the rate at which this power is added is much higher for the Edge than the Cloud.

    So what this means is this: The amount of data and computing power in the Cloud are much smaller than on the Edge, and are also growing at a much slower pace than the Edge. I could not find enough data to quantify this, but I hope you are getting the picture - the Edge is much bigger than the Cloud, and is growing at a faster pace, and given the cost trends, this is likely to continue as more and more consumers are able to afford computing.

     

    Distributed Computing Economics:

    Jim Gray's famous paper on distributed  computing economics makes the argument that if we look at computing tasks in terms of their costs apportioned across CPU, Network, Data Access and Data Storage, most computing tasks require that the computation happen close to the data based on the fact that the cost of computing has been falling way faster than cost of network.

    Moving computation to a remote CPU requires that the cost saved by moving computation be > cost of moving data to the remote facility.  This typically works for a compute intensive, stateless task which has minimal data access /storage during computation, and a small input / output. Such a task is highly mobile - an example would be to calculate no. of primes in a range.

    However most computing tasks are data bound - they have a large input and output, and are data intensive during computation. Such tasks are immobile. Most business computing falls under this category, as does most of Internet computing - HTML, Email and FTP, since all of this requires data.

     

    The Swing of the Pendulum

    The takeaway so far is:

    1) There is more data and computation available on the Edge than in the Cloud and this trend is likely to continue.

    2) Computation should happen close to data

    What this means is that a lot more computing would happen on the Edge than on the Cloud. In a sense, computing would flow from the Cloud to the Edge – it is cheaper to compute locally than pay for the round trip to a service in the sky. It just does not make economic sense to let that computing capacity on the Edge go waste, and instead take data from the Edge to the Cloud to compute it there.

    Ray Ozzie – Microsoft’s Chief Software Architect – often talks about this thing he calls the “Swing of the Pendulum.” The trend is this – there is a movement in IT from centralization to decentralization and back. 1950s saw the giant mainframes – as centralized as it could get. The decentralization came in 1970s with the PCs. The web then drove centralization once again in 90s with key applications like email and search becoming centralized. Today, you hear about centralization in the form of Web 2.0 in the consumer space, in the form of Service Oriented Architectures (SOA) in Enterprise IT, and Software as a Service (SaaS) for the SME. If the argument above is valid, the Pendulum is going to swing once more toward decentralization. Driven by economic forces, a lot more computing would happen on the Edge than on the Cloud.

     

    The Best of Both Worlds?

    To proponents of centralized computing - Web 2.0 / SaaS - the conclusion above may come as a bit of a surprise. But there are arguments on the centralization aspect as well. The biggest argument in the favor of centralization is Utilization Efficiency. A lot of cheap hardware at the end of the day is still costly. And the power that computing is consuming is increasing day by day. (I would actually love to see a graph which plots % consumption of energy worldwide by workload. I think computing would be second only to transport in the aggregate). So efficiency in hardware and power utilization are going to be key concerns  going forward. It is well known that Cloud computing offers much better Utilization Efficiency as compared to Edge computing. This would be a key driving factor in the favor of Cloud Computing. The other trend that would favor Cloud computing is a change of business model in getting computing subsidized by advertising. Remember from an earlier point in the discussion that the big growth in computing would come from Consumer Computing. And this is where advertising plays a big role.

    So what I believe would happen is that we are going to see would be a hybrid world with the tenets of centralized computing combined with decentralized computing. The richest scenarios would be the ones where the power of Cloud Computing (infinite scalability, 100% uptime, ability to process a lot of data) is combined with the power of Edge Computing (local data, rich UX). The recent trends in RIA are about this: From HTML (pure Cloud) to AJAX (JavaScript in browser) to Flash / Silverlight / AIR (abstracted OS APIs) to WPF (full utilization of client computing power). The recent trends in communications are also about this: On-premise Email (Exchange Server) is being complimented with Hosted Email (Hosted Exchange) and EMail Services (Windows Live Mail). Windows Live Mesh is yet another example of combining the Cloud and the Edge.

    Microsoft calls hybrid apps like these "Software + Services" applications. Terminology apart, you can bet that the dominant application architecture going forward would combine Cloud Computing with Edge Computing. What exact shape would this take? I discussed some of the lower level issues in my previous post: Learning to Program all Over Again. However, it is a much bigger discussion and I would try and cover some of my thinking on this in a few posts going forward.

    May 30

    Learning to Program all Over Again

    This is just a jotting down of thoughts I have been having for the last couple of weeks. Had an interesting discussion with Pandu on this last evening and I thought it would be valuable writing down some of the key takeaways. Most of this is early thinking, but I think that as programmers, all of us need to start thinking hard about the issues below and start figuring our way around them.

    1) Learn the trade-offs all over again: The CPU is getting cheaper and would continue to do so at an ever increasing pace as transistor density continues to increase. However, while disk capacity is getting cheaper and would continue to do so, disk IO is not getting any faster. While IO has always been the big bottleneck , with CPUs getting more powerful, more and more workloads would become IO bound. Thankfully Flash provides some reprieve here and as hybrid disks and pure Flash storage gets mainstream, IO contention should reduce. But to take advantage of Flash, we would need to design apps which are Flash aware so that you can keep the hotter areas of your data on the Flash storage while the colder ones go on magnetic storage. A much harder problem is the memory wall - the access to memory is not getting faster and as CPUs get more powerful, all that computing power would end up waiting on memory. There are no hardware solutions to this, so it boils down to better caching and being super aware of data locality. So when you design a app next time, think about the trade offs all over again - CPU is cheap, code needs to be Flash aware and design for better caching and locality.

    2) Learn to Parallelize Code: I have written and talked about multi-core several times, and it always surprises me to see people getting surprised when they learn that their apps would need to be re-written to take advantage of the next set of chips. All that cheap computing power in the previous point is not easy to harness because it would be distributed across multiple cores. To take advantage of these cores, your code would need to be parallelized. This needs to happen on both client side and server side:

    a) Server-Side: While server-side workloads are mostly parallelized already thanks to the transaction model on databases for stateful workloads and the atomic request-reply model in the web-server stateless model, we do have a problem is on the business logic code. As your web-server hands over a request to a business logic component and it does its computation, there are other requests waiting to be serviced by the same business logic component even as a CPU core lies idle.

    b) On the client, the situation is worse - all UI is done using one thread. And while some apps use the background worker thread approach, even there most of the apps tend to use a single thread only. Now on the client again, the CPU is not getting any faster, but more and more slow ones are becoming available. So for your app to take advantage of the new hardware, it needs to be parallelized.

    So irrespective of whether you are on the server or the client, you need to re-write your code. I had posted on some of what you can do earlier over here. Since then, Microsoft has launched a full fledged Parallel Computing Developer Center which is a great resource to learn the issues and how to address them.

    3) Learn to Work with Offline Data: While bandwidth cost is getting cheaper and coverage is increasing, the fact is that it is flaky - sometimes you have a lot of it, sometime you have none at all. Also, there are other points of failure in the system and there is no chance that you would have the latest, correct data all the time. This has two consequences:

    a) From a client side perspective, since the expectation is that with pervasive connectivity, you can be online anytime, connected apps is the way to go. However, the flakiness also means that you need to work offline. That means storing data locally and sync-ing it with the online service depending on bandwidth availability. This would at times lead to issues - the pricelist on the tablet PC of the sales guy may change while he makes a transaction based on the data he has. Offline data is not friendly towards consistency, but availability. And a business would rather make a system more available, even if it means that sometimes business transactions would happen on wrong data which would require compensating actions.

    b) From a server perspective, since your app and data would be residing in the Cloud, spread over several boxes, maybe several datacenters, changes to data would happen in different places at different points in time by different activities. So the change in inventory for example would need to be propagated and since bandwidth is flaky and boxes would go down, you would need to take the decision on the current inventory level based on the data that you have, and not some global version of the inventory. At some point in time, the data would sync and it would be figured that some transactions happened on bad data and these would need to be compensated again.

    This model of working with data you have (I am calling this offline data for the want of a better word) in an optimistic way hoping that the data is correct, syncing at some later time in point, figuring out the inconsistencies and then dealing with them is increasingly going to be a complex problem all App architects would have to solve. While there is some technology to help - ADO.net Sync Framework, Workflow Foundation and Communication Foundation come to mind - the business logic of detection, deciding on the compensation and carrying out the compensation would need to be designed in the biz-logic of the app.

    4) Learn to Slice up the App: The point on data above leads to a bigger point on slicing up the app in general. The first level of slicing is on having an offline semi-connected client. The second level of slicing is on the server-side. Some of the server-side functionality would need to be in the Enterprise datacenter where your app may need to talk to some of the other systems. Some of the functionality would need to go to the Cloud - mostly areas where you need infinite scaling and 100% uptime. This leads to questions on what should be the boundaries, how do you come up with the boundaries, how does the communication happen between the boundaries, etc. So how we slice up the apps is a huge issue, and we all need to find our feet when it comes to this. I have some thoughts on this, but over here I basically want to highlight the issue and not discuss possible approaches - that is a topic for some other post in future.

    So those were the major points in my mind on the challenges ahead and skills we would need to acquire to address these challenges. Obviously, various technologies would also evolve to help solve these issues - new languages, new algorithms, new frameworks, new design patterns .. I think the way to go would be declarative - state your intent and let the runtimes handle stuff for you. Also it would be mostly model-led. Lots of new stuff to learn - it never gets too dull here!

    May 08

    Web Innovation Conference 2008 - Mumbai

     

    Just got back to my hotel room from the Web Innovation Conference 2008, and despite being really tired, I am feeling quite happy since this was one of those rare conferences that I really enjoyed - both as a sponsor and as an attendee. The crowd was quite diverse and knowledgeable and I had a lot of very lively conversations with decision makers, developers, designers and marketeers from our customers and partners.

    My key takeaway was that in India, people are doing some great work, have great ideas and are truly excited about converting them to reality. This was most energizing. What was also very heartening was to see the buzz and excitement around Silverlight. Almost everyone whom I spoke with (and I must have spoken to at least 50-60 people thru the day) was quite impressed by the capabilities of Silverlight and wanted to find out what more could be done with it.

    But most of all, I think I would remember this conference for making some new friends - met Amuleek Bijral from RSA after several years and it was fun catching up, had a great discussion with Amit Somani from Google on various aspects of cloud computing and hardware trends, the state of computer science research and a shared respect for some of the greats in the field, and had a lot of discussions with Bhavin Turakhia from Directi on a thousand interesting topics - I think we could have gone on for a couple more days without a break and without repeating a subject. All thanks to Kaushal Karkhanis who got us to talk on WPF vs. AIR on camera and then ran out of space as we just went on and on!

    Great fun and very energizing!!

    May 03

    Getting Started with Google App Engine - Part I

     

    I have a new found interest in Cloud Computing, and the first attack is on the Google App Engine. Didn't exactly get off to a flyer:

    1) Limited beta - they are going to send me a mail on my GMail account on when I can start using it. But I can start right away by using the SDK.

    2) Downloaded the SDK and ran the installer - damn! Requires Python - go download separately!! This is ridiculous - why can't they bundle it with the installer?

    3) Now I am on the Python site and see a set of versions - do I install the latest version? Not sure, let me run the installer again ... ok, it requires 2.5 from Python.org or from activestate. Would 2.5.2 work? Not sure. Wish they could have given a page with a link rather than a modal dialog!

    4) Downloading Python-2.5.2.msi from http://www.python.org/download/ - it is 10 MB. Went to ActiveState and after 10 odd clicks, I now have to fill up a form... I think I am going to download the python.org version - AppEngine cannot be using ActiveState specific extensions.

    5) Download started - need to learn Python now ... next step, download the documentation from http://docs.python.org/download.

    6) Python compiler installed. Now running the installer .. it is Vista ready, but not signed yet .. makes sense, bits are in beta. Installed smoothly.

    7) Ok, now to figuring out what this stuff is. http://code.google.com/appengine/docs/ talks about a Python runtime, a datastore, users API, URL Fetch (sounds like a REST consumer) and a mail API - not bad, got this piece.

    8) Finally, how to work with this - http://code.google.com/appengine/docs/gettingstarted/ basically says that there is a dev web-server which I think will run locally and a utility to upload code to the Google cloud. The web-server seems interesting - it is an emulation environment for the cloud infra. Can I use it without the invite? Seems I can!

    9) Now reading http://code.google.com/appengine/docs/gettingstarted/helloworld.html - so the idea is that you write your code in python, describe the app environment using app.yaml (that obviously uses the YAML syntax) and then start the dev web-server with the path to your code. Pretty simple, eh? The URL - http://localhost:8080 is hardcoded, but that's ok.

    10) My next question - where is the web framework - is answered in http://code.google.com/appengine/docs/gettingstarted/usingwebapp.html. So any CGI targeting framework written in pure Python would work. You got to upload the framework along with the app ... hmm, that keeps things simple for AppEngine while giving a choice to Python toting devs. The other choice is to use the AppEngine supplied framework called webapp. Since I do not know any Python based framework, I think I am going to use this. Not sure of what Python devs would typically use. I have heard of Django and Zope.

    11) Finally went thru the links in http://code.google.com/appengine/docs/python/ and this gives me a decent idea of what this environment is like. It is a simple programming model and I think that is a strength, though having to learn Python may dishearten your average dev.

    So I think what I am going to do next is to maybe write a bunch of simple apps on the local dev-server and pick up Python while I wait for my AppEngine invite. This is cool - I wanted to learn a new language anyway and AppEngine forces me to learn Python which I can also use for SL 2 demos!

    March 24

    Microsoft's Recent Announcements on Web Technologies

    I have now been away from the blogosphere on yet another extended leave of absence. The reason is that I have been in the midst of change - I now lead Web Strategy at Microsoft India. The web is evolving at a very fast pace and Microsoft has a unique position amongst all technology vendors when it comes to this space. This becomes even more interesting in the India context - the pace of socio-economic change is unprecedented, and companies here are coming up with some very innovative business ideas to succeed, so I feel both excited and humbled at the opportunities ahead.

    Now if you are interested in the Web (who is not?) then it cannot be that you are not aware of Mix - Microsoft's biggest conference on the web technologies space. Like the last two years, Microsoft made some big announcements around the Mix timeframe and showcased some very interesting new technology. This post captures the announcements and explains what they really mean.

    1. Internet Explorer 8
      • CSS 2.1 + DOM Enhancements: As Dean Hachamovitch pointed out in the Mix keynote, IE 8 is heavily focused on Developers, and the biggest problem web-developers have had is the non-compliant, quirky behavior of Internet Explorer which has led to countless hours spent in achieving cross-browser compatibility. The goal with IE 8 is to achieve 100% compatibility with the CSS 2.1 spec (note that the spec is still in a "candidate recommendation" status). It also tries to address some of the top requested features from CSS 3.  More details are available on CSS Improvements in Internet Explorer 8. The other big step towards standardization is an enhanced DOM which is closer to the implementation in other browsers. More details on this are available on What’s New in Internet Explorer 8.
      • CSS 2.1 Test Cases: It is never easy to define the behavior of software unambiguously in regular English. Specs therefore go in great formalism to try and define as clearly as possible the expected behavior of software. (This has also led to efforts like Spec#) Getting that clarity for sophisticated markup like HTML and CSS is even more difficult. Which is the reason why the W3C also defines a test suite for CSS that consists of 487 odd tests. What Microsoft has done is to submit an additional 700 tests to W3C - the idea is to get these included in the W3C suite and get feedback to improve IE's adherence to CSS specs.
      • Standards Mode: The evolving nature of the HTML and CSS specs means that there is a lot of content on the Web which is not compliant with the standards. This means that browsers like Internet Explorer need to follow the "Don't Break the Web" rule in a strong way, while being standards compliant at the same time. Note that this not just an IE problem. The so called Quirks Mode is something that all browsers - Mozilla, Opera, IE - need to maintain. Now what has changed in IE is that besides using the DOCTYPE switch, in order to take advantage of the fully standards compliant mode of IE 8, authors would need to include a META content tag as well. The reasons for doing this and its implications are a bit complex and I would discuss that in detail in a future post. For now, if you want to read more about this, start from Chris Wilson's post on Compatibility and IE 8, and finally read Versioning and IE Modes.
      • Scripting Improvements: The other area where standards compliance was needed was the JScript engine. There are certain compatibility issues in the Microsoft implementation of the ECMAScript Language Specification 3rd Edition (ES3). These have been identified and documented at JScript Deviations from ES3 which should act as a guidance for developers working on cross-browser apps. Of course as a developer, I do not want to understand the differences and JavaScript should behave identically on all browsers and my code should just work! For those of us who were expecting IE 8 Beta 1 to deliver on this, the news is not exciting - IE 8 JScript Engine does not fix the compatibility issues. The JScript team is of course fully committed to achieving full compatibility, but we will have to wait a little longer to see that. However, what we are getting in JScript as a part of IE 8 are improvements on areas which were huge pain points for all developers. There are two key areas of improvement - performance and memory management:
        1. The perf issue - JScript is notorious for its string performance. (see this benchmark for details) The good news is that now operations like String Concatenation - which are very common in AJAX - are several times faster. The JScript GC has also been tuned so if you are creating a number of strings and discarding them, the number of collections is reduced, hence increasing the perf. Improvements have also been made to built-in array operations. Other improvements include DOM Search using the getElementById and getElementsByTagName methods, and the availability of CSS Selector APIs. More details are on perf improvements can be found on this whitepaper.
        2. Circular Reference Memory Leaks: I am not sure how many people are aware of this, but the JScript Garbage Collector does not control lifetimes of DOM objects, it does lifetime management only for JScript objects. The result is that JScript cannot break circular references between JScript objects and DOM objects unless the IE process terminates (IE 6) or the user navigates away from the page (IE 7). The issue is a bit more complicated than just this - read this MSDN article if you want to know what is really happening. Now what the JScript team has done for IE 8 is to go and make changes to underlying COM plumbing infrastructure to address this problem at a fundamental level so that the issue is revolved not just for JScript, but also for any other components which face similar issues. For a JScript developer, the impact is that now DOM object lifetimes are also controlled by the JScript engine and hence it can break circular references between JScript and DOM objects and hence not leak memory. More details are available on this whitepaper.
      • Developer Tools: Every time I had to understand why my page was behaving in a certain way, my only choice was to open that page in FireFox and use FireBug. The IE Developer Toolbar solved it to some extent, but not completely. The IE 8 team has now gone and delivered a set of dev tools which tops even FireBug. You can inspect and edit HTML + CSS + DOM. You can see the details of Style, Layout and even trace styles to view computed styles. But the feature I like best is the script debugging support. You can do breakpoints, watch windows, call stack, custom script execution - it is almost a Visual Studio experience, and it ships as a part of IE 8! More details are available on this whitepaper.
      • Activities: IE 8 introduces the notion of an "Activity" - a contextual UI pattern that can used to access a service from a web page. The pattern is activated when the user brings up the context menu by a Right Click or by selecting some content on the page which brings up a smart-tag like button. Activities can then be used to perform Lookup operations like Search, or Send operations like email and blog. Much like the Search Provider functionality that shipped with IE 7, an Activity is user deployed and can be promoted by any website. In this sense, an Activity is yet another kind of a browser plug-in. Activities are declaratively created, using an XML grammar called the Open Service Description. More details can be found on this whitepaper.
      • Web Slices: A Web Slice is a subscription to a frequently updated portion of a Web page, much the way RSS is a subscription to the content of a Web-page. WebSlices uses the hAtom microformat to describes a feed and feed items. But while hAtom can only represent static content, WebSlices work with dynamic content that is expected to change. To achieve this, WebSlices reuses properties from hAtom and adds properties for subscription. From a publisher standpoint, this allows creation of update-able content right inside the page which can stay with the user even after he has closed the page. More details can be found on this whitepaper.
      • Other Improvements: In IE 8, the IE Frame and the tabs are separated into different processes. This increased isolation improves perf and reliability. This allows the Windows Vista Integrity Mechanism to be used in an interesting way: The Frame runs at medium integrity level and individual tabs - manifested as separate browser processes - can run at low or medium integrity levels. More details can be found at Loosely-coupled Internet Explorer. These changes not only isolate errors from each other, but also help the browser recover from crashes and hangs. This is done by an object in the Frame that stores key data such as per tab URL history, tab URLs and order, etc. in a persistent store. When there is a hang /crash in a browser process (typically because of an external component), this object can recover the data. Since the data is persisted, it survives even Frame crashes. More details can be found at Automatic Crash Recovery. The other key architectural change is that a lot of internal communication inside the browser is now asynchronous which leads to better response times. Besides these key changes, there are improvements on the accessibility front that are detailed at W3C's ARIA support, Zoom v2, Manage Add-Ons, Per-Site ActiveX and Non-Admin ActiveX Controls, Printing Improvements, DOM Core Improvements, HTML Improvements and ACID2, etc.

        In all, IE 8 has a number of enhancements which are very exciting for a developer and promise to make life a lot simpler for a consumer. You can download details on most of these from http://code.msdn.microsoft.com/ie8whitepapers. More information on IE 8, including hands on labs and IE 8 VPCs is available on http://www.microsoft.com/windows/products/winfamily/ie/ie8/readiness/.

    2. Silverlight 2: One of the most significant announcements at Mix 08 was that Silverlight 2 Beta 1 bits were made available. An architectural diagram of SL 2 can be found at http://en.wikipedia.org/wiki/Image:Microsoft_Silverlight_stack.svg. I am going to describe the key constituent pieces here: 
      • Managed Code: Silverlight 2 is browser plugin that hosts a specially created version of the CLR (called CoreCLR) along with a subset of the libraries that ship with the .Net Framework. This means that as a .Net developer, you can start writing code for SL 2 without having to learn anything new - fundamental things like memory management, type system, generics, reflection, threading, collections, work just the same. More information on this is available on Common Language Runtime and Base Class Library in Silverlight.
      • Dynamic Languages: When Microsoft released CLR back in 2001, one of the stated goals was support for different kinds of programming languages. A category of languages which was hard to target for the CLR was Dynamic Languages. Thru the work done by people like Jim Hugunin, Microsoft has in the works a Dynamic Languages Runtime which facilitates targeting languages like Ruby, Perl, Lisp and Python on the CLR. The good news is that Silverlight 2 hosts the DLR, and hence it is possible to program against SL 2 using languages like IronRubyIronPython and Managed JScript.
      • Browser Interaction: At the end of the day, a SL app is a browser app, so interacting with the browser host is critical for several SL apps. To achieve this, the Browser is exposed to SL via .Net classes, and similarly, SL objects are available to JavaScript code running in the browser. To interact with the browser from inside SL, you go thru the System.Windows.Browser namespace.This allows you to work with browser information, get to the HTML DOM tree, hook on to events, invoke JavaScript code, etc. Details are available on Calling Client Script from Managed Code. Similarly, it is possible to interact with SL from JavaScript code running in the browser. For this, the code to be called in SL is registered during the Application Startup event in SL as a Scriptable object. This then is exposed as a property on the SL control, callable by JavaScript. Details are available on Calling Managed Code from Client Script.
      • Integrating with ASP.net: While Silverlight content can be embedded in an HTML page, the most interesting Silverlight applications would be generated using server-side technologies like ASP.net. To make it really easy for developers to integrate ASP.net with Silverlight, two ASP.net controls are provided in the SL 2 Beta 1 SDK: The ASP.net MediaPlayer Control to integrate media content (no need to learn XAML or JavaScript), and the ASP.net Silverlight Control which is a generic control that allows you to integrate XAML and any other supporting code. More details are available on ASP.NET Silverlight Overview.
      • Controls Framework: A number of web applications are form-based, and need richer functionality like the ability to work with formatted text, offline forms, spell check, etc. For developing these apps quickly, a set of controls and extensibility using a Controls Framework is required. SL 2 provides both. Moreover, the controls in SL 2 are customizable - they come with templates that can be changed to completely modify their look and feel. This is achieved by decoupling the the visual behavior of a control from its logic. The visual behavior is defined in a ControlTemplate which in turn consists of several parts. If you want a particular visual behavior element to be re-definable, you create a separate part for it. These parts can then be individually customized without any impact to the logic of the control itself. Of course this maybe too complicated for some scenarios, in which case you can always create a control using a UserControl class in the familiar ASP.net / Windows Forms way. The controls that Silverlight ships with have this decoupled visual behavior that allows them to be customized. More details can be found on Styling and Templating Overview.
      • Web Services: The term web-service is classically associated with SOAP based web-services. However, when we think of a web-service being accessed by Silverlight, we are thinking of the following kinds of services:
        • Self-Describing Services: For a service that can describe itself using using machine-readable metadata, we can generate a proxy to which Silverlight can bind. This is the process of service consumption that happens when you click "Add Web-Reference" to a project in Visual Studio. This works exactly the same way in Silverlight 2 projects as well - you add a reference to the web-service and then you can call that service from inside SL. Now which services are self-describing? These are the following:
          • SOAP-based Services: SOAP Services from the beginning have been self-describing (thru WSDL) and hence can be consumed using a "Web-Reference" from inside VS. From a service design standpoint, the restriction is that WS-* services like addressing, transactions, etc. cannot be used in SL. So if you have a WCF service, Silverlight can use the BasicHttpBinding. The implication of this is that you cannot have message-level security and need to rely on transport-level security using SSL.
          • ADO.net Data Services (Astoria): ADO.net Data Services allows you to store arbitrary data on Microsoft's infrastructure, and provides semantics for reflection on URIs and payloads. The service itself can be described using an Entity Data Model (EDM) conceptual schema. This allows tools to consume this metadata and generate proxies which can then be consumed directly from within the Silverlight code. A simple way of doing this is to use the ADO.net Data Services .Net Client Library. Please note that ADO.net Data Services is not to be confused with SQL Server Data Services which serves a different purpose (explained below)
        • Non Self-Describing Services: These services have human-readable markup. The model is to, well, read the documentation, create HTTP requests and work with Request / Response data in XML / JSON - pretty much everything by hand. Accessing HTTP-Based Services Directly describes how you should consume such services from Silverlight.
      • Data: Silverlight is a client side technology and hence its work on data is also on the client side. On the client side, there is no database and the data is usually held in the browser's isolated storage in the form of XML files and manipulated there. This data may be data pulled in by services mentioned in the previous point, or/and it can be AJAX data. To store and manipulate the data, a virtual file system can be created using System.IO.IsolatedStorage. Irrespective of how the data is pulled in, it is manipulated using Silverlight XML facilities. This is described in detail on Working with XML Data in Silverlight.
      • Deep Zoom: In Feb 2007, Microsoft acquired a company called SeaDragon Software. Out of this acquisition came the technology called PhotoSynth and a technology called SeaDragon, which has now found its way in Silverlight 2 under the name Deep Zoom. The problem this technology solves can be understood as follows: More and more visual information is finding its way into images, (curtsey high definition cameras and the need to share detailed visual information digitally), however, our ability to access this information is limited by screen size (what are the dimensions of a 4 giga-pixel image?), bandwidth (how do you download a 4 GP image?) and graphics processing capability (how do you fit a 4 GP image into the memory of a GPU card?). Deep Zoom addresses this thru a simple solution: if you have a really large image, you are not going to see all the detail on your monitor, and will only see some of the detail by zooming into that. The Deep Zoom Composer tool converts a large image into a pyramid of small images of various sections of the image at different levels of zoom, with the whole image representation being at the top, followed by a set of images representing the next level of zoom, followed by a larger number of images for the next zoom level,  and so on. When you view the whole image, the first image at the top of the pyramid is rendered, and then as you keep moving around and zooming in an out, the apt image from the pyramid is pulled out and rendered. If you zoom in, the image would be pulled from the lower side of the composition pyramid and if you zoom out, the image would be from the higher side of the pyramid. Moving around at the same zoom level pulls out images from the same level in the pyramid. This means that at any given point, you are only downloading a small piece of data and rendering that. In SL2, you use Deep Zoom using the MultiScaleImage control. At the Mix keynote, the Hard Rock Cafe guys showed how they have used Deep Zoom to provide access to their memorabilia collection.
      • Media: Media is a key scenario in SL 2 much like it was in SL 1. The focus is to provide high quality, high performance and low TCO. The key capabilities are:
        • Content Protection: Silverlight provide DRM support based on Microsoft PlayReady, for both Windows and Mac. Note that PlayReady is general purpose DRM solution so it can be used not just for media but also other content types like images, animations and graphics. Plan is to also support Windows Media DRM - which is the media specific DRM solution used by Windows Media - on both these platforms. Also, one can use SSL to protect progressive media delivery from IIS 7.
        • Format and Codec Support: SL 2 supports WMA, WMAPro, WMV/VC-1 and MP3, with support for HD 720p video. VC-1 is considered to have twice the decode efficiency of H.264 which makes it more capable for lower-end PCs, besides having other advantages for broadcasters. A detailed comparison is available here
        • Adaptive Streaming: An interesting announcement made at Mix was that under a partnership with Move networks, SL would be able to playback a Move Adaptive Stream using the VC-1 codecs. Note that Adaptive Streaming is not a part of Beta 1 and would be made available in a later beta. So my description of Adaptive Streaming here is generic and not SL specific. Adaptive Streaming is a media delivery innovation on top of HTTP where the video stream is chunked into smaller video segments, cached on the network, and re-packaged on the client. In traditional streaming, there are maybe a couple of streams (at different bit-rates) available and the user chooses one at startup time and then stays with that stream. This ignores the fact that in reality, bandwidth varies over a period of time and the decision on which bit rate to choose should be dynamically taken. Adaptive Streaming addresses this by providing several streams at different bit-rates, splitting each stream into tiny, multi-segment streamlets. The player - based on network and client-side characteristics - can decide at startup which stream to use, and as available bandwidth or available resources on the client machine change, ask for the next segment to come on a different bit-rate. This optimizes the bandwidth utilization and provides the best possible client side experience while also reducing server-side bandwidth cost. In order for adaptive streaming to work, the condition is that the media server should expose multiple streams, should chunk the streams, and information about the streams should be query-able.
        • Media Serving: While SL is a media consumption technology, I also want to talk here about the media serving platform. There are three choices available from Microsoft:
          • Windows Media Server: Windows Media Server 9 was considered by many customers and analysts as being the best in the market for live and streaming delivery, and with Windows Media Services 2008, these capabilities have improved even further. WMS08  is twice as scalable as WMS 9, the server-core deployment reduces the footprint for dedicated server scenarios, and with a built-in cache/proxy, it is easier to configure it for edge scenarios.
          • IIS 7 Media Pack: For those customers who do not want a full fledged media server, there is the IIS 7.0 Media Pack. At Mix 08, a couple of interesting Media Pack features were described: Bit-Rate Throttling and Web-Playlists. In a progressive download scenario, BRT detects the encoded bit-rate of the file being downloaded and controls the speed at which the first few seconds are downloaded, followed by the speed for the rest of the file. This saves bandwidth and also preserves a fast startup experience. Web Playlists are meant to do two things: hide media assets and provide the ability to sequence media playback to provide pre- post- roll advertising and in-stream advertising. Both these features were the preserve of a media server, but now you get them with the web server.
          • Silverlight Streaming: Silverlight Streaming is a media streaming solution hosted on the Microsoft platform. The idea is to host media + other client-side assets (including HTML/js files which can host SL) on the Microsoft infrastructure. This transfers the hosting + bandwidth cost to Microsoft instead of you having to pick it up. Currently the service is in a pre-beta stage and targeted at developers with a per user limit of 10 GB storage and 5 TB / month of aggregated bandwidth being offered for no charges.
      • Mobile: A very exciting announcement made at Mix was that Microsoft would be building SL for Mobile devices as well. People were expecting it for a long time, ever since WPF/e - the former codename for Silverlight - was announced, however, what was unexpected was that SL Mobile would be available cross-platform, with support for Symbian as well. With this announcement, XAML is now available across three screens: PC, TV and Phone. (Confused about TV? Well, Media Center SDK also uses XAML). Now SL for Mobile devices is going to have a somewhat different roadmap: SL 1.0 for Mobile CTP is expected in Q2 2008, with the RTW in Q4. As SL 1 Mobile goes to RTW, you would be able to see the CTP for SL 2 Mobile, which would then RTW in Q2 2009. Or so says the slide deck from the session T12 at Mix by Amit Chopra and David Kline.

      To build apps using Silverlight 2 Beta 1, you would need: Silverlight 2 Beta 1 runtime, Microsoft Silverlight 2 Software Development Kit Beta 1, Microsoft Silverlight Tools Beta 1 for Visual Studio 2008, Expression Blend 2.5 March 2008 Preview. More resources and tools can be found on http://www.microsoft.com/silverlight/resources/default.aspx.

    3. Expression Studio 2: Microsoft also announced Expression Studio 2 Beta 2 with an expected release of H1, 2008. Note that Expression Studio 2 targets Silverlight 1. For Silverlight 2, support will be provided in future versions. There is a ton of new functionality available across the Expression 2 products, and you can find all that on the new community site - http://expression.microsoft.com/ - which was also announced at Mix. To build Silverlight applications, as a designer you would need:
    4. SQL Server Data Services: To me personally, the most interesting announcement at Mix 08 was that of SQL Server Data Services.  As Soumitra Sengupta - architect on the SSDS team - was keen to point out, SSDS is Simple, but it is not SimpleDB. Much the way SQL Server is available on a PC, on a server and on a mobile device, it is now also available as a service. This service runs on several commodity boxes hosting the same SQL Server technology that you and I use on a daily basis. Right now SSDS exposes a small subset of the capability of the on-premise SQL Server product, but this would change as the service matures. The vision is that customers would be able to store virtually unlimited data at on-demand scalability on SSDS under enterprise-ready Service Level Agreements on a pay-for-use model. The data can be synced with an on-premise SQL Server using Microsoft Sync Framework, or consumed by an app using SOAP / REST interfaces. The idea is to have three types of SQL Server capabilities for customers: on-premise, hosted and as a service. Exchange Server already has the service offerings, and BizTalk Server is well on that journey with the BizTalk Services initiative. Note that this is different from ADO.net Data Services which has a similar name, but serves a different purpose. Nigel Ellis gave a talk on SSDS at Mix 08, and there is a blog by Jeff Currier that explains how to use SSDS. There is also a channel 9 video of Dave Campbell talking about SSDS. The technology and its implications are very interesting and I will post more on this in detail later. To find out for yourself, register for the free beta.
    5. Partnerships: Microsoft also announced a number of partnerships at Mix 08. The keys ones are: 
      • Nokia: The agreement with Nokia is to make SL Mobile available for S60 devices on Symbian, Series 40 and Nokia Internet tablets. Given that S60 has more than 50% of SmartPhone marketshare, this gives SL Mobile tremendous penetration. This also means that the gates are now open for other handset manufacturers to approach Microsoft to provide SL-enabled phones. However, the fact that SL Mobile is a royalty-free runtime, may not work for somebody like Apple who want to charge for runtime on the iPhone.
      • DoubleClick:  DoubleClick announced DoubleClick In-Stream - their SL2 based SDK for in-stream advertising. This would allow publishers to dynamically serve video ads from DoubleClick's DART platform into the SL 2 video players and get reporting and forecasting data. According to Ari Paparo, VP Advertiser Products, DoubleClick, who spoke about the partnership at Mix, the SL 2 support is expected in Q2 2008.
      • Move Networks: I talked about Adaptive Streaming support in SL2. The partner with whom this capability would be going live would be Move Networks. The nature of the partnership was not described, but I guess more details would be shared in the next few months.

    So Mix 08 had some interesting announcements on all fronts, but most prominently on the client side with IE 8 and SL 2 betas getting released. Some of the long-term bets Microsoft is making - SL Mobile and SQL Server Data Services would have implications not just for consumer computing, but also for business computing. Ray Ozzie also hinted at some interesting stuff happening on the Live front during his keynote. Combine this with the Oslo announcements a couple of months back, and you start seeing the Software+Services vision which Ozzie outlined in Mix 07 last year taking shape: On-Premise software with corresponding Service and Hosted offerings, with tight integration with x-plat client side technologies. All very exciting stuff ... what a time to be here in this industry!!

     

     

    September 11

    On Passwords

    At Microsoft India, we have been trying to raise  security awareness in developers by sharing best practices for quite some time now. Yet, the lack of security knowledge is astounding, even on basic issues like password verification. So when I see a post like this one, it really makes me feel better! Thomas Ptacek talks about salted hashes, techniques of password breaking and how to counter them in a very easily understandable way. If you are a developer and are not 100% sure of your password verification scheme, please do read the post. If you want to know more about passwords, especially on Windows, read this excellent article by Jesper Johansson.
     
    September 09

    Do you know about OBA?

    I've been away from the blogosphere for a almost three months now. June was spent planning and executing TechMela, Jul is the beginning of a new financial year at Microsoft and I was first taking a bit of break and then was heads down in FY08 planning. Since Aug beginning, I have been on the road, evangelizing Office Business Applications (OBA) - my new charge this year.


    What is an OBA? It is not really a new concept - the idea is to access backend line of business (LOB) applications like ERP, CRM, SCM, Core Banking / Insurance applications via the Office Platform. Microsoft has been urging its customers and partners to adopt this approach for a pretty long time. Four years ago when I was responsible for our E-Biz stack (BizTalk, SPS and CMS) and Office 2003 was released, I had led several projects involving InfoPath talking to a SPS Doc Lib in turn integrating with a backend app via BizTalk. So what has changed with Office 2007? Technology-wise, the changes are incremental, but business strategy-wise the focus on backend integration is very strong, and that whole strategy is coming together under the brand name - Office Business Applications. This is a long term play for Microsoft and you would see us providing more and more capabilities on this going forward.


    Now why is Microsoft so serious about this and why should you care? Javed Sikander provides a great coverage in his whitepaper - Building Office Business Applications: A New Breed of Business Applications Built on the 2007 Microsoft Office System and I would strongly urge you to read it to get the context on OBA. In brief:
    - Companies spend a lot of time, effort and money to deploy backend LOB applications. Adoption of these applications however remains low because of user discomfort with the new (and often clunky) UI
    - Even if the application is adopted by the users, it only fulfills the needs of task workers, knowledge workers however continue to do a lot of work outside these business apps which is mostly ad-hoc collaborative work involving spreadsheets, emails and word documents.
    - This data remains locked in users' desktops and never captured by central IT, nor are the ad-hoc collaborations under the purview of central IT.
    - Office Business Apps allow users to access backend apps via Office increasing user adoption and bringing the data locked in users' desktops and the collaborative processes back under Central IT.


    Obviously this is huge value for a customer - OBAs not only increase the return on the investment in the backend apps, but also allow IT to capture organizational intelligence. ISVs who make the backend applications also like the OBA approach because this increases the adoption of their products and that helps increase the revenue they make. Finally, system integrators love OBAs because they make money on services required to integrate office with the backend apps. And yes of course OBAs are strategic to Microsoft because they increase the value of Office to the customers. This is what you call a win-win.
     

    So which technologies are used to make OBAs? Top of the mind recall:
    1) OpenXML allows embedding of LOB data in Office documents and allow server-side processing of Office docs which was cumbersome so far as it involved invoking the Office Apps on the server side which they are not really meant for
    2) The Fluent UI allows creating custom Ribbon Tabs / Groups / Buttons and Custom Task Panes which can be used to provide contextual interfaces
    3) InfoPath provides a forms interface which is available both as a standalone app as well as in the browser
    4) WSS 3.0 and MOSS (Microsoft Office SharePoint Server) 2007 provide workflow capabilities as they host Workflow Foundation. The workflows can be created using SharePoint Designer as well as Visual Studio
    5) MOSS Search provides enterprise search across websites, fileshares, mail servers, business data and extensiblity mechanisms to index custom file formats (using IFilters) and custom content locations (using Protocol Handlers)
    6) Excel Services provide a server-side equivalent to Excel including a web interface and a web-services API
    7) The Business Data Catalog in MOSS is used to model applications inside MOSS and surface LOB data inside MOSS which can then be consumed in Search, Business Data Web Parts and custom applications
    8) Enterprise Content Management capabilities in MOSS
    9) MOSS itself provides a highly scalable server side composite apps platform which can be used in organizations of all sizes.


    What have we been doing about OBAs in India? A lot - we were fortunate to have Javed Sikander visiting India in Aug and while he was here, we did a couple of Architect Roundtables, met several large partners all of whom are very excited about OBAs for the reasons I outlined above and spoke with Press to help raise awareness on OBAs. We also did a webcast series on OBAs and conducted a couple of workshops. Now in Sep, we are going on the road once again talking about Enterprise apps and Office Business Apps. In the coming months, we would be conducitng a number of sessions and posting a whole lot of content on OBAs. Stay tuned!

    May 22

    TechMela - Why One City?

    A lot of people have been asking me on mail as well as on this blog why we are doing TechMela in just one city when traditionally we have done TechEd around the same timeframe in multiple cities. Well, here are the reasons:

    1) We wanted to do a much bigger event this time by covering a much larger number of technology areas by combing content from TechEd, Mix, MEDC and ITPC:

    a) The number of sessions we are doing this year is around 150, more than twice the number we have done usually. BTW, this does not include HOLs.

    b) The number of tracks we have this year is 21, four times as much the number of tracks we usually do

    c) The number of speakers we have is around 50; again more than twice the number we get usually. A big percentage of the speakers are coming from the product groups.

    2) One of the consistent pieces of feedback has been around the need for our developers and IT Pros to hear directly from the Product teams in US and the the rest-of-the-world. This interaction is immensely helpful

    a) to the Redmond teams, as they take back valuable input from our audience and that translates into better product

    b) for the participants because they get a view from the guys who actually build the product and their deep insight into how the technology landscape is evolving

    3) Its tough to maintain the same quality of experience when the venues change and the facilities don’t keep pace with the demands of our marketplace. To meet the diverse needs of the audience we need atleast 7 tracks (each day) and we need to have atleast 200 – 300 person capacity in each track. The hard part of the issue is that we don’t have that infrastructure in India!! Heard about public infrastructure lagging behind….well looks like that’s true for convention hall infra too!! Its improving but not fast enough.

    There were a couple of other points I wanted to address:

    1) Why Mumbai, why not Bangalore?
    There were two reasons for choosing Mumbai:

    • Venue: The scale at which we are doing the event this year was not possible at any venue in Bangalore. The only possible venues were either in Mumbai or Hyderabad. After considering connectivity options on domestic and international flight circuits, Mumbai won.

    • No. of events: Almost all developer events, by Microsoft and by other companies, end up happening in Bangalore. This results in what our marketing team likes to call “audience fatigue” – an allusion to the fact that since the target audience is same, people get sick of getting initiations to events and are not able to take out time for “yet another event”

    2) What about other countries? Don’t they also do TechEd in multiple cities?
    India was unique in doing the events across so many cities. If you look at TechEd US, that happens in one city (this year in Orlando), the same with TechEd Europe (Barcelona), and most other countries. For details, check out http://www.microsoft.com/events/teched2007/worldwide.mspx. I do realize that by doing this in one city, there will be people in some cities who would be constrained in their travel, and would not be able to make it. The way we seek to address this is by rotating the cities where we do this event every year going forward.

    So if you are not able to make it to Mumbai on the D-Day, what are your options? Well, frankly I do not know today. We do want to take some of the content from the event to other parts of the country, but as of now there are no plans for it. So your best bet would be to take out some time, get your company to invest some money in your skills and come over to TechMela. One thing that we can promise to you is that we will make this worth your time and money!

    May 14

    TechEd + Mix + MEDC + ITPC == TechMela

    Today is May 14, 2007, and exactly a month later, Microsoft India would be kicking off its biggest event ever for Architects, Developers, IT Pros and Designers – TechMela. TechMela 2007 takes the content from four major events Microsoft India has been doing – TechEd, MEDC, IndiMix and ITPC – and adds even more content to tell for the first time the full Microsoft story of Software, Services and Experiences.

    Over three days and across seven streams we are going to conduct almost 150 technical sessions, over 15 chalk talks and over 50 self-paced HOLs. We are getting speakers from all parts of the world and all parts of Microsoft – experts who have developed the technologies, technicians who are working with some of our biggest customers to adopt the technologies and consultants who are helping our partners build products and solutions on the technologies. It’s a galaxy of technology super stars all under one roof coming together at such a large scale for the first time. And to top it all, we expect over 3000 technical professionals from all over the country to join us at Mumbai where the event will be taking place.

    As we begin the countdown to TechMela, thru this blog I will try and take you behind the scenes into the motivations behind doing some of the key sessions and the background to the technologies we will be covering in these sessions. I will also try and provide links and references to other material which you will find useful to read thru before attending the sessions. Stay tuned in and you will discover even before you get to TechMela what all would be covered there. Of course to experience the real thing, you would want to join us at TechMela!!

     

    February 26

    Key I/O Changes in Windows Vista

    Updated on Mar 31 with Links, more background material and detail.

     

    The most talked about improvement in Windows Vista seems to be Aero - the composition based desktop which people either seem to love or hate. This is followed by the UAC which again seems to enjoy a love-hate relationship. However, there is a lot of technology shipping in Windows Vista which anticipates long term changes in storage hardware and helps the operating system be more peformant in light of those changes. This article gives an overview of some of these enhancements; the goal here is not be exhaustive and talk about every improvement, but to provide insight into why these are so important and their long term impact on PC computing.

     

    Background

    Just to establish the vocabulary, and ensure that there is no ambiguity, I will define some terms here:  

    1)      Primary Storage – This is connected to the CPU and is volatile. There are three kinds of primary storage:

    a)      CPU Registers: Internal to the CPU – this is the fastest storage

    b)      Main Memory: This contains the programs that are currently running and the data the programs are operating on. This is typically implemented as solid state volatile RAM connected to the CPU via a memory bus and a data bus.

    c)       Cache Memory: Modern CPUs are too fast for main memory so a cache is often used to prevent the CPU from waiting on main memory. This cache memory is quite small (cache size to RAM size ratio could be anywhere from 1:1 to 1:64 or even more), but is much faster than the main memory. Data is duplicated from Main memory into the Cache. Typically Cache memory is implemented as a hierarchy – primary cache (L1) being closer to the CPU and smaller, faster than the secondary caches (L2, L3)

    So you can actually imagine the primary storage to be a hierarchy with the fastest storage being closest to the CPU (in fact there could be hierarchy even within the CPU registers) and storage getting slower and bigger as we move away from the CPU.  

    2)      Secondary Storage – This is used for persisting data and is therefore non-volatile. Secondary storage is connected to the CPU using I/O channels. This is mostly implemented as magnetic disks with spinning platters or more recently using solid state drives using non volatile memory such as Flash.

     The key trends in storage hardware are:

    1)      Price of flash storage is dropping very fast, falling faster than expected at around 40% / annum. Compare this with an average of 32% for DRAM over the last 30 years or so.

    2)      Magnetic disk drives storage is getting bigger and cheaper, but so is flash memory – in fact the trend is almost parallel. There is however more to understanding the size-price trends for both. I’d recommend reading Flash Memory vs. HDD – Who Will Win for more detail. This article is dated, but has a comprehensive discussion.

    3)      The disk has been the slowest component in the hardware stack over the past 15 years, and it is unlikely that disk RPM would increase beyond 20,000. So with the introduction of multi-core, the rate of change of CPU power will outpace rate of change of storage speed even more.

    What these changes mean is that the slowest component in the system – the disk drive - would get slower and slower with regard to the CPU, so a powerful CPU would need to keep waiting on I/O. This is obviously not acceptable and we have to find a way around this.  The solution is to move data from the slowest storage to the fastest storage in such a way that we can let the CPU work mostly on faster storage, and avoid going to the slower storage as far as possible. This is done at two levels:

    1)      At one level we try and move the data from the secondary storage into the primary storage in as optimum a way as possible.

    2)      On the other level, we try and move data from the main memory into the CPU caches and the CPU registers.

     

    In this article we are going to focus on the optimization mechanisms to point 1), and specifically on moving the data from the disk into the RAM. All of these mechanisms are about buffering against disk I/O and using a faster, cheaper buffer for random I/O.

     

    Demand Paging

    There has always been this trade-off between RAM and disk – RAM is faster and expensive and volatile, and the disk is slower and cheaper and persistent. Traditionally, operating systems have been able to work out the trade off by a scheme called demand paging. The idea of demand paging comes with virtual memory, and is simple to understand – there is limited main memory, so the disk is used as an extended store, and the OS abstracts away the memory by giving each process an address space. Based on usage of the address space, the OS allocates memory from the combined disk + main memory store to the process. A page is a unit of allocation, and depending on when that part of memory was last used, the page could be in main memory or on the disk. So when an application needs to make use of data which is held in a page, if that page is not already in physical memory, it would be read off from the disk and loaded into main memory. This is called demand paging since the pages are loaded into the main memory on demand.

     

    Pre-Fetch

    The trouble with demand paging is that disk IO is terribly slow, so when the page is read from the disk instead of RAM, there is a huge performance hit. If this I/O cost could be reduced, performance would improve significantly. An interesting observation which can help here is that random IO is slower than sequential IO. Tests reveal that depending on disk and workload, sequential IO can be up to 30-100 times faster than random IO. So if the OS was to do more sequential IO than random IO for demand paging, performance would improve in a big way.  With this goal, starting from Windows XP, the OS started identifying patterns in application usage and would try and group together the files on the disk which would be used frequently.  This would convert a lot of random IO into sequential IO and make frequently used files and applications load faster. This feature in Windows XP was called Pre-Fetch.

     

    Super-Fetch

    While Pre-Fetch improves things by converting random I/O into sequential I/O, this can get even better if we were not to do any I/O at all. If the OS could somehow predict which pages would need to be read from the disk ahead of time and pre-load them into main memory, when the time comes to use them, there would be zero disk I/O cost.  This feature in Windows Vista is called SuperFetch – Vista speculatively loads code and data likely to be used by a user ahead of time from the disk into memory so that an application gets launched (or data is read by the application) “warm out of memory” instead of “cold out of disk”.

    A simple example of this is the scenario where you have a couple of applications running (say Outlook and IE), and you go for lunch. While you are away, a background program (search indexer, defragmenter, anti-virus, etc.) runs, consumes a lot of memory and releases it. To get this program to run, the Virtual Memory Manager in Windows would have to evict the pages of your foreground apps to disk. So when you come back and start reusing your apps, you would find them sluggish. This is where SuperFetch comes in – once physical memory becomes available, it would ask the memory manager to reload the memory with the data that was recently paged to the disk. So when you start using the system with SuperFetch running, you would not have to wait for I/O – SuperFetch would have loaded the pages ahead of time.

    Now given the fact that the size of disk drives is about 20 - 100 times the size of memory (I am writing this article on a 2 GB RAM / 60 GB HDD machine), SuperFetch needs to prioritize what data to pre-load, and what to leave behind on the disk, and also which activities are more important than others.  For this purpose, Vista ships with a set of heuristics that provide the prioritization, and once it starts getting used, Vista does a lot of data gathering and analysis so that it is able to recognize the user’s usage patterns and anticipate when you would need what data.  As the user keeps using her system, the data Vista gathers becomes more and more tuned to the usage and system performance improves. While the performance would improve for any scenario, some scenarios are going to benefit more than others.  The biggest impact of SuperFetch is seen today in application launching and resuming from hibernation / suspension.

    So to summarize, SuperFetching is different from demand paging in the sense that the OS is not waiting for the demand to occur – it has been recognizing patterns in demand over a period of time, and now pages based on anticipated need rather than demand.  As the gap between disk IO and processing power increases even more, making the data available in memory ahead of time would become even more important going forward.

     

    ReadyBoost

    ReadyBoost tries to solve a simple problem: RAM is great for disk caching, but adding more memory to a system is both difficult and expensive.  So what ReadyBoost allows you to do is to use a fast flash-based device (USB Stick, SD cards, Compact Flash, etc.) as an extension of physical memory which can be used for disk caching.  This works because flash is faster than disk for random I/O (and most I/O is random), and Vista performs a check on a USB disk on insertion, and if it meets the perf criteria (2.5 MB/s for random 4K reads and 1.75 MB/s for random 512K/s writes) and the minimum space requirements (230 MB free space), will let a user use it as an extension of physical memory for disk caching.  ReadyBoost is therefore ideal as a supplementary cache for SuperFetch, and in fact was developed by the same team.

    There are a number of optimizations used:

    1)      Readyboost recognizes large sequential reads and lets it be serviced by the hard disk instead of the flash since hard disks are faster at sequential I/O

    2)      The data is compressed by a factor of ½ before storing it to derive the maximum out of the storage

    3)      The data is encrypted with a randomly per boot generated key using AES 128

    From a bigger picture perspective, as the prices of flash continue to fall, we can expect that within 2007, PCs would start shipping with in-built fast-flash storages. (Intel recently announced a solid state drive product line aimed at embedded market and at motherboards in servers, PCs and notebooks) These storages would actually be much bigger and faster than the typical USB drive that you would plug into Vista today. The OS would recognize this as non-removable, non-volatile fast flash storage and would automatically use it for disk caching.  Since this storage would be non-removable, the encryption that happens today would not be required and would make the system even faster.

     

    ReadyDrive

    While ReadyBoost looks at the idea of Flash augmenting RAM in terms of increasing its size so that it can do disk caching, ReadyDrive is about Flash augmenting the magnetic disk by providing a solid state buffer.  It is a technology that ships with Vista that allows the OS to use new types of disks in the market which are called Hybrid Disks. These disks have an integrated non-volatile solid state flash memory (NVRAM) in addition to the standard rotating media. One could think of this as being a cache on the disk itself.  This cache buffers disk I/O and allows the disk to stay spun down for a longer time, thus improving perf as well as battery life.  The key areas which benefit from ReadyDrive are disk contention scenarios, booting, and resumption from hibernation.

    Ready-Drive controls the contents of the NV Cache and divides it into multiple parts. These are:

    1)      Write Cache (32 MB): Buffers write requests to the disk, allowing the disk to stay spun down longer

    2)      OEM-pinned data (1 MB+): Used by OEMs to pin key data for quickly launching preferred user experiences or Windows HotStart experiences

    3)      SuperFetch Pinned Data (remainder): Used by SuperFetch

    These drives are not far off - in fact, Samsung has recently released its Hybrid Drive to computer makers. With the falling prices of flash, it is obvious that over a period of time all disks will ship with a NVRAM cache and that the ratio of the cache to disk size would increase in the favor of the flash based cache.  ReadyDrive will therefore become more and more important going forward. In fact in near future, one can expect solid state disks to become the norm as compared to spinning disks – this trend has been there for a pretty long time, and is now coming of age.

     

    I/O Prioritization

    One can buffer against disk access using all the mechanisms discussed above, but at the end of the day, disk I/O still needs to happen.  From an overall system perspective, there is therefore a big need to be optimum about the I/O itself.  The I/O Prioritization changes in Windows Vista seek to do exactly this.

    Today the priority of an IO request from a background process is treated at par with the one from an application which may be in the foreground. If we consider applications like disk defragmenters and anti-malware agents, which do a lot of I/O requests as compared to a typical foreground application like Microsoft Word, this treatment at par can slow down the foreground applications considerably. So in Windows Vista, an application which would mostly run in the background can choose to specify that its I/O request should be given a low priority. This is done by calling SetPriorityClass and SetThreadPriority and specifying background mode.

    What this does is to embed the low priority setting in the flags field of each I/O request packet (IRP) originating from that process, and this can be read by a driver by calling IoGetPriorityHint. This ensures that I/O – a costly operation – gets prioritized correctly across the system. I/O intensive background tasks in Vista – Windows Defender, Indexer, and Defragmenter – are already using this feature, and more ISVs who develop such applications are expected to start using I/O prioritization.

     

    Better Together

    It is important to note that Super Fetch, ReadyBoost and ReadyDrive compliment each other.  ReadyBoost and ReadyDrive provide with faster caches.  This provides SuperFetch with a bigger canvas to work with – now it can put some pages in RAM, some other in the ReadyBoost storage and some other in ReadyDrive. The fact that SuperFetch does a great job in pre-loading the main memory with code and data from disk gives an opportunity for the disk to stay spun down for a longer time which alongwith ReadyDrive improves battery life and disk ruggedness.

    In terms of the entire I/O stack, SuperFetch is towards the top of the I/O stack – when your code tries to access a particular piece of data in the memory, the goal SuperFetch has is to anticipate that data and the page for that data ahead of time.  For this, it will try and pull that page out of the disk, and put it in ReadyDrive or ReadyBoost storage if they are available, or the RAM.  ReadyDrive and ReadyBoost on the other hand, are below the filesystem in the I/O stack – both of these mechanisms think in terms of disk sectors and do not think about data.

    Finally, we need to think about the impact of these optimizations – SuperFetch itself cause extra I/O and extra CPU usage, so how does one do the trade-off? The solution lies with prioritization – SuperFetch uses low priority I/O and low priority threads, so the CPU and Disk I/O overhead is minimal and does not interfere with the actual jobs being performed. In fact, SuperFetch reads only a few pages per second at Very Low Priority.  There is certainly a memory overhead which is caused by having to maintain information about what all is happening in the system, but that overhead is minimal and more than makes up with all the advantages provided.