fileUploader.js: Resumeable HTML5 Uploads

fileUploader.js

fileUploader.js is a javascript library designed to interact with nginx's upload module, allowing for resumeable uploads.The js library works as follows:

var uploader = fileUploader(file, segmentSize, sessionId)

where file is a File object from the HTML5 File API, segmentSize is the size in bytes of each segment of an upload, and sessionId is the sessionId used to represent this file upload (must be unique!). The methods of the uploader object return Bluebird Promises/A+; which may be interacted with using this API. It offers the following methods:

  • uploader.fetchStatus(): returns a promise for a status object, which has the following structure:

      {
        completed : [true|false],
        start : <The first byte uploaded. Should always be 0>
        end : <The last byte uploaded>
        total : <The total size of the file in bytes>
      }
    

    completed should be true if and only if end === total - 1.

  • uploader.uploadSegments(status, onSegmentComplete): given a status object, will upload the remaining segments of the file. onSegmentComplete(newStatus) will be called after each segment is uploaded, with a status argument reflecting the new state. It returns a promise with the completed status.

Thus, to upload a file, one may do:

  uploader.fetchStatus()
      .then(function(status) {
          uploader.uploadSegments(status, function(newStatus) {});
      })
      .then( function(status) { console.log("upload complete"); } );

Design

Resumable uploads in the upload module work in the following way:

  1. Uploaded segments of a single file should share a session ID to identify them as part of the same file.

  2. When a segment of a file is uploaded, it will return a status in the body, e.g. 0-5,9-15/24, indicating that the file is 24 bytes long, and bytes 0-5 and 9-15 have been uploaded. Partial uploads will get a 201 response.

  3. Segments may be reuploaded, as long as it is not in parallel, and the segments have the same data for that segment.

The fileUploader js library will attempt a small upload (the first 2 bytes of a file; it turns out the upload module has a bug wherein it will not handle repeated uploads of the single byte of a file) to get the status of an upload. From that starting status, the uploader will incrementally upload the file in 1MB segments.

Stieltjes: a minimal Scala client for Riemann

Stieltjies is a minimal, UDP-only, Netty-based Scala client for Riemann, the events and monitoring system for distributed systems.

    import stieltjes._

    val client = new UdpRiemannClient("127.0.0.1")
    client.write(
      Event(Host("myhost"), Service("server"), State("ok"), Metric(3.0F)))

By default, the client uses client port 5556 and server port 5555.

Events are immutable, but can be used as templates for other events:

    val defaultEvent = Event(Host("myhost"), Service("server"), State("ok"), Ttl(20))
    ...
    client.write(defaultEvent(State("critical"), Metric(1000L), Description("critical error")))

Caveats

Because Stieltjes uses UDP, event delivery is not guaranteed. This should be used for high-volume stats tracking, rather than error reporting.

Because events are sent as fire-and-forget UDP packets, there is no mechanism for detecting if the Riemann server is down.

Responsive Design

I've been playing around with twitter's bootstrap library the last few days. I made a version of my site using it, which I think looks pretty nice. The design is minimal and appealing, and responds to the size and browser automatically. This means my site looks nice on mobile now too.

Maybe I'll switch to that version of the site once I get social and music integrated. 

English/Transliterated Persian Translator

UPDATE: Not even a week after I made this, Google announces that they will be shutting down or deprecating all of the APIs used in this project. This is slightly frustrating.

---

Quick, pronounce this:

هاورکرافت من پر مارماهى است

This is Persian. It uses the Perso-Arabic alphabet. If you're like me (that is, you don't know Persian), you can't pronounce this easily, so tools that translate from English to Persian aren't of much use. Instead, you want an English to transliterated Persian translator; unfortunately, these don't exist.

So I made one.

How it Works

Google provides APIs for translation, English to Persian transliteration, and Arabic diacritization.

Transliteration is the act of converting the alphabet used to represent a languge from one to another. Transliteration to English (from, for example, Arabic-based alphabets) is well-studied. Often a user will have access to only a western keyboard, but want to type in Persian, or Chinese, or Russian. Tools like Google Transliteration will do this, but they don't go the other way.

Transliterating well is enormously difficult, as hard as translation: word meaning and intent is important. For example, how do you pronounce "bow"? It's not clear, because the word for the thing that shoots an arrow is pronounced differently from the word for bending at the waist. In other alphabets, these could have different transliterations.

However, transliterating poorly is often not hard: make a one-to-one letter map from one alphabet to the other. However in Persian, the written form has no vowels--doing this naively would result in a mess of consonants. Diacritization, the act of adding the vowel pronounciation marks to Persian, can provide the vowels.

To translate from English to transliterated Persian, the tool first translates to Persian, diacritizes the result, and then transliterates the diacritized Persian-alphabet text back to English. The first two steps are done using Google's APIs, and the last step here is implemented, in rudimentary fashion, by me. The first two steps also tend to make errors--after all, translation and diacritization are hard problems. Nonetheless, the end-result seems to be OK: for simple sentences, the end result is readable and either correct or close to correct. Often, the translation itself will be poor, which is a limitation of translation being such a hard problem.

The other direction, transliterated Persian to English, is easier implementation-wise, because Google's will transliterate the "pinglish" back to real Persian (transliteration API) and then to English (translation API). It's just a matter of stringing the APIs together.

(The pronunciation of the phrase at the top? "havercrafte man pore marmahi ast")

Graph Drawer

Graph Drawer is the example app I built on top of my Graph.js graph drawing API, and is actually pretty fun. You can share the graphs you draw (the link above goes to my drawing of the Petersen graph.

The tool also provides a way to create graphs, which can then be accessed using the API in other contexts. This is how the graphs in the Graph.js demo were created.