Wonders of YouTube

Posted on Fri 19 May 2017 in life • Tagged with life, italian, youtube, spirits, mediumsLeave a comment

Who was Eusapia Palladino?

Eusapia Palladino era una medium Italiana.

I remember saying these words over and over in my dorm room in Schapiro Hall during my third year of college. I was practicing for the midterm presentation in my Advanced Italian Conversation course, in which we had to make a video on a topic of Italian culture that interested us. Behold, in the grand wonders of the internet, the short video that I created --- and completely forgot about --- has ended up with almost 10,000 views on YouTube (as I write this) and is the most viewed video on the exciting and popular subject of Eusapia Palladino, Italian medium/magic person.

Looking back now, the video is reasonably well done, and my countless trials to properly enunciate all the words paid off. Yet you can imagine my confusion and wonderment to check back on this video almost five years later. What was hosted on YouTube just so it could be played one time in my class session has been viewed by thousands around the world. People have even commented. Crazyness.

Eusapia Palladino was, in fact, a pretty interesting person. She claimed to possess extraordinary powers, and would demonstrate the ability to move furniture and play musical instruments without moving her body, as well as summon spirits to the room. Her abilities were "vetted" by several Nobel laureates, including Pierre and Marie Curie and physiologist Charles Richet. They detected "signs of trickery" but "could not explain all of the phenomena". Indeed, Pierre Curie, days before his death, wrote to a friend that

There is here, in my opinion, a whole domain of entirely new facts and physical states in space of which we have no conception.

Now that I look back on the Wikipedia article, a fact stands out in a footnote that is not included in the article body.

The most notorious medium who used her sexual charms to seduce her scientific investigators was Eusapia Palladino... [She] had no qualms about sleeping with her sitters; among them were the eminent criminologist Lombroso and the Nobel Prize—winning French Physiologist Charles Richet.

Perhaps Richet had difficulty "explaining all of the phenomena" because he was betwixt the sheets back in his hotel room.

You need a doorbell

Posted on Fri 03 February 2017 in programming • Tagged with life, programming, python, twilio, flaskLeave a comment

Sometimes a dumb technical approach can be a solution to a real world problem.

I live in a graduate residence that doesn't have a buzzer system. To be granted access, my guests have to text/call me directly and I have to walk downstairs to let them in. Though this may not seem like a big deal, if I'm having more than just a couple people over, it can be a bit much to continuously be checking my phone and interrupting conversations to sprint through hallways to the door.

I know it's not just me. Two of my friends had an apartment near me, when I lived in New York, on Mott Street. Their apartment building, which was built in 1900, had no buzzer system, even though it probably housed 50 people or more (and cost more than your parents' mortgage payment per month for about 12 square feet of space). When friends were coming over, the new arrivals would text to announce that they were outside, and my friends would drop the front door key in a sock out the window onto the busy street. Somehow no one was ever injured, and it sure was a lot of fun.

The approach I take here is to make the arrival process more seamless and transfer burden of opening the door off the party hosts and onto the guests. Go ahead and design a robot that turns a doorknob on command at your own peril.

Introducing you-need-a-doorbell

you-need-a-doorbell is a simple app that uses text messages and audio alerts to manage your guests' arrival to the party. When they arrive, guests text a phone number that you have registered through Twilio. (Not your own cell!) Their contact information is validated against permission lists, and if the guest is authorized to enter the party, an announced is made from your computer speakers.

DING DONG! Guest Flo Rida has arrived to the party.

Since everybody in the party can hear this announcement, it gives you, the host, latitude to spin on your heels and point to some unsuspecting soul: "Go let Flo Rida in!"

The host can even have the app randomly assign responsibility to open the door from the list of guests who have already arrived. This further removes responsibility from your plate as now, not only are you not running down to open the door, but you're not even pointing a finger at anyone! (And everyone knows computers are blameless.)

Check out you-need-a-doorbell on GitHub.

Making the app

A simple Flask web server runs on my macOS desktop. Then, when a guest sends a text to the party-specific phone number, the text message number and body are packaged into a POST request to a URL on my server. After validation of the phone number, a text-to-speech utility like say is used to play a message out of the speakers. The original version I whipped up for a small kickback at my place was only 49 lines of code.

I added new features after my own alpha test and packaged things more nicely so that options could be specified in a config file. I really enjoyed making this, as I learned a bit about Flask and admired the simplicity of the Twilio API.

Going forward, I would re-design this so that the app was hosted in the cloud and party hosts could access everything through a web interface. The server could run on EC2 or some other service, listening for requests from Twilio. Simultaneously, the party host would log in to the server. A connection would be kept open using a web socket. There, they could manage permissions for potential guests, view those who had arrived (or those that were still expected), and listen for arrival messages. One could even import attendance info from a Facebook event. Perhaps, as I continue to learn more about web development, I'll create this interface. For now, I'm happy with what I have, and I hope you may be too.

Since I added new features after my alpha test and didn't do very thorough testing, you may find some bugs. Please let me know if that's the case.

I found that the text message interface for guests worked really well. It's interesting how everyone and their mother releases iOS/Android apps to facilitate just simple messaging with their organization and access to static resources like maps or schedules. I'm not sure that this interface is any better than a text message hotline and a website. As "bots" are becoming the rage for many simple tasks, maybe we'll see the lowly SMS make more of a resurgence.

My first crossword

Posted on Tue 24 January 2017 in crossword • Tagged with crosswordLeave a comment

I made my first crossword! This is something that I've been working on very intermittently since 2011. Very intermittently, in the sense that I would make progress only when I was flying across the country on college breaks and, with no any internet connection, sometimes didn't have anything better to do.

(Skip to download the puzzle below.)

Actually, this is my second puzzle, but my first one is pretty sloppy and I'd rather show you this one. Maybe that one will see the light one day, maybe not.

This puzzle even has a theme of sorts. See if you can guess which are the theme clues. (I'm sure you can.) I imagined this printing in the New York Times on Thanksgiving of some year as I triumphantly waved drumsticks in the air and forced loved ones to solve it.

Making the puzzle

It was surprisingly difficult to create the puzzle. I still haven't looked into the process and resources that experienced creators use, though I'm sure these can ease the process substantially. In my attempt, I created a 15 x 15 table in iWork Pages/MS Word and shaded cells to achieve black squares. I would place my theme answers, then try to fill in corners one at a time. Eventually, I would come to an impasse and have to backtrack, deleting answers until a fill would be possible again.

At first, all answers came from my head. But if I can't solve every crossword, then surely the set of answers I could think of was much smaller than those necessary to ensure a diverse fill. I next resorted to using regex with /usr/share/dict/words (or larger variants). This was an imperfect solution as dictionaries don't have nearly the phrases, abbreviations, acronyms, and proper nouns that make for an interesting puzzle. I finally used XWord Info's excellent clue finder and, in my most active phase, even shelled out a small donation to get more access. (You may note that by this time I was not necessarily flying on an airplane.)

This puzzle has seen many iterations. For each, I was my own worst critic, circling my "groaners" even as friends who play-tested it didn't seem to mind. I minimized these in this final version the best I can, though I'm sure Rex Parker would show no mercy on several of my answers.

Get the puzzle

Are you hungry for a crossword by now? Well, get the puzzle!

You can download Across Lite for free here.

Hope you enjoy, and please leave any comments below!

I ran a race and made some graphs

Posted on Thu 17 November 2016 in dataviz • Tagged with dataviz, running, juliaLeave a comment

The Cambridge Half Marathon was this past Sunday and I was able to race with a couple of friends. It was a gorgeous day and very nice course. As always, I was inspired by the diversity of runners.

After finishing, I was happy to see that the full results were dumped in one not-too-poorly formatted text file. I whipped it into a Julia DataFrame to do a bit of analysis.

How fast are old people?

It's a thrill to get passed by older women and men who are pacing along by themselves or running with their daughters or sons. How fast were the older runners in this race, especially compared to runners in their "primes"? To answer this, I plot the estimated distribution of pace (minutes per mile) across age groups.

It's interesting to note that, besides the literal elderly, runners from very different age groups have similar distributions of paces. Sure, there's a little more mass on faster paces for the youngest runners (20 years old or younger) and a bit more mass on slower paces for the 51-65 year old runners. But the five age categories in the middle show quite similar distributions, and the modes are almost identical at around 9:00 minutes per mile.

I haven't looked at the science for how pace changes with increased age in general — though presumably it converges to 0 when we die — nor looked at data from other races, so I'm interested to see if this phenomenon is seen more widely. I would expect that for the older age groups, there is a fair amount of self-selection in which more dedicated runners compete and the "recreational" runners stay home. This could balance out the natural decreases in pace from aging.

How fast are young people?

With those oldies speeding by, it's important not to lose track of my own peers. Below, I plot the estimated distribution of pace for the group of 23-year old males, and show the pace of several key individuals.

I turn out to not be the last of the pack for my group, for which I'm relieved. However, while I handily beat the oldest racer of the day — a 71-year old man —, I get crushed by the youngest racer of the day — a 9-year old boy, who runs an amazing 7:14 pace.

It's inspiring to take part in a race with such a diverse group of racers in terms of age, experience, and ability. A special shout out as well to the eight visually impaired runners and two wheelchair racers who completed the course.

The data and code can be seen on my GitHub.

Why I'm bailing on Julia for machine learning

Posted on Fri 04 November 2016 in programming • Tagged with julia, python, mlLeave a comment

I'm bailing on Julia for machine learning — just for my one class, that is. Don't worry ~too much~!

I'm taking graduate machine learning (6.867) this semester at MIT. There are three homework assignments in the course that are structured as mini-projects, in which students implement canonical algorithms from scratch and then use them to analyze datasets or explore the effects of hyperparameters. "Official support" — in the sense of skeleton code, plotting routines, and TA assistance — is provided for MATLAB and Python only. Working in Julia (or another language) is allowed, but the going is solo.

After sticking with Julia for the first two assignments, I'm bailing for the rest of the semester. Although Julia is great-looking, fun to write, and performant as ever, there were a lot of challenges I ran into in using existing functionality within my assignments. Specifically, I found the stats/ML packages pale in comparison to sklearn in terms of functionality and ease of use. While it was great to use Julia to implement my own algorithms, it turned out to be a real hassle to tie in with existing functionality.

As one of several small issues, here's the trouble I went through just to use pre-existing functionality to fit a logistic regression model.

Logistic regression case study

What's the quickest way to fit a logistic regression model for classification? A quick search brings up the JuliaStats page (as my first result, at least) with a variety of packages listed. From the descriptions, it seems like our candidates for logistic regression solvers are GLM.jl and RegERMs.jl. The next search results are for Regression.jl and a couple DIY logistic regression examples.

Let's assess our options:

  • GLM.jl: Has most of the goods, but as we'll see, it didn't have all the features I needed for my assignment.
  • RegERMs.jl: Doesn't even load on Julia 0.5, last commit over a year ago.
julia> Pkg.add("RegERMs")
ERROR: unsatisfiable package requirements detected: no feasible version could be found
for package: Optim
  • Regression.jl: Doesn't even load on Julia 0.5, last commit over a year ago.
julia> using Regression
ERROR: LoadError: LoadError: LoadError: UndefVarError: FloatingPoint not defined

So GLM.jl is our only option. And at this, a new Julia user might be lucky to even find it. It's not immediate from the JuliaStats blurb that GLM.jl can be used for logistic regression, and there's no mention of "logistic" neither in the documentation nor in the repo itself besides a comment in a test case. (An astute user, of course, may note that LogitLink is relevant, and will likely be aware of the features of the popular R package.)

But it's not too bad to fit a logistic regression model using GLM.jl:

using GLM, DataFrames
df = DataFrame(x = rand(10,1), y = rand([0,1], 10))
model = fit(GeneralizedLinearModel, y ~ x, df, Binomial(), LogitLink())

Note as well that the DIY logistic regression attempts that rank highly in search results (like here and here) are not super helpful for the purposes of quickly fitting a model, but are typical of the content that comes up in results for Julia queries.

Adding L1/L2 regularization

In one of the 6.867 assignments, we are asked to apply logistic regression with L1/L2 regularization. GLM.jl doesn't provide this functionality and the other seeming possibilities were non-functional RegERMs.jl. I was out of luck. I switched to sklearn for the rest of the problem.

import numpy as np
from sklearn.linear_model import LogisticRegression
X = np.random.rand(10,1)
Y = np.random.rand(10).round()
model = LogisticRegression(penalty='l1')
model.fit(X, Y)

After later investigation, I did realize that GLMNet.jl, which wraps the glmnet Fortran library, would have done the job, with sufficient user effort.

The pieces are there, the whole is missing

We were able to fit our logistic regression classifier after a fair amount of digging. But this digging shouldn't be necessary. It should have been easier to find a logistic regression that works from the JuliaStats landing page, given that this is a pretty standard learning algorithm. And inconsistencies within the JuliaStats organization seem to be a fair problem. A user might start by using GLM.jl. But to add regularization to the loss function requires to switch to a different package, with a slightly different API so that the old code can't be dropped in. The "interface" in StatsBase.jl isn't totally implemented in some of these more niche packages (GLMNet.jl, Lasso.jl), or isn't particularly followed at all, especially when the package wraps some underlying library (LIBSVM.jl).

Then, we have the entirely different entity that is JuliaML. Here, we have a design and API that seem to be in direct competition with JuliaStats (StatsBase.jl/MLBase.jl vs LearnBase.jl, Distances.jl vs LossFunctions.jl, RegERMs.jl vs MLRisk.jl, etc.). I wasn't sure how to even start using it, let alone attempt to theoretically combine LossFunctions.jl, StochasticOptimization.jl, and MLMetrics.jl into something resembling an end-to-end model. I can't quite figure out what space JuliaML is trying to fit.

Overall, I think that JuliaStats is doing a very good job and is almost there. Packages like StatsBase.jl, DataFrames.jl, and Distributions.jl are really great to use. Certainly, the obvious response to the difficulties I had above is that more community support is needed. Can't argue with that.


Julia has been perfectly suited for quickly coding up ML algorithms from scratch and really getting my hands dirty. But when I wanted to quickly and easily drop in robust community packages, I found that the functionality wasn't there.

I'll be using the Python ecosystem for the next assignment, in which we implement neural nets/backprop. If I find myself having to bust out sklearn more regularly, I better figure out how to use it fluidly.

Eight days a week, revisited

Posted on Thu 02 June 2016 in economics • Tagged with life, economics, corporate drones, nundinumLeave a comment

If, as I investigated in a previous post, you were down with an eight-day week, then you were down with an idea that is possibly welfare improving. Give yourself a pat on the back.

That's according to Maya Eden in a paper elegantly titled The Week. Dr. Eden presented her work at a New York Fed seminar yesterday on this issue near and dear to my heart.

Eden's presentation of some of the background of the modern week included some joyfully quirky tidbits:

  • Brunei today uses a 4-1-1-1 week (Friday and Sunday off) to accommodate the religious Muslim (day of rest on Friday) and Christian (Sunday) populations.
  • Beginning in 1929, the Soviets experimented with alternative weeks including a rotating 4-1 (a different fifth of the population was off each of five days) and a coordinated 5-1. A large part of this change was to diminish the influence of religion, of which the structure of the week originates. Eventually, they returned to a normal seven-day week with Sundays off. Historians use this latter fact to suggest that insubordination or noncompliance by religious citizens was not an insignificant problem.

The model

In the model, labor productivity is a function of "fatigue" and "memory". Fatigue is straightforward: people get tired. And when the are tired, they are less productive and make more mistakes. This is the same issue that causes The Atlantic or The Economist to publish thinkpieces on or around labor day about reducing the work week.

While fatigue affects labor productivity in the short run, "memory" affects productivity in the longer run. Memory is interpreted as learning by doing capital in which skills sharpen through practice but depreciate through disuse.

  • Vacation is investment in rest
  • Work is investment in memory


After including a utility framework and calibrating the model, Eden makes some arguments about productivity and social welfare. Right away, the modern week is swept into the gutter -- for no vectors of admissible model parameters is the modern 5-2 week socially optimal.

Rather, as she told us,

There is never a cycle with more than one vacation day that is optimal.

The intuition here is that pairs (or more) of non-contiguous vacation days per (seven-day) week do just as much to restore "rest" as two contiguous vacation days but reduce the depreciation in memory:

There are productivity gains for having more Mondays per week.

Given Eden's calibration, the optimal i-j weeks (cycles of workdays followed by vacation days) are either 2-1 or 3-1. Here, it is interesting note that both decreasing and increasing the share of vacation relative to work days as compared to the current 5-2 week can lead to welfare gains.

welfare table

My beloved proposition for a 5-3 week did not meet with as much support as I had hoped. In the table above, you can see that the 5-3 cell has some mass on welfare increasing but is welfare reducing on average. The parameter calibration is an important factor here. (Though hark, let welfare arguments not be the end of us.)

Field research

I asked a couple friends what they thought of the proposal for a 2-1 week. Universally (N=3), they were disinterested. Concerns mounted around the shortened weekend, which would increase the relative effect of harumph hangovers. With one day weekends, one additional vacation day would be needed to increase the weekend to consecutive days off; and while two additional vacation days would increase the weekend to four consecutive days off, this wouldn't be any different than the current model. The fact that 2-1 weeks would lead to around 17 extra days off per year was lost in the analysis.

"It's like every weekend is a Sunday only, and Sundays are the worst weekend days!"

"Right, but the first day of every week is like a Thursday."

The concerns about contiguous vacation time highlighted a missing piece of Eden's model -- no complementarity of contiguous vacation time. Even if these contiguous days are not used for traveling, there seems to be something about being able to recover from a hangover and not despair the coming of another week as a corporate drone.


The optimal arrangement of the week remains an open question. Eden's model could use some additional features and improved calibration (with less of the "magic macro wand"). I'm also interested in seeing some (not-really) serious proposals for how changes in the week could be implemented and the costs thereof. Until then, let's get some ramen on Micahday and catch up.

Using vimdiff with dumb paths

Posted on Tue 02 February 2016 in coding • Tagged with vim, git, matlabLeave a comment

I've been loving vimdiff as my git difftool for a while.

Vimdiff and Matlab packages

Vimdiff runs into some problems when working with Matlab code. Matlab considers directories with names beginning with + as "packages", a natural way to organize projects. However, this means that many paths relative to the git top-level directory begin with +. Of course, this applies to any path beginning with a +, though this is otherwise an uncommon naming convention.

Suppose that your project looks like this:


You might want to use vim to diff your two files:

vimdiff +code/foo.m +code/bar.m

But vim will tell you:

Error detected while processing command line:
E492: Not an editor comamnd: code/foo.m
E492: Not an editor comamnd: code/bar.m

What's going on? Vim interprets + as the start of a command-line argument (beyond the typical -) and fails to find options/commands called code/foo.m and code/bar.m.

The vimdiff solution

Simply tell vimdiff somehow that the paths you are specifying are not options. (By the way, note that vimdiff is pretty much just an alias for vim -d.)

We could prefix them with the current directory:

vimdiff ./+code/foo.m ./+code/bar.m

Better yet, note that the -- option

Denotes the end of the options. Arguments after this will be handled as a file name. This can be used to edit a filename that starts with a '-'.

Or, in our case, to edit a filename that starts with a +:

vimdiff -- +code/foo.m +code/bar.m

Bringing git into the picture

We now update +code/foo.m and want to compare to the previous commit. Using git's difftool command, we diff all modified git-tracked files using a tool of our choice:

git difftool --tool=vimdiff

Unfortunately, we run into the same problem as before.

"/private/var/folders/7v/cjvpqzt57c5fsb7qvsjmm5sw0000gn/T/7i6IKG_foo.m" [readonly] 1L, 6C
Error detected while processing command line:
E492: Not an editor comamnd: code/foo.m

Since you haven't specified the file names yourself, you can't use either of the solutions from the previous section!

The git solution

Git is basically writing the "before" state of the file to a temporary location and invoking a typical vimdiff command. If we can modify this command, we can use one of our solutions above.

One possibility is to set git's difftool.<tool>.cmd config option:

Specify the command to invoke the specified diff tool. The specified command is evaluated in shell with the following variables available: LOCAL is set to the name of the temporary file containing the contents of the diff pre-image and REMOTE is set to the name of the temporary file containing the contents of the diff post-image.

We can do

git config --global difftool.vimdiff.cmd 'vimdiff -- $LOCAL $REMOTE'

That'll do it. While you're at it, why not update your ~/.gitconfig with the following handy diff-related settings:

    d   = difftool
    tool = vimdiff
    prompt = false
[difftool "vimdiff"]
    cmd = vimdiff -- $LOCAL $REMOTE

Most people will likely never come across this problem: none of their paths are this crazy, they use a different (read: graphical) difftool, they even use an external diff wrapper. But this simple solution is all I need.

How Goldman stays cool

Posted on Mon 11 January 2016 in buildings • Tagged with buildings, ice, cubes, ice cubesLeave a comment

Goldman literally has a vault full of ice cubes.

I wasn't sure if I had imagined this fun fact or not, but sure enough, Goldman's FiDi headquarters is literally air conditioned in hot weather by the melting of massive ice blocks in their basement.

According to the WSJ, which covers such fun topics,

The basement houses 92 storage tanks that hold 1.7 million pounds of ice made each night when electricity rates are lower than during daytime hours. Air cooled by the melting ice circulates throughout the building.

That would only fill an Olympic-sized swimming pool one third of the way.

Without knowing anything about the economics of air conditioning, this sounds at first like nothing more than a gimmick. But apparently, the system can be as much as 40 percent cheaper compared to traditional methods. The system, called thermal storage, freezes the water into ice at night to take advantage of lower electricity costs.

"It requires about 150 to 400 pounds of ice to cool down a person in an office building every day," said Mark MacCracken, chief executive of Calmac.


Eight days a week

Posted on Sat 09 January 2016 in economics • Tagged with life, economics, corporate drones, nundinumLeave a comment

What would you give up for a three-day weekend?

The problem

How great are three-day weekends? For me, at least, they recall the days in undergrad in which Thursday at 2pm was the end of the work week. The rejuvenation of those three-day weekends is certainly a step above the normal weekend of the 21st century corporate drone.

If only we could do something about our real-life weekends...

The proposal

With that in mind, I propose a three-day weekend for all workers. Now hang with me here.

No employer would be happy to just give all of their employees Friday off each week - an extra 52 vacation days! They might be open to weekly "flex days", with working hours Monday-Thursday extended from 7 to 8.75 hours per day. This arrangement might be too much for some workers but is not altogether unreasonable.

There are two balancing incentives: - employees' desire for longer contiguous time off - employers' desire for a minimum amount of hours worked by employees

(Though on the latter point, it is not clear that this is the best proxy for productive output.)


I propose an eighth day of the week, which we can tentatively name "Micahday." (Alternate names are welcome.) Micahday would be inserted between Saturday and Sunday. The key: \(\frac{4}{7}<\frac{5}{8}<\frac{5}{7}\).

We maintain our 5-day work week, and add a full, luxurious 3-day weekend. Fly, fool -- travel every weekend if you want to!

Hours worked

Assume a typical US employee works 35 hours/week for 50 weeks/year for a total of 1,750 hours. (This roughly corresponds to the actual average work week of 34.5 hours and actual average work year of 1,789 hours.) We want to make our proposal attractive to employers by not substantially changing the total number of hours worked. A calendar with eight-day weeks would have about 43.5 work weeks per year (excluding two weeks of vacation). So the average workday under this proposal would increase from 7 hours to a few minutes over 8 hours. More reasonable.

System Hours/week Weeks/year Total hours
Normal 35 50 1,750
Proposal 40 43.5 1,750

Previous attempts

Apparently the Romans at one point used the Nundinal Cycle until the introduction of the Julian calendar. The nundinum ("market day", yes, please attempt to use that in context tomorrow) was

the day when city people would buy their eight days' worth of groceries.

Sounds like a typical TJ's run to me.

Some more or less crazy attempts have also been made in the modern era.

Problems, duh

Obviously it is far-fetched to believe that anyone would actually overhaul the calendar to accommodate eight-day weeks. In one sense, it wouldn't make that much of a difference -- January 9 would stay January 9, a year would have the same number of months, and a month would have the same number of days.

Sure, maybe under my proposal a November beginning on a Friday or Saturday wouldn't have a Thanksgiving. Sure, maybe every piece of software to ever use a date would have to be patched.

But think about those long weekends...

Try it out

There's nothing to stop a forward-thinking employer from simply scheduling their employees 5 days on and 3 days off under the existing calendar. A company with less need for constant interface with the outside world (in terms of banks, markets, clients), such as a software company doing initial product development, could give this a shot. Make sure to pay me the fat consulting fees if you do.


Posted on Fri 08 January 2016 in life • Tagged with lifeLeave a comment

Welcome to my site. I'll be writing about various topics, such as the world to which we are all party.