martinwguy: March 2018

Lots of blog posts are appearing entitled "How I failed my interview with Google" or something similar, excitedly telling all about the questions they were asked in the interviews, exactly as Google had asked them not to do. Here, by contrast, is how Google failed its interview with me.

The first contact was a Google headhunter, a certain Ashley, pleasant and uninvasive, who invited me to do some telephone interviews. I did one with the suits with multiple-choice questions and two with programmers, simultaneously coding on a shared Google Doc. During the second programmer interview I even heard a gasp when I coded a couple of lines. Both gave a 10: "Go get him!"

All well and good.

Then they passed me over to Human Resources in Dublin to organize flights and accommodation for an in-person interview in Zurich. I told them that I was in the middle of a software product release and that I wouldn't have much time for chat. The first girlie was so offensively bubbly and stupid that I had to ask her to pass me on to someone a bit more... er... adult which, to her credit, she did. The second person, from a quick Google search, turned out to be her best friend on Facebook.

Some other agency, unrelated to Google, was charged with booking me two flights. They suggested two alternatives for each way, one in the morning and one in the afternoon, and asked me which were more convenient for me. I was on an afternoon-evening waking/working cycle so said the afternoon ones would be best. They booked me the morning flights, both of them.

The HR Dublin weeb then kept phoning me up every day, desperate to gabble at me to "prepare" me for the interview. I only got her off my back by telling her to stop phoning me and to write email if she really had to. She wrote me a load of stuff that it was entirely inappropriate for a company to say to an interviewee, including a list of eight 800-page books that I was supposed to read before coming to the interview the next day.

For my accomodation in Zurich I would have been happiest with a 13-euro BnB. Instead thay booked me into a 250-euro-per-night 5-star hotel for two nights, then said "You'll have to pay for the hotel yourslf, then send the originals of the receipts to an I.B.M. address in Poland (really! straight up!) and they will refund you." I didn't have 500 euro, thank heavens, and said so. "Oh! Oh! No, it's OK, we'll make an exception and we'll pay for it!". Damn right, you will.

Then they sent me a 1500 euro set of plane tickets for Zurich and the very morning of the flight, the same turd kept phoning me up from before 9am to wake me up and nag me. In the end, at 11:30, mentally destroyed by the continual rude awakenings, I just said "Fuck it" to myself and went back to sleep. They phoned me again. I said "Listen, I'm not coming, so this is now just an argument" and debatteried the phone.

Why would they do that to the highest-scoring and most expert programmer they'd seen for ages? Well, at the time, Google had issued a directive to hire 1000 new programmers, so HR droids on short-term contracts would probably think it in their best interests to make the hiring drive last as long as possible by dicking the best candidates about until they ran away screaming, but then again, the Chinese wisdom does say: "Never ascribe to malice, that which can be explained by incompetence".

I searched the first girlie some months later. She'd quit Google and was now working in HR at Amazon. Poor Amazon!

I still have the printout of the 1500€ PDF air ticket for Zurich. I should get it framed.

Sorry about the formatting; it's pasted from a libreoffice document.

Resynthesizing audio from spectrograms

Martin Guy <martinwguy@gmail.com>

Work: July-August 2016; Docs: February 2018.

ABSTRACT

It can occur that the only available source of a piece of music is a JPEG image of its spectrogram. An algorithm is presented to convert such a graphic back into a best-effort approximation of the audio from which it was created.

Here is an example of a source graphic from the case that provoked this work: spectrograms of unpublished samples of electronic music by pioneer Delia Derbyshire in James Percival's 2013 dissertation for his master's degree, Delia Derbyshire’s Creative Process:

Fig. II.4 from Delia Derbyshire’s Creative Process:
“Spectrographic analysis of CDD/1/7/37 (2’49”-3’00” visible)”

CDD/1/7/37 is Singing Waters: “It is raining women’s voices”,

a musical arrangement of Apollinaire’s graphic poem Il Pleut.

This represents 11 seconds of sound in 618 pixel columns (56 columns per second) from 5Hz to 1062.8Hz in 252 pixel rows (so with frequency bins spaced by 4.2 Hz)

It has linear time and frequency axes and is composed of a square grid of coloured points and frequency and colour scales on the left that show what frequencies each row represents and what sound energies are represented by a range of colours.

Algorithm

In brief, we turn the colour values back into estimated amplitudes, then reverse FFT those to create an audio fragment from each pixel column. We then mix these to produce the audio output.

Colour-to-amplitude conversion

We make an associative array mapping the colour values present in the scale to their decibel equivalents by sampling a vertical strip of the colour scale, knowing on which pixel rows it starts and ends and by reading off the minimum and maximum decibel values on the scale. Using this, we map the colour values present in the spectrogram (or their “closest” equivalents on the scale) to create an array representing the energies at each frequency shown in the spectrogram, for each of the moments represented by its pixel columns.

Interpolation between frequency-domain frames

One can optionally reduce the choppiness of low frame-rate spectrograms by interpolating between FFT frames before doing the transform, thereby effectively increasing the frame rate.

Phase

Each reverse FFT, as well as an array of amplitudes, also needs a phase component for each frequency bin, which needs to be chosen to ensure that the sine wave output due to each bin of one frame is in phase with the output from the same bin for the all the other frames.

We do this by setting the phase for a bin centred at f Hz at time t seconds to

random_offset[f] + t × f × 2 pi radians

The constant random phase offset, different for each bin, avoids artifacts caused by many partials coinciding in phase periodically and producing harsh cos-like or sin-like peaks:

/\ ,

/ \ /|

/\ / \ /\ or /| / | /|

\/ \/ |/ | / |/

’

Mixing successive frames

To avoid discontinuities when the output audio changes from the results of one reverse FFT to those of the next, the size of the FFT is twice the number of samples represented by a pixel column, and we then overlap the putative audio output fragments by half a window and fade between them sample by sample to create the final audio data.

The fading function is a Hann window which, being cos squared, has the useful properties that it crosses 0.5 at 1/4 and 3/4 of its width, that each half has 180° rotational symmetry so that the sum of two adjacent windows’ contribution factors is always 1.0, and its endpoints are both at 0. Its bell shape also means that the sound output for the middle half of each window depends mostly on the data from the corresponding pixel column.

In our implementation we centre each fragment of output audio on the time represented by the centre of its corresponding pixel column and mix using a double-width window, so a quarter of the first window extends before the start of the piece’s started start time and a quarter of the last window extends beyond its end, making the total length of our audio output the stated length plus the time for one pixel column.

Results

A program to perform this transformation, specialized for the example graphics, is available under http://github.com/martinwguy/delia-derbyshire in the “anal” folder, file “run.c” with a driver script “run.sh”. The sample input files can be extracted from the thesis, available under https://wikidelia.net/wiki/Delia_Derbyshire%27s_Creative_Process and the audio output from the example spectrogram cited in the text can be heard at https://wikidelia.net/wiki/Singing_Waters

The other spectrograms present in the thesis give similar results, but Singing Waters is the prettiest of them.

martinwguy

Sunday, 4 March 2018

How Google failed its interview with me

Resynthesizing audio from spectrograms

Resynthesizing audio from spectrograms