<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Phantom Dust]]></title><description><![CDATA[A collection of random stuff and thoughts]]></description><link>https://dust.teckyianlim.me/</link><image><url>https://dust.teckyianlim.me/favicon.png</url><title>Phantom Dust</title><link>https://dust.teckyianlim.me/</link></image><generator>Ghost 4.34</generator><lastBuildDate>Tue, 07 Apr 2026 17:28:18 GMT</lastBuildDate><atom:link href="https://dust.teckyianlim.me/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Fixing macOS's terminal Home, End, and Function keys]]></title><description><![CDATA[Fix the keymaps for macOS's terminal for your sanity.]]></description><link>https://dust.teckyianlim.me/fixing-macoss-teminal/</link><guid isPermaLink="false">62772f5cc9409d4f81b62dc4</guid><category><![CDATA[short read]]></category><category><![CDATA[macOS]]></category><category><![CDATA[config]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Sun, 08 May 2022 03:36:46 GMT</pubDate><content:encoded><![CDATA[<p>macOS&apos;s defaults for Home, End, and F1-F4 keys have different behaviors than what one would expect coming from a Linux terminal. Instead of moving to the start and end of a line, they scroll the entire terminal window. Fortunately, the Terminal app comes with key mapping built-in. F1-4 function keys are also mapped to some other functions. </p><p>To change this, &#xA0;go to Terminal -&gt; Preferences and click on the Profiles tab. In this tab, on the profile that you wish to use, click on the &quot;Keyboard&quot; tab. Click on the &quot;+&quot; button below the list of keymaps to add mappings for Home and End with the following escape sequences. Note that in the key mapping editor, use the <code>Esc</code> key to enter the starting escape sequence <code>\033</code>. Typing <code>\</code> will result in an escaped backslash (i.e. <code>\\</code> ) instead.</p><ul><li>Home: <code>\033OH</code></li><li>End: <code>\033OF</code></li><li>F1-4: <code>\033[11~</code>, <code>\033[12~</code>, <code>\033[13~</code>, <code>\033[14~</code></li></ul><p>The function keys should already have a keymap, edit those so that they work as intended in terminal programs. Finally, it might also be useful to check the option below the keymaps &quot;Use Option as Meta key&quot;, I do use it for several functions in Vim ( <code>&lt;M-...</code> ). </p><p>Don&apos;t want to do it yourself? <a href="https://gist.github.com/moodoki/008c42b780d8ade0743f7ef511599a7b">Here&apos;s</a> my Terminal profile with the keymaps configured, together with my color scheme. </p>]]></content:encoded></item><item><title><![CDATA[Visualizing High Dimensional Data - PCA, t-SNE and UMAP]]></title><description><![CDATA[<p>Much of the data that we deal with live naturally in a high dimensional space. Being humans in a 3-dimensional world, we have difficulty visualizing such data. Effective visualization is often useful in helping us gain insights on the data that we are dealing with. In order to do so,</p>]]></description><link>https://dust.teckyianlim.me/visualizing-high-dimensional-data-pca-t-sne-and-umap/</link><guid isPermaLink="false">60c95f838362b13de78b7739</guid><category><![CDATA[short read]]></category><category><![CDATA[code-snippet]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Wed, 30 Jun 2021 06:00:19 GMT</pubDate><content:encoded><![CDATA[<p>Much of the data that we deal with live naturally in a high dimensional space. Being humans in a 3-dimensional world, we have difficulty visualizing such data. Effective visualization is often useful in helping us gain insights on the data that we are dealing with. In order to do so, we require tools to reduce the number of dimensions to 1, 2 or 3. Fortunately, many of such tools are already implemented in popular data science packages like <a href="https://scikit-learn.org/">scikit-learn</a>, and visualizing these data is often as easy as a <code>fit_transform(data)</code>.</p><pre><code class="language-python"># Packages we use for plotting

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd</code></pre><p>In subsequent code snippets, <code>data</code> is an array, or slice of a pandas DataFrame, <code>df[feature_cols]</code>, where each row is an data point and columns are feature dimensions. </p><p>A colab notebook for this post is available <a href="https://gist.github.com/moodoki/851004bd4ac24e49e833d74da2162d1e">here</a>.</p><h3 id="pca-principal-component-analysis">PCA: Principal Component Analysis</h3><p>PCA finds the direction where the most variance is observed, set at first direction. Find the next largest variance after removing the first, set as next direction, and repeat this process until desired number of components are obtained. We are often able to stop well below the original number of dimensions, while capturing the majority of the variances in the data. </p><p>As a visualization method, PCA is good when the data is already linearly separable. However, it might not be as useful if the data lies on a lower dimension manifold embedded in a high dimensional space. It is also relatively cheap to compute, thus making it a good first thing.</p><pre><code class="language-python">from sklearn.decomposition import PCA

pca = PCA(n_components=2)
pca_result = pca.fit_transform(data)
df[&apos;pca_0&apos;] = pca_result[:, 0]
df[&apos;pca_1&apos;] = pca_result[:, 1]
print(f&apos;Explained var: {pca.explained_variance_ratio_}&apos;)

plt.figure(figsize=(16,10))
sns.scatterplot(
    x=f&apos;pca_0&apos;, y=f&apos;pca_1&apos;,
    hue=&quot;y&quot;,
    palette=sns.color_palette(&quot;colorblind&quot;, 10),
    data=df,
    legend=&quot;full&quot;,
    alpha=0.3
)</code></pre><h2 id="t-sne-t-distributed-stochastic-network-embedding">t-SNE: t-distributed Stochastic Network Embedding</h2><p>Suppose that our data is inherently low-dimension but lives in a high dimensional space (a rolled up 2D sheet (swiss rolls), a tangled strand of string, are common examples of such cases), PCA and other linear methods would not be an effective visualization. </p><pre><code class="language-python">from sklearn.manifold import TSNE

tsne = TSNE(n_components=2, verbose=1, perplexity=50, n_iter=300)
tsne_result = tsne.fit_transform(data)
df[&apos;tsne_0&apos;] = tsne_result[:, 0]
df[&apos;tsne_1&apos;] = tsne_result[:, 1]

plt.figure(figsize=(16,10))
sns.scatterplot(
    x=&apos;tsne_0&apos;, y=&apos;tsne_1&apos;,
    hue=&quot;y&quot;,
    palette=sns.color_palette(&quot;colorblind&quot;, 10),
    data=df,
    legend=&quot;full&quot;,
    alpha=0.3
)</code></pre><p>t-SNE, however, contains some hyperparameters and not setting them correctly could lead to misreading of the manifold. Here&apos;s good interactive post to see how each of these parameters matter and how to avoid certain pitfalls when using t-SNE as a visualization technique. <a href="https://distill.pub/2016/misread-tsne/">How to Use t-SNE Effectively (distill.pub)</a></p><h2 id="umap-uniform-manifold-approximation-and-projection">UMAP: Uniform Manifold Approximation and Projection</h2><p>UMAP is a method that isn&apos;t included in scikit-learn. Using it is almost exactly the same as scikit-learn methods.</p><pre><code class="language-python">umap_reducer = umap.UMAP()
umap_result = umap_reducer.fit_transform(data)

df[f&apos;umap_0&apos;] = umap_result[:, 0]
df[f&apos;umap_1&apos;] = umap_result[:, 1]

plt.figure(figsize=(16,10))
sns.scatterplot(
    x=f&apos;umap_0&apos;, y=f&apos;umap_1&apos;,
    hue=&quot;y&quot;,
    palette=sns.color_palette(&quot;colorblind&quot;, 10),
    data=df,
    legend=&quot;full&quot;,
    alpha=0.3
)</code></pre><h2 id="other-methods">Other Methods</h2><p>scikit-learn is an amazing package. It includes several other dimension reduction methods with a largely similar API. </p><hr><h3 id="changelog">Changelog</h3><ul><li>2021-06-29 Initial version</li><li>2021-07-01 Clarity on code, intro and some additional points on PCA</li></ul><hr><h3 id="todo">TODO?</h3><ul><li>Add additional reading as references</li><li>Add some useful insights and use cases</li></ul>]]></content:encoded></item><item><title><![CDATA[Embedding Youtube Videos]]></title><description><![CDATA[How to set sizes automatically with fixed aspect ratios in responsive webpages.]]></description><link>https://dust.teckyianlim.me/embedding-youtube-videos/</link><guid isPermaLink="false">602f54efd43fec087330278a</guid><category><![CDATA[web]]></category><category><![CDATA[code-snippet]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Sun, 21 Feb 2021 09:50:00 GMT</pubDate><content:encoded><![CDATA[<p>When copying the embed code from YouTube, a fixed width and height is given for the <code>iframe</code>. This will probably look ugly on web pages with responsive designs. As <code>iframe</code>s are not images, there&apos;s no way for the browser to know what height to set the frame to. Fortunately, all YouTube embeds are of 16:9 ratio, and we can work around this with a little bit of CSS.</p><h2 id="html-snippet">HTML snippet</h2><pre><code class="language-html">&lt;div&gt;
    &lt;div style=&quot;position:relative;padding-top:56.25%&quot;&gt;
        &lt;iframe src=&quot;&lt;youtube-embed-url-here&gt;&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen style=&quot;position:absolute;top:0;left:0;width:100%;height:100%;&quot;&gt;&lt;/iframe&gt;
    &lt;/div&gt;
&lt;/div&gt;</code></pre><h2 id="how-this-works">How this works</h2><p>Notice that we are not using the <code>width</code> and <code>height</code> properties, instead we are using CSS style. Two important things makes it work, the first <code>padding-top:56.25</code> creates a 16:9 aspect ratio <code>div</code> box for the <code>iframe</code> to fill up. Next, we set the <code>iframe</code>&apos;s <code>style</code> to <code>position:absolute;top:0;left:0;width:100%;height:100%;</code>. This positions the <code>iframe</code> at the top-left corner of the <code>div</code> block and sets it occupy the entire block which is of the correct aspect ratio. Now we have an auto-resizing embedded block that&apos;s filled with the desired YouTube video!</p>]]></content:encoded></item><item><title><![CDATA[Goodbye Heroku, hello GCP]]></title><description><![CDATA[Having a VM is still way easier. ]]></description><link>https://dust.teckyianlim.me/goodbye-heroku-hello-gcp/</link><guid isPermaLink="false">5e45718174d8c87d768b9f61</guid><category><![CDATA[web]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[gcp]]></category><category><![CDATA[cloud]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Sun, 05 Jul 2020 11:16:40 GMT</pubDate><content:encoded><![CDATA[<p>It&apos;s been a good run with Heroku, but it got a bit too hard to keep everything updated, and I get impatient waiting for free apps to spin-up as well. Perhaps I&apos;ll find other uses for Heroku.</p><p>Google provides a permanently free &#xA0;VM (alongside several other services, see them <a href="https://cloud.google.com/free">here</a>). This is the smallest instance type (f1-micro) that&apos;s available on Google Cloud Platform. We get:</p><ul><li>614 MB of RAM</li><li>30 GB persistent storage</li><li>1 GB egress</li></ul><p>For a small personal website, this is more than sufficient. </p><h1 id="create-vm-instance">Create VM Instance</h1><p>This should be pretty straight forward. From GCP console, navigate to Compute Engine and click on &quot;Create and instance&quot;. Make sure to select <code>f1-micro</code> and use a disk size of 30GB for things to be free. You should see that this costs approximately $5 a month, but also an additional note saying that the first 744 hours of this instance is free. Follow the on screen instructions and after a few minutes, your instance should be ready!</p><figure class="kg-card kg-image-card"><img src="https://dust.teckyianlim.me/content/images/2020/07/instance_creation.png" class="kg-image" alt loading="lazy" width="851" height="781" srcset="https://dust.teckyianlim.me/content/images/size/w600/2020/07/instance_creation.png 600w, https://dust.teckyianlim.me/content/images/2020/07/instance_creation.png 851w" sizes="(min-width: 720px) 720px"></figure><p>A static public IP should be assigned as well, and you should set up DNS setting accordingly with your registrar. </p><h1 id="installing-ghost">Installing Ghost</h1><p>I chose Ubuntu as my starting image and configuration for most things is a breeze. Detailed instructions on installing Ghost can be found on <a href="https://ghost.org/docs/install/ubuntu/">Ghost&apos;s Documentation page</a>, but as always, there&apos;s a paste-able block to save time.</p><figure class="kg-card kg-code-card"><pre><code class="language-bash"># log into your instance with gcloud compute ssh &lt;name&gt;

#Add user (ghost requires a special username ghost, create another to not have conflict
sudo adduser ghost-user
sudo usermod -aG sudo ghost-user

#Install dependencies
sudo apt-get update
sudo apt-get upgrade
# Nginx
sudo apt-get install -y nginx
sudo ufw allow &quot;Nginx Full&quot;
# MySQL
sudo apt-get install -y mysql-server

# Node
# Add the NodeSource APT repository for Node 12
curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash

# Install Node.js
sudo apt-get install -y nodejs

# Ghost-CLI
sudo npm install ghost-cli@latest -g</code></pre><figcaption>Ghost system preparation</figcaption></figure><p>With all the software components required, its now time to actually install Ghost. MySQL, in some cases, might require you to have a password, this can be set with:</p><figure class="kg-card kg-code-card"><pre><code class="language-bash"># To set a password, run
sudo mysql

# Now update your user with this password
# Replace &apos;password&apos; with your password, but keep the quote marks!
ALTER USER &apos;root&apos;@&apos;localhost&apos; IDENTIFIED WITH mysql_native_password BY &apos;password&apos;;

# Then exit MySQL
quit</code></pre><figcaption>Prepare MySQL</figcaption></figure><p>Finally, we will install ghost in <code>/var/www</code> as that&apos;s the convention.</p><figure class="kg-card kg-code-card"><pre><code class="language-bash">sudo mkdir -p /var/www/ghost
sudo chown ghost-user:ghost-user /var/www/ghost
sudo chmod 775 /var/www/ghost
cd /var/www/ghost

sudo -u ghost-user -i
ghost install</code></pre><figcaption>ghost install</figcaption></figure><p>The install script will ask you several questions. It&apos;s pretty straightforward and allowing the script to set up Nginx, MySQL user and database, systemd for automatic startup, etc. Site name and other details isn&apos;t important if you are importing from another site.</p><h2 id="updating-ghost">Updating Ghost</h2><p>New version is release? Updating is way easier now as we don&apos;t need to dance around Heroku&apos;s peculiarities. </p><figure class="kg-card kg-code-card"><pre><code>cd /var/www/ghost
sudo -u ghost-user -i
ghost check-update
ghost upgrade
exit</code></pre><figcaption>ghost update</figcaption></figure><h1 id="migrating-content">Migrating Content</h1><p>Fortunately, migrating content with Ghost is incredibly painless. Every single configuration and post can be exported as a giant, glorious json file. Just head over to the admin page <code>/ghost/</code> and click on &quot;Labs&quot;. There you should see an option to export your content.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dust.teckyianlim.me/content/images/2020/07/export_content.png" class="kg-image" alt loading="lazy"><figcaption>Export content json</figcaption></figure><p>This <code>json</code> file can then be easily imported. Just head to the same &quot;Labs&quot; page on the newly running Ghost website hosted on the free-tier VM. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dust.teckyianlim.me/content/images/2020/07/import_content.png" class="kg-image" alt loading="lazy"><figcaption>Import content json</figcaption></figure><h1 id="performance-of-the-vm">Performance of the VM</h1><p>I&apos;m serving quite a few things on the VM, mostly random stuff that I&apos;m experimenting with. For the most part, things run reasonably well. However, occasionally the SQL server seems to quit as memory is very limited. I found it helpful to enable swap. This can be done so easily with <code>dphys-swapfile</code></p><figure class="kg-card kg-code-card"><pre><code>sudo apt-get install -y dphys-swapfile

#Configure swap size
sudo vim /etc/dphys-swapfile</code></pre><figcaption>Enable swap with dphys-swapfile</figcaption></figure><p>Finally, Cloudflare&apos;s <a href="https://www.cloudflare.com/plans/">free website plan</a> could also be used to improve overall site performance and some bandwidth. Simply create a free account, add a free website and everything else should be pretty straightforward. </p>]]></content:encoded></item><item><title><![CDATA[Linux, Ultrabooks, CUDA and eGPUs]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>I work mostly on my Dell XPS 9365 these days. Since I&apos;m working with deep learning, it&apos;s often helpful to have a GPU locally for experimenting. Since I&apos;ve been able to get my hands on an Titan RTX, I&apos;ve decided to go</p>]]></description><link>https://dust.teckyianlim.me/cuda-on-an-ultrabook-with-egpu/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f1b</guid><category><![CDATA[hardware]]></category><category><![CDATA[xorg]]></category><category><![CDATA[egpu]]></category><category><![CDATA[linux]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Tue, 19 Nov 2019 18:16:54 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>I work mostly on my Dell XPS 9365 these days. Since I&apos;m working with deep learning, it&apos;s often helpful to have a GPU locally for experimenting. Since I&apos;ve been able to get my hands on an Titan RTX, I&apos;ve decided to go ahead and give my main workhorse a boost when I&apos;m at my desk.</p>
<p>Getting an external GPU working is no longer as difficult compared to several years ago when USB-C and Thunderbolt was initially introduced.</p>
<p>Almost plug and play, simply plug everything in and install CUDA drivers as per <a href="https://developer.nvidia.com/cuda-downloads">instructions</a> from Nvidia.</p>
<p>If you were to reboot now, you will find that the graphical login manager will fail to start. Studying the logs reveals that X isn&apos;t able to find a usable display. This is due to X not allowing external GPUs by default. If you have an internal GPU, you might not face this problem. The exteral GPU is now available for CUDA, but not for running X.</p>
<p>To get X working, we need to add <code>Option &quot;AllowExternalGpus&quot; &quot;True&quot;</code> to the X configuration template <code>/usr/share/X11/xorg.conf.d/10-nvidia.conf</code>.</p>
<p>This is how the file should look like after the edit:</p>
<pre><code>Section &quot;OutputClass&quot;
    Identifier &quot;nvidia&quot;
    MatchDriver &quot;nvidia-drm&quot;
    Driver &quot;nvidia&quot;
    Option &quot;AllowEmptyInitialConfiguration&quot;
    Option &quot;AllowExternalGpus&quot; &quot;True&quot;
    ModulePath &quot;/usr/lib/x86_64-linux-gnu/xorg&quot;
EndSection
</code></pre>
<h1 id="importantnoteabouthotplugging">Important note about hot-plugging</h1>
<p>I&apos;ve not tried hot-plugging and I have no idea what will happen if I do so, but this functionality isn&apos;t required by me for now. It should work in theory and much more information can be found in the incredibly useful site at <a href="https://egpu.io">eGPU.io</a>, where I did my research before purchasing my eGPU enclosure.</p>
<hr>
<h3 id="mysetup">My setup</h3>
<ul>
<li>Dell XPS 9365</li>
<li><a href="https://amzn.to/2rZviih">Razer Core X Chroma</a></li>
<li><a href="https://amzn.to/2XqhuZQ">NVidia Titan RTX</a></li>
<li>Ubuntu 18.04.3 LTS</li>
</ul>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Updating Ghost on Heroku]]></title><description><![CDATA[oh no... security warnings!]]></description><link>https://dust.teckyianlim.me/updating-heroku-ghost/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f1a</guid><category><![CDATA[web]]></category><category><![CDATA[heroku]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Tue, 19 Nov 2019 17:49:57 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>So Github now has security warnings, this means updates are due? Update instructions on Ghost.org doesn&apos;t really help a Heroku hosted option. Took me a while to figure out how to update everything.</p>
<h2 id="downloadandextractthenewversionovertheoldones">Download and extract the new version over the old ones</h2>
<pre><code class="language-bash">#Download the newer version (1.26.0 as of writing)
wget https://github.com/TryGhost/Ghost/releases/download/1.26.0/Ghost-1.26.0.zip
cd $APP_DIR

#Overwrite all the old files. Additional node modules installed needs to be added later
unzip ../Ghost-1.26.0.zip

#Reinstall the storage adapter
yarn add ghost-github
#Also fix the submodule with the configs
git submodule foreach git pull origin master

</code></pre>
<p>At this point, there will be a bunch of updated files in node_modules. Make sure to ignore this in the git commit.</p>
<h2 id="smalleditsrequiredforfreedatabases">Small edits required for free databases</h2>
<p>The free JawsDB database that we configured previously has a limit of 10 connections. The defaults used by the database migrator seems to exceed this limit. This limit can be set by setting a config variable:</p>
<pre><code class="language-bash">heroku config:set database__pool__max=2
</code></pre>
<p>However, it seems like we this value will get interpreted as a string. To fix this, edit <code>core/server/config/index.js</code> (around line 30):</p>
<pre><code class="language-js">nconf.env({
    seperator: &apos;__&apos;,
    parseValues: True,
});
</code></pre>
<h2 id="nowweshouldbereadytocommitandpusheverything">Now we should be ready to commit and push everything</h2>
<pre><code class="language-bash">git add .
git commit -m &quot;Update to 1.26.0&quot;
git push heroku master

#Migrate database 
heroku run knex-migrator migrate db
</code></pre>
<p>Some security issues still show up on Github.. EOL for 1.x is Jan 2020. Hopefully upgrading to 2.x or 3.x won&apos;t be too difficult. But that&apos;s for a later time.</p>
<hr>
<h1 id="githubstorageadapterfixes">Github Storage Adapter Fixes</h1>
<p>Updated repo can be found <a href="https://github.com/moodoki/ghost-github.git">here</a></p>
<p><code>index.js</code> was copied from <code>node_modules</code> after <code>yarn add ghost-github</code>. Processing of heroku environment variables were added to the constructor.</p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Basics of FMCW Radar]]></title><description><![CDATA[A not too technical overview of the basic operating principles of an FMCW radar.]]></description><link>https://dust.teckyianlim.me/basics-of-fmcw-radar/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f14</guid><category><![CDATA[signal processing]]></category><category><![CDATA[radar]]></category><category><![CDATA[dsp]]></category><category><![CDATA[fmcw]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Thu, 01 Aug 2019 04:08:28 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Radar stands for &quot;RAdio Detection And Ranging&quot;, initially a top-secret military technology for detecting invading aircraft long before they are visible, is now making its way into our daily lives. Many modern vehicles are equipped with short-range radars as a safety feature, in adaptive cruise control and collision avoidance systems. Google&apos;s <a href="https://atap.google.com/soli/">Project Soli</a> takes this to the next level by using it as a close-range sensor for use in mobile devices.</p>
<h1 id="basicprinciplesofradars">Basic Principles of Radars</h1>
<p>Radars work on a simple idea: send out a radio signal, wait for an echo. The time it takes for the echo to arrive is directly proportional to the distance of the reflecting object.</p>
<p>A basic version of this idea would be a Pulse Radar<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>. Transmit is on for an instant, followed by a period of waiting for echoes. Mathematically, the transmitted signal is:<br>
$$S_T = A(t)sin( 2 \pi f_c t + \phi_0 )$$</p>
<p>Where $A(t)$ is a constant transmit amplitude when the radar is transmitting and zero otherwise. $f_c$ is transmission frequency and $\phi_0$ is the starting phase. Without loss of generality, can assume that the starting phase is $0$ and will drop the term for clarity of notation, and only reintroduce it where the difference is significant.</p>
<p>In addition to estimating range from the time delay, non-zero relative velocity results in frequency shifts in a phenomenon known as the Doppler effect<sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup>. As the transmitted signal is a single frequency, we can estimate the relative velocity of the reflecting object by measuring the Doppler effect that causes a change in frequency of the reflected pulse.</p>
<p>Although simple in terms of operating principles, due to the speed of light, pulse radars are blind at very short ranges (below 1Km). While not an issue for long-range applications (e.g. aircraft, ships), this makes them of limited use where the range is small.</p>
<h1 id="fmcwradars">FMCW Radars</h1>
<p>In contrast with traditional pulse radars, an FMCW (Frequency Modulated Continuous Wave) radar transmits a signal who&apos;s frequency changes with time, often referred to as a chirp:</p>
<p>$$S_{T}(t) = A_{T} \cos\left(2 \pi (f_c + f_\tau(t) ) t \right)$$</p>
<p>Where $f_c$ is the starting frequency and $f_\tau(t)$ is a function describing how the frequency changes over time. One possible waveform is a sawtooth (in frequency-time) signal. i.e. (for single chirp):</p>
<p>$$S_{T}(t) = A_{T} \cos\left(2 \pi (f_c + B t ) t \right)$$<br>
Where $B$ is the slope or the rate of frequency. For the rest of the discussion, we assume that we are working with a sawtooth wave.</p>
<p>Similar to the classical radar, we expect to receive a time-delayed and Doppler-shifted version of the transmitted signal. In contrast with the classical radar, both the transmitter and receiver are on simultaneously. Thus, there are no problems with very short ranges.</p>
<h2 id="estimatingrange">Estimating Range</h2>
<p>The reflected waveform is a delayed version of the transmitted wave. Again, by measuring this delay, we can compute the distance the object is away from the radar. At the receiver, a mixer (multiplier) mixes the reflected signal with the transmitted signal. Next, this signal passes through a low-pass filter and is sampled by an ADC. At any instant, we can describe the signal as:</p>
<p>$$S_{rx} = A \cos(\alpha)\cos(\beta)$$</p>
<p>Where, $\alpha$ is the frequency that is being transmitted and $\beta$ is the frequency that had been reflected. Using the product to sum identity, we can see that:</p>
<p>$$S_{rx} = (A/2) \left( \cos(\alpha-\beta) + \cos(\alpha + \beta) \right)$$</p>
<p>In this form, we see that there are two frequency components in the received signal &#x2014; one of much lower frequency than the transmitted waveform and one of very high frequency. After a low-pass filter, this leaves us with a signal that does not have very stringent ADC requirements, as compared to the original GHz-band signal.</p>
<p>Since the slope is known, we can determine the time delay (and distance) easily as follows:<br>
$$d = \frac{ f }{2B} \cdot c_0$$<br>
Where $c_0$ is the speed of light in free space.</p>
<p>Since the mixed signal gives us a frequency difference, all we have to do is to perform an FFT over the entire chirp, and the (frequency) location of the (amplitude) peaks is directly proportional to the range of the target. In FMCW radar literature, this is often referred to as the &quot;intermediate frequency&quot;, &quot;beat frequency&quot; or the IF signal.</p>
<h3 id="ondopplereffect">On Doppler Effect</h3>
<p>With a sawtooth wave, there is no way to disentangle frequency shifts that is due to a non-zero relative velocity. It is treated as measurement noise for low-velocity targets. If this is not the case, a different waveform might be a more suitable choice.</p>
<h2 id="estimatingrelativevelocity">Estimating Relative Velocity</h2>
<p>While we are unable to resolve the velocity of a target from a single chirp, if we look across multiple chirps, the relative velocity can be recovered. Recall that we are assuming that the velocity of the target is small, and its range does not change significantly over several chirps. Numerically this results in FFTs with peaks at the same frequency bin. While unable to be resolved as different distances, this small displacement manifests as a phase shift.</p>
<p>Suppose two chirps are sent $T_c$ seconds (usually in the order of microseconds) apart. Recall that the IF signal is a sinusoid:<br>
$$A\cos(2\pi f t + \phi_0)$$</p>
<p>If the object is stationary, the phase term of the first chirp will be identical to that of the second chirp. However, if there is a slight change in distance between the first and second chirp, the IF signal of the second chirp will be a phase delayed version of the first chirp. With phase delay:<br>
$$\Delta \phi = \frac{4\pi \Delta d}{\lambda}$$</p>
<p>Using a $77\text{GHz}$ radar, a $1\text{mm}$ ($\lambda/4$) displacement will result in a $\pi/2$ phase shift, with only an insignificant change in frequency. (The reader is encouraged to plug in some values here to see this. A typical slope, $B$, for a 77GHz FMCW radar is $50 \text{MHz}/\mu s$.)</p>
<p>Rearranging and dividing by the time between chirps, $T_c$, we obtain the relationship between the phase difference and the velocity of the target:<br>
$$v = \frac{\lambda \Delta \phi}{4 \pi T_c}$$</p>
<p>Numerically, the phase difference can be obtained by performing an FFT across chirps. The number of chirps and the period between the chirps determines the velocity resolution.</p>
<p>In a practical FMCW radar system, $N$ chirps are sent and processed as a group in order to determine the velocity of the target. We call this sequence of of chirps a frame and this is the basic unit of FMCW radar signal.</p>
<h1 id="conclusion">Conclusion</h1>
<p>We have now established the basic principles behind the FMCW radar. We saw that by performing 2 FFTs, one within a chirp and another across chirps, we can estimate the range and relative velocity of a reflecting target. To design a system that operates with some desired performance parameters, we leave the following points as things to ponder about:</p>
<ol>
<li>What are the limitations of an FMCW radar?</li>
<li>What determines the minimum resolvable distance (i.e. range resolution)?</li>
<li>What is the velocity resolution?</li>
<li>Is there an ambiguity in velocity estimation?</li>
<li>To measure the speed of vehicles, how long should each chirp be? What&apos;s the periodicity of the chirps?</li>
<li>What about angle estimation?</li>
</ol>
<p>A reader with some knowledge in digital signal processing should be able to derive these limits with the information in this post. We will leave these topics as an exercise for now, and provide a detailed treatment in the next post.</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="http://www.radartutorial.eu/02.basics/Pulse%20Radar.en.html">http://www.radartutorial.eu/02.basics/Pulse Radar.en.html</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p><a href="https://en.wikipedia.org/wiki/Doppler_effect">https://en.wikipedia.org/wiki/Doppler_effect</a> <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[4 Lines to Using TPUs on Google's Colab]]></title><description><![CDATA[Just add 4 lines to your Keras code and you can now train on TPUs in Colab for free. Why not give it a go?]]></description><link>https://dust.teckyianlim.me/using-tpus-on-googles-colab/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f12</guid><category><![CDATA[tensorflow]]></category><category><![CDATA[tpu]]></category><category><![CDATA[machine learning]]></category><category><![CDATA[code-snippet]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Fri, 03 May 2019 05:17:42 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p><a href="https://colab.research.google.com">Google Colab</a> is a massive contribution to the democratization of machine learning. Not only are GPUs available for free (1x K80 at the time of writing), you can also use Google&apos;s TPUs (Tensor Processing Units) for free. While there are some limitations, pretty big and non-trivial models can be trained so long as you have access to the internet and a relatively modern browser. What&apos;s more, it should not take more than a few minute of your time to try it out.</p>
<h2 id="selecttpuruntime">Select TPU runtime</h2>
<p>In Colab menu, &quot;Runtime -&gt; Change runtime type&quot;. In the window that appears, under Hardware Accelerator, select TPU.</p>
<h2 id="4linestotpu">4 lines to TPU</h2>
<pre><code class="language-python">from tensorflow.contrib.tpu.python.tpu import keras_support

tpu_grpc_url = &quot;grpc://&quot;+os.environ[&quot;COLAB_TPU_ADDR&quot;]
tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu_grpc_url)
strategy = keras_support.TPUDistributionStrategy(tpu_cluster_resolver)
model = tf.contrib.tpu.keras_to_tpu_model(model, strategy=strategy)
</code></pre>
<p>That&apos;s it. (That&apos;s actually just a single line if you don&apos;t care about long lines)</p>
<p>If you already have a working Keras model, this is all you need to get it running in colab. Train it as usual with <code>model.fit_generator(...)</code></p>
<hr>
<h1 id="extrastuffnotes">Extra Stuff/Notes</h1>
<p>For completeness...</p>
<h2 id="gettingyourcodeanddataontocolab">Getting your code and data onto colab</h2>
<p>This is probably the hardest thing to do. Colab runtimes are given a 50GB temporary storage (approximately 30GB usable). If your code is on github or somewhere publicly accessible, command line tools are available from within the notebook.</p>
<p>The can be downloaded easily like this</p>
<pre><code>!git clone &lt;your-code.git&gt;
!wget http://your.data.server/dataset.tar.gz
</code></pre>
<p>Or you can click on the &apos;&gt;&apos; on the left to open a side panel where you can upload files.</p>
<h2 id="notesandcommonproblems">Notes and common problems</h2>
<ul>
<li>Copying back to CPU takes a while. Reducing the number of checkpoints will speed up the training significantly.</li>
<li>Use of learning rate scheduler is required, even if it&apos;s just a constant.</li>
<li>The initial compilation of the TPU model might take quite a while, especially for very large models.</li>
<li>Error messages might be a little cryptic. I would definitely get a model running properly locally before running it on a TPU.</li>
</ul>
<h2 id="runnablenotebook">Runnable Notebook</h2>
<p><a href="https://colab.research.google.com/drive/12aezd43epJ-lmQdpvowIoqXfoXvFiQMX">https://colab.research.google.com/drive/12aezd43epJ-lmQdpvowIoqXfoXvFiQMX</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Measuring Audio Quality]]></title><description><![CDATA[Audio perception is a complex process. Involving knowledge from physical systems to psychology. Here I look at some of the attempts in applying an objective measure to what constitutes a good quality audio.]]></description><link>https://dust.teckyianlim.me/audio-quality/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f0e</guid><category><![CDATA[signal processing]]></category><category><![CDATA[audio]]></category><category><![CDATA[dsp]]></category><category><![CDATA[paper reading]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Mon, 01 Apr 2019 14:38:13 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>At the heart of all audio processing algorithms, is some notion of the quality of the results that signal. In compression, the algorithm attempts to reduce the resources(e.g. bitrate, bandwidth, etc.) required while having as little as possible impact on the input signal. In audio enhancement, the algorithm takes an input signal and attempts to produce a signal that scores better on some quality metric.</p>
<p>But measuring quality is hard. The perception of audio is as much a psychological process as it is also physical. Given a reference signal, we can always use the L2 distance, SNR, or some mathematically defined metric as a measure of quality. Such objective distances do not always correlate closely with how an (averaged) human listener perceives them.</p>
<h1 id="subjectivemeasure">Subjective Measure</h1>
<h2 id="meanopinionscoremos">Mean Opinion Score (MOS)</h2>
<p>Since people opinions might differ, it seems reasonable to collect the opinions of multiple listeners, so as to obtain an average opinion on the quality. Expert listeners (people trained to pick up problems with audio) were asked to rate an audio sample against an original with the following scale.<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup></p>
<table>
<thead>
<tr>
<th>Rating</th>
<th>Speech Quality</th>
</tr>
</thead>
<tbody>
<tr>
<td>5</td>
<td>Excellent</td>
</tr>
<tr>
<td>4</td>
<td>Good</td>
</tr>
<tr>
<td>3</td>
<td>Fair</td>
</tr>
<tr>
<td>2</td>
<td>Poor</td>
</tr>
<tr>
<td>1</td>
<td>Bad</td>
</tr>
</tbody>
</table>
<p>The arithmetic mean is then computed to obtain the MOS.</p>
<h2 id="mushra">MUSHRA</h2>
<p>In a similar spirit of MOS, the MUltiple Stimuli with Hidden Reference and Anchor (MUSHRA) is a method of obtaining an averaged opinion of human listeners. This test is aimed at audio files of intermediate quality. <sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup></p>
<h1 id="objectivemeasures">Objective Measures</h1>
<p>While having expert human listeners in a well-controlled environment is definitely the gold standard of determining the quality of an audio clip, it&apos;s not always practical and scalable, and especially when tuning an audio processing algorithm. Here, we look at some objective measures of speech (audio) quality and their definitions. This list is by no means exhaustive.</p>
<p>We define $x$ as the reference signal and $y$ as the signal under test. Capital letters denote a frequency domain representation.</p>
<h2 id="segsnrsegmentalsignaltonoiseratio">SegSNR (Segmental Signal-to-Noise Ratio)<sup class="footnote-ref"><a href="#fn3" id="fnref3">[3]</a></sup></h2>
<p>Defined as:<br>
$$\frac{10}{N}\sum^N_{i=1}\left(\frac{\sum^M_jx^2_{i,j}}{\sum^M_j(y_{i,j}-x_{i,j})^2}\right)$$</p>
<p>Where $x$ is the reference signal and $y$ is the signal under test. Subscripts $i, j$ refers to start and end time indexes. This computes the SNR of segments and then obtains an average SNR of all segments.</p>
<h2 id="lsdlogspectraldistance">LSD (Log Spectral Distance)</h2>
<p>Defined as:<br>
$$ \frac{1}{N}\sum^N_{i=1}\sqrt{\frac{1}{M/2+1}\sum_{j=0}^{M/2}\left(10\log_{10}\frac{|Y_{i,j}|}{|X_{i,j}|}\right)^2} $$<br>
A frequency domain assessment of speech audio quality.<sup class="footnote-ref"><a href="#fn4" id="fnref4">[4]</a></sup> $X, Y$ are the STFT spectrum of the original and signal under test, subscripted by their time index $i$ and frequency bin $j$.</p>
<h2 id="wssweightedspectralslope">WSS (Weighted Spectral Slope)</h2>
<p>$$\frac{1}{N}\sum_{j=0}^{N-1}\left(\frac{\sum_{i=1}^M W_{i,j}(S_{i,j}-X_{i,j})}{\sum_{i=1}^M W_{i,j}}\right)$$</p>
<p>An auditory model based frequency domain assessment of speech audio quality.<sup class="footnote-ref"><a href="#fn5" id="fnref5">[5]</a></sup><sup class="footnote-ref"><a href="#fn6" id="fnref6">[6]</a></sup> The main idea behind this algorithm is to compare the slope of frequencies grouped into weighted sub-bands.</p>
<h2 id="pesqperceptualevaulationofspeechquality">PESQ (Perceptual Evaulation of Speech Quality)</h2>
<p>This is a very involved objective metric with the goal of reproducing the MOS of human listeners. Several preprocessing steps were performed to align and equalize the input audio, and finally, a simple neural net is used to predict the MOS scores.<sup class="footnote-ref"><a href="#fn7" id="fnref7">[7]</a></sup></p>
<h1 id="suitabilityofobjectivemeasures">Suitability of Objective Measures</h1>
<p>A study<sup class="footnote-ref"><a href="#fn8" id="fnref8">[8]</a></sup> was conducted to investigate how well the objective measures listed above compared with the MOS scores. The authors found that not all objective measures correlate well with the scores given by human listeners. Some measures may correlate well on one type of noise but not the others.</p>
<p>3 types of noise(from <a href="https://catalog.ldc.upenn.edu/LDC2017S04">TIMIT</a>) were added to original signals, namely the white noise, factory noise, and babble noise. It was found that SegSNR performed poorly under all noise types and the following were well correlated under types:</p>
<ul>
<li>White Noise: LSD, WSS, PESQ</li>
<li>Factory Noise: LSD, WSS, PESQ</li>
<li>Babble Noise: LSD, PESQ</li>
</ul>
<p>The authors also claimed that LSD correlates the best with human listeners.</p>
<h1 id="differencesbetweenspeechandmusic">Differences Between Speech and Music<sup class="footnote-ref"><a href="#fn9" id="fnref9">[9]</a></sup></h1>
<p>In order to better understand the perception of audio quality, it&apos;s important to understand some properties of audio signals. Equipped with an understanding of the statistics of audio signals, we can then apply these models to enhance or regenerate missing components, thereby improving the perceived audio quality. We can broadly classify audio into 2 categories, namely speech and music. We will look at how these signals can be modelled and also some characteristics of sounds and audio.</p>
<h2 id="speechsignals">Speech Signals</h2>
<p>Speech is an important form of human communications. Due to its importance, there is a wealth of studies on its properties, specialized algorithms are often created just for speech signals alone. PESQ, above, is just one example. Applying PESQ to music probably will not give you a reliable score.</p>
<p>A good quality speech signal should be natural sounding and intelligible. Speech signals can be split into 2 components, the voiced component, modelled as an impulse train of the speaker&apos;s pitch, and noise-like unvoiced components, modelled by a white noise generator (Figure 1).</p>
<p><img src="https://raw.githubusercontent.com/moodoki/phantom-dust/assets/assets/2019/03/SpeechProductionBlock.png" alt="SpeechProductionBlock" loading="lazy"></p>
<p>While the fundamental frequency of human speech tends to be from around 85Hz to 255Hz (inclusive of both adult males and females), harmonics can be observed up to 8kHz. Energy can also be observed in even higher frequencies due to the presence of unvoiced portions of speech that isn&apos;t produced by the vocal cord. Besides segments with spectral content, the ratio of silence to non-silence time segments is also an important property.</p>
<p><img src="https://raw.githubusercontent.com/moodoki/phantom-dust/assets/assets/2019/03/its_all_greek_spectrogram.png" alt="It&apos;s All Greek to Me" loading="lazy"><br>
Spectrogram of <a href="https://commons.wikimedia.org/wiki/File:En-us-it%27s_all_Greek_to_me.ogg">It&apos;s all Greek to me.</a></p>
<h2 id="music">Music</h2>
<p>Music, on the other hand, tends to have clear bandpass characteristics and regular temporal patterns as seen in the spectrogram. The shape of the spectrum is largely dependent on the instrument.</p>
<p><img src="https://raw.githubusercontent.com/moodoki/phantom-dust/assets/assets/2019/03/music_20-22s.png" alt="Spectrum of Mid-Air Machine - Those Who Discard the World" loading="lazy"><br>
Spectrum of the 20s-22s segment from Mid-Air Machine - Those Who Discard the World<sup class="footnote-ref"><a href="#fn10" id="fnref10">[10]</a></sup></p>
<p>Given the differences in the statistics of music and human speech, we should expect that objective measures of quality will differ for voice and music. In fact, ITU had also published a PEAQ measure, in a similar spirit to the PESQ for speech. I have yet to find a study on how well the objective measures compare to a (averaged) human listener&apos;s evaluation.</p>
<h1 id="finalwords">Final Words</h1>
<p>With all that we are playing around with deep neural nets, selecting or designing a good loss function is paramount to the success of the network. If we were to select a loss function that doesn&apos;t reflect how human listeners perceive audio, all might be just a fool&apos;s errand.</p>
<hr>
<h1 id="revisions">Revisions</h1>
<ul>
<li>6/5/2019: Fix MathJax/Markdown problems resulting in equations not rendering</li>
</ul>
<hr>
<h1 id="appendix">Appendix:</h1>
<p>Implementations for computing some of the above metrics.<br>
<a href="https://github.com/moodoki/audio_metrics">Github link</a></p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>P. 800: Methods for subjective determination of transmission quality <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p>BS.1534 : Method for the subjective assessment of intermediate quality levels of coding systems <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn3" class="footnote-item"><p>Hansen, J. H., &amp; Pellom, B. L. (1998). An effective quality evaluation protocol for speech enhancement algorithms. In Fifth International Conference on Spoken Language Processing. <a href="#fnref3" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn4" class="footnote-item"><p>Beh, J., Baran, R. H., &amp; Ko, H. (2006). Dual channel based speech enhancement using novelty filter for robust speech recognition in automobile environment. IEEE Transactions on Consumer Electronics, 52(2), 583-589. <a href="#fnref4" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn5" class="footnote-item"><p>Klatt, D. (1982, May). Prediction of perceived phonetic distance from critical-band spectra: A first step. In ICASSP&apos;82. IEEE International Conference on Acoustics, Speech, and Signal Processing (Vol. 7, pp. 1278-1281). IEEE. <a href="#fnref5" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn6" class="footnote-item"><p>Kokkinakis, K., &amp; Loizou, P. C. (2011, May). Evaluation of objective measures for quality assessment of reverberant speech. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2420-2423). IEEE. <a href="#fnref6" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn7" class="footnote-item"><p>Rix, A. W., Beerends, J. G., Hollier, M. P., &amp; Hekstra, A. P. (2001). Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221) (Vol. 2, pp. 749-752). IEEE. <a href="#fnref7" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn8" class="footnote-item"><p>Jie, Z., Zhao, X., Xu, J., &amp; Yang, Z. (2014, July). Suitability of speech quality evaluation measures in speech enhancement. In 2014 International Conference on Audio, Language and Image Processing (pp. 22-26). IEEE. <a href="#fnref8" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn9" class="footnote-item"><p>Aarts, R. M., Larsen, E., &amp; Ouweltjes, O. (2003, October). A unified approach to low-and high-frequency bandwidth extension. In Audio Engineering Society Convention 115. Audio Engineering Society. <a href="#fnref9" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn10" class="footnote-item"><p>Creative Commons music from <a href="http://freemusicarchive.org/music/Ask%20Again/Mid-Air_Machine_-_Singles/Those_Who_Discard_the_World">http://freemusicarchive.org/music/Ask Again/Mid-Air_Machine_-_Singles/Those_Who_Discard_the_World</a> <a href="#fnref10" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Wolfson DAC for the Raspberry Pi 1]]></title><description><![CDATA[Necroed my old Raspberry Pi 1 with the Wolfson DAC that was collecting dust. Amazing audio quality from this tiny thing. ]]></description><link>https://dust.teckyianlim.me/wolfson-dac-for-the-raspberry-pi-1/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f10</guid><category><![CDATA[raspberrypi]]></category><category><![CDATA[raspberrypi1]]></category><category><![CDATA[wolfson_dac]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Thu, 28 Mar 2019 18:16:32 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Necroed my old Raspberry Pi 1 with the Wolfson DAC that was collecting dust.</p>
<p>Card specs<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>:</p>
<ul>
<li>Multiple analogue I/O</li>
<li>Digital IO (SPDIF)</li>
<li>Class-D amp for direct connection to speakers (some headers need to be soldered)</li>
<li>Stereo MEMS microphones</li>
<li>24-bit, 192kHz output</li>
</ul>
<p>Much better than on-board audio<sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup>:</p>
<ul>
<li>Analoge sound generated using PLL</li>
<li>11-bit, 48kHz analogue out</li>
<li>no input</li>
</ul>
<p>Used to require some image from element14/farnell. Now supported in official images. (For quite a while now<sup class="footnote-ref"><a href="#fn3" id="fnref3">[3]</a></sup>...)</p>
<h2 id="filestoedit">Files to edit</h2>
<h3 id="bootconfig"><code>/boot/config</code></h3>
<pre><code>...
# Wolfson audio
dtoverlay=rpi-cirrus-wm5102
</code></pre>
<h3 id="etcmodprobedcirrusconf"><code>/etc/modprobe.d/cirrus.conf </code></h3>
<pre><code>softdep arizona-spi pre: arizona-ldo1
#Fix card numbering, wolfson(cirrus) 0, onboard 1
options snd slots=snd-soc-rpi-cirrus,snd-bcm2835
</code></pre>
<h2 id="configuringthecardsinputsandoutputs">Configuring the card&apos;s inputs and outputs</h2>
<p>Download helper scripts <a href="http://www.horus.com/~hias/tmp/cirrus/cirrus-ng-scripts.tgz">here</a>. Extract these and put them somewhere in your <code>PATH</code></p>
<p>These are basically <code>amixer</code> scripts to help you configure for various tasks:</p>
<ul>
<li><code>Record_from_*.sh</code> to choose recording input</li>
<li><code>Playback_to_*.sh</code> to choose output</li>
<li><code>Reset_paths.sh</code> to set inputs and outputs to defaults.</li>
<li><code>Cirrus_listen.sh</code> to configure IO mixing. (eg. SPDIF to lineout)</li>
</ul>
<h2 id="appendixes">Appendixes</h2>
<p>Original docs <a href="https://www.horus.com/~hias/cirrus-driver.html">here</a></p>
<p>Finally, some high-res audio to play with <a href="http://www.2l.no/hires/">here</a>.</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="https://www.farnell.com/datasheets/1805130.pdf">https://www.farnell.com/datasheets/1805130.pdf</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p><a href="https://www.raspberrypi.org/forums/viewtopic.php?t=59823">https://www.raspberrypi.org/forums/viewtopic.php?t=59823</a> <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn3" class="footnote-item"><p><a href="https://www.horus.com/~hias/">https://www.horus.com/~hias/</a> <a href="#fnref3" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Portable Ubuntu USB Stick with Persistent Storage]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>UNetBootin allows creation of a such a stick, but it only takes care of creating stuff that&apos;s needed for legacy BIOS boot<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>. If you are using UEFI (Macs don&apos;t do legacy boot), things won&apos;t work as expected. To get it working, add <code>persistent</code></p>]]></description><link>https://dust.teckyianlim.me/portable-ubuntu-usb-stick/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f02</guid><category><![CDATA[linux]]></category><category><![CDATA[ubuntu]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Mon, 25 Mar 2019 11:09:07 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>UNetBootin allows creation of a such a stick, but it only takes care of creating stuff that&apos;s needed for legacy BIOS boot<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>. If you are using UEFI (Macs don&apos;t do legacy boot), things won&apos;t work as expected. To get it working, add <code>persistent</code> to the &quot;Try Ubuntu without installing&quot; entry of <code>grub.cfg</code> and <code>loopback.cfg</code> in the <code>/boot/grub</code> folder of the USB stick that was created. (Or create a new menu entry)</p>
<pre><code>...
menuentry &quot;Try Ubuntu without installing&quot; {
	set gfxpayload=keep
	linux	/casper/vmlinuz  file=/cdrom/preseed/ubuntu.seed boot=casper  quiet splash persistent ---  
	initrd	/casper/initrd.lz
}
...
</code></pre>
<p>The <code>casper-rw</code> file is where persistency is stored. This is simply a large file, and formatted with a filesystem, say <code>ext4</code>.</p>
<p>Creating this file in Linux is easy:</p>
<pre><code class="language-bash">dd if=/dev/zero of=casper-rw count=4192 bs=1M
mkfs.ext4 -F casper-rw
</code></pre>
<p>With this file on the usb stick, passing the <code>persistent</code> paramerter to the kernel boot options will mount this partition when booting. Any changes made to the root file system will be stored in this file.</p>
<p>To verify that this is configured correctly, run <code>df -h</code>. You should see that theres a line that says:</p>
<pre><code>Filesystem             Size  Used Avail Use% Mounted on
...
/cow                   3.9G 1019M  2.7G  28% /
...
</code></pre>
<h1 id="bonusround0hostname">Bonus Round 0 - Hostname!</h1>
<p>Perhaps you want to have a fixed host name for your usb stick. This can be done easily by editing the same menu entry by adding <code>hostname=yournamehere</code>.</p>
<h1 id="bonusround1largerpersistentstorage">Bonus Round 1 - Larger Persistent Storage</h1>
<p>The limitation of using a loopback file is that the file can&apos;t be larger that the maximum file size supported by the underlying filesystem. Since the Ubuntu stick uses FAT32, we are stuck with a maximum of 4GB. This can be overcame by creating an actual partition for the files instead. So long as the partition has the label <code>casper-rw</code>.</p>
<h2 id="specialnames">Special names</h2>
<p>Besides <code>casper-rw</code>, you can also create a <code>home-rw</code> to be automatically mounted as <code>/home</code>.</p>
<p>Furthermore, you can also create <code>casper-sn*</code> and <code>home-sn*</code> to be used as snapshots. These snapshots are copied to the filesystem after the persistent volumes are mounted. (More details <a href="http://manpages.ubuntu.com/manpages/bionic/man7/casper.7.html">here</a>.)</p>
<h1 id="bonusround2encryptedhome">Bonus Round 2 - Encrypted <code>home</code></h1>
<p>First create a fully encrypted partition. Current Ubuntu Live images (last checked 18.04) have dm-crypt included.</p>
<h2 id="createencryptedpartitonwithdmcryptandluks">Create encrypted partiton with dm-crypt and LUKS</h2>
<ol>
<li>Install cryptsetup if not available: <code>apt install cryptsetup-bin</code></li>
<li>Create encrypted partition: <code>cryptsetup -v -y luksFormat /dev/sdXX</code></li>
<li>Open encrypted partition: <code>cryptsetup luksOpen /dev/sdXX home-rw</code></li>
<li>Check status if desires: `cryptsetup -v status home-rw&apos;</li>
<li>Fill with zeros for security: <code>dd if=/dev/zero of=/dev/mapper/home-rw bs=1M status=progress</code> (This can take a <em>very</em> long time)</li>
<li>Format with desired filesystem: <code>mkfs.ext4 /dev/mapper/home-rw</code></li>
</ol>
<h2 id="automounting">Automounting</h2>
<p>Encrypted partitions can&apos;t be picked up by casper boot&apos;s automounting, and editing /etc/fstab doesn&apos;t work, this file is regenerated each time on boot. <s>Instead edit <code>/usr/share/initramfs-tools/scripts/casper-bottom/12fstab</code>.</s>(No longer works with 18.04, you&apos;ll have to regenerate the squashfs file, too much work.)</p>
<p>Run <code>blkid</code> and take note of the UUID of the encrypted partition. Edit <code>/etc/crypttab</code> such that the volume with be setup automatically during boot.  Not specifying a passkey will present you a prompt to enter teh passkey during boot.</p>
<pre><code>#name device passkey type
home-rw UUID=&quot;...&quot; none luks 
</code></pre>
<p>Finally, add/create a line in <code>rc.local</code> to mount:</p>
<pre><code class="language-rc.local">#!/bin/sh -e
mount -t ext4 /dev/mapper/home-rw /home
</code></pre>
<p>Note: since this is done after the live system creates the user, the default ubuntu user will have no home directory. Graphical login for the ubuntu user will fail. Can be fixed with copying over the added home directory.</p>
<h1 id="finalnotes">Final notes</h1>
<p>Live systems can be fragile. <code>apt upgrade</code> can break stuff. I recommend keeping the home parition seperate and upgrading the live image every once in a while rather than upgrading individual packages. Also, surprisingly,<br>
Nvidia drivers .run installation works.</p>
<p>Maybe a completely customized LiveUSB is more worth the time? Maybe next time</p>
<hr>
<p>Edit:</p>
<ul>
<li>2/6/2019 - fixed typos and missin <code>luksFormat</code> in encryption setup</li>
</ul>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>If you don&apos;t bother about legacy boot, simply copying the files over from the iso image will work as well. Remember to include all hidden files ad well, and to set boot flags if there&apos;s more than 1 partition. <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Mirrored Strategy]]></title><description><![CDATA[Have more than 1 GPU on your setup? Data parallelism on multiple identical GPUs is easy when training with Tensorflow Estimators, and just marginally less convenient with Keras' `model.fit`.]]></description><link>https://dust.teckyianlim.me/mirrored-strategy/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f0d</guid><category><![CDATA[tensorflow]]></category><category><![CDATA[tf.estimator]]></category><category><![CDATA[wip]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Tue, 12 Mar 2019 16:38:31 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Have more than 1 GPU on your setup? Data parallelism on multiple identical GPUs is easy when training with Tensorflow Estimators, and just marginally less convenient with Keras&apos; <code>model.fit</code>.</p>
<h2 id="estimators">Estimators</h2>
<pre><code class="language-python">strat = tf.distribute.MirroredStrategy(local_gpu_list)
runconfig = tf.estimator.RunConfig(train_distribute=strat,
                                   eval_distribute=strat,
                                  )
</code></pre>
<p>If the evaluation dataset&apos;s <code>input_fn</code> is something that Tensorflow can&apos;t figure out how to split/shard, you might run into errors during evaluation. The exact same input function works properly if in train, but will throw some error when doing evaluation.</p>
<h2 id="kerasmodels">Keras models</h2>
<p>For Keras models to take advantage of multiple GPUs, it&apos;s just slightly more annoying. The model has to be created and compiled in the strategy scope.</p>
<p>Example with sequential model:</p>
<pre><code class="language-python">from tf.keras import models, layers

strat = tf.distribute.MirroredStrategy(local_gpu_list)
with strat.scope():
    model = models.Sequential([layers.InputLayer(input_shape=[64, 64, 3]),
                               layers.Conv2D(3, 64, padding=&apos;same&apos;),
                               ...,
                              ])
    model.compile(loss=&apos;binary_crossentropy&apos;, optimizer=&apos;adam&apos;)
</code></pre>
<hr>
<h1 id="todo">TODO:</h1>
<ul>
<li>Splitting datasets/sharding</li>
</ul>
<hr>
<p>Edits:</p>
<p>-5/6/2019 added snippet for Keras models</p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Discovering Compute Devices in Tensorflow]]></title><description><![CDATA[Add simple device discovery to your code so that it can run on multiple devices perhaps with differing configs without needing to configure them explicitly.]]></description><link>https://dust.teckyianlim.me/discovering-devices/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f0c</guid><category><![CDATA[tensorflow]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Tue, 12 Mar 2019 15:37:28 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><pre><code class="language-python">from tensorflow.python.client import device_lib

devices = device_lib.list_local_devices()
</code></pre>
<p>Each item is a <code>DeviceAttribute</code>, where we can use to find out the device types and names. Somehow this isn&apos;t well documented in the Tensorflow&apos;s documentation. Of which, we should find the attributes <code>name</code>, <code>device_type</code> and <code>memory_limit</code> most useful.</p>
<h1 id="attributes">Attributes</h1>
<h2 id="name"><code>name</code></h2>
<p>A string that can be used to specify the compute device in Tensorflow, eg, <code>tf.device(devices[0].name)</code>, to explicitly state device placement.</p>
<h2 id="device_type"><code>device_type</code></h2>
<p>A string, it can take the following values to specify the type of device it is.<br>
-<code>CPU</code> for the CPUs<br>
-<code>GPU</code> for GPUs that are visible to Tensorflow<br>
-<code>TPU</code> for Google&apos;s own TPUs, probably will never see this (or maybe we will? Google AIY <a href="https://cloud.google.com/edge-tpu/">Edge TPUs</a>)</p>
<p>Also, more recently <a href="https://www.tensorflow.org/xla">XLA (Accelerated Linear Algebra) Devices</a> <code>XLA_CPU</code> and <code>XLA_GPU</code></p>
<h2 id="memory_limit"><code>memory_limit</code></h2>
<p>Total memory in bytes. Perhaps this can be used to compute the batch size to use?</p>
<h2 id="otherattributes">Other Attributes</h2>
<p>There&apos;s also the <code>incarnation</code> and <code>locality</code> attribute. <code>locality</code> isn&apos;t meaningful here as we are refering to local devices. Thus it&apos;s always an empty <code>dict</code>.</p>
<p>I have no idea what <code>incarnation</code> is.</p>
<h1 id="practicalusageexampleforusingmultiplegpuswithestimators">Practical usage example for using multiple GPUs with Estimators</h1>
<pre><code class="language-python">local_gpus = [d.name for d in devices if d.device_type == &apos;GPU&apos;]

strat = tf.distribute.MirroredStrategy(local_gpus)

runconfig = tf.estimator.RunConfig(train_distribute=strat,
                                   eval_distribute=strat)

est = tf.estimator.Estimator(..., config=runconfig)
</code></pre>
<hr>
<h2 id="obtainingdevicelistswithtfsession">Obtaining device lists with tf.Session()</h2>
<p>Another way of obtaining compute devices is with:</p>
<pre><code class="language-python">import tensorflow as tf

with tf.Session() as sess:
  devices = sess.list_devices()
</code></pre>
<p>This however is slightly different and the objects returned are of <code>session._DeviceAttributes</code>. It&apos;s <code>name</code> string now also includes where the device, and this can also reference remote devices if an address is passed into <code>tf.Session</code>, say a TPU worker maybe.</p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Using Tensorflow's Dataset API]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>TensorFlow&apos;s new Dataset API (available from 1.8) makes creating input pipelines much easier. Using it should be painless if you have something is an iterable, one of the common formats (files in a folder, csv, numpy array) or TFRecord, life is gonna be much easier, and <code>from_</code></p>]]></description><link>https://dust.teckyianlim.me/using-tensorflows-dataset-api/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52f01</guid><category><![CDATA[tensorflow]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Mon, 01 Oct 2018 07:43:51 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>TensorFlow&apos;s new Dataset API (available from 1.8) makes creating input pipelines much easier. Using it should be painless if you have something is an iterable, one of the common formats (files in a folder, csv, numpy array) or TFRecord, life is gonna be much easier, and <code>from_generator</code> is perhaps the easiest to get any dataset into TensorFlow.</p>
<p>Usage pattern:</p>
<ol>
<li>Create the &apos;raw&apos; <code>dataset=</code> one of:
<ul>
<li><code>tf.data.Dataset.from_generator()</code> for some function with a <code>yield</code></li>
<li><code>tf.data.TFRecordDataset()</code> for reading from TFRecords</li>
<li><code>tf.data.Dataset.from_tensor_slices()</code> for numpy arrays. (Sparse version available too)</li>
<li><code>tf.data.TextLineDataset()</code> for text files like <code>.csv</code>s</li>
</ul>
</li>
<li>Apply transforms, if desired, with <code>dataset.map(...)</code></li>
<li>Randomize order with <code>.shuffle(buffer_size=n)</code></li>
<li>Set with <code>.repeat(n)</code>. (Pass nothing for it to repeat forever)</li>
<li>Set batch size with <code>.batch(n)</code></li>
<li>Obtain iterator with  <code>iter = dataset.make_one_shot_iterator()</code></li>
<li>Elements can now be obtained with: <code>x, y = iter.get_next()</code></li>
<li>Build graph using <code>x, y</code> directly. <em>No placeholders needed!</em></li>
</ol>
<h1 id="migratingfromtfrecordsandqueuerunners">Migrating from TFRecords and QueueRunners</h1>
<p>If you have been using TFRecords and QueueRunners, switching over to the new Dataset API will be very painless.</p>
<p>Your original input pipeline should have something like this</p>
<pre><code class="language-python">reader = tf.TFRecordReader()
_, example = reader.read(filenamequeue)
fmt = ...
features = tf.parse_single_example(example, features=fmt)
x = features[&apos;data&apos;]
y = features[&apos;label&apos;]
</code></pre>
<p>In the new API, we create a function will all the parsing and preprocessing we need for each example, into a function. This is then applied to the dataset using <code>.map()</code>.</p>
<pre><code class="language-python">def parse_func(example):
    fmt = { &lt;key1&gt; : tf.FixedLenFeature( &lt;shape&gt;, &lt;dtype&gt;, &lt;default_value(optional),
            &lt;key2&gt; : tf.VarLenFeature( &lt;dtype&gt; ), ...
          }
    parsed = tf.parse_single_example(example, fmt)
    return parsed[&lt;key1&gt;], ...
</code></pre>
<h2 id="fullbasicexamplewithtfrecord">Full basic example with TFRecord</h2>
<pre><code class="language-python">data_raw = tf.data.TFRecordDataset(filename) #or list of filenames

def _parse_func(example):
   #example is a Tensor of bytes. Needs to be parsed with parse_single_example
   example_fmt = { &apos;x&apos;: tf.FixedLenFeature((), tf.string, &apos;&apos;),
                   &apos;y&apos;: tf.FixedLenFeature((), tf.string, &apos;&apos;),
                   }
    parsed = tf.parse_single_example(example, example_fmt)
    #parsed is a dictionary of tensors
    #Can do further processing of the tensors now, or simply return them
    return (parsed[&apos;x&apos;], parsed[&apos;y&apos;])

data = data_raw.map(_parse_func)

#Make random, repeatable and batched
data = data.repeat().shuffle(buffer_size=BATCH_SIZE*10).batch(BATCH_SIZE)

iter = data.make_one_shot_iterator()
x, y = iter.get_next()

#Build graph, note how x and y are used
net = tf.layers.dense(x, 512, activation=tf.relu)
net = tf.layers.dense(net, 512, activation=tf.relu)
pred = tf.layers.dense(net, 10)

loss = tf.losses.softmax_cross_entropy(pred, y)

train_op = tf.train.GradientDescentOptimizer().optimize(loss)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for ii in range(MAX_ITER):
        _, curr_loss = sess.run([train_op, loss])
        print(&apos;Iter: {}, Loss: {}&apos;.format(ii, curr_loss))
</code></pre>
<h3 id="onparsingcompressedimages">On parsing (compressed) images</h3>
<p>Images are parsed as <code>FixedLenFeature</code>, even if they might be compressed and of different sizes (byte lenght). This is because <code>FixedLen</code> here refers to the tensor length, not the number of bytes in the Tensor. A variable sized image is still a single element Tensor of type BytesList.</p>
<h1 id="fromalmostanythingelsewithgenerators">From (almost) anything else with generators</h1>
<p>I find that the best part of the Dataset API is the <code>from_generator()</code>. So long as you know how to iterate thru the examples, you should be able to wrap it into the Dataset API without much difficulty.</p>
<h2 id="examplewithhdf5generators">Example with HDF5 + generators</h2>
<p>Heres a toy example of reading a HDF5 file, with keys <code>x</code> and <code>y</code>. (Unfortunately, in H5PY&apos;s documentations, these are called datasets)</p>
<pre><code class="language-python">import h5py
import tensorflow as tf
import numpy as np

in_file = h5py.File(&apos;data.h5&apos;, &apos;r&apos;)
x_in = in_file.get(&apos;x&apos;)
y_in = in_file.get(&apos;y&apos;)

def gen():
    for x, y in zip(x_in, y_in):
        yield x, y

#Lets assume that x is 3 vector of floats and y is an int
d = tf.data.Dataset.from_generator(gen, 
                                   output_shape = ([3], None),
                                   output_types = (tf.float32, tf.int32)
                                   )
</code></pre>
<p>That&apos;s it! The HDF5 (or which ever esoteric reader that you might have) is now wrapped in a nice Dataset API, with all the batching, pipelined reading, shuffling, available to you!</p>
<hr>
<h2 id="todo">TODO:</h2>
<p>Planned updates:</p>
<ul>
<li>[x] Notes on parse function for TFRecord</li>
<li>[ ] Fancy initializers</li>
<li>[ ] Using with graphs built with placeholders</li>
<li>[ ] Boilerplate file gist?</li>
</ul>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Deploying Ghost to Heroku]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>or how this blog was made.</p>
<p>I was in search of a Markdown based blogging platform that I could use for free. Jekyll and Github Pages is really nice, but the lack of a web/app interface to write stuff is rather limiting at times.</p>
<p>Fortunately, there&apos;s Heroku</p>]]></description><link>https://dust.teckyianlim.me/deploying-ghost-to-heroku/</link><guid isPermaLink="false">5e41bf6321d5e8676ca52ef9</guid><category><![CDATA[web]]></category><category><![CDATA[heroku]]></category><dc:creator><![CDATA[TeckYian Lim]]></dc:creator><pubDate>Thu, 19 Jul 2018 07:02:15 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>or how this blog was made.</p>
<p>I was in search of a Markdown based blogging platform that I could use for free. Jekyll and Github Pages is really nice, but the lack of a web/app interface to write stuff is rather limiting at times.</p>
<p>Fortunately, there&apos;s Heroku and Ghost. Heroku lets you run small scale web apps for free and with some configuration, you can get Ghost to run on it.</p>
<h2 id="prelimiaries">Prelimiaries</h2>
<p>For the free databases to work, billing needs to be setup on your heroku account. Create your heroku account, and install the command line tool and enable billing. On top of being able to use addons, you get more free dyno hours too! Don&apos;t worry, everything used here is free.</p>
<h2 id="gettingabareminimumblogworking">Getting a bare minimum blog working</h2>
<p>Once you have done all the preliminary stuff, paste the stuff below and you&apos;ll have a (mostly) working blog!</p>
<p><em>Copy and paste without knowing what&apos;s going on? Who cares.. I&apos;ve comments if you are confused</em></p>
<pre><code class="language-bash">APP_NAME=ghostblog

#Download the archive, newest available at https://ghost.org/developers
wget https://github.com/TryGhost/Ghost/releases/download/1.24.8/Ghost-1.24.8.zip
mkdir $APP_NAME
cd $APP_NAME
unzip ../Ghost-1.24.8.zip

#Create a git repo and commit
git init
git add -A
git commit -m &quot;Initial everything&quot;

#The heroku command will create the app and set up the git remote
heroku create $APP_NAME
#pushing to heroku will automatically deploy the website
git push heroku master

#Free DB plan (kitefin) from JAWS DB
heroku addons:create jawsdb:kitefin

#JAWSDB_URL is set by the addon after the previous command
DBURL=`heroku config:get JAWSDB_URL`
#This returns mysql://&lt;user&gt;:&lt;password&gt;@&lt;server&gt;:&lt;port&gt;/&lt;database&gt;

#Some regex magic to get the params from the address
DBUSER=`sed &apos;s/.*\/\(.*\):.*@.*/\1/&apos; &lt;&lt;&lt; $DBURL`
DBPASS=`sed &apos;s/.*\/.*:\(.*\)@.*/\1/&apos; &lt;&lt;&lt; $DBURL`
DBSERVER=`sed &apos;s/.*@\(.*\):.*/\1/&apos; &lt;&lt;&lt; $DBURL`

#These config vars corresponds to database.connection.* if set in config.*.js
heroku config:set \
    database__connection__user=$DBUSER\
    database__connection__password=$DBPASS\
    database__connection__host=$DBSERVER\
    database__connection__database=${DBURL##*/}  

#Initialize the database with Ghost&apos;s initializer
heroku run knex-migrator init

#Set some server parameters
#Heroku seems to start on random ports
echo export server__port=\$PORT npm start &gt; .profile
git add .profile
git commit -m &quot;Server port config&quot;
git push heroku

#Listen on all addresses, url is read by engine to provide a &quot;Home&quot; link
heroku config:set \
    url=https://$APP_NAME.herokuapp.com \
    server__host=0.0.0.0
    
</code></pre>
<p>After this you should be able to see your pretty Ghost site at <code>https://$APP_NAME.herokuapp.com</code>!<br>
<img src="https://raw.githubusercontent.com/moodoki/phantom-dust/assets/assets/2018/07/ghost_ready.jpg" alt="ghost_ready" loading="lazy"></p>
<h2 id="configuringthesite">Configuring the site</h2>
<p>Now that all is done, you should be able to access your admin page at <code>https://&lt;app-name&gt;.herokuapp.com/ghost</code>. When you first access this, you will get to create your account and stuff. Next, you probably will want to delete the &quot;Ghost&quot; user to get rid of the example posts.</p>
<p>If you aren&apos;t planning to upload any images through Ghost&apos;s interface, you are done! If not read on.</p>
<h2 id="sslbecausetheinternetisdangerous">SSL, because the Internet is dangerous</h2>
<p>If your are using Heroku&apos;s subdomain (i.e <code>appname.herokuapp.com</code>) SSL is automatic. Heroku will only allow you to configure SSL certificates if you upgrade to a paid plan. However all is not lost, use Cloudflare as your DNS and you&apos;ll get free SSL from Cloudflare. This is as simple as setting the <code>CNAME</code> entry of your site to <code>appname.herokuapp.com</code>.</p>
<p>After doing this, you might want to change the <code>url</code> config variable to point to your custom domain.</p>
<h2 id="gettingfileuploadstoworkcorrectly">Getting file uploads to work correctly</h2>
<p>Web apps in Heroku runs in an ephermeral VM. Once shutdown, file are lost. This means that images that are uploaded will stop existing once your app shuts down due to inactivity. Fortunately, Ghost allows custom storage adapters, meaning we can make use of some free services on out there.</p>
<p>I chose to base <a href="https://github.com/moodoki/ghost-github">mine</a> on <a href="https://github.com/ifvictr/ghost-github">ghost-github</a>.</p>
<p>However, the author&apos;s documentation stated that you are required to have your access tokens/passwords in the config file, stored in clear, and perhaps inadvertantly on some publicly accessible repository like Github.</p>
<p>To add files to github programatically, you will need to get a personal access token from <a href="https://github.com/settings/tokens/new">here</a>.<br>
<em>Aside: you probably want to create a separate machine user and share the repo with this user for this purpose. Obtain the token for this machine user<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>.</em></p>
<p><em>Note: Assets repo seems to need to be a public repo. The adapter generates some access token if the repo is private, but it doesnt seem to work.</em></p>
<pre><code class="language-bash">#Get dependencies in yarn.lock and packages.json
yarn install ghost-github

cd content/adapters/storage
git submodule add https://github.com/moodoki/ghost-github.git

git commit -a -m &quot;Add storage adapter&quot;
git push heroku master

#Config vars read by the adapter
#if you are using a shared repo and a machine user, 
#the REPO_OWNER should be set to the actual owner
heroku config:set \
    GHOST_GH_DESTINATION=&lt;folder&gt; \
    GHOST_GH_REPO=&lt;repo_name&gt; \
    GHOST_GH_REPO_OWNER=&lt;github_repo_owner_username&gt; \
    GHOST_GH_BRANCH=&lt;repo_branch&gt; \
    GHOST_GH_TYPE=token \
    GHOST_GH_TOKEN=&lt;access_token&gt;

#Tell ghost to use the adapter
heroku config:set \
    storage__active=ghost-github
</code></pre>
<p>File uploads should be working now!</p>
<p><em>Disclaimer: I&apos;m no expert with Javascript or NodeJS. I have no idea how Ghost is able to get the config vars from the either environment variables or config files transparently</em></p>
<h3 id="links">Links</h3>
<ol>
<li>Ghost publishing platform. <a href="https://github.com/TryGhost/Ghost">[Source]</a><a href="https://ghost.org">[Official Website]</a><a href="https://ghost.org/developers">[Archive download]</a></li>
<li>My fork of ghost-github storage adapter. <a href="https://github.com/moodoki/ghost-github">[Github]</a></li>
</ol>
<hr>
<p>On hindsight, perhaps running everything in a free VM in GCP might be a lot easier. Although connections are metered, the always free tier is rather generous, most likely more than sufficent for a moderate sized website. More on this perhaps next time</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>Machine users are allowed in Github terms, read <a href="https://developer.github.com/v3/guides/managing-deploy-keys/#machine-users">this</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item></channel></rss>