Tuesday, November 11, 2014

Deploy ownCloud with Bitnami in Google Cloud Platform

Google Cloud Platform recently announced the availability of Ubuntu images for Google Compute Engine. To take a test drive on this favorite operating system of mine, I decided to deploy an application on the new operating system.

The first thing came to mind was ownCloud, an open-source alternative for document sync solutions such as Dropbox, Box, etc. Traditionally, ownCloud is a software package that is deployed on-premise. With virtual machines (VM) available from IaaS providers, it is conceivable that one may be able to run ownCloud in a public cloud, so that a personal version of Dropbox can be had without dealing with concerns of security and privacy issues. (There are commercial file sync solutions that address this issue by design, including SpiderOak and Wuala.)

Initial deployment

First, I created an Ubuntu image with in Google Compute Engine. I made sure to select both "Allow HTTP traffic" and "Allow HTTPS traffic". Doing so would add the tags http-server and https-server to the instance, allowing the native firewall to open ports for the VM.

The complete stack of ownCloud requires a web server supporting PHP (such as Apache), and MySQL as the supporting database. It could get quite complex if one has to deploy the ownCloud software, as well as all the necessary dependencies on a VM. So I chose Bitnami, who specializes in packaged solutions for easier deployment.

There are two ways one can use Bitnami stacks.
  1. Deploy the standalone Bitnami installer for ownCloud.
  2. Deploy the Bitnami installer for LAMP stack, and then add the Bitnami module for ownCloud.
For my purpose, I wanted something to go live in hours, so I wanted to fiddle as little as possible. The first option turned out to be my natural choice.

After having downloaded the Bitnami installer on my Ubuntu image (via wget), I simply ran the installer script (via sudo). Only a few essential items needed I to be prepared before the installation.
  • Admin user info (username, password, and a descriptive name). Email is optional since I don't intend to enable SMTP for ownCloud.
  • Accessing address for ownCloud. Choose the external IP address for the Ubuntu VM instance running in Google Compute Engine. Do not use the internal IP, even though it is the default setting. Since you always need remote access to the VM, consenting to the internal IP would prevent you from accessing the web user interface (UI) without having to change the trusted domain further.
  • Choose production settings (i.e., option 2) for the installation type.
At the end of the installation, the script asks you if you want to launch ownCloud. If the answer is no, you can manually start it by running the following script from the installation directory.
$ ./ctlscript.sh start
More info is published here.

Change the web root

By default, the Bitnami installation runs ownCloud from the web URI of /owncloud. This is intended for installing ownCloud on a shared web service. Since I only intend to run ownCloud alone on this VM, I would very much like it to run from the root path. I have learned to make this adjustment before configuring any desktop clients. The ownCloud way is a bit kluncky as it requires code change. I find that the Bitnami way to be much easier to work with. Do restart the ownCloud stack after the change.

Now that the ownCloud was live and well, I still needed to further configure secure access.

Set up the domain

I used Google Domains to register a domain name that I liked, which is reasonably affordable.

The recommended steps are as follows.
  1. Create an A record for the host DNS name such as mycloud.mycompany.com, and assign the current ephemeral IP address to it. Note that the IP address is temporary, we need to set the TTL to something as short as a few minutes. (More on this later.)
  2. Change the trusted domains settings in ownCloud to use the host name instead of that IP address. I decided to just modify the PHP file directly, and only allow accessing from that host name.
  3. Restart Apache.
  4. Access the web UI using the host name and make sure it works.

Set up the SSL certificate

I acquired a cheap SSL certificate from Namecheap, where its PositiveSSL single-domain certificate cost only $9 a year, and it is available almost instantly, pending domain ownership verification. (Note, Namecheap's certificates use Comodo as the certificate authority.)

Now we move on to digital certificate. We need to create a private key first with the following command.
$ sudo openssl genrsa 2048 -des3 \
    -out {
host__dns_name}.key
Next, prepare for the certificate signing request (CSR). On the VM, run the following to generate a key with SHA-256 hash. This is important because of the recent Chrome controversy about deprecating SHA-1.
$ sudo openssl req -new -sha256 \
    -key {
host_dns_name}.key \
    -out
 {host_dns_name}.csr \
    -subj "/C=US/CN={
host_dns_name}
You would need to fill out a request form on Namecheap's web page using the blob in the generated .csr file. Also, choose the web server type as apache2.

Because I couldn't receive emails from the domain that I used (for lack of MX records), I worked with Comodo technical support over a web chat session. My request was approved via domain verification using a temporary CNAME entry. At the end of the live chat, I received two .crt files - one for the signed certificate (which I renamed to {host_dns_name}.crt) and the other is a combined file of the root and other intermediate certificates.

I still received a ZIP file from Comodo in the email, but the root and intermediate certificates are separate files, which requires you to manually concatenate the certificate blobs together into a single .crt file.

I ran these two commands to ensure that the hash values of the private key and of the certificate match.
$ sudo openssl rsa -noout -modulus \
    -in {
host_dns_name}.key \
    | openssl sha256
$ sudo openssl x509 -noout -modulus \
    -in {
host_dns_name}.crt \
    | openssl sha256
Needless to say, one has to deal all three files with care. Other than proper backup, I created a folder that stores these files with only read access by root.

These two files, along with the private key, are all you need to configure the Apache. The nice thing about Bitnami is that the changes can be made outside of Apache in {bitnami_installation_folder}/apache2/conf/binami/bitnami.conf. The relevant entries to point to these files are SSLCertificateFile, SSLCertificateKeyFile, and SSLCertificateChainFile. These entries point to a dummy self-signed certificate by default. Remember to use the approved digital certificate file, not the original CSR file for SSLCertificateFile, even though both may be using the .crt file extension.

Lastly, restart Apache and verify that the web UI works with HTTPS.

Finalize the deployment

After having proved my ownCloud instance can be accessed via HTTPS, the first thing I did was to remove the tag http-server from the VM instance. This effectively changed the firewall rule, forcing all communications only through TLS. There are other ways to enforce that in Apache or ownCloud, but I have found this to be the most expedient.

Immediately afterwards, I changed the admin password again, since I had used the initial password several times in the clear before HTTPS was configured.

Now I need to make a few final touches to make the VM permanent.
  1. Un-checking the option "Delete boot disk when instance is deleted", so that the ownCloud deployment can survive a VM reboot.
  2. Create a new static IP and assign to the VM. This way, I can always make use of the fixed IP address in the future if I have to restart the instance.
  3. Change the DNS entry with the fixed IP address, now with a much longer TTL (such as a week or so).

For further piece of mind, I created a snapshot of the persistent disk associated with the VM.

Configure the sync client(s)

There are cross-platform desktop clients available from ownCloud. Mobile clients are available for iTunes, Google Play, and Amazon Appstore. Desktop clients are free to use, but mobile clients cost 99 cents each.


Bitnami pros and cons

The Bitnami advantage for deploying ownCloud (or other packaged solution) is fast time-to-market. All the dependencies are pre-packaged and the entire solution is self-contained. For example, it does not require individual installation of Apache, PHP, or MySQL. All components can be managed using Bitnami scripts and/or tools.

That said, using Bitnami also implies that you are not using operating system's package management tools. For example, Apache is not installed via apt-get, and you won't be able to get the most recent security patches pushed to Ubuntu's package repository. Instead, you would rely on Bitnami to package the next version to upgrade.

It is possible, however, to enhance the initial Bitnami installation with more advanced configuration. For example, one can configure to have ownCloud start upon system start, instead of having to manually invoke the start-up script. It is further possible to replace the MySQL back-end to, say, Google Cloud SQL, or introduce Nginx as the web server. Yet, all of which would imply that one has deviated from Bitnami and started down the path of self-administration of ownCloud.

Benefits of ownCloud in the cloud

So why am I doing this?

To start off, for the essential file sync functionality, ownCloud does pretty much what Dropbox offers, with fewer concerns about privacy and security. Granted, my data reside with the IaaS provider, which is Google in this case. However, the industry should move forward with custom-managed keys. When Google Cloud Platform supports custom-managed keys, it is the next step to further secure the data.

Secondly, using public clouds takes advantage of the IaaS infrasture. All the current commercial solutions impose storage caps. Dropbox has 1TB limit. In the case of Google Compute Engine, I could technically provision a persistent disk as large as 10TB, and attach four of them to the VM. Also, the sync client would enjoy the great sync bandwidth from the cloud, comparing with hosting ownCloud from a personal server built at home, where the upstream limit is capped by most ISPs to a few Mbps, orders of magnitude lower than using IaaS providers.

There is also cost advantage to deploying ownCloud in the cloud. The majority of the monthly spend is on persistent disks and network egress. Assuming 10% of storage is network egress, this renders the cost for 1TB of data to about $53, plus the computing hours. This is costlier than Dropbox Pro's $9.99, but most unlimited solutions (such as SpiderOak Professional) would charge $100 for the same 1TB.

If you opt for building your ownCloud hardware, the cloud would still be ahead. The hardware cost of a reasonable server is $600 and let's amortized the amount over three years. Then there is the electricity bill (100W at the rate of 15 cents per kWh). Further factor in the notion that 3 hours are needed each month for maintenance and care, using the minimum wage of $7.25. The total monthly spend is now $53. This is with almost no redundancy and difficulty in hardware migration down the road.

Last but not the least, with your own ownCloud, you never have to worry about the file sync provider went out of the business.

Now I'm going to sip a cup of coffee while my initial file sync is in progress.

Sunday, October 28, 2012

Coming Back to Windows for HTPC

I have started a slow process of a DIY project to build a home theater PC (HTPC) for my living room several months.

My primary goal is to be able to use it as a low-cost multi-purpose computer that can do the following.

  • Watch over-the-air local television programs with DVR capability.
  • Watch some ethnic programs from Kylin TV, an IPTV provider.
  • Allow easy access to the DVD library.
  • Serve as a Skype client.
  • Casual web browsing.
There are tons of instructions on how to build HTPC, so I am not going to bore you with details.  I did have some idea on what kind of components I want, and waited for them to be on sale from Newegg or Amazon, and accumulated them over several months.  I don't need the most current models, but I am very conscious on cost, energy consumption, and heat dissipation (fan noise) requirement.  I do not plan on using it as a gaming console, so I want to have a CPU with embedded graphics without the need for a discreet graphics card.  Here are what I have.
  • Foxconn ITX motherboard H67S, which is HDMI-ready.
  • Intel Pentium G630T Sandy Bridge, which has a very low TDP rating of 35W.
  • 4GB of memory
  • DVD-RW drive
  • 60GB SSD as the primary drive and 2TB HDD as the storage drive.
  • TV tuner card
  • 430W PSU
  • nMEDIAPC case, as its form factor allows it to be stacked on traditional living room media components.
The only tidbit that I wanted to mention is that the stock CPU fan from Intel was not as good.  The CPU temperature reading was over 50°C when idle.  The case is low profile and height-challenged, so I cannot fit most CPU coolers with 120mm fans.  I don't care for the noise from those bigger monster coolers, either.  So I opted for some better thermal paste that lowers the CPU temperature by about 5°C.  It now runs under 45°C when idle.

Initially I chose to use Ubuntu and XBMC.  But that gave me endless problems.  The biggest problem was the fact that it was prone to sudden freeze in the middle of media playback.  I blamed it on the quality of drivers.  XBMC playback was another problems.  XBMC couldn't handle some video, which otherwise played fine using VLC.  Thirdly, the Skype client on Linux had a very primitive user interface, and the video calls were hit and miss.

The thing that was most lacking was the limitation for remote control.  I use a wireless keyboard/mousepad combo from Logitech, but it is clumsy to use from the couch.  I tried to use XBMC remote app from my Android phone, and it worked.  I could configure to let the HTPC go to standby from the remote, but I still need to walk over to the HTPC and press the power button to wake it up.  In summary, it was good enough for geeks like me, but the build was not wife-ready.

That was enough for me to purchase a copy of Windows 7 Home Premium and start over.

I cannot be happier with the result.  The whole system is more stable than ever.  I have both Windows Media Center and XBMC configured.  Windows Media Center is a great for TV viewing, its TV guide and DVR capability is flawless.  My wife cannot tell whether she is watching from the HTPC or directly from the TV.

There are two minor tweaks that is needed for Windows Media Center.  The first is to set the temporary and recording storage locations from the default SSD to the HDD so as to reduce the wear and tear on the SSD.  The second is to disable Windows Media Center from automatically waking up from standby just to download the daily TV schedules.

Windows Media Center also provides two improvements.  The first is the remote.  There are many Windows Media Center remote controls (from $15).  The thing I like most is the ability to wake up the HTPC from standby by a press of the remote button, just like what you would expect from any other media components.  The second is the ability to show media information (TV channels, DVD chapters, etc.) using a programmable LCD ($35), so you can see them from the HTPC front panel.  Those features make Windows Media Center-based HTPC much friendlier for the living room.

I still prefer using XBMC to work with the DVD library.  Windows Media Center has a very basic linear user interface, and cover arts are not automatically downloaded.  XBMC is more mature in that area, although its configurations are still quite confusing.  I did play with different skins, and found that Xeebo is better with keyboard based navigation and controls.  But eventually I go back to use Confluence, the default skin.  Most importantly, I find that XBMC on Windows no longer exhibits the problem with some video where its Linux cousin did.

I really wish Ubuntu-based system could improve.  Not only in the areas of system stability, but also its usability with regard to remote control and front panel LCD support.  Until then, I'm sticking to Windows.  (I'm holding off from Windows 8 as I have heard that Media Center is not included in its base package.)

My takeaway from this episode of home improvement project is that open source stuff may still be a long way off from being ready to average consumers, comparing with what commercial products and solutions can deliver.

Monday, October 3, 2011

Google 2-Step Verification Is Not Two-Factor Authentication

Earlier this year, Google made the announcement that 2-step verification is rolled out to all Google accounts.  It requires the the user to provide one-time password (OTP) after entering the memorized password.  One way to generate the OTP is via the mobile app Google Authenticator from a smart phone, which works very similar to two-factor authentication devices used by VeriSign VIP, RSA SecureID, and Yubikey.  But there are a few things that make the OTP for Google accounts different from these two-factor authentication providers.

The first thing that we notice is that Google has made its OTP generation algorithm open-source, which is computed by synchronized time.  Although the configuration user interface makes it a bit difficult to retrieve the key after the user initially sets up 2-step verification, the key itself is still available.  Especially when the user has printed out the QR code (or simply instructed the web page to output the key string when configuring for a camera-less smart phone such as BlackBerry).  One can scan the QR code again to get the key and re-load it on more than one device.  I, for example, have managed to enable Google Authenticator on many different devices (one BlackBerry device and two Android devices) for my own convenience.  So this ability fundamentally weakens "the second factor," which traditionally is something you own, in contrast to the password, which is something you know.  One can argue, if you have multiple devices that can generate the same OTP, it is less likely that you can be sure that you have possession to all of them at all time.

Think about it, when you can write the key down or print the QR code out and later re-scan it, the OTP actually does not represent something that you owe, but another thing that you know about.  So if a key logger can steal your password, another malware may steal the key to the OTP if you print the QR code into something like a PDF file, or store the key in a text file.  With the published algorithm for generating the OTP, this method of the 2-step verification is really a two-password scheme.

Furthermore, with a single-password mechanism, the server should store the password as a salted one-way hash (neither as plain text nor as un-salted hash).  In the event of internal attack, we have reasonable assurance that even if the master user database were to be breached, hackers would not easily extract the password from the hash.  But it would be a totally different scenario with the OTP key, as Google must be able to decrypt it (if it is encrypted) from the user identity store to compute the OTP value to compare with user's submission.  In some aspect, it would be harder for Google to protect this key than it would for the user password from internal attack.  Theoretically speaking, it requires Google and users to raises the barrier to guard the OTP key.

It also makes Google the sole target for hackers as it must maintain both the password and the OTP key.  In the case of Yubikey or VeriSign VIP, at least the key is stored by a third-party.  We can only imagine that Google does this to lower the cost of providing more security without transferring the cost payable to such a third-party to consumers.

Don't get me wrong, even with these implications, 2-step verification makes account hacking a lot more difficult, as brute-force attack would introduce at least six digits to the existing password, making the attack space much bigger.  Aside from that, the initial login process requires at least two HTTP requests to Google's servers, which tremendously slows down the rate of attacks that these hacks can achieve.

In conclusion, if you really want to use the most secure two-factor authentication, your only choice is not to use the Google Authenticator app, but activate 2-step verification via SMS.  You also want to make sure that you provide a real mobile number to set up the SMS number, not using services that can relay SMS messages to other devices, which probably excludes Sprint devices that are linked to Google Voice.  This is the only way to ensure that you would have a single device to receive the authentication code.  However, there is other issues with the SMS mechanism, such as its failure to work when the phone is not in range of any cell towers, or when the carrier has problem that delays the SMS delivery.  Most importantly, you pay the usurious SMS rate for every OTP you receive, if you do not already have the unlimited SMS plan, which is equally exorbitant.

Monday, June 13, 2011

The Cost of Encryption

For Matrix fans, the green letter-fall on a computer screen was symbolic of a view into an exotic world that is based on machine languages, cryptic to naked eyes.  In another word, encryption was sexy and hard to attain for some regular Joe.

Flash forward a decade, technology has advanced so much and there are a wealth of digital security tools available to corporations and consumers alike.  With powerful, free and open-source software such as TrueCrypt, we are entering into a new world where security conscious users embrace crypto technologies more than ever.  Ubuntu offers whole disk encryption right out of the box.  More companies issue laptops to employees that are equiped with Trusted Platform Modules (TPM) to secure corporate assets thereof.  The old way of protecting a hard drive via a BIOS password is no longer sufficient nowadays.

Recently, I purchased a few large capacity (2TB) hard disk drives as my offline storage for personal backup files.  I tried to set up these drives with whole disck encryption using TrueCrypt.

I first started initializing one disk from a newly built computer.  It has an AMD four-core CPU clocked at 3.4GHz.  The bare drive was mounted in a docking station that plugged into the USB 3.0 port.  Once I started the process, it said that I had to wait for four hours, and the initial write was progressing at a rate of 122MB/s.  I knew it would take some time since I had done something like this before.  A few hours was what I had in mind, so I turned off the flat screen monitor and let it run.

As I had another drives to initialize, I started the same process from a spare laptop that is over six years old.  That laptop has an Intel Pentium single-core CPU (1.6GHz) that runs Windows XP Service Pack 3.  I immediately noticed that the wizard advised about 25 hours.  Uh, what?  25 hours?  That was way beyond what I had expected.  When I last checked, the encryption was moving along at a rate of 20.4MB/s.  The raw math says that it would be almost 30 hours, but I knew the speed would vary when the drive head moves away from the center of the platters, so it would pick up steam later.  Nevertheless, the rate is pathetic in my opinion.  I mean the USB 2.0 dock for these bare drives should give me 480MB/s, no?  As I poked around, I noticed that the CPU consumption was a sustained 80% while there was no other application running.  The bottleneck has got to be due to the computational burden when the initialization process writes pseudo-randon 0's and 1's to the drive.  Then I looked at the power consumption from my UPS unit.  It was about 36W when I closed the laptop lid to save the 9W from the LCD.

That accounts for 0.9 kWh, which is about 10 cents damage to my electric bill.  Not a big deal, you would say. But I felt dumbfounded, because the long wait that I had to endure.  Formatting a disk is just to prepare a disk for normal use, and it takes over one day and uses all that computing power of the poor little laptop.  Wow!  Imagine, during the life of that encrypted hard drive, I may operating many hours of read and writes, all of which would involve encryption and decryption, which are computationally intensive operations.  The cost of employing encryption would be orders of magnitude higher, would I choose not to encrypt.

Meanwhile, the faster machine consumed about 190W, while eating up 30% of overal CPU cycles.  So the initialization would cost slightly less power, not not orders of magnitude less.

Furthermore, when I use an unencrypted hard drive, the disk writes involve those bytes that are affected.  However, because of the nature of the encryption, the entire block has to be re-calculated and re-written.  So more computations and more writes.  A lot more.

Here is the thought, when the technology affords better improvement in terms of speed and efficiency, we tend to leverage the technology to do more complex things.  So the efficiency gain does not translate to overall productivity gain.  Encryption along is net new operation spend that could be prohibitively costly, should we not have the luxury of fast computers.

Security has never been so prevalent these days, in the advent of major cyber hacks from Sony to International Monetary Fund.  Why don't they have better security?  Whey don't they encrypt the data and traffic?  We ask.  The truth is security comes at a price.

Some people criticize Amazon cloud storage service for not providing data encryption.  In contrast, Dropbox does encrypt customer's files in the cloud storage.  Put aside who owns the encryption keys, Dropbox must absorb the additional cost of encrypting and decrypting documents.  That cost is not a trivial.  We might have someone calculate the additional carbon footprint due to encryption, to know the environmental impact for those added security.

Encryption is the basis of all security.  It is not cheap.  It costs money.  That is IT!