Introducing Grive

Mehdi Poustchi Amin

Issue #224, December 2012

An independent open-source implementation of Google Drive—up in the cloud.

Earlier this year, Google introduced its Google Drive cloud storage service. Cloud storage is a model of networked on-line storage where data is stored in virtualized pools of storage that third parties generally host. Hosting companies operate large data centers, and people who require hosting buy or lease storage capacity from them. The data-center operators, in the background, virtualize the resources according to their customers' requirements and expose them as storage pools, which the customers themselves can use to store files or data objects. Physically, the resources may span across multiple servers.

A Web service application programming interface (API) accesses cloud storage services through a cloud storage gateway or through a Web-based user interface. Google Drive is a place where you can create, share, collaborate and keep all of your stuff. You can upload and access all of your files, including videos, photos, Google Docs, PDFs and more.

The advantage of the Google Drive is that it could become your centralized storage vault for all your documents in the cloud. It lets you create and collaborate in the same way as Google Docs did.

1) Create and collaborate:

Google Docs is built right in to Google Drive, so you can work with others in real time on documents, spreadsheets and presentations. Once you choose to share content with others, you can add and reply to comments on anything (PDFs, images, video files and so on) and receive notifications when other people comment on shared items. You can upload up to 30 different file extensions, which many other cloud services don't allow.

2) Store everything safely and access it anywhere (especially while on the go):

Synchronization is one of the coolest things ever. When you use a synchronizing storage product, like Google Drive, any file you put in the drive, and anything you change that's stored in the drive, is updated automatically—not just in the cloud but on all the other devices that you have connected to the Drive. So you can start working on a file on one computer, close it and then open it on a second computer, and what you'll see is the version you closed on the first one. You can access your stuff from anywhere—on the Web, in your home, at the office or while running errands, from all of your devices. You can install Drive on your Mac or PC and can download the Drive app to your Android phone or tablet. And regardless of platform, blind users can access Drive with a screen reader.

3) Search everything by keyword and filter by file type, owner and more:

Drive even can recognize text in scanned documents using Optical Character Recognition (OCR) technology. Let's say you upload a scanned image of an old newspaper clipping. You can search for a word from the text of the actual article. Drive uses image recognition so that if you drag and drop photos from your Grand Canyon trip into Drive, you later can search for “Grand Canyon”, and photos of it should pop up. This technology is still in its early stages, and it's expected to improve over time.

Figure 1. Google Drive on a Mac, PC and Android Phone or Tablet

4) Get your old files:

Drive keeps a copy of every file you edit on it. It also tracks changes on documents and keeps revision copies of up to 30 days or the last 100 revisions.

5) Easy sharing to Google+:

Backing up photos is one of the most popular uses of cloud storage services, and with that in mind, it's no surprise to see that Google has made it easy for Google+ users to share images from its new Google Drive service. Google has integrated the two services to allow Google Drive users to share photos from the cloud service to Google+ from within the social network itself.

Google's new cloud storage solution features 5GB of free space. If you have never used a cloud-based storage service like Google Drive before, take a moment to consider the advantages of keeping your files on-line. Because files can be accessed from any computer with an Internet connection, Drive eliminates the need to e-mail or save a file to a USB drive. And because Drive allows you to share files, working with others becomes easier than ever before.

Google Drive and Privacy

Google's biggest problem with the Google Drive privacy policy is that there isn't actually a specific Google Drive privacy policy—there's just Google's new unified term of service and privacy policy. Google states, “You retain ownership of any intellectual property rights that you hold in that content. In short, what belongs to you stays yours.”

That means Google can't use your content for commercial purposes without your consent. However, the term of service also states:

You give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving our Services, and to develop new ones.

For content that is yours, Google can't re-use it for its own purposes, but it can use content you upload in order to serve you. This can include integrating services together (like reading your scanned pictures in order to OCR them), and it can include analyzing your files to target advertisements to you. Google already does this in Gmail.

Google doesn't currently serve ads in Google Docs (now called Google Drive), but it may, according to its license agreement, use data about the content you upload to target ads to you anywhere on the service.

Google also may give up your data in response to a legal demand, like a subpoena. If you want your cloud storage to be a little more out of touch, you might be interested in Wuala, which has no storage servers in the United States. Or, you might just want to keep your data off the Internet.

Google Drive and Security

Google Drive encrypts data between your computer and the Google servers. If you're using your Google Drive over the Web, the connection defaults to secure (HTTPS), and when you use the software that makes your Google Drive appear on your computer like a local hard drive, the data between your computer and Google is likewise encrypted. No casual hacker will be able to grab your files by monitoring or intercepting your Internet connection to Google. Your data also is stored under lock and key at Google itself, but it is not encrypted on the Google servers. You will have to encrypt your own files ahead of time. A Google rep explained why: encrypting files stored at Google would prevent you from previewing them on the Web, and it also would prevent services like Google Goggles and its OCR engine from accessing files on your behalf. (I presume it also would prevent Google's ad-serving algorithms from scanning your data to serve you more targeted messages, and this is how Google makes its money.)

Linux Installation

If you install the Google Drive client app for your Windows or Mac PC, you can sync all your on-line Google Docs to your computer. You can be selective with your syncing—sync Google Docs, all of My Drive or individual folders, and items in Shared with me.

But for all the celebration, there is something missing: a Linux client. The Google Drive client is not yet available for Linux, and the old Google Docs FS does not work too well with Google Drive, so besides the Web interface, the best solution for getting Google Drive on Linux right now is using Grive, an unofficial, open-source, command-line Linux client for Google Drive. Grive has been put together by a third-party developer, and it looks quite nice. Thanks Google for making Drive an open platform.

The purpose of Grive Project is to provide an independent implementation of the Google Drive client. It uses the Google Document List API to talk to Google's servers. The code is written in standard C++.

As of version 0.2.0, Grive can do two-sided synchronization between Google Drive and the local directory. It can download and upload changed files. New directories in Google Drive and the local directory also can be downloaded/uploaded. It cannot yet do the following:

  • Wait for changes in the filesystem to occur and upload the files afterward. Sync is performed only when you run Grive.

  • Symbolic link support.

  • Sync all files/folders with multiple parents and download Google Documents.

  • Support for files >2GB.

Those things will be added in the future, possibly during the next release.

At the time of this writing, you easily can install Grive in Ubuntu 11.10 or later using the following commands:

system :~$ sudo add-apt-repository ppa:nilarimogard/webupd8
system :~$ sudo apt-get update
system :~$ sudo apt-get install grive

On other Linux distributions, the installation is a little involved, as a Grive package is not yet available. You can compile it from source.

Grive source code is available at GitHub if you want to tinker with it. Setup is a bit of a pain right now, and it requires some command-line work, which is not uncommon in Linux. But, I am here to show you how to install it on RPM-based Linux (CentOS, Fedora and Red Hat).

For starters, you need a dedicated machine or virtual server, as well as a reliable Internet connection.

In this scenario, let's use CentOS. As you may know, CentOS is an enterprise-class Linux distribution derived from sources freely provided to the public by the Upstream OS Provider (www.redhat.com/rhel).

I strongly recommend using CentOS 6.0, because CentOS 6.0 has been completely rebuilt with a newer build system and library checks to confirm upstream binary compatibility. It also brings a new kernel, new versions of key server components and improved virtualization support.

Before you install Grive, you need to do some preparatory work. Note: all the following commands must be run as root.

First, update your CentOS. This is optional, but I strongly recommend it. Run the yum command:

[root@system ~]# yum update 

Next, disable SELINUX, and simply edit the /etc/selinux/config file:

[root@system ~]# vi /etc/selinux/config

Then add a line like:

SELINUX=disabled

Now, you need to enable an extra repository, ATrpms.

ATrpms is a third-party RPM repository. Its original focus was upon software used in natural sciences, especially in the field of high-energy physics, such as tools for numerical programming or for scientific publications. Since then, this repository includes many non-scientific software titles, like system tools or multimedia packages, resulting in a far more generic repository. Currently, packages are built only for Red Hat Linux flavors.

Install and enable ATrpms in CentOS by running the following commands:

[root@system ~]# cd /tmp
[root@system ~]# wget http://packages.atrpms.net/RPM-GPG-KEY.atrpms
[root@system ~]# wget http://dl.atrpms.net/el6Server-i386/atrpms/
↪stable/atrpms-repo-6-5.el6.i686.rpm  
[root@system ~]# rpm --import RPM-GPG-KEY.atrpms
[root@system ~]# rpm -ivh atrpms-repo-6-5.el6.i686.rpm

Next, you must run the following commands to install the required packages:

[root@system ~]# yum install automake autoconf  \
openssl openssl-devel json-c json-c-devel curl \ 
curl-devel libcurl-devel libcurl libarchive \
libarchive-devel libidn libidn-devel expat \
expat-devel binutils binutils-devel

CMake is a cross-platform build system generator. CMake is used to control the software compilation process using simple platform- and compiler-independent configuration files. Check your CMake package version:

[root@system ~]# rpm -qa | grep -i cmake

If the version of CMake is <2.8, you need to update the CMake package by running the following commands:

[root@system ~]# yum remove cmake
[root@system ~]# yum install cmake  \ 
--disablerepo=*  --enablerepo=atrpms-testing 

Boost provides free peer-reviewed portable C++ source libraries. Boost libraries are intended to be widely useful and usable across a broad spectrum of applications.

Now, check your Boost package version:

[root@system ~]# rpm -qa | grep -i boost

If the version of Boost is <1.46, you need to update the Boost package by running the following commands:

[root@system ~]# yum remove boost boost-devel 
[root@system ~]# cd /tmp
[root@system ~]# wget http://sourceforge.net/projects/boost/
↪files/boost/1.49.0/boost_1_49_0.tar.gz/download  
[root@system ~]# tar -xvf boost_1_49_0.tar.gz
[root@system ~]# cd boost_1_49_0
[root@system ~]# ./bootstrap.sh --prefix=/usr
[root@system ~]# ./b2 install

Installing the Boost libraries will take up to 15 minutes, depending on your hardware, so be patient.

Make sure your system time and date are correct. Run the following command if required:

[root@system ~]# ntpdate -b -p 8 -u pool.ntp.org

With all the pieces in place, you are ready to install Grive. The project's Web site offers access to the source code using Git. To download Grive, run:

[root@system ~]# cd ~
[root@system ~]# mkdir -p ~/.grive
[root@system ~]# cd /tmp
[root@system ~]# git clone https://github.com/Grive/grive.git
[root@system ~]# cd grive
[root@system ~]# cmake \
-DCMAKE_INSTALL_PREFIX:PATH=/usr .
[root@system ~]# make all install

The first time you run Grive, use the -a argument to grant permission to Grive to access to your Google Drive. The -a option is needed only the very first time you run Grive:

[root@system ~]# cd ~/.grive/
[root@system ~]# /usr/bin/grive -a

After running the command above, a URL should be displayed in the terminal—copy that URL and paste it in a Web browser.

You will need to log in to your Google account if you have not done so. In the newly loaded page, you will be asked to give Grive permission to access your Google Drive, and after clicking “Allow access”, an authentication code will be displayed. Copy this code and paste it in the terminal where you ran Grive:

-----------------------
Please go to this URL and get an authentication code:

https://accounts.google.com/o/oauth2/auth?scope=
↪https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email
↪+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.profile
↪+https%3A%2F%2Fdocs.google.com%2Ffeeds%2F
↪+https%3A%2F%2Fdocs.googleusercontent.com%2F
↪+https%3A%2F%2Fspreadsheets.google.com%2Ffeeds
↪%2F&redirect_uri=urn:ietf:wg:oauth:2.0:oob
↪&response_type=code&client_id=
↪22314510474.apps.googleusercontent.com

-----------------------
Please input the authentication code here

That's it. Now, each time you want to sync Google Drive with your local grive folder, navigate to the .grive folder and run /usr/bin/grive (this time without the -a, because you already have authenticated Grive with Google Drive). You can set up a cron job if you want to do this on a regular basis.

If everything works fine, Grive will create a .grive file in your home directory. It also will start downloading files from your Google Drive to your current directory. After all this business is done, all you need to do to refresh your Google Drive files is run Grive from the terminal.

Is it as convenient as Google's official desktop solutions? No, but you are not running Linux because you are looking for easy solutions.

Ending Notes

If you get a “crash” (a bunch of error messages on the screen), you should run:

[root@system ~]# /usr/bin/grive   -l   log.txt

Because the file will contain a log of the sync operation, you may want to edit the file first to remove personally sensitive information and then send it to Grive developers.

Keep up with the posts at https://plus.google.com/116411649000013553715/posts and https://github.com/grive/grive/issues for the latest information, especially regarding news on newly discovered bugs and when you need to download and build again!

Sayyed Mehdi Poustchi Amin is currently a PhD research student in computer science at SIU. His research is focused on developing honeypot decoys for analysis of Internet attacks. His professional certifications include CCNA, MCTS, MCITP and MCSE. He welcomes comments on this article and beyond at mehdi.poustchi@gmail.com.