For the past year since Microsoft has acquired GitHub, I've been hosting my Git repositories on a private server. Although I relished the opportunity and challenge of setting it all up, and the end product works well for my needs, doing this was not without its sacrifices. GitHub offers a clean interface for configuring many Git features that otherwise would require more time and effort than simply clicking a button. One of the features made easier to implement by GitHub that I was most fond of was web hooks. A web hook is executed when a specific event occurs within the GitHub application. Upon execution, data is sent via an HTTP POST
to a specified URL.
This article walks through how to set up a custom web hook, including configuring a web server, processing the POST data from GitHub and creating a few basic web hooks using Bash.
For the purpose of this project, let's use the Apache web server to host the web hook scripts. The module that Apache uses to run server-side shell scripts is mod_cgi
, which is available on major Linux distributions.
Once the module is enabled, it's time to configure the directory permissions and virtual host within Apache. Use the /opt/hooks directory to host the web hooks, and give ownership of this directory to the user that runs Apache. To determine the user running an Apache instance, run the following command (provided Apache is currently running):
ps -e -o %U%c| grep 'apache2\|httpd'
These commands will return a two-column output containing the name of the user running Apache and the name of the Apache binary (typically either httpd
or apache2
). Grant directory permission with the following chown
command (where USER
is the name of the user shown in the previous ps
command):
chown -R USER /opt/hooks
Within this directory, two sub-directories will be created: html and cgi-bin. The html folder will be used as a web root for the virtual host, and cgi-bin will contain all shell scripts for the virtual host. Be aware that as new sub-directories and files are created under /opt/hooks, you may need to rerun the above chown
to verify proper access to files and sub-directories.
Here's the configuration for the virtual host within Apache:
<VirtualHost *:80>
ServerName SERVERNAME
ScriptAlias "/cgi-bin" "/opt/hooks/cgi-bin"
DocumentRoot /opt/hooks/html
</VirtualHost>
Change the value of the ServerName
directive from SERVERNAME
to the name of the host that will be accessed via the web hook. This configuration provides base functionality to host files and executes shell scripts. The DocumentRoot
directive specifies the root of the virtual host using an absolute path on the local system. The ScriptAlias
directive takes two arguments: an absolute path within the virtual host and an absolute path on the local system. The path within the virtual host is mapped to the local system path. mod_cgi
handles all requests made to the path specified in the ScriptAlias
directive. (Note: any additional configuration including SSL or logging isn't covered in this article.)
You'll need a basic understanding of the HTTP protocol and Bash scripting to understand how CGI scripts work. When a request is made to an HTTP server, a response is generated and sent back to the client. The HTTP request contains headers that instruct the server how to handle the request. Likewise, the HTTP response contains headers that instruct the client how to handle the response. Viewing and analyzing HTTP traffic can be very simple using the developer tools on any modern browser. Here's a simple example of an HTTP request and response:
Request:
POST /cgi-bin/clone.cgi HTTP/1.1
Host: hooks.andydoestech.com
Content-length: 86
{"repository":{"name":webhook-test","url":https://github.com/
↪bng44270/webhook-test"}}
Response:
HTTP/1.1 200 OK
Date: Tue, 11 Jun 2019 02:44:52 GMT
Content-Length: 18
Content-Type: text/json
{"success":"true"}
The request is making a POST
request to the clone.cgi file located in http://hooks.andydoestech.com//cgi-bin/. The response contains the response code, date/time when the request was handled, the length of the content body (in bytes) and the content body itself. Although there are instances when binary data may be sent via HTTP, the examples in this article deal only with clear-text transmissions.
Given the robust text-processing capabilities and commands available, Bash is well suited for constructing and manipulating the text in an HTTP transaction. If the above HTTP request were to be handled by a Bash script, it might look like this:
#!/bin/bash
JSONPOST="$(cat -)"
echo "Date: $(date)"
echo "Content-Length: 18"
echo "Content-Type: text/json"
echo ""
echo "{\"success\":\"true\"}"
Although this script is lacking in logic, it nicely illustrates how the HTTP POST
data is captured as the JSONPOST
variable, and how the HTTP response headers and data are returned to the client via standard script output.
Although many GitHub resources can trigger web hooks, this article focuses specifically on the push event that fires when data is remotely pushed into a code repository. When the HTTP POST request of a web hook is made, a JSON object is posted to the URL. This JSON object contains many pieces of information relating to the push operation, including information about the repository and commits contained in the data push. The command to parse individual values out of the POST JSON is jq
, which is available on major Linux distributions. The syntax for the command requires the desired property to be specified in dot notation. As an example, consider the following snippet of the JSON object returned from GitHub:
{
"repository": {
"name": "webhook-test",
"git_url": "git://github.com/bng44270/webhook-test.git",
"ssh_url": "git@github.com:bng44270/webhook-test.git",
"clone_url": "https://github.com/bng44270/webhook-test.git",
}
}
To return the value of the attribute named clone_url
using jq
, you would use the following syntax:
jq -r '.repository.clone_url' <<< 'JSON'
After replacing JSON with the text representation of the JSON object, this command would return the HTTP repository clone URL. Using command substitution, the value of a JSON attribute can be assigned to a Bash variable for use within a script.
The first hook I want to cover will create a backup of the repository on the Apache server hosting the web hook scripts. The above VirtualHost configuration will be used in this example. Here's the repository backup web hook script:
1 #!/bin/bash
2
3 REPODIR="/opt/hooks/html/repos"
4
5 json_resp() {
6 echo '{"result":"'"$([[ $1 -eq 0 ]] && echo "success"
↪|| echo "failure")"'"}'
7 }
8
9 POSTJSON="$(cat -)"
10
11 REPOURL="$(jq -r ".repository.clone_url" <<< "$POSTJSON")"
12 REPONAME="$(jq -r ".repository.name" <<< "$POSTJSON")"
13
14 echo "Content-type: text/json"
15 echo ""
16
17 if [ -d $REPODIR/$REPONAME ]; then
18 pushd .
19 cd $REPODIR/$REPONAME
20 git pull
21 json_resp $?
22 popd
23 else
24 mkdir $REPODIR/$REPONAME
25 git clone $REPOURL $REPODIR/$REPONAME
26 json_resp $?
27 fi
The REPODIR
variable at the beginning of the script indicates the directory that will contain all repository directories. The json_resp
function allows the code that generates a JSON response to be reused multiple times in the script. Just like in the example above, the HTTP POST
data is captured in the POSTJSON
variable. In lines 11 and 12, the clone_url
and name attributes are pulled from the POSTJSON
variable using jq
. Line 14 begins the creation of HTTP response headers. The if
block on lines 17–27 determines whether the repository already has been cloned. If it has, the script moves to the repository folder, pulls down repository changes and returns to the original working directory. If the folder does not exist, the directory is created, and the repository is cloned to the new directory. Note the use of the $REPODIR
variable that was set at the beginning of the script. Whether the repositor is cloned or updates are pulled down, the json_resp
function is called to generate the response JSON, which will contain a single attribute named "success" with a value of "true" or "false" depending on the outcome of the respective git
commands.
Backing up repositories can be useful. With the vast number of build tools available on the command line, it makes sense to create a web hook that will deliver a built package for code in a repository. This could be built out into a robust solution filling the need for Continuous Integration/Deployment (CI/CD). Here's the build/deploy web hook script:
1 #!/bin/bash
2
3 WEBROOT="/opt/hooks/html/archive"
4 REPODIR="/opt/hooks/html/repos"
5 WEBURL="http://hooks.andydoestech.com/archive"
6
7 json_package() {
8 echo '{"result":"'$([[ $1 -eq 0 ]] && echo
↪"\"success\",\"url\":\"$1\"" ||
↪echo "\"package failure\"")"'}'
9 }
10
11 run_make() {
12 [[ -d $REPODIR/$REPONAME/build ]] && make -s -C
↪$REPODIR/$REPONAME clean
13 if [ $1 -eq 0 ]; then
14 make -s -C $REPODIR/$REPONAME
15 if [ -d $REPODIR/$REPONAME/build ]; then
16 FILENAME="$REPONAME-$COMMITTIME.tar.gz"
17 tar -czf $WEBROOT/$FILENAME -C
↪$REPODIR/$REPONAME/build .
18 json_package "$?" "$WEBURL/$FILENAME"
19 else
20 echo '{"result":"build failure"}'
21 fi
22 else
23 echo '{"result":"clone/pull failure"}'
24 fi
25 }
26
27 POSTJSON="$(cat -)"
28
29 REPOURL="$(jq -r ".repository.url" <<< "$POSTJSON")"
30 REPONAME="$(jq -r ".repository.name" <<< "$POSTJSON")"
31 COMMITTIME="$(jq -r '.commits[0].timestamp' <<<
↪"$POSTJSON" | date -d "$(cat -)" +"%m-%d-%YT%H-%M-%S")"
32
33 echo "Content-type: text/json"
34 echo ""
35
36 if [ -d $REPODIR/$REPONAME ]; then
37 pushd .
38 cd $REPODIR/$REPONAME
39 git pull
40 run_make $?
41 popd
42 else
43 mkdir $REPODIR/$REPONAME
44 git clone $REPOURL $REPODIR/$REPONAME
45 run_make $?
46 fi
In a similar manner to Hook #1, variables are defined at the beginning of the script to specify the directory where repositories will be cloned, the directory where build packages will be stored and the base URL of build packages. The two functions defined on lines 7–25 will be used later in the script. Lines 27–31 are capturing the JSON POST data and parsing out attributes into shell variables using jq
. Note that the format of the date in COMMITTIME
is being modified from its original form (this will make sense later). Lines 33–46 are almost identical to Hook #1 in terms of setting HTTP headers and cloning/pulling repository with an addition of a call to the run_make
function. The return status of the clone/pull is passed to the run_make
function. If the clone/pull ran successfully, the function assumes there is a Makefile in the root of the repository. The Makefile is assumed to behave in the following manner:
make
is executed, the solution is built into a folder named "build" within the repository.make clean
is executed, the "build" folder is deleted.Beginning on line 12, if the build folder exists, make clean
is executed to remove it. If the make in line 13 is successful, an archive filename is constructed using REPONAME
and COMMITTIME
. Note that the value of COMMITTIME
contains no spaces for a proper filename. The status code of the tar
command on line 17 is passed into the json_package
function. If the archive was created successfully, a JSON object containing two JSON attributes are defined: result
is set to "success", and url
is set to the URL of the archive. If the archive was unable to be created, the result attribute is set to "package failure".
GitHub provides many features, but without question, web hooks provides the DevOps engineer with tools to accomplish almost any task. Leveraging the functionality of Apache with CGI and Bash scripting in such a way that it can be consumed by GitHub allows for almost endless possibilities.
For more information on topics mentioned in this article, refer to the following links: