Jenkins Installation

Our primary (master) Jenkins server is http://jenkins1.internal.lab7.io:8000, which runs in a KVM virtual machine on our
IBM S822L (POWER8) server. The Jenkins VM runs Ubuntu 16.04.3 and has three virtual disks (LVM-backed libvirt volumes):

  • /dev/vda: 10-GB volume for the guest OS installation
  • /dev/vdb: 50-GB data disk for Jenkins
  • /dev/vdc: 25-GB data disk for Docker

(Docker is currently used to run the container that builds the esp_sys tarballs and will eventually be used to run ESP
test containers as well.)

After initializing the VM and installing the guest OS (on /dev/vda), do the following as root on the Jenkins VM:

  1. Use lsblk to find the data disks:
    NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    vda    253:0    0   10G  0 disk 
    ├─vda1 253:1    0    7M  0 part 
    ├─vda2 253:2    0  954M  0 part [SWAP]
    └─vda3 253:3    0  9.1G  0 part /
    vdb    253:16   0   50G  0 disk
    vdc    253:32   0   25G  0 disk 
    
  2. Create ext4 filesystems on each data disk:
    JENKINS_FS_UUID=`uuidgen | tr A-F a-f`
    mkfs.ext4 -m 0 -i $((8*1024)) \
        -U ${JENKINS_FS_UUID}
        -O extent,sparse_super2,flex_bg,resize_inode,mmp \
        -O dir_index,dir_nlink,large_file,filetype,ext_attr,^quota \
        /dev/vdb
    
    DOCKER_FS_UUID=`uuidgen | tr A-F a-f`
    mkfs.ext4 -m 0 -i $((4*1024)) \
        -U ${DOCKER_FS_UUID} \
        -O extent,sparse_super2,flex_bg,resize_inode,mmp \
        -O dir_index,dir_nlink,large_file,filetype,ext_attr,^quota \
        /dev/vdc
    
  3. Add the following to /etc/fstab:
    UUID=${JENKINS_FS_UUID} /srv/jenkins     ext4  defaults,noatime,nodiratime,nodev  0 2
    UUID=${DOCKER_FS_UUID}  /var/lib/docker  ext4  defaults,noatime,nodiratime  0 2
    

    We specify which filesystems to mount using UUIDs rather than device names (e.g., /dev/vdb) so things don’t break
    in cases where modifying the VM disk configurtion causes block device name changes.

    If you did not explicitly set the filesystem UUID using uuidgen and the mkfs -U option above, you can get the
    UUID for an existing filesystem using the dumpe2fs

    dumpe2fs -h /dev/vdb 2>/dev/null | grep -i 'uuid:' | awk '{print $3;}'
    
  4. Mount the data filesystems:
    mount /var/lib/docker
    mount /srv/jenkins
  5. Install Jenkins:
    bash
    wget -q -O - https://pkg.jenkins.io/debian-stable/jenkins.io.key | \
    sudo apt-key --keyring "/etc/apt/trusted.gpg.d/jenkins.gpg" add -
    echo 'deb https://pkg.jenkins.io/debian-stable binary/' > /tc/apt/sources.list.d/jenkins.list
    apt-get update
    apt-get install jenkins openjdk-8-jre-headless
  6. Install Docker for POWER (ppc64le):
    add-apt-repository -y ppa:docker/experimental
    apt-get update
    apt-get install docker.io
    

    Add the local server administrator and Jenkins server accounts to the docker group so they can run
    docker-related commands without sudo/su privileges.

    usermod -a -G docker srvadmin
    usermod -a -G docker jenkins
    
  7. Stop Jenkins so we can move the installation to /srv/jenkins:
    bash
    systemctl stop jenkins
  8. Update the Jenkins home directory in the user account database:
    bash
    usermod -d /srv/jenkins jenkins
  9. Move Jenkins-associated files and directories to /srv/jenkins:
    bash
    rsync -avAHX /var/lib/jenkins/ /srv/jenkins/
    mv /var/cache/jenkins /srv/jenkins/cache
    mkdir -p /srv/jenkins/logs
    mv /var/log/jenkins/* /srv/jenkins/logs
    rm -rf /var/{cache,lib,log}/jenkins
  10. Edit /etc/default/jenkins and set the following options:
    • HTTP_PORT=8000
    • JAVA_ARGS="-Djava.net.preferIPv4Stack=true"
    • JENKINS_HOME=/srv/$NAME
    • JENKINS_LOG=$JENKINS_HOME/logs/$NAME.log
    • JENKINS_ARGS="--webroot=$JENKINS_HOME/cache/war --httpPort=$HTTP_PORT"
  11. Remove the /srv/jenkins/cache/war directory (rm -rf /srv/jenkins/cache/war). Failing to do this will prevent
    Jenkins from restarting properly.
  12. Restart Jenkins:
    bash
    sudo systemctl daemon-reload
    sudo systemctl start jenkins

Jenkins Basic Configuration

This section describes how we set up some “system-level” things. It does not describe how things like the current set
of build jobs were created/configured; for that, refer to the Jenkins user documentation and
the current job configurations on the system.

  1. Connect to http://jenkins1.internal.lab7.io:8000 using your browser.
  2. Login using the password copied from /var/lib/jenkins/secrets/initialAdminPassword.
  3. When prompted to customize Jenkins, select “Install suggested plugins”, and let the installer do its thing.
  4. Create the first admin user. Since we ultimately want to use Google Apps for authentication, what’s entered here is
    not terribly important; we simply need an account to perform the initial configuration with. However, avoid using
    ${first_name}.${last_name}@lab7.io as the first username, as that may conflict with usernames pulled from the
    “lab7.io” Google Apps domain.
  5. Log into Jenkins as the admin user. Navigate to the “Manage Jenkins” > “Manage Plugins” page, and install the
    following plugins from the “Available” tab:

    • build-name-setter
    • Git Parameter
    • Google Login
    • Green Balls
    • Slack Notification
    • TextFinder

Google Apps Integration

The Google Login plugin provides single sign-on access to Jenkins for users
in the “lab7.io” Google Apps domain.

To enable this, we first need to create an OAuth client ID for the Jenkins server:

  1. Log into the Google API Credentials page, and create a
    “Jenkins Auth” project.
  2. Navigate to the project’s “Credentials” section (e.g., through the left side bar link)
  3. From the “Credentials” tab, click on the “Create credentials” button and select “OAuth Client ID”. Set the
    “Application type” to “Web application”, and enter the following values:

  4. On the “OAuth consent screen” tab, enter the following values:

Once the OAuth client ID has been created, we can then configure Jenkins:

  1. Log into Jenkins as the admin user, and navigate to the “Manage Jenkins” > “Configure Global Security” page.
  2. Make sure the “Enable Security” box is checked.
  3. Under the “Access Control” > “Security Realm” section, select “Login with Google” and enter the following values:
    • Client ID: <Value from Google Developers Console>
    • Client secret: <Value from Google Developers Console>
    • Google Apps Domain: lab7.io
  4. Click “Save” or “Apply” to enable Google Apps support. Users should now be able use the “lab7.io” Google account to
    log into Jenkins using a process similar to the one used for Atlassian products (BitBucket, JIRA, etc.).

Slack Integration

The Slack Notification plugin allows Jenkins jobs to send status
messages to Lab7’s #notifications Slack channel

To enable this, first create an integration token for the Jenkins in Slack:

  1. As a “lab7io” Slack admin, navigate to the Slack “Configure Apps” page.
  2. Use the “Search App Directory” tool to find the “Jenkins CI” app.
  3. Click the “Add Configuration” button on the “Jenkins CI” app page, and enter the following values for the
    “Integration Settings”:

    • Post to Channel: #notifications
    • Customize Name: jenkins
    • Token: <Copy this value to supply to Jenkins below>
  4. Click the “Save Settings” button.

Once the Slack integration token has been created, we can then configure Jenkins:

  1. Log into Jenkins as the admin user, and use the left side bar to navigate to the “Credentials” > “System” store.
  2. Select the “Global credentials” domain.
  3. Click “Add Credentials” in the left side bar, and enter the following values:
    • Kind: Secret Text
    • Scope: Global
    • Secret: <“Token” from the Slack integration settings above>
    • Description: “Slack Token”
  4. Click “OK” to confirm create of the Slack credential.
  5. Navigate to the “Manage Jenkins” > “Configure System” page. In the “Global Slack Notifier Settings”, enter the
    following values:

    • Base URL: <Leave blank>
    • Team Subdomain: lab7io
    • Integration Token Credential ID: Slack Token
    • Channel: notifications

    WARNING: Leave the “Integration Token” field blank; we are explictly using Jenkins’ credential system to prevent
    sensitive information like our Slack API token from being stored as plain text in “publicly” readable files.

  • Click “Save” or “Apply” to enable Slack Integration.
  • Slack notifications are configured on a per-job basis, by selecting “Slack notifications” from the “Post Build Actions”
    list on the job configuration page.

    BitBucket Integration

    An SSH key pair and corresponding Jenkins credential need to be generated for Jenkins jobs to get access Lab7’s
    BitBucket (git) repositories. To do this,

    1. SSH in as the jenkins user on the Jenkins server. From the terminal, run
      ssh-keygen -t rsa -b 2048 -f ~jenkins/.ssh/esp_jenkins_rsa
    2. Log into Jenkins as the admin user, and use the left side bar to navigate to the “Credentials” > “System” store.
    3. Select the “Global credentials” domain.
    4. Click “Add Credentials” in the left side bar, and enter the following values:
      • Kind: SSH Username with Private Key
      • Scope: Global
      • Username: Lab7 BitBucket (Note: this doesn’t actually matter all that much, as BitBucket access is based on the
        key fingerprint and not the specific user name.)
      • Private key: “From a file on Jenkins master”
      • File: /srv/jenkins/.ssh/esp_jenkins_rsa
      • Passphrase: <Leave blank>
      • ID: <Leave blank>
      • Description: <Leave blank>
    5. Click the “Save Settings” button.

    To give Jenkins access to a Lab7 BitBucket (git) repository, create a deployment key by:

    1. Log into BitBucket as a repository administrator.
    2. Click on “Settings” for the repository, followed by “Access Keys” in the “General” section.
    3. Click on the “Add Key” button, and enter the following values in the dialog:
      • Label: “Jenkins<n> server”
      • Key: <Contents of ~jenkins/.ssh/esp_jenkins_rsa.pub>
        Click the dialog’s “Add Key” button to complete the deployment key creation.
        Note that this process needs to only be done once per repository, regardless of how many Jenkins jobs actually make use
        of it.

    Access to Lab7 BitBucket repositories is done on a per-job basis, by selecting “Git” in the “Source Code Management”
    section of the job’s configuration, entering the desired repository’s BitBucket URL (usually of the form
    git@bitbucket.org:lab7io/${repo_name}), and selecting “Lab7 BitBucket” for the “Credentials” to use.

    Advertisements

    Nitpicking: Dreadnoughtus wasn’t heavier than a jumbo jet

    Warning: Aviation geek nitpicking below.

    Last week saw quite a bit of press about the discovery of Dreadnoughtus schrani, a sauropod with the “largest calculable mass of any land animal”. Many reports also included some version of this figure:

    Dreadnoughtus sure was heavy.
    Dreadnoughtus compared with other sauropods and a 737-900. Image credit: Nature news

    That figure, in turn, lead to multiple tweets appearing in my timeline claiming a dinosaur larger than a jumbo jet had been found (e.g., this one from Time). Despite an early “promise” not to do so, I eventually felt compelled to tweet a clarification:

    To explain: in aviation, the term “jumbo jet” refers to certain large, wide-body aircraft like the Boeing 747 or Airbus A380; it’s generally not used to refer narrow-body aircraft like the 737-900 shown in the figure above. Relatedly, being specific is important when discussing aircraft weights, as various weights like the operating empty weight (OEW) and the maximum takeoff weight (MTOW) can differ significantly.

    For example, for the three aircraft mentioned in this post (ranges arise from variations in aircraft configurations):

    Model OEW (kg) MTOW (kg)
    Boeing 737-900 42,901 [1] 74,389–79,016 [1]
    Boeing 747-400 178,756–179,752 [2] 362,874–396,894 [2]
    Airbus A380 276,800 [3] 490,000–575,000 [4]

    Dreadnoughtus is (well, was) very much an impressive, enormous animal, but at a “mere” 60 tonnes, it’s absolutely dwarfed by jumbo jets. In fact, it’s quite a bit lighter than a loaded 737-900, a fact that’s probably not made very clear in that Nature news figure.

    Mind you, none of this is meant to take away from the importance or interestingness of Dreadnoughtus (and you should read Brian Switek’s coverage, by the way). It’s just that I get oddly picky about precision when talking about aviation and science.

    [1] “Boeing 737 Airplane Characteristics for Airport Planning: Chapter 2: Airplane Description”. (PDF)
    [2] “Boeing 747 Airplane Characteristics for Airport Planning: Chapter 2: Airplane Description”. (PDF)
    [3] Wikipedia: Airbus A380: Specifications
    [4] “Airbus A380 Airplane Characteristics: Airport and Maintenance Planning”. (PDF)

    Donors Choose Drive for #Ferguson-area schools

    Update 1: Mrs. Randoll’s writing materials has been funded!
    Update 2: Ms. Peach’s string bass has been funded!
    Update 3: Mrs. Baughman’s reading mat and bean bags have been funded! Just Mr. Brown’s 3D printer left to go!

    Last week, Drug Monkey organized two successful Donors Choose drives (here and here) to support schools in the Ferguson, Missouri area. If you’re not familiar with it, Donors Choose is an online charity that allows you to make donations to support teacher-selected projects in (generally) economically disadvantaged public schools.

    This weekend, the Bill and Melinda Gates Foundation is providing matching funds to nearly every project on DonorsChoose.org. Since today is the last day of this “sale”, I figured it would be a great opportunity to continue what Drug Monkey started and help schools affected by the situation in Ferguson. Here are just four of the many projects I think are worth supporting:

    • Mrs. Randoll at Walnut Grove Elementary School needs writing supplies:

      Many of my students enter kindergarten without even knowing how to write their name. This year I want to make writing fun and meaningful for them by allowing them to publish their own books!

      The resources for this project will help my classroom because students will be able to share their thoughts and feelings daily through their writing. Publishing their writing will help engage them, create excitement for writing, and give them something they can share at home with their families.

      Learning how to write well is fundamental to how learning to communicate well, especially in this era where so much of our communication is text-based. So don’t you want to help this kindergarten class get their proper start?

    • Mrs. Baughman at Johnson Wabash Elementary School needs help with her library:

      I hope that creating a fun space to read and listen to stories will get them excited about all sorts of topics and encourage them to read more. My school is in a low-income area, so not every child has a stack of books waiting for them at home.

      The reading rug will provide a space for my K-2 students to come in and enjoy of fun story. Also, the bean bag chairs will provide some comfortable seating for all of my students to read independently. Currently, my library does not have any comfy seating.

      Browsing for hours at the local library and bringing home a stack of books to read was one of my favorite activities as a kid. Doing so certainly inspired my curiosity and set me down the path to my current science/engineering career. I suspect many of you have similar stories, so let’s give the kids of Johnson Wabash Elementary the same opportunity. After all, who wouldn’t like reading while sitting in a comfy bean bag?

    • Ms. Peach at Lee-Hamilton Elementary School needs a string bass:

      My students walk into my classroom excited about learning how to play a string instrument. Some students are trying out the instruments for the very first time; others have been playing for up to 3 years (since they started violin in 3rd grade), but they all are a talented group of musicians.

      Due to budget cuts, the one thing that’s missing from my orchestra is an upright string bass. I currently teach at 5 elementary schools and have only 1 bass (that is much too big for a majority of my students) available to my students to use.

      Playing the violin in orchestra was one of the great experiences of my middle and high school year. Music is a wonderful outlet for children, but an orchestra without a bass simply isn’t an orchestra. Shouldn’t we help Ms. Peach make her orchestra complete?

    • Mr. Brown at Cross Keys Middle School needs a 3D printer:

      I teach engineering and 3D modeling to middle school students. The students love using the computer software to digitally create 3D objects, but have no way to see them in real life. It would be great to have a 3D printer so that students can “print” their designs and see them come to life.

      [M]y focus is on 21st century skills (robotics, programming, 3D modeling, engineering). The students are excited about the program, and have lots of fun learning through building in my classroom.

      This is my “fun” suggestion and an opportunity I never had as a kid. Think of it as the modern replacement for woodshop—giving students an opportunity to design and build their own creations.

    Of course, you are under no obligation to contribute to the projects I’ve listed. You could, for example, contribute to projects from other schools in the Ferguson area or from schools anywhere in the United States. Even if you can’t afford to give, you can help by spreading the word about Donors Choose (in general) and the generous but soon-to-expire offer from the Gates Foundation (specifically).

    Thank you, Dear Readers, for all of your help!

    Using pip with an alternate CA bundle

    After recently upgrading my OS X pip install, I began having problems using it to install Python packages; for example, attempting to install requests resulted in the following:

    $ pip install requests
    Downloading/unpacking requests
      Cannot fetch index base URL https://pypi.python.org/simple/
      Could not find any downloads that satisfy the requirement requests
    Cleaning up...
    No distributions at all found for <foo>
    Storing debug log for failure in $HOME/.pip/pip.log

    The debug log reveals that this is an SSL certificate verification problem:

    Downloading/unpacking requests
      Getting page https://pypi.python.org/simple/requests/
      Could not fetch URL https://pypi.python.org/simple/requests/: connection error: [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
      Will skip URL https://pypi.python.org/simple/requests/ when looking for download links for requests

    Now, pip has an internal CA bundle, but for reasons I didn’t bother looking into, that bundle no longer worked for validating the TLS certificate used by pypi.python.org. Fortunately, pip has a “--cert” option for providing an alternate CA bundle.

    Connecting to PyPI in a browser allowed me to examine the TLS certificate chain:
    CA chain for pypi.python.org

    This told me that the pip alternate CA bundle needed to include the DigiCert “High Assurance EV Root CA” and “High Assurance CA-3” certificates, both of which can be obtained from the DigiCert Root Certificates page.

    One caveat is that pip expects its (alternate) CA bundle to be in PEM format, which means just downloading the certs isn’t enough. So…sh, wget, and openssl to the rescue:

    $ wget -q -O- "http://cacerts.digicert.com/DigiCertHighAssuranceEVRootCA.crt" | \
        openssl x509 -inform DER -outform PEM > ~/.pip/ca-bundle.crt
    $ wget -q -O- "http://cacerts.digicert.com/DigiCertHighAssuranceCA-3.crt" | \
        openssl x509 -inform DER -outform PEM >> ~/.pip/ca-bundle.crt

    Once this alternate CA bundle had been created, I could once again use pip to install Python packages; e.g.,

    $ pip --cert ~/.pip/ca-certs.crt install requests
    Downloading/unpacking requests
      Downloading requests-2.3.0.tar.gz (429kB): 429kB downloaded
      Running setup.py egg_info for package requests
    
    Installing collected packages: requests
      Running setup.py install for requests
        
    Successfully installed requests
    Cleaning up...

    Postscript: To make life easier, I added the following to my ~/.pip/pip.conf to make sure pip always uses the alternate CA bundle:

    [global]
    cert = /home/chl/.pip/ca-certs.crt

    Getting back into blogging

    I recently realized that I haven’t updated this blog in more than seven months, and even before then, new posts were rather sporadic. While I certainly haven’t been lacking in things to say (see, for example, my activity on Twitter), I’ve avoided blogging due to a combination of (1) having other, higher-priority commitments and (2) being uncertain in the direction(s) I wanted my blogging to go.

    However, after talking to others and giving it some thought, I decided I needed to get back into blogging for several reasons. First, while Twitter has helped me improve aspects of my writing (mostly in condensing thoughts and choosing words carefully), I’ve noticed that lack of use has caused my “long form” writing skills to degrade, and blogging seems like a great way to polish those skills again. Secondly, the longer format of blog posts provides a much better venue for developing and explaining “complex” ideas than a tweet storm. Finally, my startup work has really distracted me from graduate school, and I’m hoping that forcing myself to blog about PhD-related topics will get me back on track.

    Going forward, I see two broad areas of focus for this blog. The first—based largely on my experiences at the startup—will be posts on programming and software development, covering topics like how to accomplish X in language Y or comparing various approaches to doing something [1]. The second, as mentioned above, will be PhD-related posts, essentially turning this blog into an open lab notebook; these will include research blogging papers I’m reading, describing methods I’m learning, and explaining my research ideas as I develop them [2]. And of course, I will continue blogging about other things, such as science communication, food, and (to a lesser extent now) politics.

    So, to my few loyal readers and hopefully to some new ones as well, welcome (back) to my restarted blog!

    [1] Inspired by Sebastian Raschka’s “One Python Benchmark Per Day”
    [2] Inspired by various friends’ grad school blogs and “Becoming A Data Scientist”

    Standing with DNLee [updated]

    [Updated Oct. 14th: As of this afternoon, Scientific American has restored DNLee’s blog post with an editor’s note on why the post was originally taken down (“…for legal reasons…”).

    While I respect the position that SciAm’s editors were in and commend them for doing the right thing by restoring the post, I still think their management of the situation as it developed misguided. The scicomm world has a fairly well-developed BS detector, and the mix of unresponsiveness and flailing for an explanation (“not about discovering science” and “too personal” before settling on “lawyers”) certainly set it off. However, the community is also reasonably patient, and I think a lot of the outrage could have been avoided had the editors just posted a simple “hey, this may cause us some legal problems, so we’re going to pull the post until we can sort it out” message from the get-go.]

    Woke up this morning to find my Twitter feed in an uproar about Scientific American’s decision to take down a one of its blogger’s post; as explained by SciAm’s editor-in-chief:
    SciAm's tweets why DNLee's post was taken down

    The short story behind this post so clearly not about “discovering science”? DNLee, the blogger in question, was asked by Ofek, the editor of biology-online.org, to guest blog for them; after asking about the specifics, she politely declined:

    Thank you very much for your reply.
    But I will have to decline your offer.
    Have a great day.

    In a brilliant rapport-building move, Ofek responded:

    Because we don’t pay for blog entries?
    Are you an urban scientist or an urban whore?

    Um, yeah; calling someone “an urban whore” really isn’t the way to make friends. An understandably very unhappy DNLee decided to blog about the experience—except that link is dead now because, well, see DiChristina’s tweet above (see update). Needless to say, SciAm’s decision spawned furor in the online scicomm community, as evidenced by the floods of #StandWithDNLee/#StandingWithDNLee tweets in my feed this morning.

    Granted, DNLee’s post wasn’t about a headline grabbing new discovery. But I’d strongly argue that it is (or rather was) about “discovering science”. Effective science communication—i.e., the process of helping the public “discover science—can’t simply be a stream of “hey look at this cool new thing scientists discovered!” articles; it also has to be helping people understand process of how science is done, and that, unfortunately, also means exposing them to the uglier side of things, including the pervasive sexism that women in STEM fields face.

    Following Dr. Isis’ lead, I’m now reposting DNLee in her own words. I encourage readers to also repost (with proper attribution!) and also hope that Scientific American blogs quickly corrects and apologizes their mistake.


    wachemshe hao hao kwangu mtapoa

    I got this wrap cloth from Tanzania. It’s a khanga. It was the first khanga I purchased while I was in Africa for my nearly 3 month stay for field research last year. Everyone giggled when they saw me wear it and then gave a nod to suggest, “Well, okay”. I later learned that it translates to “Give trouble to others, but not me”. I laughed, thinking how appropriate it was. I was never a trouble-starter as a kid and I’m no fan of drama, but I always took this 21st century ghetto proverb most seriously:

    Don’t start none. Won’t be none.

    For those not familiar with inner city anthropology – it is simply a variation of the Golden Rule. Be nice and respectful to me and I will do the same. Everyone doesn’t live by the Golden Rule it seems. (Click to embiggen.)

    The Blog editor of Biology-Online dot org asked me if I would like to blog for them. I asked the conditions. He explained. I said no. He then called me out of my name.

    My initial reaction was not civil, I can assure you. I’m far from rah-rah, but the inner South Memphis in me was spoiling for a fight after this unprovoked insult. I felt like Hollywood Cole, pulling my A-line T-shirt off over my head, walking wide leg from corner to corner yelling, “Aww hell nawl!” In my gut I felt so passionately:”Ofek, don’t let me catch you on these streets, homie!”

    This is my official response:

    It wasn’t just that he called me a whore – he juxtaposed it against my professional being: Are you urban scientist or an urban whore? Completely dismissing me as a scientist, a science communicator (whom he sought for my particular expertise), and someone who could offer something meaningful to his brand.What? Now, I’m so immoral and wrong to inquire about compensation? Plus, it was obvious me that I was supposed to be honored by the request..

    After all, Dr. Important Person does it for free so what’s my problem? Listen, I ain’t him and he ain’t me. Folks have reasons – finances, time, energy, aligned missions, whatever – for doing or not doing things. Seriously, all anger aside…this rationalization of working for free and you’ll get exposure is wrong-headed. This is work. I am a professional. Professionals get paid. End of story. Even if I decide to do it pro bono (because I support your mission or I know you, whatevs) – it is still worth something. I’m simply choosing to waive that fee. But the fact is I told ol’ boy No; and he got all up in his feelings. So, go sit on a soft internet cushion, Ofek, ’cause you are obviously all butt-hurt over my rejection. And take heed of the advice on my khanga.

    You don’t want none of this

    Thanks to everyone who helped me focus my righteous anger on these less-celebrated equines. I appreciate your support, words of encouragement, and offers to ride down on his *$$.


    Git: copying a subset of commits from another branch

    Today’s git challenge was to copy a subset of commits from our current development branch (“dev” in the figures below) into our current release branch (“v2-hotfix” in the figures below). Graphically, we started with a repository looking like this:

    ... <-- A <-- B <-- C <-- D (v2-hotfix)
            ∧
            |                                   
            +---- Q <-- R <-- S <-- T <-- U <-- V (dev)

    and wanted a repository looking like this:

    ... <-- A <-- B <-- C <-- D <-- R′ <-- S′ <-- T′ (v2-hotfix)
            ∧
            |                                   
            +---- Q <-- R <-- S <-- T <-- U <-- V (dev)

    where R′, S′, and T′ are copies of the R, S, and T commits from the dev branch. The “obvious” use case for doing this is incorporating a set of dependent patches for fixing some bugs into our supported release branch (v2-hotfix).

    Copying commits the easy way

    The “easy” way of copying a sequential subset of commits is to use “git cherry-pick”. First, prepare the target v2-hotfix branch:

    $ git checkout v2-hotfix
    $ git stash
    No local changes to save
    $ git status
    # On branch v2-hotfix
    nothing to commit, working directory clean

    Strictly speaking, the “git stash” is not necessary; however, before integrating patches from elsewhere, I like having a clean working directory to reduce the chance of merge conflicts. (In general, I find it easier to fix conflicts from a “git stash pop” than trying to resolve them when performing a merge or rebase; your milage may vary.)

    Next, use “git log” to get the commit ids for the patches of interest:

    $ git log --oneline dev
    f36a95a Message for commit V
    19ece6a Commit message for U
    58ee3ac Commit message for T    <- Want this commit...
    11ea6f6 Commit message for S    <- and this one...
    707ac6a Commit message for R    <- and this one
    824e4b8 Commit message for Q
    ...older commit messages for dev branch...

    Finally, use “git cherry-pick <from_id>..<to_id>” and the SHA-1 ids for the commits of interest to incorporate them into the target branch:

    $ git cherry-pick 707ac6a^..58ee3ac
    [dev 707ac6a] Commit message for R
     .. files changed, .. insertions(+), .. deletions(-)
    [dev 11ea6f6] Commit message for S
     .. files changed, .. insertions(+), .. deletions(-)
    [dev 58ee3ac] Commit message for T
     .. files changed, .. insertions(+), .. deletions(-)
    
    $ git status
    # On branch v2-hotfix
    nothing to commit (working directory clean)
    $ git log --oneline
    5141a8f Commit message for T    <- Now also the message for T′
    1f69887 Commit message for S    <- Now also the message for S′
    1b42c27 Commit message for R    <- Now also the message for R′
    6ffbf6f Commit message for D
    ...older commit messages for v2-hotfix...

    The key here is that because git needs a reference commit from which to generate the diff for R (and thus R′), the <from_id> argument for “git cherry-pickmust be the parent of R, i.e., 824e4b8, or equivalently, 707ac6a^.

    Copying commits the harder way

    The alternative, slightly harder, method uses “git rebase” and requires an additional command to achieve the same effect. It was, however, a good opportunity to learn how to use the “--onto” argument for “git rebase”, so I’ll discuss the approach here.

    The key command is “git rebase --onto <new_root> <old_root> <old_tip>”, where <old_root> corresponds to <from_id> in the previous approach, <old_tip> corresponds to <to_id> and <new_root> is the branch we want to add the commits to. For the example repository in this post, we would run the following instead of “git cherry-pick”:

    $ git rebase --onto v2-hotfix 707ac6a^ 58ee3ac
    First, rewinding head to replay your work on top of it...
    Applying: Commit message for R
    Applying: Commit message for S
    Applying: Commit message for T
    
    
    $ git status
    ...output for git version > 1.8...
    # HEAD detached from 6ffbf6f
    nothing to commit, working directory clean
    ...output for git version < 1.8...
    # Not currently on any branch.
    nothing to commit, working directory clean

    As with the cherry-pick approach, it’s important that the <old_root> argument refer to the parent of R so git can properly generate the commit R′.

    Warning: be sure you don’t use a branch name instead of a commit id (or tag name) for the <old_tip>. Using a branch name will cause git to rebase (i.e., move, not copy) the branch and its associated commits as descendents of the <new_root> branch, which is definitely not the behavior we’re looking for here (refer to the git-rebase manpage for more details).

    The output from “git status” tells that we’re working in a “detached HEAD” state. In fact, the state of the repository after running “git rebase --onto” looks like this:

                              (v2-hotfix)         (HEAD)
    ... <-- A <-- B <-- C <-- D <-- R' <-- S' <-- T'
            ^
            |                                   
            +---- Q <-- R <-- S <-- T <-- U <-- V (dev)

    where the HEAD pointer is on commit T′ while the tip of the v2-hotfix branch is still on commit D. To fix this, we use:

    $ git checkout -B v2-hotfix
    Switched to and reset branch 'v2-hotfix'

    The “-B” tells git to reset the v2-hotfix branch to the HEAD commit. We can verify this using status and log commands:

    $ git status
    # On branch v2-hotfix
    nothing to commit (working directory clean)
    $ git log --oneline
    5141a8f Commit message for T    <- Now also the message for T′
    1f69887 Commit message for S    <- Now also the message for S′
    1b42c27 Commit message for R    <- Now also the message for R′
    6ffbf6f Commit message for D
    ...older commit messages for v2-hotfix...

    So there you go: two different ways of copying a subset of commits from one branch to another in git.