Setup git on Anvil

Many times you’ll use an application like GitHub Desktop to interact with GitHub.

However, occasionally you’ll need to work with GitHub through the terminal. Anvil is one of those cases.

The first time steps you need to follow in order to get your Github account connected to the repo include:

  1. Make a Github account using the same email as you used for Access ("Making an Account on Github")

  2. Setup your SSH keys and connect them to your Github account ("Setting up SSH Keys")

  3. Cloning your teams repository ("Cloning the Repository")

You only need to do the steps above once, then you can then go on to learn how to use the git CLI to make changes with Anvil. There are additional steps below to do advanced branching, if you are interested.

0. Working with git on Anvil

To open a terminal session, start by launching Jupyter Lab from ondemand.anvil.rcac.purdue.edu.

After you’ve connected to Jupyter Lab, click the blue plus sign in the upper-left corner. This will list the different launchers that are available to you.

Scroll to the bottom of the "Launcher" page, and under the "Other" category you should see the option for a "Terminal".

If this is your first time working with a terminal, some helpful commands are included below:

  • ls - this will list the contents of the directory that you are currently in.

  • cd - this will change locations to a new directory based on what is passed to it.

  • mkdir - this will make a new directory that can store files.

We’ll use these commands to setup our GitHub repository.

1. Making an Account on Github

If you don’t already have a Github account that uses the same .edu email you used to sign up for Access, you should make one now.

2. Setting up ssh Keys

In order for Github to authorize us to clone the repository to Anvil, we’ll need to setup ssh ("Secure SHell") keys.

ssh uses public key cryptography to securely send messages. Two keys are generated; one public, one private. The public key is shared with others in order for messages to be encrypted (so that only you can read the message). The private key is private (you should not share it with others) and is used to decrypt messages that were signed with your public key.

To begin, go to Anvil and login like usual: ondemand.anvil.rcac.purdue.edu/pun/sys/dashboard

Click the "Clusters" drop down menu, and select Anvil Shell Access. Run the code below to generate a key pair and change the security permissions so that ssh has the appropriate permissions. You will be prompted to input a password. Write this password down, as you will need it to authenticate each time your ssh key is used.

ssh-keygen -a 100 -t ed25519 -f ~/.ssh/id_ed25519 -C "My Anvil Key"
chmod 700 ~/.ssh
chmod 600 ~/.ssh/id_ed25519
chmod 644 ~/.ssh/id_ed25519.pub

We want to share our public key with Github so that it can authenticate us, so let’s cat the contents of the public key. Copy the output of this command.

cat ~/.ssh/id_ed25519.pub
Make sure you are logged in to the Github account that has the same .edu email address as the one you signed up with Access for.

Here you can click the green New SSH Key button in the top right corner.

  • Call it anything you like, maybe My Anvil Key.

  • Make sure the key type is set to Authentication Key.

  • Once again, using the contents of the public key that we just copied from Anvil, copy the public key here in the Key field.

Our ssh keys are now set up!

If you interested in learning how to use ssh locally or connect to Anvil via ssh, you can follow our more in depth instructions at the ssh page.

3. Cloning the Repository

Each team will be setup with a GitHub repository at the start of the project.

There is an example repository that I have setup for your reference.

On each repository, you’ll see a green <> Code button in the center-right of the screen. If you click this button, you’ll see different options listed under Clone.

We’ll use these options when we clone the repository in Anvil.

Now that we have all the pieces, we can start the steps below:

  1. In your Anvil terminal that we launched above run the command cd /anvil/projects/tdm/corporate and then run ls.

    • This will move you to the "/anvil/projects/tdm/corporate" directory and list all the contents.

  2. Find the name of your team’s project and run cd <project_name>/project/<access_id>.

    • Replace the <project_name> portion with the name of your team’s directory and the <access_id> with your ACCESS ID.

    • If you get the "no such directory found" error, first run mkdir /anvil/projects/tdm/corporate/<project_name>/project/<access_id>. Then do cd <project_name>/project/<access_id> again.

  3. Git is already installed on Anvil by default. We can see all of the commands available to use by running the git --help command.

  4. Now we are ready to clone the repository. First, navigate back to the website for your team’s GitHub repository and click the green <> Code button. Under the Clone options select ssh and copy the text provided.

  5. Navigate back to the terminal on Anvil and run the command git clone <copied_text>. Replacing the <copied_text> with the ssh link that you copied from the GitHub website.

    • This will clone the repository into your directory on Anvil.

GitHub Structure

The Anvil directories are a shared environment. This is good, because it means that you can easily share code and data with others on your team.

However, this can be challenging when working with GitHub because of the way it handles different branches.

The Data Mine team provides some guidelines for how to structure your project environment below:

  1. Have a single data folder that everyone works from.

    Don’t copy your data into other folders. This can create confusion and be hard to track.

    For example, if I have a set of data in david_working and there is another set in kevin_working it can be hard to ensure that they are aligned.

  2. In the team’s directory on Anvil The Data Mine team will create a GitHub directory for each user.

    • The directories will be located in Projects and will be named for your ACCESS ID.

    • This is where each student will work on their code and contribute to the main repository.

    • The directory will be setup for you, but you’ll setup the repository as part of the steps below.

  3. The main branch of the GitHub repository should be updated once the code is working and ready for other members of the team to use.

    • The Data Mine team should be able to check your repository and see branches for code in development as well as finalized code in the Main branch.

Working in GitHub

Now that we have cloned the repository, we are ready to create our first branch and push the change.

It should be noted that there are different techniques for working with GitHub. This is how The Data Mine plans to work for the projects, but it isn’t the only workflow.

  1. In your directory run the ls command to see the name of the new repository that you just cloned.

  2. Run cd <repo_name>. Replacing the <repo_name> with the name of your cloned repository.

  3. In your cloned repository run the command git checkout -b "first_branch_<your_name>". This will create a new branch for you to work in.

    • You should see a message that says something like Switched to branch <branch_name>.

    • Branches are your own development area. When working on new code changes they should be in a branch. Once the code is finished up it can be merged into the main branch to become part of the core project.

  4. Each repository should be created with a README file. These files are ongoing documentation for how to interact with the code in the repository. In this case we are going to make a change to the README file and merge it into the main branch.

  5. To add your name to the README file, follow the steps below:

    • Run the command vi README.md.

    • Hit i on your keyboard to enter insert mode.

    • Using the arrow keys and enter, navigate to a new line and type your first and last name.

    • Once your name is typed, hit the escape key to exit interactive mode.

    • Finally, type :wq and then enter to save the changes and exit.

  6. Now that we have made a change in our branch, we can push it to make it public to others.

    • Run git add . in the terminal to stage all your changes.

    • Run git commit -m "Adding my name.".

    • Run git push to push the changes.

      If you get an error that looks like:

      fatal: The current branch tdm_test has no upstream branch.
      To push the current branch and set the remote as upstream, use
      
          git push --set-upstream origin tdm_test

      Copy the last line into your terminal and run it.

      For example, in the error above I would copy git push --set-upstream origin tdm_test and run the command.

  7. For the last set of steps, we can navigate back to the website for our GitHub repository.

  8. On the website just under the name of the repository we can see a branches term with the number of branches listed next to it.

  9. If we click on branches, we can see the different branches that are active for the repository. Including one with the same name that we created above. We can also see a button on the right-side that says New Pull Request. Click that button for the branch that you created.

  10. At the bottom of this screen, we can see the changes that we made in our files. We can also add comments regarding the code changes at the top of the request. Add a few comments about the code you changed and why you changed it and then click Create Pull Request.

    • Many times, you’ll hear pull requests referred to as a PR.

    • It’s good to add a bit of detail in your PR comments so that others can easily know what the PR contains.

  11. This will bring up the final screen which is your PR. If everything looks good, you can click the Merge pull request button at the bottom of the screen.

    • It’s a good idea to have other team members or your TA review your code changes.

    • You can you use the comments settings or the PR settings to add potential reviewers or notes.

    • Merging the pull request will make the code part of the main branch, which is the core of the code repository.

      Sometimes you will see that the branch has conflicts. This means that there is other code that has been added to the repository that is different from what you are adding.

      It can be helpful to review the GitHub documentation on merge conflicts for help.

  12. After your branch is merged into main it will automatically be included as part of the core files for the repository.

    • In this case you should see your name appear at the top of the repository.

Using GitHub for the Project

You did it! You’ve now cloned a repo, made a branch, and merged your first change.

Now how do we use this going forward?

  1. Create a branch for the things that you are working on.

  2. Once your code is ready, review the changes with a teammate and then merge your changes into main.

  3. Other people can also work on the same branch if you are collaborating with a teammate.

  4. The goal at the end of the year is to have all your code and documentation in the main branch of the repository.

Video Resources

To help with the instructions, The Data Mine team created the videos below for the SSH key and GitHub process.

The videos follow the same set of written instructions above.

SSH Keys

GitHub on Anvil