How to build a reproducible analytic pipeline on linux cluster using github

GitHub is a code hosting platform for version control and collaboration. It lets you and others work together on projects from anywhere.

Here is a brief tutorial on how to generate a reproducible analytic pipeline on linux cluster using github.

First, create a github account. You will find an interface like this

  1. Click the New

You can find “Create a new repository”

I entered the Repository name as “HelloWorldExample” and chose “Add a README file”.

2. Upload your codes to the repository

Copy the code ssh “git@github.com:zhaoyuqi616/HelloWorldExample.git”

3. Login the linux cluster, enter the following code:

git clone git@github.com:zhaoyuqi616/HelloWorldExample.git

Change the directory:

cd HelloWorldExample/

Execute the code:

python HelloWorldExample.py

4.1 Modify the code for your aims. Then you want to create a reproducible branch for “Project1”.

# Check the current branch
git branch
# You will see "main"
# Then create a new branch
git checkout -b Project1
# Switched to a new branch 'Project1'
git add HelloWorldExample.py
# add commit
git commit -m "Modify code for New Project"
git push origin Project1

4.2 Swith to your github repository, you will see the new branch “Project1” you created.

It is very simple, right?

One response to “How to build a reproducible analytic pipeline on linux cluster using github”

  1. Reblogged this on Nelsapy.

Leave a Reply

%d bloggers like this: