Git Commands (Every day use)

Here are the list of git commands that I use every day:

  • Clone remote repository
  • list the branches in test-repo repository.
    • git branch -vv
  • Switch between branches
    • git checkout milestone1
    • git checkout milestone2
  • Create a new branch work1 from milestone1
    • git checkout milestone1
    • git branch work1
    • git checkout work1
    • You can combine above two commands with “git checkout -b work1”
  • Push new branch work1 to repository test-repo
    • git push origin
  • Merging branches (merge milestone2 code to work1)
    • git checkout milestone2
    • git pull (git fetch & git merge origin/milestone2)
    • git checkout work1
    • git pull
    • git merge –no-ff milestone2
    • git push origin work1
  • push a modification to file1 to work1 branch
    • git chekcout work1
    • git pull
    • git add file1
    • git commit -m “message” (or git commit -a -m “message” to skip staging area)
    • git push origin work1
  • Delete local branch
    • git branch -d work1 (to delete tracking branch)
    • git push origin  :work1 (to delete branch on remote repository)
  • You have done some changes in your local tracking branch, but you want to work on different branch and come back. If you have new file, then add it to the index.
    • git add newfile
    • Do changes in current indexed file file1.
    • git stash
    • git checkout master
    • do some work on master branch
    • git checkout work1
    • git stash pop (will pop both file1 and newfiles changes on top of work1 branch)
  • See the history of commits
    • git log
    • git log -p -1
    • git log –stat
  • Code differences
    • git diff origin/master (local uncommitted changes with remote tracking branch)
    • git diff master (local uncommitted changes with tracking branch, where you commits your changes)
    • git diff COMMIT^ COMMIT (compare commit with ancestor commit of tracking branch)
    • git diff –staged (staged changes with tracking branch recent commit)
    • git diff (not committed)
    • git diff HEAD (upstaged changes with recent commit)
    • git show (to see the commited changes, difference with parent commit)
    • git diff milestone1..milestone2 (difference between two branches)
  • Status of your changes
    • git status
  • Resetting local committed changes
    • git reset <commit-id> (HEAD is moved to the specified commit-id)
  • Resetting local staged changes
    • git rm –cached
  • Discard changes in current working directory. (not staged)
    • git checkout — file1
  • Bringing changes to exiting commit
    • git add file2
    • git commit –amend (will add file2 change into same commit, not pushed)
  • Generate patch from commit 
    • git format-patch -n1
    • git am <p (apply a specific patch)

We need to understand some theory behind how git works. Some common terms we come across while working with git:

Repository, Remote tracking branch,  Tracking branch, Local branch

Repository – A set of branches related to project. (git clone <repository link>)

Remote tracking branch –

Remote-tracking branches are references to the state of remote branches. They’re local references that you can’t move; they’re moved automatically for you whenever you do any network communication. Remote-tracking branches act as bookmarks to remind you where the branches in your remote repositories were the last time you connected to them.

On my computer origin/master is the remote tracking branch which refers to the branch on hosting computer. When I do git pull, git push,  the git command uses origin/master info to talk to hosting computer(github.com).

Tracking branch /Local branch – This is the branch on my computer, which will have all the commits I have done but not yet pushed to the remote branch.

git branch -vv
  ganga     8e22413 [origin/ganga: ahead 2] integration of allocation logic
  master    4ae3a25 [origin/master] db change
* bug1 c8676f9 [dev/bugs: ahead 4, behind 2] fix for illegal option
  fet1   62a363b container changes

In above example,  Local branch or tracking branches are : ganga, master, bug1 and testing

Remote tracking branches : origin/ganga, origin/master, dev/bugs,

Tracking branches are local branches that have a direct relationship to a remote branch. If you’re on a tracking branch and type git pull, Git automatically knows which server to fetch from and branch to merge into

Git index (aka staging area/cache/directory cache/staged files) – Changes that are added for commit.

 

Advertisements

Good string programming puzzle

Problem Definition:

You are challenged to write an algorithm to check if a given string, s, can be formed from two other strings, part1 and part2.

The restriction is that the characters in part1 and part2 are in the same order as in s.

Test Cases:

  1. s = “Bananas from Bahamas” ;  part1 =”Bahas” ; part2 =” Bananas from am”
  2. s =”Can we merge it? Yes, we can!”; part1=” an mrgei? e,we”; part2=”C wee tYs can!”
  3. s=”Making progress; part1=”Mak pross”; part2=”inggre”
  4. s=’codewars’; part1=’code’; part2=’wars’
  5. s=’codewars’; part1=’cdw’; part2=’oears’

 

Logic:

  1. Start matching character by character, there are two possibilities to start
    1. ‘s’ with ‘part1’
    2. ‘s’ with ‘part2’
  2. ‘s’ with part1:  Start matching string ‘s’ in part1 as long it matches and then move to part2 as long as it matches there. come back to part1 and try to match as much it matches then move to part2 … till the end of string ‘s’. If we can’t make progress then return False.
  3. ‘s’ with part2 :  If step2 fails, then start matching ‘s’ with part2…
  4. It is not just two possibilities, for every character match there are two possibilities, so we need to handle all those cases.
  5. For example:
    1. s = “Bananas from Bahamas” ;  part1 =”Bahas” ; part2 =” Bananas from am”
  6. For first character ‘B’ from s,  two possibilities s[0] == p1[0], s[0] == p2[0].  ‘s’ matches with ‘p1’  till “Ba”. The  rest of ‘s’ (“nanas from Bahamas”) does not match with p2 (“Bananas from am”) . so we have try with second possibility, s[0] == p2[0], this matches till “Bananas from ” in p2.  The rest of ‘s’ (“Bahamas”)  does not match with p2 (“am”). Then try to match “Bahamas” with p1 (“Bahas”), p2(“am”).
  7.    s=”Bahamas”, p1=”Bahas”, p2=”am”.  comparisons should go like this:
          'B' in s and 'B' in p1 (1)
               'a' in s and 'a' in p1   (2)
                  'h' in s and 'h' in p1  (3)
                      'a' in s and 'a' in p1 (False) 
                         'm' in s and 's' in p1 (False) 
                         'm' in s and 'a' in p2 (False) 
                      'a' in s and 'a' in p2 (True)  (4)
                         'm' in s and 'a' in p1 (False)
                         'm' in s and 'm' in p2  
                            'a' in s and 'a' in p1
                               's' in s and 's' in p1
                                  '' in s and '' in p1 (True) (5)
                                  

True value is bubbles up to (1) . In the order of 5,4,3,2,1

Python code:

Recursive solution:

def is_merge(s, part1, part2):
  if not part1:
     return s == part2
  if not part2:
     return s == part1
  if not s:
    return part1 + part2 == ''
  if s[0] == part1[0] and is_merge(s[1:], part1[1:], part2):
    return True
  if s[0] == part2[0] and is_merge(s[1:], part1, part2[1:]):
    return True
  return False

Iterative Solution:

def is_merge(s, part1, part2):
    queue = [(s,part1,part2)]
    while queue:
        str, p1, p2 = queue.pop() 
        if str:
           if p1 and str[0] == p1[0]:
              queue.append((str[1:], p1[1:], p2))
           if p2 and str[0] == p2[0]:
              queue.append((str[1:], p1, p2[1:]))
        else:
           if not p1 and not p2:
              return True
    return False

Aplication Architectures using Google cloud

The below architecture diagrams helps in understanding deployment architectures for different use cases.  I want to have a quick pointer to look at these diagrams from my blog, instead of searching for them every time.

Web applications:  https://cloud.google.com/solutions/architecture/webapp

Digital asset management and sharing:  https://cloud.google.com/solutions/architecture/digitalassets

Content Management : https://cloud.google.com/solutions/architecture/contentmanagement

High Performance Computing:  https://cloud.google.com/solutions/architecture/highperformancecomputing

IOT: https://cloud.google.com/solutions/architecture/streamprocessing

Mobile Apps and Games:  https://cloud.google.com/solutions/architecture/mobileandgames

CAP theorem

CAP theorem states that it is impossible for a distributed system to simultaneously provide all three guarantees of

  • Consistency: All nodes see the same data at the same time.
  • Availability: a guarantee that every request receives a response about whether it was successful or failed.

Availability in CAP is defined as “every request received by a non-failing [database] node in the system must result in a [non-error] response”

  • Partition tolerance: the system continues to operate despite arbitrary message loss or failure of part of the system.

A system is partition tolerant if processing can continue in both partitions in the case of a network failure

http://robertgreiner.com/2014/08/cap-theorem-revisited/

In above article, Robert Greiner, neatly explained the CAP with diagrams.

“Given that networks aren’t completely reliable, you must tolerate partitions in a distributed system, period. Fortunately, though, you get to choose what to do when a partition does occur. According to the CAP theorem, this means we are left with two options: Consistency and Availability.”

CP – Consistency/Partition Tolerance – Wait for a response from the partitioned node which could result in a timeout error. The system can also choose to return an error, depending on the scenario you desire. Choose Consistency over Availability when your business requirements dictate atomic reads and writes. ”

Some distributed systems prefer CP and in case of partitions, cluster with quorum continue to operate. other part of the cluster is either dormant/non-operational/shut down.  All the IOs are redirected to the cluster with quorum.

But there can be more than one partition in the cluster and based on the quorum policies (>50% nodes), quorum can not be established. So any IO on distributed system mail fail or it will be put into read-only mode. Availability is sacrificed in this case.

“AP – Availability/Partition Tolerance – Return the most recent version of the data you have, which could be stale. This system state will also accept writes that can be processed later when the partition is resolved. Choose Availability over Consistency when your business requirements allow for some flexibility around when the data in the system synchronizes. Availability is also a compelling option when the system needs to continue to function in spite of external errors (shopping carts, etc.)

I would highly recommend to read this IEEE article on “Consistency Tradoffs in Modern Distributed Database System Design”

http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf

PS: 1. This is my understanding. The information provided here may not be accurate or immature. Please comment if you find any misleading information.

2. All the words in the ” ” are not my wordings. They were part of the respective websites mentioned above.

Python, Java and C run time.

C

  • gcc compiles your c code and generates an executable code. This code has machine instructions, means CPU can understand these instructions. Kernel has to load the executable code into memory and point CPU to starting instruction.

Java

  • java converts the java code into byte code and this byte code is executed on JVM (written in C).
  • JVM understands byte code and perform actions based on byte code instructions.

Python

  • Python compiles your code and generates a byte code.  This is not a machine code, it is python representation. This byte code is executed on Python virtual machine (PVM). PVM is the run time environment of python, it is installed as part of your python installation. This process is hidden to users who are executing .py file with python binary. Python also generates .pyc (compiled python) files  We can ship the .pyc files as your product. It might be easy to decode the python source from pyc.  python can not be used to delivery a proprietary product where they don’t want to expose alogs or logics involved.  There is a concept of frozen binaries, which packages .pyc, PVM and some support files and provide a .exe file. This is how python applications are shipped. You no need to install Python on your machine.
  • To just compile python code and create .pyc files,
python -m py_compile script.py
  • Since object code is not a machine code, Byte code instruction execution takes more time because they are not executed directly on CPU. PVM needs to understand these byte code instructions and take action.
  • There is Jpython, which converts your python program to Java byte code and run these java byte code on JVM.

Reference:

How Python Runs Programs

Choosing web framework – Django

After going through this website: https://en.wikipedia.org/wiki/Comparison_of_web_application_frameworks

I felt, man! many languages and each language has several web frameworks. This will add more confusion.  After spending 12 years with C language, I wanted to learn some high level language where applications are developed using extensively available libraries (packages/modules). I choose Python because of its wide spread usage in enterprise environments.

Python supports below frameworks. Now we have full of choices.

Project Current stable version Release date License
Bottle 0.12.8 2014-12-28[31] MIT
BlueBream 1.0 (dormant) 2011-01-18 ZPL
CherryPy 3.7.0 2015-04-24[32] BSD
CubicWeb 3.20.7 2015-04-22[33] LGPL
Django 1.8.6 2015-11-04[34] BSD
Flask 0.10.1 2013-06-14[35] BSD
Grok 2.8 (dormant) 2013-02-14[36] ZPL
Nagare 0.4.1 (dormant) 2012-01-18 BSD
Pyjs 0.8.1a (dormant) 2012-05-06 Apache
Pylons 1.0.1 (dormant) 2012-08-14 BSD
Pyramid 1.5.7 2015-04-28 BSD
TACTIC 4.3.0.v02 2015-03-31[37] EPL
Tornado 4.2 2015-05-26[38] Apache
TurboGears 2.3.5 2015-04-28[39] MIT, LGPL
web2py 2.11.2 2015-05-30[40] LGPL3
Webware 1.1.1 (dormant) 2013-01-18 Python
Zope 2 2.13.23 2015-06-29[41] ZPL

I want to choose one option out of these. After interacting with some experienced engineers in these technology and doing some study in the internet.  I came to an understanding, Flask is easy to start with if we want an application built for small scale.

“So for Small applications with simpler requirements go with FLASK. Pyramid and Django are both aimed at larger applications, but take different approaches to extensibility and flexibility. Pyramid targets flexibility and lets the developer use the right tools for their project. This means the developer can choose the database, URL structure, templating style, and more. Django aims to include all the batteries a web application will need so developers need only to open the box and start working, pulling in Django’s many modules as they go.”

“Django includes an ORM out of the box, while Pyramid and Flask leave it to the developer to choose how (or if) they want their data stored. The most popular ORM for non-Django web applications is SQLAlchemy by far, but there are plenty of other options from DynamoDB and MongoDB to simple local persistence like LevelDB or plain SQLite. Pyramid is designed to use any persistence layer, even yet-to-be-invented ones.”

“Django’s ‘batteries included’ approach makes it easy for developers who know Python already to dive in to web applications quickly without needing to make a lot of decisions about their application’s infrastructure ahead of time. Django has templating, forms, routing, authentication, basic database administration, and more built in. In contrast, Pyramid includes routing and authentication, but templating and database administration require external libraries.”

“The extra work up front to choose components for Flask and Pyramid apps yields more flexibility for developers whose use case doesn’t fit a standard ORM, or who need to interoperate with different workflows or templating systems.”

Since Django has all built in a box, as a beginner I decided to go with django. The wikipedia link compares web frameworks with following parameters:

Ajax

MVC framework

MVC push-pull

i8n & L10n

ORM

Testing framework

DB migration framework

Security framework

Template framework

Caching framework

Form Validation framwork

Will talk more about these capabilities in further blogs.

PS: Para’s with ” ” are not my wordings. I used these from https://www.airpair.com/python/posts/django-flask-pyramid