SSD Performance Workloads Template

Workload on block device
MAX Read Bandwidth (128k rand) MB/s
MAX Write Bandwidth (64k seq) MB/s
MAX Random Read IOPS (512b) kIOPS
MAX Random Read IOPS (4K) kIOPS
1 Thread Read Latency (4K Random) usec
1 Thread Write Latency (512b Sequential) usec
MAX Sustained Random Write IOPS(4K) kIOPS
MAX Sustained Random 75:25:R:W(4K) KIOPS
Read Latency (8K Random,1QD):  usec
Write Latency (8K Random,1QD):  usec
Write Latency (512b Sequential post random conditioning ):  usec
Random Read IOPS (8K,256QD):  kIOPS
Random Write IOPS (8K,256QD):  kIOPS
Sustained Random 75-25 R-W IOPS (8K,256QD):  kIOPS

SAFe Product Owner role

Recently our company adopted SAFe agile framework for our BU.  I was asked to take the Product Owner role. Apart from my People Management job,  I have to take care of this role. Some changes I observed:

Before PO role:
————————

  • I was doing coding
  • I was doing code reviews
  • I was participating in Design discussions
  • I was identifying the list of task to do to deliver a component.
  • I was assigning the tasks to the team.
  • I was taking the ownership of some of the tasks depending on the urgency and delivery times.
  • I was mentoring the team members on technology and career related areas.
  • I was spending time with team to debug the issue. I enjoyed it the most. This was my core competency.
  • I was managing people (I have five people reporting to me)

With PO role:
——————–

  • Understand the PM User stories and discuss with System Architect.
  •  I am identifying the User Stories for the scrum team.
  •  I am writing a clearly defined Acceptance Criteria (This is the timing consuming task, This is the key for product success.). This is the critical job.
  •  I am discussing with team on defined User stories and split them into small user stories.
  •  I am answering team’s question related to the User story. (Mainly the “What” part, Some “How” part as well). I need to know the How part in some user stories, For example, the communication between two components. I should not care about HTTP/GRPC/S3. If team chooses a User story to go with a S3, I need to have domain knowledge to address the AC.
  • I am attending the Ceremonies, Backlog grooming, Sprint Planning, Design discussions, Sprint Reviews.

I see there is need for seperate PM and PO roles. Some of the task sharing between these two roles are:

 

Role Responsibility
Product manager Tracks the overall market and competitive dynamics.
Product manager Manages the long term roadmap, involving sales, marketing, implementations, clients, prospects, partners, and other groups. The ideas are expressed as epics.
Product manager Attends iteration demos and some stand-ups.
Product manager Supports other non-technical organizations (such as sales, marketing, and channel).
Product owner Leads requirements gathering effort on the epics as needed — consulting with product management, implementations, clients, and other stakeholders.
Product owner Documents story details based on the epics and after review with development.
Product owner Attends scrum meetings including standups, retrospectives, and demos.
Product owner Leads backlog grooming to decompose/estimate stories.
Product owner Creates mockups and works with UX on design.
Product owner Answer questions from developers, clarifies requirements, etc.
Product owner Documents the new feature for implementations and release notes.
Both Write acceptance criteria.
Both Demonstrate latest iteration to customers (pre-release) and gathers feedback.

 

 

POC vs Product

POC

  • Focus is on functionality, Algorithms, data structures.  no need to handle all error cases.
  • Proving performance
  • Need reliability to the extent of not having too many issues/bugs while doing performance tests/ basic functionality tests.
  • Test coverage is limited to End to End testing.
  • No need for unit testing for every function/class.
  • No integration with management layer (UI)
  • Have automation to reduce the setup related tasks. (use Ansible playbooks/ shell scripts/ python scripts)
  • Have clear document on what functionality is skipped and what needs to be taken care in the productization phase. ( requirements, features and functionality)

Productization

  • All error handling.
  • Log messages, and log file management
  • Stats to get insights into the product layers.  (need for debugging purpose)
  • Stats to customer
  • Management (Tools or UI) layer integration.
  • Class/Function level unit testing.
  • Sub component level testing.  Make sure there are no race conditions and memory leaks.
  • Integration testing (End to End testing)
  • Performance related tests. Stress and Load testing.
  • All possible functionality related testing at each level. Both Positive and Negative tests. Ask developers to get the infrastructure to induce errors.
  • Tools/Utilities to manage and monitor the product.
  • Call Home feature, where events are sent from customer machine to the  Servicing company.
  • Notifying users on specific events like failures/warnings.
  • Testing on real Hardware.

Code Review Guidelines

Most of the cases I have seen engineers are not doing efficient reviews because it’s not really their code (someone’s code.) or no time to do.  The mindset is very important during code review.

Importance of Code Review:

  • This saves lot of time in fixing bugs, so that we can release the product in time.
  • This is the time engineers learn/teach effective programming practices, language semantics/constructs, new libraries and algorithms.
  • If an engineer finds bugs during the code review time, it boost his confidence. This is the clear sign that engineer is matured, technically strong and experienced enough.
  • During the review, whole lot of implementation details are discussed with the owner of code,  for other members it’s a quick way to learn what other team members are doing and what is going into the product.
  • You can learn new language quickly if you are new to it.

The owners of the code needs to get his code review from at least two colleagues.  The code reviewers has to keep below in mind while reviewing.

  • Functionality is implemented as discussed.
  • Handle Failure cases all over the code.
  • Have enough comments in the code.
  • Check for proper copyright headers in the code.
  • Does the code scales for future requirements.
  • Product does not break during Upgrade and Downgrade scenarios
  • Does it have all the stats covered for debuggability.
  • Logging is sufficient to pinpoint the issue in the code.
  • Adhering to language Coding guidelines
  • Performance will not be degraded.
  • Memory allocations and frees are handled properly.
  • In case, the implementation involves multiple threads,  make sure code is not making CPUs 100% utilized.  Also look for race conditions.
  • If you think the new code is too risky, ask for  more unit/component test coverage.
  • Check if the security is compromised as part of the new code.
  • Make sure the developers not copying the code from sensible licensed public code like GPL licenced.
  • See if timeout are incorporated in IO path, should not hang the IO.
  • If developer is using open source libraries ask how stable they are.

Make file example

The make utility automatically determines which pieces of a large program need to be recompiled, and issues commands to recompile them.

Most of the engineers who started working on projects in the middle or supporting the product which are in maintenance phase has less visibility to Make files. For them, make files are of least concern. I want to present Make file basics with simple examples here. I was in this situation in my previous job.  I am into software development for almost 5 years,   would like to explain the basics of makefile I learnt while creating new projects.

In any Make file, we see mainly these sections: Variable, targets.

Variable syntax:

NAME = value

Target syntax:

target … : prerequisitesrecipe
        …
        …

A target is usually the name of a file that is generated by a program.

A prerequisite is a file that is used as input to create the target.

A recipe is an action that make carries out. A recipe may have more than one command, either on the same line or each on its own line.

By default, Makefile targets are “file targets” – they are used to build files from other files. Make assumes its target is a file, and this makes writing Makefiles relatively easy.

However, sometimes you want your Makefile to run commands that do not represent physical files in the file system. Good examples for this are the common targets “clean” and “all”.These special targets are called phony and you can explicitly tell Make they’re not associated with files. In terms of Make, a phony target is simply a target that is always out-of-date, so whenever you ask make clean, it will run, independent from the state of the file system

.PHONY: clean all product-daemon subproduct-daemon iotest framework metastore

The below is an example of a makefile of a top directory in a product:

TARGET=production

include makefiles/$(TARGET).make
export CC
export CC_PERMISIVE
export CXX
export CXX_PERMISIVE
export OPTFLAGS
export CXXFLAGS
export LDFLAGS

PROTOC = protoc
GRPC_CPP_PLUGIN = grpc_cpp_plugin
GRPC_PYTHON_PLUGIN = grpc_python_plugin
GRPC_CPP_PLUGIN_PATH ?= `which $(GRPC_CPP_PLUGIN)`
GRPC_PYTHON_PLUGIN_PATH ?= `which $(GRPC_PYTHON_PLUGIN)`

PROTOS_PATH = ./protos
PROTOS_OUTPUT_CPP = ./protos/src
PROTOS_OUTPUT_PYTHON = ./build/python

PROTO_FILES = product_daemon_rpc.proto \
 subproduct_daemon_rpc.proto \
 metadb_kv.proto

PROTO_GEN_HEADERS = $(patsubst %,$(PROTOS_OUTPUT_CPP)/%, $(PROTO_FILES:.proto=.pb.h) $(PROTO_FILES:.proto=.grpc.pb.h))
PROTO_GEN_SOURCES = $(patsubst %,$(PROTOS_OUTPUT_CPP)/%, $(PROTO_FILES:.proto=.pb.cc) $(PROTO_FILES:.proto=.grpc.pb.cc))

all: common/common.a framework/framework.a meta-store/metastore.a product-daemon subproduct-daemon iotest

common/common.a:
 cd common && $(MAKE)

framework/framework.a:
 cd framework && $(MAKE)

meta-store/metastore.a: $(PROTO_GEN_HEADERS) $(PROTO_GEN_SOURCES)
 cd meta-store && $(MAKE) STATS=1

clean:
 cd common && $(MAKE) clean
 cd framework && $(MAKE) clean
 cd meta-store && $(MAKE) clean
 cd subproduct-daemon && $(MAKE) clean
 cd product-daemon && $(MAKE) clean
 cd utils/iotest && $(MAKE) clean
 rm -f $(PROTOS_OUTPUT_CPP)/*.cc
 rm -f $(PROTOS_OUTPUT_CPP)/*.h
 rm -f $(PROTOS_OUTPUT_CPP)/*.d
 rm -f $(PROTOS_OUTPUT_PYTHON)/*.py

$(PROTOS_OUTPUT_CPP)/%.pb.h $(PROTOS_OUTPUT_CPP)/%.pb.cc: $(PROTOS_PATH)/%.proto
 $(PROTOC) -I $(PROTOS_PATH) --cpp_out=$(PROTOS_OUTPUT_CPP) $<

$(PROTOS_OUTPUT_CPP)/%.grpc.pb.h $(PROTOS_OUTPUT_CPP)/%.grpc.pb.cc: $(PROTOS_PATH)/%.proto
 $(PROTOC) -I $(PROTOS_PATH) --grpc_out=$(PROTOS_OUTPUT_CPP) --plugin=protoc-gen-grpc=$(GRPC_CPP_PLUGIN_PATH) $<

product-daemon: $(PROTO_GEN_HEADERS) $(PROTO_GEN_SOURCES) common/common.a framework/framework.a meta-store/metastore.a
 cd product-daemon && $(MAKE)

subproduct-daemon: $(PROTO_GEN_HEADERS) $(PROTO_GEN_SOURCES) common/common.a framework/framework.a
 cd subproduct-daemon && $(MAKE)

iotest:
 cd utils/iotest && $(MAKE)

framework: framework/framework.a

metastore: metastore/metastore.a

.PHONY: clean all product-daemon subproduct-daemon iotest framework metastore
$ cat makefiles/production.make
CC=gcc
CC_PERMISIVE=gcc
CXX=g++
CXX_PERMISIVE=g++

OPTFLAGS=-g3 -ggdb -O3 -DNDEBUG
  • include makefiles/$(TARGET).make

    The include directive tells make to suspend reading the current makefile and read one or more other makefiles before continuing.

  • export CC 
    export CC_PERMISIVE 
    export CXX
    export CXX_PERMISIVE
    export OPTFLAGS
    export CXXFLAGS
    export LDFLAGS

    When make runs a recipe, variables defined in the makefile are placed into the environment of each shell. This allows you to pass values to sub-make invocations . By default, only variables that came from the environment or the command line are passed to recursive invocations. You can use the export directive to pass other variables. To pass down, or export, a variable, make adds the variable and its value to the environment for running each line of the recipe. The sub-make, in turn, uses the environment to initialize its table of variable values.

  • GRPC_CPP_PLUGIN_PATH ?= which $(GRPC_CPP_PLUGIN)

This is called a conditional variable assignment operator, because it only has an effect if the variable is not yet defined. This is equivalent to

ifeq ($(origin GRPC_CPP_PLUGIN_PATH), undefined)
  GRPC_CPP_PLUGIN_PATH = which $(GRPC_CPP_PLUGIN)
endif
  • PROTO_GEN_HEADERS = $(patsubst %,$(PROTOS_OUTPUT_CPP)/%,
     $(PROTO_FILES:.proto=.pb.h) $(PROTO_FILES:.proto=.grpc.pb.h))

A substitution reference substitutes the value of a variable with alterations that you specify. It has the form ‘$(var:a=b)’ (or ‘${var:a=b}’) and its meaning is to take the value of the variable var, replace every a at the end of a word with b in that value, and substitute the resulting string.

For example:

foo := a.o b.o c.o
bar := $(foo:.o=.c)

sets ‘bar’ to ‘a.c b.c c.c’. This case is equivalent to ‘$(patsubst %.o,%.c,$(foo))’

  • meta-store/metastore.a: $(PROTO_GEN_HEADERS) $(PROTO_GEN_SOURCES)
     cd meta-store && $(MAKE) STATS=1

    For generating target meta-store/metastore.a , make has to make sure the dependencies $(PROTO_GEN_HEADERS) and $(PROTO_GEN_SOURCES) are available. The recipe is “cd meta-store && make STATS=1”, This sub-make gets all the exported variables in its environment.

  • $(PROTOS_OUTPUT_CPP)/%.pb.h $(PROTOS_OUTPUT_CPP)/%.pb.cc: $(PROTOS_PATH)/%.proto
     $(PROTOC) -I $(PROTOS_PATH) --cpp_out=$(PROTOS_OUTPUT_CPP) $<

    Target is: A rule with multiple targets is equivalent to writing many rules, each with one target, and all identical aside from that. The same recipe applies to all the targets. In above case for each maching target, it makes sures that the corresponding prerequisite exits and corresponding rule is executed.

    $(PROTOS_OUTPUT_CPP)/%.pb.h $(PROTOS_OUTPUT_CPP)/%.pb.cc
  • Dependency:
    $(PROTOS_PATH)/%.proto

    Recipe:

    $(PROTOC) -I $(PROTOS_PATH) --cpp_out=$(PROTOS_OUTPUT_CPP) $<

    Make has some special variables called, automatic variables. These variables have values computed afresh for each rule that is executed, based on the target and prerequisites of the rule. The widely used ones are:

    • $@ The file name of the target of the rule.
    • $%The target member name.

Example1:

foolib(hack.o) : hack.o
        ar cr foolib hack.o

foo.a(bar.o) then ‘$%’ is bar.o and ‘$@’ is foo.a. ‘$%’ is empty when the target is not an archive member.

  • $< The name of the first prerequisite.
  • $^The names of all the prerequisites

Example2:

all: library.cpp main.cpp

In this case:

  • $@ evaluates to all
  • $< evaluates to library.cpp
  • $^ evaluates to library.cpp main.cpp

 

common directory  makefile:

CXX?=g++
CC?=gcc
ASM?=g++
OPTFLAGS?=-g3 -ggdb -O0

PACKAGES=libconfig++ libglog
LIBS=-pthread -lgtest

CXXFLAGS+=-std=c++14 -MMD -Wall -I. -I../init -I/usr/local/include `pkg-config --cflags ${PACKAGES}` $(OPTFLAGS)
CFLAGS+=-MMD -Wall $(OPTFLAGS)
LDFLAGS+=-L/usr/local/lib `pkg-config --libs ${PACKAGES}` ${LIBS}

ASM_SRC=codeword_utils.o
CRC_SRC=crc.o

OBJECTDIR=build

ASM_OBJS=$(patsubst %,$(OBJECTDIR)/%,$(ASM_SRC))
CRC_OBJS=$(patsubst %,$(OBJECTDIR)/%,$(CRC_SRC))
VPATH+=.

SRC=$(CRC_OBJS) $(ASM_OBJS)

all: codeword_utils mpmcq

codeword_utils: $(CRC_OBJS) $(ASM_OBJS)
 ar rcs common.a $(SRC)

mpmcq:
 cd mpmcq && $(MAKE)

$(OBJECTDIR)/%.o: %.cpp
 $(CXX) $(CXXFLAGS) -c -o $@ $<

$(OBJECTDIR)/%.o: %.s
 $(ASM) $(CXXFLAGS) -c -o $@ $<

-include $(OBJECTDIR)/*.d

clean :
 rm -f $(OBJECTDIR)/*.o $(OBJECTDIR)/*.d common.a codeword_utils
 cd mpmcq && $(MAKE) clean

.PHONY: clean all codeword_utils mpmcq

most of the above sections are already discussed. The new ones are:

  • VPATH+=.

    The value of the make variable VPATH specifies a list of directories that make should search. Most often, the directories are expected to contain prerequisite files that are not in the current directory; however, make uses VPATH as a search list for both prerequisites and targets of rules.

  • -include $(OBJECTDIR)/*.d

    This acts like include in every way except that there is no error (not even a warning) if any of the filenames (or any prerequisites of any of the filenames) do not exist or cannot be remade.

meta-store makefile

CXX?=g++
CC?=gcc
CXX_PERMISIVE?=g++
OPTFLAGS?=-g3 -ggdb -O0

PACKAGES=libconfig++

CXXFLAGS+=-std=c++14 -MMD -Wall -I. -I/usr/local/include -Iapi -Istore -Icluster -Itransactions -Igen-cpp2 -I../protos/src `pkg-config --cflags ${PACKAGES}` $(OPTFLAGS)
CXXFLAGS+= -I transactions/meta_keys -Itransactions/sql -Itransactions/key-value

ifeq ($(STATS), 1)
CXXFLAGS+= -D_ENABLE_STATS_
endif

CFLAGS+=-MMD -Wall $(OPTFLAGS)

SRCPROTOS = metadb_kv.pb.o

VPATH+=api::store::cluster::transactions::tests::gen-cpp2
VPATH+=../protos/src
VPATH+=transactions/sql \
 transactions/key-value

SRC = meta1.o \
 meta2.o \
 

THRIFT_CPP_FILES = $(wildcard gen-cpp2/*.cpp)
THRIFT_CPP_FILES := $(filter-out gen-cpp2/non-shared_constants.cpp, $(THRIFT_CPP_FILES))
SRC += $(patsubst gen-cpp2/%.cpp,%.o,$(THRIFT_CPP_FILES))

SRC += $(SRCPROTOS)

OBJECTDIR=build

OBJS=$(patsubst %,$(OBJECTDIR)/%,$(SRC))

all: db test

db: $(OBJS)
 ar rcs metastore.a $(OBJS)

test: db
 cd tests && $(MAKE)

$(OBJECTDIR)/%.o: %.cpp
 $(CXX) $(CXXFLAGS) -c -o $@ $<

$(OBJECTDIR)/%.o: %.cc
 $(CXX) $(CXXFLAGS) -c -o $@ $<

-include $(OBJECTDIR)/*.d

clean :
 rm -f $(OBJECTDIR)/*.o $(OBJECTDIR)/*.d
 rm -f metastore.a
 cd tests && $(MAKE) clean

.PHONY: clean all db test
  • THRIFT_CPP_FILES = $(wildcard gen-cpp2/*.cpp)

    Wildcard expansion does not happen when you define a variable.However, if you use the value of objects in a target or prerequisite, wildcard expansion will take place there. To do the wildcard expansion we use the keyword wildcard.

Race condition Scenario 1

Recently we hit an issue where the client program has stuck. The client has been written in such way it sends requests to server in synchronous manner (sends the next request only after receiving the acknowledgement for current request).  This issue happens intermittently.  This symptom tells that there is some race condition.

On the server side, there are two threads

thread 1

  1. submits a request to one of its internal submission queue.
  2. increments the io_submitted value.

thread 2

  1. picks up the item from submission queue and does an asynchronous IO using libaio.
    1. libaio thread calls the call-back function passed to it once the IO is done.
    2. As part of call-back function aio thread enqueues the request into completion queue
  2. picks up the item from completion queue and increments io_completed value.
  3. the does a check io_submitted == io_completed to do next set of task.
  4.  after completing the next set of tasks, sends a response to the client.

The problem is that the client is not receiving the acknowledgment.  Why?

There is a race:   Before thread 1 increments the io_submitted value,  thread2 increments io_completed and does a comparison check. This can be possible if thread1 is scheduled out before we increment io_submitted value.

Couple of solutions:

  1. Move increment before submitting request o internal submission queue.
  2. Use spin lock to protect the io_submitted

 

 

Working with Docker

Docker is a orchestration layer on top of Linux containers. It creates a lightweight work environment like BSD jails, Solaris zones. You need to install set of packages in your ubuntu OS:

https://docs.docker.com/engine/installation/linux/ubuntulinux/

1. Create docker images from docker file
docker build -t image1 . ( Docker file is located at ./)

2. List out docker images

docker images

3. changes to images and committing changes

docker run -i -t –name guest image1:latest /bin/bash
apt-get install vim
docker stop guest
docker commit -m “message” -a “author” <container-id of guest> image1:latest

docker rm guest

docker run -i -t –name guest image1:latest /bin/bash

check if installed vim is installed.

dpkg –get-selections |grep vim
3. Update existing image
docker pull mysql
docker stop my-mysql-container
docker rm my-mysql-container
docker run –name=my-mysql-container –restart=always \
-e MYSQL_ROOT_PASSWORD=mypwd -v /my/data/dir:/var/lib/mysql -d mysql

your data should have been stored on -v volumes.

4. List containers

docker ps -a

5. Stop container

docker stop guest

6. start container

docker start guest

7. Aattach a running container, you exited from the session, but container still running

docker attach guest

8. Remove a container

docker rm guest

9. Reomve docker images

docker rmi $(docker images -f “dangling=true” -q)

 

In general,  egnieers use a script to start a container and share the directories from the host machine. Ex:-

rundocker.sh

#!/bin/sh

mkdir -p $HOME/$1-shared

docker run –rm -t -i –name $1 -v $HOME/.gitconfig:/root/.gitconfig -e SSH_AUTH_SOCK=$SSH_AUTH_SOCK -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix:rw –privileged –net=host -v $HOME/.Xauthority:/root/.Xauthority -v $HOME/$1-shared:/root/shared  image1:latest /bin/bash

option description:

–rm  : remove container when it exit.

–name : name of the container

-i : keep STDIN even container is not attached.

-t : allocate a psudo TTY.

-v :  directory mapping from host to container (sharing of host directories)