Perspectives of Software architecture

There will be several perspectives during architecting a software, these views can be captured using Kruchten’s 4+1 model view. It talks about necessary perspectives we need to think when architecting a software.

  • Logical view
    • The logical view focuses mostly on achieving the functional requirements of the system. The context is the services that should be provided to end users.  In practice, the logical view usually involves the objects of the system.  From these objects, you can create a UML class diagram that illustrates the logical view.
    • Tools: UML class diagrams, UML State diagrams.
  • Process view
    • Nonfunctional requirements:  performance and availability
    • Execution order of different objects
    • Behaviors such as concurrent and asynchronous are also considered.
    • Tools: UML sequence diagrams (objects and their interactions), UML activity diagrams (Processes and their activities, eg:- sending and receiving a message in an email application)
  • Development view
    • Hierarchical software structure
    • Programming languages, libraries, and Toolsets.  Details of software development and what it needs to support that.
    • Besides code, this includes management details like scheduling,
      budgets and work assignments, project management.
    • UML Component Diagrams
  • Physical View
    •  The physical view handles how elements in the logical process,
      and development views must be mapped to
      different nodes or hardware for running the system.
    • UML deployment diagram represents how software pieces are deployed and executed on H/W
  • Scenarios
    • Use cases and User tasks of the systems.

Not all software architectures need to be documented using the four plus one view model.  If any of the views are thought to be useless they can be omitted.  For example, if logical and development views  are so similar that they might as well be the same,
they can be described together.

Reference: https://en.wikipedia.org/wiki/4%2B1_architectural_view_model

 

 

 

 

Advertisements

Object-Oriented Design Process

You probably associate the term “object-oriented” with coding and software development. While that is true, the notion of being object-oriented can apply outside of the role of a developer. Object-oriented thinking involves examining the problems or concepts at hand, breaking them down into component parts, and thinking of those as objects.

It is good practice to prepare for object-oriented design by accustoming yourself to thinking about the world around you in terms of objects, and the attributes and behaviors those objects might have.

When software is developed, it generally goes through a process. In simple terms, a process takes a problem and creates a solution that involves software. An object oriented design process is generally iterative. These iterations consist of taking a set of requirements based on the identified problem(s) and using them to create conceptual design mock-ups and technical design diagrams, which can then be used to create a working software implementation, which must also pass testing.

Step1:  Collect requirements and understand the trade-offs.

Requirements are conditions or capabilities that must be implemented in a product, based on a client or user request. They are the starting point of a project—you must understand what your client wants. In addition to establishing specific needs for the project, it is also important to establish potential trade-offs the client may need to make in the solution. This is a process to understand the scope and vision of the customer and what he does not really means and not crucial. This involves writing a customer requirements document.

Step2:  Come up with the conceptual designs, further understanding of customer requirements.

Conceptual designs are created with an initial set of requirements as a basis. The conceptual design recognizes appropriate components, connections, and responsibilities of the software product.  Conceptual designs outline the more high-level concepts of your final software product. Conceptual designs are expressed or communicated through conceptual mock-ups. These are visual notations that provide initial thoughts for how requirements will be satisfied.  Once you start to create a mock-up, you may more easily see what components are missing or may not work. These flaws would require further clarification with your client or involve additional conceptual design work.

Wireframe mockups, web/mobile screens,  Simple component diagrams with their responsibilities.

Step3:  Technical Designs

In the conceptual design, the major components and connections, as well as their associated responsibilities of the software being developed, are outlined. The technical design brings this information to the next stage—it aims to describe how these responsibilities are met.  In order to accomplish this, technical designs begin by splitting components into smaller and smaller components that are specific enough to be designed in detail. By breaking down components more and more into further components, each with specific responsibilities, you get down to a level where you can do a detailed design of a particular component. The final result is that each component will have their technical details specified. Technical diagrams are used to visualize how to address specific issues for each component.  UML class diagrams, UML sequence, and state diagrams are mainly used here.

These three steps are very crucial before we start implementing the software product. The more time we spend on refining the conceptual designs and continue discussions with the customer, the fewer faults we can expect later.  At the end of step 3, we need to have higher confidence so that we will have lesser surprises.  50% of product success depends on how well we have done here.

There are many design techniques that may be used to get the most out of the design process. There are some design quality attributes needs to be discussed and balanced during the design.

  • a tradeoff between performance and security
  • context,  corporate-users vs regular users profiles
  • consequences,  small data vs big data.
  • Functional requirements describe what the system or application is expected to do.
  • Non-functional requirements specify how well the system or application does what it does. Non-functional requirements to satisfy might include performance, resource usage, and efficiency;  flexibility, reusability, and maintainability are other aspects.
  • Reviews and tests should also be used to verify that required qualities
  • Regular discussions and reviews with customers and peers.
  • Compromises:
    • Performance and maintainability – High-performance code may be less clear and less modular, making it harder to maintain. Alternately, extra code for backward compatibility may affect both performance and maintainability.
    • Performance and security – Extra overhead for high security may lessen performance.

A balance between qualities must be understood and taken into account during design. It is important to prioritize and understand what qualities are needed. A good question to ask to help you determine what compromises can be made is: Is there a way to cut back on a certain quality to balance another?

Some common qualities to take into account in software design include performance, maintainability, security, and backward compatibility.

 

 

 

 

 

 

 

Race condition Scenario 2

Case1:

I was implementing a file watch service using apache common io framework.  Using this framework we can monitor a particular directory (in my case).  Any file created or deleted in this directory will be notified to the callback registered. The call-back is invoked in a background thread that is started when we start monitoring.  We need to provide a periodic check interval to look for the changes. I think it takes the snapshot of current and compares with the previous to detect the changes.

During the initialization of my server, I need to check if any files are there in those directories.  I have to treat them as newly created.  So I had code like

  1. Check for exiting files.
  2.  Register for any file changes.

The problem with the above order is that between 1 and 2 there is a chance that some files may get created. This is a race condition. To fix this issue, we have to flip the order. It is fine to process the same file multiple times. (Duplicates are possible with the reverse order)

 

Case2:

During the discussion of a design of workflow manager,  one of the problems mentioned was: there is a completion queue and there is a submission queue. The work item is submitted by the subscriber,  we should be able to avoid duplicate work submissions from the subscribers.  To implement this we have decided to keep the work item in the submission queue until completed.  Once completed we move it to the completion queue.  when moving the item from submission queue to completion queue we should do that atomically ( we should not see the item in both the queues or we should not see the item in none of the queues).  We can use transactions to do this move.  The problem is that the submitter/subscriber checks these queues before submitting the item to the submission queue (it is neither in submission nor in completion queues).  Looking at both the queues cannot be done using transactions.  Because of this, there are races possible which can lead to the submission of duplicate items to the queue.  One solution I proposed is

  1. Look into the submission Queue
  2. Look into the Completion Queue
  3. Look into the submission Queue again to confirm there are no changes done between 1 and 3.

 

 

 

Transaction Isolation levels in Database

Transaction in the database can happen in isolation without affecting other transaction. Database with strict mutual exclusion is not performance efficient. To avoid performance penalty,  Databases provide some flexibility to allow transactions to proceed with some level of understanding by applications. All the transactions that are happening on a database by multiple applications (or) threads in an application can undergo some transaction (concurrency) anomalies when there is no strict mutual exclusion.   The common anomalies we hear are dirty reads, non-repeatable reads, phantom reads, lost updates.

Dirty reads: –

Thread2 –  Transaction2: writes data (not committed to Database)

Thread1 – Trasaction1: read the data (It can see the uncommitted write by Thread2)

Problem: Transaction2 can roll back the data, but Trasaction1 uses the data.

Non-Repeatable Reads: –

Thread1 – Transaction1: reads data

Thread2 – Transaction2: writes the data (committed to Database)

Thread1 – Transaction1; reads data (can see the committed data)

Phantom reads:-

Thread1 –  Transaction1: reads data

Thread2 – Transaction2 Inserts the new row into the table

Thread1 – Transaction1: reads data ( can see the newly inserted data)

Databases provided different Isolation levels (visibility levels) to prevent the above-mentioned anomalies and also to a trade-off between performance and anomalies.

TRANSACTION_READ_UNCOMMITED (allows all anomalies)

TRANSACTION_READ_COMMITED (prevents dirty read)

TRANSACTION_REPEATABLE_READ (prevents dirty read, repeatable reads)

TRANSACTION _SERIALIZABLE (prevents dirty read, repeatable reads, phantom reads)

There is another isolation level, Snapshot Isolation, which prevents anomalies like serializable isolation (implemented using locks, but performance cost),  all reads work on a committed database snapshot copy (implemented using versioning),  does not take locks, so performance efficient.

In case of updates, a conflict between transactions will result in an aborted transaction if the isolation level is snapshot isolation/repeatable read.

Updates can be silently lost in case of READ_UNCOMMITED/READ_COMMITED isolation levels.

Database Design concepts

I was looking at someone’s DB schema diagrams and could not understand few notations and keywords. So I read through these concepts quickly and also watched some youtube videos.  Found https://www.studytonight.com/dbms has nice videos on normalization of DB tables.  Lucid chart website (https://www.lucidchart.com/pages/ER-diagram-symbols-and-meaning) gave a crisp explanation of these notations. The lucid chart is using Crow’s foot notation.

Some jargon I heard while reading more about DB design.

Primary key: Non-null, unique and not changing field in the DB table.
Foreign key:  it is referencing the primary key of some other table.
Normalization: a technique used in reducing the footprint of the redundant data and removing the insert/delete/update anomalies in the table design.

  • 1NF

    • Rule 1: Single Valued Attributes
    • Rule 2: Attribute Domain should not change (the type of value should match with schema)
    • Rule 3: Unique name for Attributes/Columns
    • Rule 4: Order doesn’t matter
  •  2NF
    • There should be no Partial Dependency. Dependency is all the other columns in a table depends and has some relation to the primary key. Partial Dependency, where an attribute in a table depends on only a part of the primary key and not on the whole key(in case primary key is a composite key). (teach name has a partial dependency on subject_id with student_id + subject_id primary key)
  •  3NF
    • The table should not have Transitive Dependency. Partial DependencyTransitive Dependency. When a non-prime attribute depends on other non-prime attributes rather than depending upon the prime attributes or primary key.
  • BCNF
    • For dependency A->B,  A should be super key. In simple words, it means, that for a dependency A → B, A cannot be a non-prime attribute, if B is a prime attribute.

Some of the SQL query language jargons to remember :

  • WHERE vs HAVING,  to specify the conditions with aggregated values in a SQL query where GROUPBY is used.
  • GROUPBY is used to do apply the aggregate functions like count, sum, avg on the selected columns.
  • ORDERBY is used to arrange the output in a specified column in ascending or descending order.
  • INDEX, if you create an index on a specific column, the querying on that field will be faster. otherwise, the values are stored in such a way the retrieval is faster with the primary key.
  •  CROSS JOIN, get all the combinations of two tables
  • INNER JOIN, mutually matching entries. (NATURAL JOIN is a verity of it.)
  • LEFT OUTER JOIN, matched data from the two tables and then the remaining rows of the left table and null from the right table’s columns.
  • RIGHT OUTER JOIN, matched data from the two tables and then the remaining rows of the right table and null from the left table’s columns.
  • FULL OUTER JOIN matched data from the two tables and then the remaining rows of both tables with null from the right/left table’s columns.
  • In the case of a Foreign key, if the primary key is deleted in the main table then we can say that delete all matching rows in the referenced table (Cascade delete). or fill with NULL values for those rows.